jbetker – Page 2 – Non_Interactive

On the efficiency of human intelligence

Posted on July 5, 2023July 5, 2023 by jbetker

A pet peeve of mine that often shows up in ML discourse is the claim that humans are much more data efficient at learning than the models we are currently training. The argument typically goes like this: “I’m blown away by how much knowledge my 3 year old has. They are smarter than most language…

Techniques for debugging neural networks

Posted on July 1, 2023July 1, 2023 by jbetker

In my last post, I briefly discussed the infuriating fact that a neural network, even when deeply flawed, will often “work” in the sense that it’ll do above-random at classification or a generative network might create things that may sometimes look plausibly from the dataset. Given an idea that you’re testing out that is performing…

Ablations are really important

Posted on June 25, 2023 by jbetker

I don’t read as many papers as I once did. I find this surprising as I always assumed that when I made ML my full-time job, I would spend a lot more time reading up on all of the things that other folks in the field are up to. To some extent, this is a…

The “it” in AI models is the dataset.

Posted on June 10, 2023 by jbetker

I’ve been at OpenAI for almost a year now. In that time, I’ve trained a lot of generative models. More than anyone really has any right to train. As I’ve spent these hours observing the effects of tweaking various model configurations and hyperparameters, one thing that has struck me is the similarities in between all…

GPT might be an information virus

Posted on March 9, 2023March 9, 2023 by jbetker

Obligatory: the views and opinions expressed in this post are my own and do not represent the views and opinions of my employer. In light of all the hype going around about ChatGPT, I wanted to offer my “hot take” on what the next 2-5 years of the web look like. One aspect of the…

The Fundamental Building Blocks of DL

Posted on December 6, 2022July 16, 2023 by jbetker

I’m going to take a stab at nailing down what I believe to be the five fundamental components of a deep neural network. I think there’s value in understanding complex systems at a simple, piecewise level. If you’re new to the field, I hope that these understandings I’ve built up over the last few years…

Grokking Diffusion Models

Posted on October 31, 2022 by jbetker

Since joining OpenAI, I’ve had the distinct pleasure of interacting with some of the smartest people on the planet on the subject of generative models. In these conversations, I am often struck by how many different ways there are to “understand” how diffusion works. I don’t think most folk’s understanding of this paradigm is “right”…

I’ve Joined OpenAI

Posted on October 28, 2022October 28, 2022 by jbetker

I’ve been meaning to write this for a couple of months now, but simply haven’t found the time. Life has gotten quite busy for me lately, and I hope to explain why. First, the elephant in the room – I have left Google and finally stepped into the ML industry. I’ve accepted a position as…

The case for composite models

Posted on July 16, 2022July 16, 2022 by jbetker

In machine learning research, there is often a stated desire to build “end to end” training pipelines, where all of the models cohesively learn from a single training objective. In the past, it has been demonstrated that such models perform better than ones which are trained from multiple components, each with their own loss. The…

Lab notes: Cheater latents

Posted on June 18, 2022June 20, 2022 by jbetker

Lab notes is a way for me to openly blog about the things I am building. I intend to talk about things I am building and the methods I plan to use to build them. Everything written here should be treated with a healthy amount of skepticism. I’ve been researching something this week that shows…