jbetker – Page 2 – Non_Interactive

Is the Reversal Curse a generalization problem?

Posted on October 18, 2023October 18, 2023 by jbetker

In my last post, I made a claim that the recently discovered reversal curse is not something that worries me. In fact, when I originally learned of it, I can’t say I was very surprised. In this post, I wanted to dig into that a little bit more. My hypothesis is that the reversal curse…

The State of ML in 2023

Posted on October 7, 2023October 7, 2023 by jbetker

I’ve been trying to figure out how to best write this article for most of the last year. Today, I’ve decided to just write down something, rather than continue trying to wordsmith exactly what I mean. I am tremendously excited by everything that is going on in ML right now. The breadth of the problem…

DALL-E 3

Posted on September 23, 2023September 27, 2023 by jbetker

We released DALL-E 3 this week. It has been a labor of love for Aditya, Gabe and myself for a little over a year. It really is an impressive machine we have built. It continues to surprise me every day, despite having worked on it for so long. I’m extremely grateful to my fellow authors…

ICML 2023

Posted on July 18, 2023July 18, 2023 by jbetker

I’ve met quite a few amazing people through this blog, most of which I’ve only had the chance to trade e-mails with. I’m attending ICML next week and would love to grab a coffee or beer with any of you. Shoot me an e-mail if interested. jbetker -at- gmail.

On the efficiency of human intelligence

Posted on July 5, 2023July 5, 2023 by jbetker

A pet peeve of mine that often shows up in ML discourse is the claim that humans are much more data efficient at learning than the models we are currently training. The argument typically goes like this: “I’m blown away by how much knowledge my 3 year old has. They are smarter than most language…

Techniques for debugging neural networks

Posted on July 1, 2023July 1, 2023 by jbetker

In my last post, I briefly discussed the infuriating fact that a neural network, even when deeply flawed, will often “work” in the sense that it’ll do above-random at classification or a generative network might create things that may sometimes look plausibly from the dataset. Given an idea that you’re testing out that is performing…

Ablations are really important

Posted on June 25, 2023 by jbetker

I don’t read as many papers as I once did. I find this surprising as I always assumed that when I made ML my full-time job, I would spend a lot more time reading up on all of the things that other folks in the field are up to. To some extent, this is a…

The “it” in AI models is the dataset.

Posted on June 10, 2023 by jbetker

I’ve been at OpenAI for almost a year now. In that time, I’ve trained a lot of generative models. More than anyone really has any right to train. As I’ve spent these hours observing the effects of tweaking various model configurations and hyperparameters, one thing that has struck me is the similarities in between all…

GPT might be an information virus

Posted on March 9, 2023March 9, 2023 by jbetker

Obligatory: the views and opinions expressed in this post are my own and do not represent the views and opinions of my employer. In light of all the hype going around about ChatGPT, I wanted to offer my “hot take” on what the next 2-5 years of the web look like. One aspect of the…

The Fundamental Building Blocks of DL

Posted on December 6, 2022July 16, 2023 by jbetker

I’m going to take a stab at nailing down what I believe to be the five fundamental components of a deep neural network. I think there’s value in understanding complex systems at a simple, piecewise level. If you’re new to the field, I hope that these understandings I’ve built up over the last few years…