Skip to content

Non_Interactive – Software & ML

Menu
  • Contact
  • Non_Int
  • What is Non-Interactive?
Menu

Author: jbetker

DALL E for TTS: TortoiseTTS

Posted on February 3, 2022February 3, 2022 by jbetker

In an earlier post, I walked you through a project I’ve been working on, which I called “triforce” at the time. I’ve finished training a first pass on this collection of models and want to write about the results. Deploying this speech CLIP model on the outputs of my autoregressive speech token generator made all…

Continue reading

Batch speech transcription with ocotillo

Posted on January 22, 2022January 22, 2022 by jbetker

As I mentioned in my previous blog post, I’m currently working on text-to-speech models. I’m taking the “scale-it-to-the-moon” approach, so I need a lot of data. Fortunately, speech data is pretty easy to come by. Audio books, podcasts, YouTube and large archives of speeches and presentations are available all over the internet. The problem is…

Continue reading

Triforce: A general recipe for kickass Generative Models

Posted on November 3, 2021November 3, 2021 by jbetker

For the past two years, I’ve been tinkering around with generative models in my spare time. I think I’ve landed on an approach that produces by far the most compelling results available today, and which scales like big language models. I’d like to outline the approach here. First of all, I want to touch on…

Continue reading

Switched Convolutions – Spatial MoE for Convolutions

Posted on April 15, 2021April 15, 2021 by jbetker

Switched Convolutions – Spatial MoE for Convolutions Abstract I present switched convolutions: a method for scaling the parameter count of convolutions by learning a mapping across the spatial dimension that selects the convolutional kernel to be used at each location. I show how this method can be implemented in a way that has only a…

Continue reading

SRGANs and Batch Size

Posted on January 8, 2021January 1, 2021 by jbetker

Batch size is one of the oldest hyper parameters in SGD, but it doesn’t get enough attention for super-resolution GANs. The problem starts with the fact that most SR algorithms are notorious GPU memory hogs. This is because they generally operate on high-dimensional images at high convolutional filter counts. To put this in context, the…

Continue reading

Training SRFlow in DLAS (and why you shouldn’t)

Posted on January 5, 2021January 1, 2021 by jbetker

SRFlow is a really neat adaptation of normalizing flows for the purpose of image super-resolution. It is particularly compelling because it potentially trains SR networks with only a single negative-log-likelihood loss. Thanks to a reference implementation from the authors or the paper, I was able to bring a trainable SRFlow network into DLAS. I’ve had…

Continue reading

Translational Regularization for Image Super Resolution

Posted on January 1, 2021January 1, 2021 by jbetker

Abstract Modern image super-resolution techniques generally use multiple losses when training. Many techniques use a GAN loss to aid in producing high-frequency details. This GAN loss comes at a cost of producing high-frequency artifacts and distortions on the source image. In this post, I propose a simple regularization method for reducing those artifacts in any…

Continue reading

Deep Learning Art School (DLAS)

Posted on December 26, 2020January 1, 2021 by jbetker

At the beginning of this year, I started working on image super-resolution on a whim: could I update some old analog-TV quality videos I have archived away to look more like modern videos? This has turned out to be a rabbit hole far deeper than I could have imagined. It started out by learning about…

Continue reading

Accelerated Differentiable Image Warping in Pytorch

Posted on September 17, 2020January 1, 2021 by jbetker

Computing optical flow is an important part of video understanding. There are many ways to train a model to compute this, but one of the more compelling methods is to: Feed a model an image pair Have it predict optical flow Apply that optical flow to the original image Compute a pixel-wise loss against the…

Continue reading

Batch Normalization is a Hack

Posted on July 19, 2020January 1, 2021 by jbetker

Batch normalization has a simple goal: stabilize the gradients of large computational graphs. In doing so, this technique has enabled the deep learning renaissance that almost every major ML breakthrough in the last 5 years has relied on. The concept is sound: by regularizing the mean and variance of the inputs of nearly every layer…

Continue reading
  • Previous
  • 1
  • 2
  • 3
  • 4
  • 5
  • Next
© 2025 Non_Interactive – Software & ML | Powered by Minimalist Blog WordPress Theme