Les billets libellés: pytorch. Afficher tous les billets.

VAE GAN

dimanche 08 décembre 2019

I had been trying to train a version of VAE-GAN for a few weeks and it wasn't working as well as I had hoped it would. I had added an auxiliary output to the discriminator which was attempting to predict the 40 features of each image provided with the celeb-a dataset as suggested in the VAE-GAN paper and I was scaling that loss to try to bring it in line with the GAN discriminator loss, but I was doing that incorrectly so that loss ended up overwhelming the GAN loss. (I was summing, rather than averaging the losses, and the lambda I was using to scale the loss was appropriate for a mean loss, but with 40 features the auxiliary loss was 40x the GAN loss at base, so I needed to divide the lambda by 40 to get the effect I wanted.)

After having corrected that error I am finally making some progress with these models. Below are sample images from two models I am training. The first outputs images at 160x160, the second at 128x128.

I guess the moral of this story is if something isn't working the way you expect it to, double check your math before you continue training it!

Libellés: python, machine_learning, pytorch, gan
3 commentaires

VAE GAN

dimanche 22 septembre 2019

I started working on a variational auto-encoder (VAE) for faces a few months ago. I was easily able to make a non-variational autoencoder to reproduce images that worked incredibly well, but since it was not variational there wasn't much you could do with it other than compress images. I wanted to be able to play with interpolation and such, and for that you need a VAE. So I converted my auto-encoder to a variational one, but the problem was that the resulting images were very blurry and the quality wasn't all that great. So I thought maybe I could attach a GAN to this to make the images look more realistic. And I tried that but unfortunately it didn't work very well, the GAN was trying to produce to generate images of what it though were faces will the autoencoder was trying to reproduce its input, as seen in the images below:

 

After fighting with this for a few months I decided to try to make sure that the GAN was working properly before I added on the autoencoder, and although I had to fight with the GAN quite a bit and was never able to get it to generate really high quality images, I was sure that it was working properly. So I decided to try to hook it up to the autoencoder again.

Then I discovered this paper Autoencoding beyond pixels using a learned similarity metric, which does the same thing I was trying to do but in a much smarter way. What I had been doing was using the MSE between the input and the generated images for my VAE loss, and training both the encoder and the decoder with the GAN loss. Obviously this did not work.

What they do in the paper is basically separate the encoder and leave the decoder and discriminator as the GAN, which is trained as usual. I had tried to think of ways to train the encoder and decoder separately, but my ideas were much more primitive and didn't work at all. What they do that is train the encoder separately, using the KLD loss and - this is the brilliant part - instead of using MSE between the input and the recreation they use the MSE between a feature map from an intermediate layer of the discriminator for the real and faked images. So rather than trying to produce an exact duplicate of the input, the encoder is trying to produce something that the discriminator thinks is close to the input.

It took me a few hours to rewrite my code to make use of this new loss, and come up with a version that would be able to run without having to keep all of the graphs in memory and be able to train in a reasonable amount of time, and I think everything is finally working. Hopefully this works better than my previous attempts, and next time I will try to remember to review the literature before trying to implement a new idea on my own.

Libellés: pytorch, autoencoders, gan
1 commentaires

It is difficult to play around with the structure for the GAN I am working on in Colab since it trains so slowly. I can usually get maybe 2 or 3 epochs in a day, which means that I need to wait a day before evaluating each change I make. I decided to rent a GPU in the cloud for a few days so I could train it a bit more quickly and figure out what works and what doesn't work before going back to Colab.

I already have a Google Cloud GPU instance I was using for my work with mammography, but it was running CUDA 9.0 which apparently is not supported by PyTorch out of the box. I tried to upgrade CUDA to 10, but I think I ended up just making things worse. Rather than spend a whole day trying to fix the GCS instance, and since I have some AWS credits, I decided to try to use an AWS Deep Learning AMI instance, which already has everything configured.

It was incredibly easy to get set up, it comes pre-configured with virtual environments for different deep learning frameworks and packages, so there is no need to install CUDA or drivers or anything like that, which is a huge advantage, since back when I was setting up the GCS instance it took me a few days to get everything installed and working. One thing I quickly noticed was that the default disk size was not even close to big enough - after downloading a few data files I was already running out of disk space, but it was very easy to increase the disk size.

Then all I had to do was activate the pytorch environment, launch a notebook and everything was running smoothly. I did run into a few minor issues, none of which were difficult to resolve:

  • If I launch tmux from within a virtual environment it launches a session that does NOT have the environment activated. Then if I activate the environment from within tmux it doesn't have access to the proper modules. This was resolved by launching tmux from outside of the venv, and then activating the venv from inside tmux.
  • In my notebook it didn't seem to have access to pytorch, but this was because I hadn't selected the proper kernel from the kernel -> change kernel menu. I wasn't even aware that one could select the kernel like that.

I used to prefer GCS to AWS because it was more configurable and easier to use. While AWS does have a bit of a learning curve, they really have thought of and provided for just about every possible contingency. We use AWS at my work, and it really is very impressive. I still like the simplicity of GCS, but even simple things like AMIs make such a huge difference in set-up time that I think I'll be using AWS more often now.

Libellés: machine_learning, aws, pytorch
Aucun commentaire

Archives du Blogue