Tweaking the params

Making the most out of torch-dreams.

In a nutshell, you can use torch-dreams to optimize an input image to activate various parts of a neural network.This would help give an intuition on what each part of the neural network "looks for".

import matplotlib.pyplot as plt
import torchvision.models as models
from torch_dreams.dreamer import dreamer

torch_dreams.dreamer is basically a wrapper over any PyTorch model that would enable us to optimize the input image to activate various feature(s) within the neural network.

model = models.inception_v3(pretrained=True)
dreamy_boi = dreamer(model)

Config

The config is where we get to customize how exactly we want the optimization to happen.

config = {
    "image_path": "your_image.jpg",
    "layers": [model.Mixed_6c.branch1x1],
    "octave_scale": 1.2,
    "num_octaves": 10,
    "iterations": 20,
    "lr": 0.03,
    "custom_func": None,
    "max_rotation": 0.5,
    "gradient_smoothing_coeff": 0.1,
    "gradient_smoothing_kernel_size": 3
}
  • image_path specifies the relative path to the input image.

  • layers: This is a list where you pass the layers whose outputs are to be "stored" for optimization layer on. For example, if we want to use 2 layers, we can simply:

  • octave_scale: The algorithm in torch_dreams resizes the input image iteratively from (original_size)/(octave_scale**n) to the original_size. This is reminiscent of the "octave scale" used by Alexander Mordvintsevarrow-up-right in his DeepDream Tensorflow tutorialarrow-up-right.

  • num_octaves: specifies the number of times the image is scaled up in order to reach back to the original size while running the algorithm.

  • iterations: Number of gradient ascent steps taken per octave. Note: When using random noise as the input image, you'll need a lot more iterations per octave (around 100) than usual in order to get good results.

  • lr: Learning rate used in each step of the gradient ascent.

  • custom_func: Use this to build your own custom optimization functions to optimize on individual channels/units/etc. By default, it will optimize the input image on all of the layers mentioned in layers. More on this later.

  • max_rotation: Caps the maximum rotation to apply on the image before each gradient ascent step. Rotation transforms helped in reducing high frequency noise.

  • gradient_smoothing_coeff: Use this to apply Gaussian blur to the gradients before the gradient ascent step.

    Ideal values range around 0.5 if used (higher value-> stronger blur). Useful to remove high frquency patterns sometimes.

  • gradient_smoothing_kernel_size: Kernel size to be used when applying gaussian blur.

And finally, to get things rolling all you have to do is:

And to save the images, you can:

Last updated