A lot of you guys have been playing with the Style Transfer notebook and template!
I wanted to share some tips with you today about how to make style transfer look better on the SnapCamera
In machine learning, since the model is trained on a specific type of data, it is critical that we pass in a similar type of data. In the case of Style Transfer - the result will be most detailed/preserve most style features if input has the same aspect as model input.
The notebook we've provided is an exporting model with the output image size of aspect 0.5 (vertical aspect 256 x 512) which makes it good for the portrait mode (Your phone). But, for the Horizontal screen (e.g. in Snap Camera) this model will result in either low detail or very stretched (if you have stretch checkbox selected) since the input doesn’t match what was trained in the model.
To get maximum from the resolution of the model we can use an input transformer to rotate the input image to provide a higher quality result for this without increasing model resolution. You can find detailed info about input transformer here
That being said, we want to make sure the experience works well for everyone! So what we’ll do is choose the right model, based on the screen.
Preparing an ML Component for each screen aspect ratio
I will create two Ml components with the same model but different settings. I could do this in a script but UI setup is a bit easier to understand. Also I will add a couple lines of code to the template to select the one based on the screen aspect. If screen orientation is horizontal i will rotate input image and rotate output back.
I'll start with opening a Style Transfer template and duplicating the ML Component.
Both my MLComponents have autorun and autobuild checkboxes disabled so that we can build the one we want, depending on the aspect ratio of the screen. And we want to prevent two models running at the same time.
Then I will configure one of the components Input Transform to be rotated 90 deg so that it has aspect same as the model input shape. And configure output transform to apply inverse transformation.
Choosing which model to run
Now that we have our model set up, we will pass this model into our controller. To do this: I will open the StyleTransferController script and add one more input :
//@input Component.MLComponent mlComponent
//@input Component.MLComponent mlComponentLandscape
Then I will add an input to get the aspect ratio. I can access the aspect of the screen by accessing aspect property of the camera
//@input Component.Camera camera
With new inputs added set them up in the Inspector panel:
Warning: Aspect of the camera is only available on lens turn on, that's why i will change a couple lines of code too. Instead of calling build ML Component in the script’s init function I will create an event to call it on the TurnOnEvent
and create a callback function for this event with added logic to build the ML model based on the camera’s aspect ratio :
var aspect = script.camera.aspect;
mlComponent = aspect < 1.0 ? script.mlComponent : script.mlComponentLandscape;
mlComponent.onLoadingFinished = wrapFunction(mlComponent.onLoadingFinished, onMLLoaded);
Spot the difference:
- Make sure to pass in data similar to the one you trained your model on to get the highest quality result
- You can use Input Transformer and other settings in the ML component to allow one model to deal with multiple situations
- You can have multiple ML models/component and choose one to run depending on a situation
Please pay attention to these details:
1. Double check that you are building and running only one model! (We are doing that from the script so checkboxes should be disabled)
2. Check the render order of the newly added ML component. Should be the same as the portrait one. (Means that MLComponent should finish processing before it’s output is rendered to Orthographic camera)
3. Note that this makes sense if you have rectangle output. If it is square - to avoid stretching just disable the Stretch checkbox on the MLComponent. This should be enough - paddings will be added to input and removed from output and style features will not be squeezed or stretched