High-Resolution Multi-Scale Neural Texture Synthesis 2017

TL;DR If you’re doing neural texture synthesis, use a multi-scale Gaussian pyramid representation and everything will look better!

Demonstration of the texture synthesis algorithm running on marble
Demonstration of the texture synthesis algorithm from a high-resolution source (source credit Halei Laihaweadu)

Preview the paper

To appear at SIGGRAPH Asia 2017: Read the paper. Get the code.

Overview

The Neural Algorithm of Artistic Style (Gatys, Ecker and Bethge 2015) was a groundbreaking paper, the “Gram matrix” of feature activations is used to represent a texture, basically a covariance matrix of features “How often does the ‘stripy’ feature co-occur with the ‘green’ feature”. This work triggered an explosion of research into using convolutional neural-network based representations of textures and images to synthesize new ones.

Much of this new research has been limited in its usefulness for general images, because the technique was limited to generating quite low-resolution textures with features only at a single scale.

In the traditional texture synthesis world this issue had already been addressed with the concept of a [Gaussian pyramid](RCA Eng. 1984 Adelson.pdf) multi-scale image representation.

It turns out that combining these two ideas is both simple and powerful for building high-resolution images. A block diagram of the loss function that we seek to minimize is:

Block diagram of our system

It consistently out-performs the Gatys technique when the source image is of high-resolution, or contains multiple scales of interesting features.

Comparison to the Gatys technique
Click for many more examples at full resolution

More examples

Graffiti in a Toronto Laneway
Graffiti in a Toronto Laneway

Meowbot
My cat, Meowbot

Combining multiple Gram matrices

Though not discussed in the paper due to space constraints, by averaging the Gram matrices at each scale for a large number of images,

75 images of Toronto
Averaging the Gram matrices from 75 photographs, autumn along Dundas Street West in Toronto

Combining tiles with smoke
Combining the Gram matrices for a tile image and a smoke image

Citing this work

@inproceedings{Snelgrove:2017:ROD:3145749.3149449,
 author = {Snelgrove, Xavier},
 title = {High-Resolution Multi-Scale Neural Texture Synthesis},
 booktitle = {SIGGRAPH ASIA 2017 Technical Briefs},
 series = {SA '17},
 year = {2017},
 isbn = {978-1-4503-5406-6/17/11}
 location = {Bangkok},
 url = {http://doi.acm.org/10.1145/3145749.3149449},
 doi = {10.1145/3145749.3149449}
 acmid = {3005379},
 publisher = {ACM},
 address = {New York, NY, USA},
}