Generating wiggle animation images from depth maps

Written by Paul Bourke
July 2025

Examples used in this document were sourced from the Murten panorama scan

The following discussion is around a particular way of presenting images, with an associated depthmap, so as to reveal the 3D structure. Depthmaps can be generated by a number of means, these include but are not limited to: 3D modelling/rendering software, derived from stereo pairs, manual painting and focus stacking. More recently various AI engines have been able to create reasonable quality depth maps, the examples used here were generated with this approach. Once one has an image and depth map a number of opportunities are available for it's presentation in 3D. For example traditional stereo pairs can be created for a stereoscopic display or virtual reality headset, or image sets can be created for lenticular displays such as the Looking Glass.

The approach used here to presenting depth cues using an image and depth map are normally called "wiggle" images, sometimes piku-piku or twitching in Japanese. In essence two or more images are presented in quick succession, the difference in viewing position between these images combined with the resulting parallax changes gives the sense of depth. Specifically, objects in the scene at the zero parallax distance don't move at all. Objects more distant than the zero parallax distance (positive parallax) vary in their relative positions between the views. And objects closer than the zero parallax distance (negative parallax) vary in their relative positions but in the opposite direction to objects greater than the zero parallax distance.

These features can be seen in the following example, in these cases there are 10 views (the author prefers a small number of views instead of just 2). Zero parallax is set to the most distant objects, so the rest of the scene is at negative parallax. (It may be necessary to wait for the animated gif to download)

In this example the zero parallax distance is closer to the virtual camera. (It may be necessary to wait for the animated gif to download)

The depth map for both examples above is as follows.

A feature of presenting depth information by this approach is that no special display hardware is required, and there are no glasses or other headwear involved. Unfortunately a digital display is required, that is, there is not a printed equivalent. One of the advantages of encoding wiggle images as an animated gif, as done here, is that due to early adoption of animated gif by the browser industry, they are widely supported across browsers on both computers and mobile devices. But in reality almost any movie format or player can be used as long as it can uniformly and indefinitely loop a movie.

The approach here is intended to be maximally general. It starts with the conversion of the depth map (closest objects are white, furthest are black) to a OBJ mesh. In other contexts called a height field. To facilitate a standard rendering pipeline, the mesh is created centered at the origin and fitting within a unit cube bounding box. The mesh depth (z axis) is normalised to lie between 0 and the chosen vertical scaling factor (-z).

Usage: maketexturedmesh [options] imagefilename
Options
   -a n   Set subsampling level, default: 4
   -z n   Set vertical scaling of mesh, default: 1

The resulting mesh is illustrated below, in the first the depth map used as the texture and the second the original image is mapped onto the mesh the texture.

At this point almost any rendering package can be used to create views of the model from multiple positions. In this case Povray was used mainly due to it's powerful and precise scripting capabilties. Typically one wants the virtual camera quite distant from the depth map surface in order to avoid revealing the edges of the close to distant transitions. The mesh should be rendered without any lighting effects, in essense, just ambient light. In the example above, 10 views are captured between a camera rotation across a small angle. These can be assembled in mp4 or in this case an animated gif using the following ffmpeg command.

   ffmpeg -y -i scene%02d.png \
      -vf 'scale=800:-1,fps=50,split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse' \
      -loop 0 animation.gif

Depending on the depth relationships and the quality of the depth map there can be some experimentation required for a optimal result. In particular, controlling the number of images, the virtual camera positions range and the delay between each image. While these examples focus on horizontal camera offsets, a vertical offset is also possible, or indeed both at once. Another example below along with AI derived depth map, 10 views. (It may be necessary to wait for the animated gif to download)

At the time of writing FaceBook has a viewer for an image and depthmap pair. One simply uploads the image pair as separate files, if the colour image is called XXX.png then the depthmap will be automatically applied if it called XXX_depth.png, assuming it's a greyscale image and of the same dimensions. Unfortunately there doesn't seem to be any controls for setting the degree of "look around" and the default attempts too much parallax leaving large blurred out regions. The controller also doesn't appear to provide symmetric vertical parallax control.

The pipeline used here is as follows.

Further example from historical photographs from China.