Computer Assisted Analysis of Classical Cartoon Animations
The aim of this project is to understand 2.5D structure of classical cartoon animations produced by cel or paper based technique where animation frames are created as a planar composition of static textural background and dynamic homogeneous foreground. Algorithms are developed to disassemble the original composition of layers and to estimate mutual correspondences between frames/regions in the animation.
Cartoon analysis framework
The cartoon analysis framework consists of three independent steps. First an unsupervised image segmentation based on robust outline detector (Fig. 1b) separates the input frame into a set of regions (Fig. 1d). Region area size is used to roughly estimate whether a given region belongs to the background or to the foreground layer. In the following phase extracted foreground layer is optinally converted from raster to vector form and visible fragments of the background layer are stitched toghether to produce one big image that is used to refine foreground background classification (Fig. 1c). Finally, patch-based structural similarity and neighborhood relations are exploited to estimate correspondences between animation frames and regions (Fig. 1e).
The proposed framework allows to reduce a large amount of manual intervention needed in various cartoon renewal tasks such as colorization, color restoration, dust spots removal, synthesis of new animations in the original style, and also leads to an efficient video compression scheme for classical cartoon animations.
Colorization and Restoration
Since previous colorization techniques require extensive user intervention the whole production pipline is tedious and time consuming. The proposed cartoon analysis framework reduces colorization workflow from tedious frame-by-frame scribbling to simple one-click corrections. This dramatical speed up can be reached thanks to considerably different visual appearance of detached layers (Fig. 2). Although the background layer is complicated textural image, when reconstructed, it can be processed once and then reused in the whole sequence. On the other side the foreground layer contains only homogeneous regions. When they are properly extracted, it is easy to colorize them using one-click operations. The only problem is that foreground later is dynamic and so requires frame-by-frame care. However, when several animations frames are colored region-to-region correspondences can be used to predict suitable color-to-region assignment.
The another important feature of the proposed colorization framework is automatic color brightness modulation, layer composition and dust spot removal techniques that allows to produce the final color images in broadcasting quality without additional user intervention (Fig. 3). Moreover, they can be also applied to restoration and enhancement of aged color cartoons (Fig. 4).
In computer assisted cartooning a skilled artist first prepare a set of fragments from which more complex scenarios and animations are composed. However, the problem arises one wants to bring new life to some traditional cartoons of which stand-alone fragments are not available.
Extracting and composinting fragments from ready-made compositions is tedious and time consuming task using standard image manipulation tools. The proposed cartoon analysis framework allows to reduce burden connected with fragment extraction and composition. The user simply selects an interesting part in the original drawing and then adjusts it in a new position using only few control scribbles (Fig. 5).
Standard video compression approaches such as MPEG-2 assume that strong spatial and temporal discontinuities are not frequent in real-life image sequences. They exploit discrete cosine (DCT) or wavelet (DWT) transform to rearrange the image energy so that it can be further quantized and compressed efficiently. However, due to decomposition to blocks and quantization errors, blocking and ringing artifacts may arise when DCT or DWT is applied to classical cartoon animations where each animation frame consists of many sharp edges (Fig. 6).
The proposed cartoon analysis framework provides natural decomposition of cartoon images much less sensitive to visual artifacts. The background layer is stored as a single image and the foreground layer as a sequence of vector images. Such a hybrid form can be encoded more compactly as compared to standard video compression approaches therefore provides better visual quality for equivalent coding bit-rates (Fig. 6). Moreover, an efficient hardware accelerated playback is possible with partial spatial scalability.