XionBBS

The Ironworks => XAA Projects => Topic started by: Alexander on Jan 25, 2024, 01:42 PM

Title: Efficiently encoded animated images using reverse OAM methods
Post by: Alexander on Jan 25, 2024, 01:42 PM
(https://cdn.tohoku.ac/f/d53cdb0c4ea94fbca621890dd4711e6a/unknown.png)

I've spent the last few hours playing with this animation sequence from the 2001 movie Cowboy Bebop. I downloaded the film to source it myself, so I could have a high quality starting point to generate APNGs, Web-friendly MP4s, and GIFs as well. The thing about my results that really hurts is how heavy APNGs are, and they are the only thing that looks good. GIFs are also alright but still kind of big, and MP4s are nice if you really crank the quality coefficient to near-lossless.

But I've learned something key: there is a geometry about the image data that I need to be able to exploit in order to do colour mapping and help the compression algorithms help me crush the file size without reducing perceptive image quality. I'm ideating a more manually steerable process for what the DCT algorithms in H.264 et al do automagically (and blindly to the perceptive content boundaries we see). This can give me far superior image quality for the size. I am thinking of creating a broad geometry, like an array of blobs in motion that I would overlay on top of the animation, which would be my dictation of "weights" for the different areas. within those weight areas I can allocate a portion of the palette arbitrarily, or even define a kind of "primary palette" that could be combined with a mixing coefficient to create the final palette needed for the image.

Unfortunately, my tooling is beyond sparse given the hostile environment of modern Linux and macOS. My digging returned multiple GCPL-poisoned projects that lazily "innovated" on top of the existing state of the art here, which of course means little more than autistically throwing together the biggest kitchen sink and demanding everyone use rustup. No thanks!

Anyway, what I am describing above is a somewhat rough backwards approximation of what OAM achieves far more easily and effectively using layers sourced from an original animation studio project file. Regardless of whether it's approached in forwards using OAM, or in reverse using these algorithmic ideas, it is essentially about breaking down an image by its natural composition to seize upon the far more limited entropy in the original artistry, preserving the information necessary to better feed end user compression algorithms. In this instance, it is trying to approximate that seizure of information by allowing the user to draw compositional layer boundaries over a finished moving picture here, netting most of the benefit of this process without the need to have original files before they were flattened. The hard part will be getting those boundaries to treat the pixels they cover well, balancing which palette section gets what pixels, among other things. OAM based methods will make this far better as they will allow user to directly dictate edge blending algorithms with respect to the final output medium, with no mess.

It's hard to achieve the kind of trivial size configurability made possible by DCT-based raster formats inside either PNG or GIF, but there are several algorithms I can think of creating that will get us 90% of the way there and with a far better looking result as well. I will write more details in this thread as they come to me.