Thursday, September 07, 2006

Intersecting Summation Compression

If we have the summation of each row and each column for an image, how much of the image can we reconstruct? There are obviously multiple solutions once you get beyond the 2x1 pixel image, but what if we add extra information about images in general — like the fact that two nearby pixels are similar colors in general? And if this isn't enough information, what if we have the diagonal summations as well? Multiple diagonals? Other angles besides 45 degrees? What does the shape of the linear summation information versus recall accuracy function look like?

4 comments:

Jason said...

Sounds like a picross puzzle. The question really is, can an arbitrary image be made into a picross puzzle, and still be reconstructed?

Somehow, I doubt it, but with some clever algorithms (possibly AI techniques?) it might be doable. When I get some time I'll try poking at this one, sounds very fun.

Kyle said...

Kind of picross-ish, except with real instead of binary values. You're right about the AI techniques, I had a solution that worked on small images using genetic algorithms. For bigger problems... let me know if you find a solution.

Jason said...

Have you tried chopping up large images into an NxN grid of small images?

Kyle said...

In an earlier version of this problem I imagined an algorithm that would encode especially high-chaos areas for easier reconstruction later. This is, essentially, a generalization of "chopping up" the image: if you have a big area with nothing but "white space", there's just no reason to subdivide it. Here the connection to subdivision (in more of a graphics sense, but still related to the quadtree sense) becomes apparent, and subdivision is isomorphic to wavelet compression (hence, compression in general), inpainting/filling-in, and the general pattern recognition/learning problem. So once you start thinking of subdividing it, it immediately turns into a very general problem (surface subdivision/wavelet compression/inpainting/pattern recognition), which already has plenty of researchers :) I'm more curious if a "pure" version of this problem can be solved than whether related techniques can successfully dissolve the problem.