Similarity search technology in images, music, and other multimedia content has been researched to death. This idea is not about research in any of those areas (I save that for work at Temple). The idea is simply an implementation of these techniques. Something like Google images that allows you to upload images and query based on similarity to the given image. Small-scale systems exist, but I have yet to find any that are as large as mainstream keyword-based image searches, such as Google Image Search. I’ve suggested this to Google when I was in their NYC office (I even gave them my BACH paper to suggest how they could do it for music!), but as far as I know, they still lack this feature (though they are joined by all of the other large search engines).
Large-scale query-by-humming systems already exist, so the lack of those isn’t a problem, but video could also benefit from such an approach (find video with this sound, find video with this frame, etc.). Images could be broken down using MPEG7 descriptors, time series analysis after linearization by a Hilbert curve, or vector quantization, among other techniques. Music could be broken down by a Fourier transform/power spectrum analysis; even the mood of the piece can be accurately predicted by this technique (according to the literature). Video search can be treated as a simple array of images and music (frames) and solved by the bagging the previous two methods.