GoogleCompress
Feb. 22nd, 2005 09:43 amHere’s an idea for a product that Google could make. Since they seem to have a copy of every file in the world, they could make a WinZip clone that has a new compression scheme whereby if the file is already indexed by Google, they replace the entire file with the URL to the file. Huge ZIP files of porn would compress to nearly nothing. They just need a mechanism for the compressor to say, “Hey, don’t delete this file... EVER”. Sure, stuff you’ve written personally won’t compress as well, but right now I’m backing up my Mac and I see that most of the stuff I have is also available elsewhere.
They could also make a backup scheme based on this.
They could also make a backup scheme based on this.
no subject
Date: 2005-02-22 07:13 am (UTC)For backing up porn files, sure. Anything else, maybe not.
best,
Joel. Who also wonders what sorts of creative bookkeeping chicanery would result.
no subject
Date: 2005-02-22 07:56 am (UTC)It's not a bad idea. Then again, first they should implement web hosting. They already practically do that, except for pictures. But if Google went in on the deal with one of the photo sites, they'd rule the world (hey, here's space for your website, and don't worry, we've already copied what we can see to it, so your setup is that much easier. . . and it's only $5 a month).
Meanwhile, google is just a cache of what's out there. So if someone deletes something that's already out there, then google will eventually cycle it out of the cache. . .
no subject
Date: 2005-02-22 08:05 am (UTC)no subject
Date: 2005-02-22 08:35 am (UTC)The issue is Google promises not to delete a file from the cache. That's easy to solve if they are getting paid for it.
no subject
Date: 2005-02-22 08:42 am (UTC)no subject
Date: 2005-02-22 09:05 am (UTC)no subject
Date: 2005-02-22 11:01 am (UTC)no subject
Date: 2005-02-22 12:17 pm (UTC)no subject
Date: 2005-02-22 11:58 am (UTC)no subject
Date: 2005-02-22 08:07 am (UTC)no subject
Date: 2005-02-22 08:26 am (UTC)no subject
Date: 2005-02-22 08:40 am (UTC)It wasn't URLs then, but the idea was the same. I got it from the joke about the monks who know all the jokes so well that they can just refer to them by number.
no subject
Date: 2005-02-22 09:17 am (UTC)For example, a general purpose stream compression program for binary data, like ZIP, can compress any stream of bytes you feed it. If you feed it a random stream, it won't compress very much. If you feed it English text, it will, over the course of reading and compressing that data, adapt to compress English text well. But what if you wrote a compression/decompression program that already knew, ahead of time, that it was going to always be fed English text? It could use strategies that work really well for English - for example, it could look for words, rather than bits or bytes, as its basic unit, and already have predefined codes for the most common words and phrases. Such a compressor would do much better than ZIP on English text, but might be entirely unable to compress some other kinds of data.
You can take this tradeoff to the extreme: A compression program optimized to compress exactly one message. The decompression program already knows the message and can spit it out. So it doesn't matter how large the message itself is, it can be compressed to nothing.
The example I tend to use is, how about a compression program for Shakespeare plays? It can only compress Shakespeare plays, nothing else. The decompressor has the text of all the plays. You feed the compressor a play, it compresses it to the integer that has been assigned to that one. Transmit the integer to the decompressor, and it emits the correct play.
That's still an extreme, but it illustrates the opposite side of the tradeoff, opposite from general purpose compression. Image-specific compression methods are another example of the same tradeoff.
no subject
Date: 2005-02-23 04:54 am (UTC)no subject
Date: 2005-02-22 08:59 am (UTC)Of course, it's only a joke ...
no subject
Date: 2005-02-28 08:16 am (UTC)I'm not sure whether anybody ever actually tried it. (If so, I'm sure their neighbours wouldn't have been pleased if they'd figured out what was going on.)
Ah, the olden days, when I was ..!seismo!dolqci!hqhomes!glenn
no subject
Date: 2005-02-22 12:56 pm (UTC)no subject
Date: 2005-02-22 02:21 pm (UTC)After all, if you can make someone hang onto information for a price, why not pay the same price to have them remove it and never re-archive it? They get the same money, without all the storage costs. And hey, there'll be a huge market for it in a few short years, as yesterday's AOL-enhanced highschoolers become tomorrow's public-service exposé fodder.
Oooh, just imagine, an eBay-like auction front end, where people can place competitive bids based on how important it is to them that their information live or die! It'd be, like, a totally democratized Ministry of Truth!
no subject
Date: 2005-02-23 11:57 pm (UTC)