Do you read the ‘supplementary information’ in science articles? If you’re familiar with the way journal articles work, they publish a traditional and formally formatted article in the print version of the journal, but now they often also have a supplementary information section stored in an online database that contains material that would be impractical or impossible to cram into print: raw data, spreadsheets, multimedia such as movie files. This is important stuff, especially if you want to dig deeper or re-analyze or otherwise rework the information.
Another important function is, I think, preserving data. In a previous life, I moved into an old lab that was piled high with the cluttered debris of the previous tenant’s scientific career; some we boxed up and moved to a storage space (where it is probably moldering, untouched since the early 90s) and the remainder found a resting place in a dumpster. I felt terrible about that, but it was a necessity.
Maybe it won’t always be such a necessity, though. The ‘supplemental information’ section of science papers represents a way to archive data that would otherwise lie in heaps at the bottom of file cabinets until lost. Those sections have their own problems—’supplemental information’ is an amorphous category that can contain anything, is clearly going to require some kind of formal metadata support, and is going to be a storage headache for publishers. We might also wonder whether the big publishing companies are also the appropriate repositories for what ought to be publicly accessible data.
One other possibility is storing raw data on these growing free databases. YouTube is essentially a free database specifically for storing any (almost) video data — I’ve seen some scientific work tucked away there, although it also creates new concerns: resolution is limited, you never know when the YouTube management might decide they dislike you and throw away your work, and a lot of raw scientific data isn’t going to have a large audience and therefore isn’t going to draw in a lot of ad revenue. Another interesting possibility is Google Base. Did you know that Google is providing a free online database in which you can store just about anything? Free storage, public access and searching, a reliable host — it’s a wonderful idea, as long as you don’t mind Google owning all of the information in the world.