16 April 2004

Preservation via LOCKSS?

After lunch a few of us retired to a quieter corner of the hotel to discuss whether it would be worth our time and effort to try to make LOCKSS more of a preservation tool. There was a clear consensus among this group that LOCKSS is not preservation today, and that the project (though it claims a preservation role) is really not doing much (beyond its NSF grant attempt, anyway) to make accommodations in the software for preservation issues. These would include things like issue level manifests with metadata, file format recognition and metadata (perhaps via JHOVE, which I saw was announced today), or picking up formats other than HTML (maybe an OAI harvest of metadata followed by a harvest of the related deeper-web items). Right now LOCKSS is, in essence, a “bit store,” it is a backup mechanism. In some ways, building up LOCKSS installations might also remove some of the wins the system brings in terms of ease of setup and maintenance.

An interesting experiment might be to use the WayBackMachine to figure out how many of the current Humanities titles are captured in the Internet Archive.

