Life is a Mystery

30 April 2004 . 1 Comment


I had a nice day today at our local ARLD (Academic and Research Libraries Division of the Minnesota Library Assocation) Day conference at the Arboretum in Chanhassen. Most interesting to me was a presentation on the “Googlization of Library Values” by librarians from St. Cloud State, St. Catherine’s, and Carleton College. I was expecting the usual library lament about how we have to resist the dominion of google which is teaching our patrons that search is simple and everything is on the net. Surprise! Every one of the presenters spent their time sharing positive lessons we need to learn from Google. Robin Ewing asked us to learn from three core values demonstrated by Google: vision, usability, and whimsy. We all know Google thinks big and keeps stuff simple, but I had not really valued their sense of whimsy. Upon some reflection I find it true: Google takes the time to make things a little fun, from their name and logo to things like allowing their interface to be translated into Klingon and Elmer Fudd. Can we make library services anything other than deadly dull?

27 April 2004 . Comments Off on Powerless without Video

Powerless without Video

It is the smallest things! A server we use in Digital Collections had its video card go bad. The disk array on the server has a 4-hour service plan, but it turns out that the University and one local shop were quoting a four day turnaround for the video card. Luckily a genius (Scott) at the Apple Store at the Mall of America was able to swap the card out while I waited there. Only one morning lost to shuttling the machine around campus and town. We are back up again. Almost. I forgot to tell the server not to go to sleep!

23 April 2004 . Comments Off on A Rose by any other name

A Rose by any other name

John clued me in to SRW today. I thought I had not heard of it, but have just found that it is the name for Z39.50 Next Generation, something I had hear of. Do you think Z39.50’s reputation is so poor this group had to choose a new name?

23 April 2004 . Comments Off on Censorship?


There has been a controversy swirling around the net about some pictures taken by a (now fired) Kuwait airport worker of the coffins of US soldiers being placed aboard transport home. We sometimes imagine that the net makes information free (though in this case it had something to do with the Freedom of Information Act), but I wonder how many people will be able to reach the site which was distributing similar pictures. It seems the US Government has (maybe) shut it down. A chilling sign of the times, and a reminder of just how fragile even the net can be? Or not? The NYT is running the story, with photos.

22 April 2004 . Comments Off on The Many Uses of Creative Commons

The Many Uses of Creative Commons

A nice mainstream article on the many uses of Creative Commons licenses is available at Business2.0. If you don’t know what the Creative Commons is, you should. This article may make for a gentle intro alongside real-life stories of how it is making a difference. [Source: OAN]

21 April 2004 . Comments Off on OAI on Apache

OAI on Apache

In a promising new twist, mod_oai will bring the Open Archives Initiative protocol to Apache. What sort of services will this make possible.

21 April 2004 . Comments Off on Simpson on Google and Akamai

Simpson on Google and Akamai

I’ve been a fan of Simpson Garfinkel since my days in the NeXT Users Group at MIT. Simpson was an active NeXT user and has since gone on to become a tech writer with a clear point of view, defending privacy as he engages the future. Today Simpson wrote about Google and Akamai as competitors, or at least fellow travelers on the the path to distributed terascale computing. What should Libraries be buying from Akamai and learning from Google?

20 April 2004 . 1 Comment


We talk a lot about “scale” and “robust” when discussing library systems. This posting about Google reminded me just how much the scale has shifted in the last 15 years. Amazing.

16 April 2004 . Comments Off on Public Access to the Public Domain

Public Access to the Public Domain

Brewster Kahle gave the talk at the closing plenary of the CNI Spring Task Force meeting. Brewster just keeps on doing, he never seems to be daunted by the scope of large tasks. The amazing thing is that it works! He set out to capture the web, and the Internet Archive (IA) does that better than any other entity. He called on us to “put the best we have to offer within the reach of our children.” Within reach, to Brewster (and to our children) means “on the web.” He then walked us through a back-of-the-napkin calculation of what it would take, concluding that the goal is within reach of us today and within our budgets to boot. Are we ready to answer the call?

Books. The Library of Congress = 20M volumes = 26TB = $60,000 disk space. At 2 hours/book (without destroying the books) this is doable. Output back to book form costs $1/book. This print-on-demand solution is being demonstrated today by the BookMobile the Internet Archive has put on the streets not just of the USA, but also India, Egypt, and most recently rural Uganda.

Audio. 2M “saleable objects” of audio exist, but much of it behind IP regs that make it hard to deal with. The IA approached the “taper” community of people who have taken advantage of performance oriented rock bands who followed the Grateful Dead’s lead into allowing fans to tape their music and exchange it for non-commercial use. “How would you like infinite bandwidth and infinite storage for free?” the IA asked the tapers. Guess what? They love the idea. 500 rock bands have given the IA permission to archive this material and share it for free. The tapers have already produced 10-20TB of concerts available on the IA.

Moving Images. Don’t just consider the 100-200,000 mainstream films (half of them from India). Consider the 2M films created in the 20th century that document daily life. Some of these may be in your very own basement. One hour of film costs about $100 to convert. One hour of video costs only $15. The IA is also now capturing 20 channels of video from around the world 24/7 for about $500,000. It is estimated there may be about 400 channels around the world.Software. The IA has received a DMCA exception to circumvent copy protection for the purpose of ripping some of the 50,000 software packages that exist to date. They are only allowed to rip titles from no-longer-supported operating systems.Web. The IA now captures 20TB/month of web content. The WayBackMachine holds over 30B (yes, billion) pages from 50M sites on 15M hosts. Anna Patterson’s search engine based on this corpus searches 4 times the number of sites covered by Google.

The Internet Archive does all this on a budget of about $4M or $5M each year. I don’t know about you, but this leaves me breathless.In order to preserve this growing corpus (libraries, Brewster notes, traditionally burn eventually) the IA seeks out partners around the world who can host copies of the data. The more different they are from the US the better. Right now a copy is held at the new library in Alexandria and negotiations are under way with a northern european country. Brewster estimates that the resources needed to maintain a mirror of IA are a PB of disk (that’s petabyte), a GB of bandwidth, and $100M to set up an appropriate endowment for continued operation.

But if the “Universal Access to All Human Knowledge” goal articulated by Raj Ready of the Million Book Project is too vast, and even the “All Published Knowledge Available to the Kid in Uganda” is a bit far out, how about something easy, asks Brewster. What if we just tried to attack what we already have every right to collect? Let’s go for “Public Access to the Public Domain.”In the USA the public domain is pre-1923 publications. In fact, Brewster points out, with the aid of Mike Klezman’s (?) recently completed electronic version of the copyright registry, it is now easy to find out which material from 1923-1964 did not have their copyright renewed and are now also in the public domain. Let’s go get this material! His proposal: give the IA a book and $10 and the IA will return to you the book unharmed plus a digital copy. Will we accept the offer? Oh, and by the way, the IA is also happy to accept video and $15/hour for the conversion of that to digital format. Oh, and did I mention that the IA will also host the digital documents on their servers “forever”?

I think we should take Brewster up on this offer. How much material do we have in the University of Minnesota collections which we could part with for a bit to let the IA digitize and store it? We should seriously consider a project to pump this material and the limited dollars required to the IA as fast as we can. This is a crazy idea at a crazy price point, let’s try to sink Brewster under our enthusiastic response! The great thing is, we probably won’t, he has not sunk yet.P.S. Brewster also tossed off an idea about how to archive blogs in response to a question. His thought was that we should be able to subscribe to blog RSS feeds and simply archive everything we see announced via that mechanism. I wonder if we could auto-harvest RSS from UThink.

16 April 2004 . Comments Off on Our Role in P2P

Our Role in P2P

I am concerned that the work of the Joint Committee of the Higher Education and Entertainment Communities may do more harm than good by legitimizing some role for higher ed in killing off P2P file sharing. I don’t think we have a role, I think this is a fight between the RIAA and MPAA and American society, we will just get trampled in the middle. Still, a session updating us on the P2P issue at CNI was interesting. It is clear that EDUCAUSE is finding little workable technology to help satisfy industry demands (tools like Audible Magic and ICARUS are throwing out the legitimate baby with the illegal bathwater). Brewster Kahle was in the audience and asked us to please remember that the Internet Archive depends on P2P for distribution of its legitimate content. If we need an example of real life content dependent on P2P distribution, he welcomes us to point his way.

Eric Celeste / Saint Paul, Minnesota / 651.323.2009 /