S3 (Simple Storage Service) – Amazon and Libraries

Have you heard of Amazon’s s3 (Simple Storage Service)? From the site:

Amazon S3 is “storage for the Internet” with a simple Web services interface that can be used to store and retrieve any amount of data, at any time, from anywhere on the Web.

It’s one of Amazon’s newer web services. At .15 cents/gig of storage, it’s a pretty cheap option. Caveat emptor: S3 is intended for developers as an option for storage that can be queried with SOAP and REST web services, so they also get you for network traffic at .25 cents/gig. I wasn’t able to find anything in the fine print about checksum routines and the integrity of the objects, but I’m assuming backups and error checking are part of the Amazon routine. (Update from the horse’s mouth: found this thread in the forums which talks about Amazon’s data protection routines. It’s reassuring…)

Can the library use this? I think so. Even with the mentioned caveats, in the end you are looking at taking the server management side out of the equation. That’s pretty liberating for the small digital shops that our libraries are. At work, we’re experimenting with using the service to store some of our master digitization objects. I mentioned that this was an experiment, right? We’ve got some objects on the S3 servers and are looking into building a web interface that will allow our Special Collections staff to pull down master files when they receive requests from patrons. We’re also working with a campus entity to store media files on S3 and then building a search interface to query S3 for the data. It’s all a work in progress, but something to consider. I can tell you that my library and university will never have the infrastructure or access to a network cloud like Amazon’s. That’s not a knock; them’s just the facts.

(Sidebar: If you’re interested in web services, think about browsing around the Amazon Web Services Developer Connection. Lots of code examples, “howtos” and discussion to get you thinking about web service applications. Don’t be afraid to get you hands dirty and make some mistakes. It’s the only way to learn.)

Gettin’ Edumacated… A Digital Library Curriculum

This post on “Beyond the Job” calling for applications to a Digital librarian fellowships program at the University of Iowa SLIS came across my feedreader earlier in the month. I’m starting to see more of this which is pretty exciting from where I’m sitting. It means some schools are taking the step to train students for digital library work. (Most of the schools have used seed money from an IMLS grant for library education.) Here’s a sample curriculum from the IOWA SLIS website:

Students enrolled in this special Digital Libraries track will take the 9-semester-hour core specified for
the general MA in SLIS degree program.

021:120 Computing Foundations 3 s.h.
021:122 Conceptual Foundations 3 s.h.
021:101 Cultural Foundations 3 s.h.

Students in this track will also take the following 6 semester hours:

021:224 Electronic Publishing 3 s.h.
021:226 Digital Libraries 3 s.h.

Students will also enroll in the following course each semester.

021:239 Topics in Digital Libraries 1 s.h.

Additionally students will choose at least 6 semester hours from the following:

021:123 User Education: Multimedia 3 s.h.
021:242 Search and Discovery 3 s.h.
021:220 Programming for Text Manipulation 3 s.h.
021:124 Database Systems 3 s.h.
021:278 Information Policy 2 s.h.

22C:196 Human Computer Interaction 3 s.h.

The remaining 12 semester hours of course work may be taken from the other courses offered by the School as well as courses selected (with advisor approval) from other departments in the University.

Students are strongly encouraged to take a programming course such as Perl or Java.

I love the emphasis here on programming and relational databases. I use these skills daily in mapping out data structures and metadata crosswalks. It’s also nice to see “electronic publishing” get some face time. I’m seeing more of this “library as publisher” direction in my job and an epublishing course could really help out. The only piece I might add would be a digital library practical component – some internships in a local electronic publishing company, a semester practicum with the local digital content group. Give the students an opportunity to show their stuff in a real world setting. I’m sure this could be built into the course of study with those open 12 credits.

I’m also just a bit envious… I remember cobbling together bits and pieces of classes and work experiences that were going to help place me in a digital library shop after graduation. For the most part it worked… I worked for the UWDCC at the University of Wisconsin doing mostly grunt work – scanning, prepping for scanning, entering metadata, bit of interface design – and it was this experience that really gave me a fuller picture of digital library work. Perl and PHP programming was mostly learned on the job at my “fellowship” with the University of Wisconsin Division of Information Technology working in their corporate library and on the main web site for the communications team. Seems like this “cobbling” won’t be the reality anymore, not that there’s anything wrong with that. I’m liking the move to standardize a digital library curriculum. Here’s another program moving in the same direction and a recent article about a digital library curriculum from D-LIB Magazine:

University of North Carolina SILS – http://sils.unc.edu/news/releases/2006/01_digitalcurriculum.htm

The Core: Digital Library Education in Library and Information Science Programs
Jeffrey Pomerantz, Sanghee Oh, Seungwon Yang, Edward A. Fox and Barbara M. Wildemuth
D-Lib Magazine, November 2006 Volume 12,Number 11 ISSN 1082-9873

Just a little food for thought during the holiday season. Best wishes to all.

Anatomy of a Function

It’s been a little while since my last post. My recent work schedule and Turkey Day played a part in that. I’ve been working .com hours on a super cool project. (I mentioned the TERRA group in an earlier post.) I’ve learned so much in the last couple of weeks. It’s amazing what a hard deadline and a shifting set of requirements will do to your web programming skills. All the hard work is about to bear fruit as as the new TERRA web site is about to go live. Have a look at a different kind of digital library.

But, that’s not the point of this post… I wanted to share a little code that made some data conversion very simple over the course of the TERRA project. It’s a simple little php function that converts a MySQL timestamp into an RFC 822 date format (For the project, we stored the item update fields as timestamps and then converted them when we generated our various XML feeds. RFC 822 or RFC 2822 are necessary for valid feeds.) Here’s the php function in all its glory:

//function converts mysql timestamp into rfc 822 date
function dateConvertTimestamp($mysqlDate)
if ($rawdate == -1) {
$convertedDate = ‘conversion failed';
} else {
$convertedDate = date(‘D, d M Y h:i:s T’,$rawdate);
return $convertedDate;
//end dateConvertTimestamp

You call the function by including it on the page and using the following code:

$newPubdate = dateConvert(“$stringToConvert”);
echo $newPubdate;

Where $stringToConvert would be any MySQL timestamp value that needs conversion.

In the end a string like this “2005-05-17 12:00:00″ looks something like this “Tue, 17 May 2005 12:00:00 EST”. You could also reverse the conversion using this php function:

//function converts rfc 822 date into mysql timestamp
function dateConvert($rssDate)
if ($rawdate == -1) {
$convertedDate = ‘conversion failed';
} else {
$convertedDate = date(‘Y-m-d h:i:s’,$rawdate);
return $convertedDate;
//end dateConvert

NOTE: If/when you copy and paste the above code, make sure all ” (double quotes) and ‘ (single quotes) are retyped. WordPress is doing a number on the proper format.

I just wanted to share the wealth a bit. If you’ve got questions or suggestions, don’t be shy about dropping a comment. I’ll be home in Wisconsin for the next several days, but I’ll have limited internet access there.  I’ll try to answer questions if they arise.

Dueling Ajax – couple of articles

I’ve been a bit of the Ajax poster boy lately. Two pieces that I wrote for library audiences have just been published.

Building an Ajax (Asynchronous JavaScript and XML) Application from Scratch.” Computers in Libraries 26, no. 10 (November/December 2006).

Ajax (Asynchronous JavaScript and XML): This Isn’t the Web I’m Used To.” Online 30, no. 6 (November/December 2006)
uri: http://www.infotoday.com/Online/nov06/Clark.shtml

Both articles fall into the “introductory” mode, although the CIL article walks you through a proof of concept Ajax page update script (mentioned in an earlier post…). I want to be clear: I’m not an Ajax evangelist. I find the suite of technologies that make Ajax go intriguing and the improvements that the Ajax framework can make to some library applications are worth learning about and applying. I tried to point out the good and the bad. Although, it is a four letter word…

I did want to mention a couple of books that were really helpful in getting me up to speed with the Ajax method.

Ajax in Action by Dave Crane, Eric Pascarello and Darren James

DHTML Utopia Modern Web Design Using JavaScript & DOM by Stuart Langridge

(Click on the book covers if you are into book learnin’ and want to browse the Amazon records.) Dig in and discover (or rediscover) some of the possibilities when you put Javascript to work in your apps.

Internet Librarian – Epilogue

I’ve been back for a little over a week and I’m still doing the catchup thing. Overall, Internet Librarian (IL) was a great experience. Got to do a little bit of everything: teaching, learning, networking…. you get the idea. One of the great things about IL is the small scale of it all. Enrollment tops at a little more than a grand which makes it easier to connect with all of the attendees. I had some extended conversations about libraries, code and workplace scenarios with a whole range of people. I’ve blogged most of the sessions I attended in earlier posts, so I won’t bore you again with the details. Here are a couple of quotes from the conference:

Favorite Library Quote from the week: “Libraries are a collection of services, not books”

Favorite Non-Library Quote from the week: “I just checked the il2006 tag on Flickr. Who is that brunette you were talking to?” (from my wife jokingly…)

One of the major themes from IL 2006 was libraries using social software – Flickr, MySpace, Facebook, Wikis… It’s a rich topic and most of the tech is easily implemented. But there are other parts of the conference where I struggle to understand the “where does the rubber meet the road” code questions. Most of the IL sessions offer the broad ideas and have little time for explaining how to do it. (I’m including myself here…) That’s not a criticism; that’s just the structure of the event. There are some ways to get at the “how’d they do that?” code questions.

  1. Attend conference workshops to answer some of the in-depth code questions.
  2. Ask for code examples from presenters (I had a number of people stop me after the presentation and over the course of the conference to talk a bit more about code questions.)
  3. Network and talk to librarians with some programming chops. Talk shop with those who are building apps in your field of interest.

As I said earlier, it was a great conference and I haven’t even mentioned Monterey, CA which is just a beautiful setting. I would suggest Internet Librarian for any public services librarian looking to understand what’s on the horizon for libraries. The smaller scale of the conference and the forward-looking nature of the presenters makes for a great mix. If you have the means and the time, check it out.

Internet Librarian – Day 4: The Mobile Web

Trends in Mobile Tools and Applications for Libraries
Megan Fox, Web & Electronic Resoures Librarian – Simmons College

I missed a similar session at a previous conference. Wanted to hear about how our resources are displayed in all kinds of settings. Fox has got a broad perspective on the topic and had all kinds of tips and gadgets.

Notes from the presentation:

Mobile tools – PDAs, cell phones, ipod
Mobile information needs – on a flight, stuck in traffic

The market
75% of all US adults have mobile phone service
90% of college students

The Devices
smartphones are the device of choice – nokia, samsung
-camera, calculator, phone, web browser, radio, interactive gaming
-large screens (relatively speaking) and keyboards
– other devices, UNPC, sony reader

The future network – 3G, even faster wireless to allow for streaming – think video conferencing on the bus
-not yet available in US, but taking off in Japan

The mobile web
– formatting content is the challenge – optimize for web
– limit images, simplify html, mobile css stylesheets, avoid device intensive programming languages (large javascript files)

Library mobile web options:
– hours pages
– quick lists
– library catalog search (Sirsi and Innovative)
– ready reference on the go – Handango, answers.com
– ebooks – mobipocket (U of Alberta has directions on downloading netLibrary into PDAs)
– databases (Ovid and LexisNexis have some mobile partnerships)
*Los Angeles public has ipods and smartphones, Thomas Ford Memorial, Duke Freshman Reading project – Billy Collins reading his poetry in a podcast

*check IMDB and New York times for examples of optimized web; convert web pages on the fly can be another option

Third party sites – google mobile optimizer, skweezer
– convert your sites for mobile “transcoding”

Mobilize your content with an RSS feed – simple format for display
Mobilize your mashups – Frucall (compare prices), RealTime Traffic

Drawbacks to mobile web:
1. Cost
2. Comfort with small size
*people are using the devices to text message because it’s much cheaper

Content via SMS – texting: more acceptance
Communicating with mobile users

Next generation prefers the texting environment – we need to get in their network
– check altarama.com.au
– extending ready reference or library record lists
– JOOPZ web texting – fill out simple web form and translate to text message

LibriVox – auidobook version of project Gutenberg
Expected n 2007 – PVR (personal video recording on your mobile)

What’s next?
*problem with data entry
– communicate using images “photo to search”
– voice recognition
– taking advantage of GPS – location interaction, geotagging
– face recognition for security

full url for presentation – web.simmons.edu/~fox/pda

Internet Librarian – Day 3: Repositories, a library opportunity

Repositories & The Impact on Digital Librarians
D. Scott Brandt

D. Scott Brandt talked about getting into the research network at the university. Brandt was pointing to a new library initative at Purdue to involve libraries earlier during the research process. It’s not about managing finished, publishable objects; we need to insert ourselves in the data management, access and preservation needs when the research is happening in the lab. I like the sound of that.

Notes from the presentation:

The story of how Purdue Libraries got into repository initative?

Need for approaches, protocols and systems to store all of our natiaonl data
New dean, new directions – collect, organize describe, curate – for university community

“Librarians as participants – putting libraries salaries on grants for limited times”
-librarian does work or is project manager

Library faculty are better integrated into campus research agenda

How to foster interdisciplinary collaboration?

Researchers have data management needs
– seek researchers who undrstood that collecting and organizing and providing access to data
could make grant need stronger

Initial Questions:
Do researchers have data discovery management and organization needs?
Can library science solve some of these problems

Data related faculty needs they found:

  1. not sure how to share data
  2. lack of time organize data sets
  3. help describing data
  4. want to find new ways to manage data
  5. need help archiving datasets/collections

Departments served – Ag, Chem Eng, Biology

Brandt used the onstar car data example

Collaboration: crucial to research
– different aspects of dealing with data = library opportunity

Agronomy example:
1. working with simple data generation model, determine data/metadata workflow
2. libraries role is helping data producers

Chem Engr – discovery informatics
1. investigation of small science data needs
2. issues of what gets shared, when and how
3. libraries role is developing dataset ontologies – utilitzing language of electronic notebooks to define, navigate throughout research process

Library roles:

  • metadata creation
  • providing access
  • providing preservation
  • consult on ontologies and vocabularies

scholarly communication – data analysis

  1. published data/datasets
  2. unpublished research
  3. published research (non-traditional)
  4. published research traditional
  5. secondary tertiary resources

Libraries are being invited into the levels of development before these top 5 peieces of scholarly communication are finished
– actually participating during research and development – COOL STUFF

Data Curation Matrix – Johns Hopkins, looking at 5 types of scholarly communication and the opportunities across disciplines

Repository as a research platform
– focus on active verbs: access, preserve
– Distributed Institutional Repository

Going forward – new role for librarians
– serve as “bridge” between researchers and libraries
– library data research sceintist

Purdue e-Scholar library


Internet Librarian – Day 3: Mashups + APIs

Mashup Apps: Community Dev
Chris Deweese, Lewis and Clark Libary System
John Blyberg, Ann Arbor District Library

I’ve been thinking about how we might open up some of our library data to an API. Chris Deweese and John Blyberg have taken some different approaches.

1. Use someone else’s – e.g., Google Maps API
2. Build it yourself – PatRest add on to Innovative OPAC (XML REST web service)

It was interesting to hear their thoughts on encouraging users to “re-purpose” library data. I was wondering about using XML-RPC as quick and easy alternative to full-on SOAP and REST web services.
Notes from the presentation:


New buzzwords – Mashups = consuming and recombining two separate systems
Preparing for what’s next – many parts, loosely joined

What’s so great about them?
– don’t require wicked coding skillz
– results are instant
– results can be striking
– Masups = the evolving web

Two categories of mashups
1. simple mashup
2. statement mashup – web as an authoring language – profound statement

“Net as a global operating system.”

Call for “an open standards based API”
A first part of the semantic web – web services

Can libraries mash?
We already have a goldmine of data

It’s all about markup
– xml + rdf (applying schema to loggically group data)
– OWL ontology definitions
– help machines read the data

REST? Sounds lazy
– representational state transfer
– accessed through URI
– simple for developers

PatREST (Patron REST)


- schema that was simple to use, easy to understand

Stuff you can do – electronic signage
Ed Vielmetti’s (SuperPatron) Wall of Books
Library Gadgets

Why let the public do it?
– creates a sense of stewardship
– unlocks a potential brain trust
– encourages innovation
– benefits other libraries
– solicits high-quality feedback
– puts library data into new contexts


Google maps mashup: Mash-it-up Google Style

Google Maps API

1. Get Google Maps API key
2. Get Hello World example
3. Add markers to the map – lat AND long

Google maps API allows you to plot a series of points using a custom xml file
Lewis and Clark Library System – delivery routes

Internet Librarian – Day 2: Data Dump

Got a chance to see some great speakers today. Wanted to collect some of my thoughts about Steve McCann’s study of academic library web sites.

Analysis of US College and University Library Home Pages
Steve McCann, University of Montana

He’s been analyzing the organization of popular academic library index pages. Steve was mostly interested collecting trends and then running a usability test to determine best practices. He hasn’t conducted usability testing to hammer out the last part of his thesis, but it was still really informative to see library web design trends catalogued with the long view in mind.

Here are my notes from the session:

Making data work harder; talked about his business experience

Study of library college websites by analyzing wayback machine

Users are bypassing on-site navigation – getting to object directly
-search engines, weblogs, rss
-commercial sites have had sucess – flickr, youtube

Library Access: a long tail problem
-social software strategy; it can’t all be on the front page
-how to address the bypassing of content
-does the index page even matter?


  1. google search: 1803 library websites found
  2. ranked with Alexa.com – tracking ip
  3. isolated sites ranked in top 100
  4. categorized top page strategy – browse OR search

Two Site Strategies for library web sites


  • grid
  • cascade
  • frame
  • radial


  • no search box
  • library search box
  • site search box
  • university search box

Data results:

  1. grids and cascades are most popular
  2. browse strategy has become most popular on sites
  3. search strategies are pretty evenly distributed – search is becoming a standard

-browse = grid, cascade with grid showing most momnetum
-search = some form of search strategy will exist

Next steps – usability testing with your user groups about preferences
hybrid – search and browse – u of oregon libraries (http://libweb.uoregon.edu/)


Get every new post delivered to your Inbox.