Thursday, June 25, 2009

"Lick This": LOC, Flickr, and the Limits of Crowd Sourcing

[Update: This post has provoked quite the discussion over at the Flickr Commons board.]

In January of 2008 the Library of Congress and the photo-sharing web service Flickr announced a unique partnership. The Library of Congress Flickr Pilot Project put 3000 historic LOC photographs on the website Flickr and invited the public to view, annotate, tag, and generally mess with them. This was perhaps the LOC's first foray into the world of Web 2.0 and generated a tremendous buzz. "In the first 24 hours after launch, Flickr reported 1.1 million total views on our account, with 3.6 million views a week later," according to this LOC report on the project. The project--"a match made in photo heaven" according to the LOC blog--has been praised everywhere from the New York Times to the popular community weblog Metafilter.

The goals of the project are to "increase awareness of the Library and its collections; spark creative interaction with collections; provide LC staff with experience with social tagging and Web 2.0 community input; and provides leadership opportunities to cultural heritage and government communities." Especially talked about was the second goal--sparking interaction with the collections. The idea was that visitors to Flickr could add useful metadata LOC images, such things as the names of people in the photographs, locations, models of cars or other machinery, etc.

The project may well be a success overall, but as a way to add useful metadata to historical documents, the Library of Congress Flickr Pilot Project is a disappointment. Let me explain...

Above is a screen shot of this photograph, from the very popular 1930s-40s in Color photograph set. This iconic photograph is also used as the cover image on the LOC's Final Report Summary for the project. This one photograph, and the user-generated metadata attached to it, demonstrate the problems with inviting the general public to contribute to a historical collection.

One of the most innovative features of Flickr is the ability of visitors to add notes to the pictures. You can create a rectangular box over some portion of an image and add a text note. This is especially useful for identifying individuals in group photos or pointing out specific details.

So what sort of metadata have users added to supplement the sparse LOC identification ("
Bransby, David,, photographer. Woman aircraft worker, Vega Aircraft Corporation, Burbank, Calif. Shown checking electrical assemblies, 1942 June ") of the photo?

There are 20-30 notes on the photograph and not one contains useful historical information to give context or help us understand the photograph. Most are throw-away jokes or comments, "I love this fabric!" by Flickr user Mrelia and "Lick this" by user HeatherrFalk (referring to the woman's forehead!). Most of the rest of the notes refer to the woman's appearance or the composition of the picture. Almost useful is a little nested debate about the authenticity of the photograph--how staged was it?--but the discussion is hard to follow, requiring hovering the mouse over each box to see the comment.

Flickr users may also add comments and tags to images, and organize them together into sets. But here again the crowdsourced noise overwhelms the signal of useful historical information. There are over 100 comments attached to this one photograph, all but a few devoted to the picture's composition (well it is a photography website after all) or how pretty the woman is or posting just to post something. Within the chaff there are a few grains of wheat--as when user BeadMobile adds some pencil drawings made by his grandmother when she worked in a factory during World War Two. But you really have to dig.

What about tagging? User tagging is often presented as a simple and powerful way to crowdsource metadata in online archives. There are 71 user-generated tags for this image. Some are obvious and useful--"1942" and "rosie the riveter." Many others however are odd ("everyone did their part") or cryptic ("sfv" "LF").

And the sets? How have Flickr users organized this image with others? Well the woman in the picture should be proud that she is in the "Nation Of Domination. (We Rule The Universe)" photo pool and the "cable porn" pool.

The above might seem like a lot of text to bash on one image and its metadata, but the problems extend to all of the other images in the project. The notes are mostly smart-ass remarks, the comments are empty, the tags are idiosyncratic. The frustrating thing is that there really is some crowd sourced gold withing the flood of junk, such as the transcriptions of hand-lettered signs in the windows of the Brockton Enterprise newspaper office in this photo.

The most useful comment I found in this project? User Catskills Grrl's comment: "Gee, I wish the stupid, smart-ass notes would be deleted off these photos."

I will pick up the topic of crowd sourcing again in a future post, pointing towards some archives that I believe are doing it correctly.


Patrick Peccatte said...

Hi Larry,
There are maybe plenty of unuseful user generated contents (notes, comments, tags) because ... it is the LoC.
On our own crowdsourcing project called PhotosNormandie which is active since January 2007 we have very few noise and almost all comments give us valuable information. This probably because our project is more confidential, restricted to a more specific subject, and use French language (yes, we know that it could cut down the audience of our project...).
With best regards

Criz said...

Hey Larry, thanks for the post. It certainly is something to consider from all angles.

We started a discussion in the Flickr Commons discussion group on Flickr:

I hope you'll drop by and discuss it with us!

Rob Ketcherside said...

Hey Larry, your "Brockon Enterprise" link is missing. I think you meant this one.

Rather than a commentary on crowdsourcing's ability to document and research history, you've provided insight into design issues..
- A site made for individuals to moderate their own content has been extended to a public forum. It's missing community features like voting that Digg, Facebook etc have.
- On a similar note, there are few "rewards" in Flickr for behavior. The only way to passively observe the coversation is to post to a thread and then have responses show up in your "Activity" thread. No one will browse repeatedly for any interesting new comment.
- Taking it one step further, the institutions are not given many tools to reward "good" behavior. LOC is one of the few that even acknowledges when a piece of information is truly useful and worth posting to the master database. As a community member, it's important to know that your time is worthwhile. Right now, the high-5 chains are the only thing obviously impacting anyone else's lives.
- Photos are selected to represent the Commons on the home page and landing page, but they are not given extra attention to control bored posters and to make a positive impression on browsers.

That said, I think your sample set is way too small. You've selected the worst of the bunch. Don't photos with comments like "that's my aunt" (with personal biographical sketch) at least offset the issues with the Rosie photo? Even if very few photos have interaction that provides new insight into history, isn't it worth it just for the connections and perspective that were missing in the original card file and on the Internet?

I've learned too much from other users in The Flickr Commons to dismiss it as a failure. Reports issues from LoC indicate that they feel the same way.

Larry Cebula said...

Thanks all for the thoughtful comments.

Patrick: I think you are right that the problem with the LOC project is exactly its popularity. (Ooooh--I wish I had named this post "The Tragedy of the (Flickr) Commons." It is too late to change isn't it?) I look forward to checking out your Flickr site.

Criz: Thanks for the tip, see you there.

Rob: Thanks, I was missing a link, it is now fixed.

"I've learned too much from other users in The Flickr Commons to dismiss it as a failure." I hope I did not come off as dismissing the entire enterprise. The project has been a great success in generating publicity for the LOC (and I mean that sincerely--government institutions need positive publicity to ensure their support.) And it put some valuable historic photos into wide circulation.

I think you are right that the Flickr software encourages noise over signal (if I am not putting words into your keyboard). What is needed is a system to vote comments up or down and display options ("Show only comments rated good or better."). Another way to do it is to have the option of user flagging and administrative removal (ala Metafilter). But then you need an administrator, and who has time for that?

All: I want to follow up with a post of successful examples of crowd sourcing metadata. Right now I am thinking of, for example. Are there other sites I should feature?

Bill Youngs said...

Value of more focused sourcing

Larry, that's a great example of a crowd-sourced image-tagging process gone wrong. That said, some of these comments do rightly point out that sometimes the process goes right. I do like the idea of a approach that would limit the crowd for a more focused conversation -- say three colleagues discussing an image or a class sharing interpretations. That doesn't preclude public discussions -- one could even present the same image to different audiences for different objectives.

heatherfalk said...

I am the user Heatherrfalk and I would just like to LOL @ this.

In lighter news, check out my flickr page @

Larry Cebula said...

Hah! Thanks Heather for checking in--I feel like we have completed some kind of digital circle here...