Monday, April 24, 2017

The Long Lingering Death of the Google Newspaper Archive


Once there was a happy land where historians could go online and explore by date and keyword through most of the historic newspapers of their town. In this favored place, researchers could quickly hunt down the names of specific people, or keywords like "Indians" or "temperance" of "I.W.W." and perform what used to be months of painstaking research in moments. Gone were the painful days of rolling microfilm through a projector and squinting at columns of cramped text, hoping against hope that you would find something on your topic. The Google News Archive, which contained most of our historic newspapers, promised a revolution in local history research. And the people did rejoice.

"Bringing history online, one newspaper at a time," Google crowed in 2008, promising to make "billions of pages of newsprint from around the world searchable, discoverable, and accessible online." Spokane was among the first cities to get the Google News Archive treatment, as the company scanned the microfilm copies of the Spokesman-Review, the Spokane Chronicle, and other newspapers and added them to the database.

This stuff used to work.
It was exciting--but Google's support for the ambitious effort faded quickly. Google mostly abandoned the project in 2011, without ever providing a reason for doing so. Soon after that, the company merged the archived news into the same category as current news, in the process breaking many of the search features. With some clever manipulation of Google search parameters and patience you could still tease a lot of information out of search, but it was a crippled feature.

Then, one day last fall, search suddenly stopped working at all. Oh a search like "Spokane Indian tribe" will come up with a few interesting hits from the historic newspapers, but compared to be able to narrow that search by newspaper, year, etc. it is pretty thin gruel. You can still (for now?) browse individual issues of Spokane newspapers at the Google News Archive (scroll down) but without search, they are far less useful.

Users of the archives got another shock last fall when, without warning, Google yanked all of the Milwaukee Journal-Sentinel from public view. This is because that newspaper, which is still in business, worked out a deal with the commercial electronic publisher Newsbank to add its back issues to their pay-walled database. Google even deleted those issues that are in the public domain. Will this happen in other cities? (Hint: yes.)

Attempts to contact Google have proven fruitless, and it appears that the News Archive joins the long list of abandoned Google projects. It would be nice if we could get the newspapers that Google has imaged into the Chronicling America site, but I have not been able to get anyone at either institution to speak about the possibility. The most likely future for the Google News Archive is that it simply disappears one day.

There is a larger lesson here, a story about the perils of relying on private corporations to provide what could be a public utility. The huge scanning projects of Google Books and Google News Archive could be replicated with a public effort--a few hundred million would do it. We could have nice things.

1 comment:

Unknown said...

Nice article, Larry. I, too, used to use this service often and hoped that it would expand, rather than disappear.

But google's book and newspaper efforts were attacked from by so many disparate parties: copyright holders, academics, competitors, and people worried about too much control being centered in one such entity. It's obvious that rewards by google of providing the service became outweighed by the costs of doing so, including the legal defense against the onslaught. Not to mention the politics of being perceived as a heavy handed monopolist.

Now, like every other serious researcher, I have been forced to use pay services. However, I have found out that yearly subscriptions are well worth the cost for some one having my heavy activity. Students? Well, a few monthly payments are not that unreasonable for them, compared to the price of their other regular communications bills.

As for the government-run services or those provided by foundations such as hathitrust or the kentucky digital project ... they all started out with a bang, but have stagnated because of the enormous costs ...,and possibly too few eyeballs.