Tuesday, February 12, 2013

Mapping Spokane's Dead: A Pedagogical Experiment in Flash-mob Data Visualization

I taught Digital History last quarter. The course was a lot of fun as a dozen traditionally-trained MA students and I explored some of the new digital historical landscapes. The class is divided between readings discussions and a weekly "make" where we bust open some new digital tool and see what we can do with it. For our weekly make a few months ago we created, populated, and visualized a historical database in about an hour. This post is about what we did and how we did it.

I had a couple of different pedagogical goals.First, I wanted my students to understand the importance and power of constructing a database, rather than merely building a website. Second, I wanted them to explore one tool for doing so, Google Fusion Tables. Third,.I wanted to use some of the rich historical resources of my employer, the Washington State Archives, Digital Archives. Finally I wanted them to experience the difficulty, decisions and compromises of building a database and extracting metadata from handwritten historical documents.

Working with my grad student, the excellent Lee Nilsson, we chose the Spokane County Death Returns.1888-1907 as our data set. These records were interesting, the images of the death certificates and some metadata were already online, they represented a broad cross-section of the population of early Spokane, and they presented certain complications as well, in terms of handwriting and uneven data. Here is a sample death return from the collection, that of the unfortunate Owen J. Jones:



When the State Archives originally scanned and indexed these records, they chose to record as the metadata the first and last names, age at death, and place of death. This was a good start, but missed some data that was recorded on the death certificates and that historians would find important. So we added race, occupation, place of birth and cause of death to the metadata fields that we wanted to capture.

Then Nilsson created a Google Fusion table with our metadata fields and entered the information from a few death returns. Right away he realized that one problem the students would run into was transcribing the causes of death--things like phthisis (tuberculosis of the lungs) and morasumus (malnutrition) written in sometimes terrible 19th-century handwriting. A quick Google found us some lists of 19th century causes of death. Nilsson added about 150 names from the 1880 death index to the Google Fusion table and used color highlighting to organize the list in groups of ten names. I took the email addresses of the students in the class and gave them permission to edit the table.

The actual lesson took about an hour. We gave each student ten names and asked them to read the death certificates and to add the metadata to the table. Nilsson and I circulated in the classroom to help people out.   The names and dates went pretty smoothly. The students stumbled with causes of death at first, but the guides to 19th century causes of death cleared up most questions. A bigger problem was missing data. 1880 was the first year of death records in Spokane and the record keeping was erratic. Places of birth, causes of death, and other items were not always filled out.

Then came the experiment part--visualizing the data. My original inspiration for the project was the idea that we would produce a map of where in Spokane people had died. This did not work so well:



In retrospect the reasons are clear. A few of the death certificates gave street addresses, and Google was able to map these accurately. In other cases no place was given, or given only as Spokane. In some cases there was a location but the student transcribing was unable to read it or just did not bother. With better instruction and monitoring this map might have come out better.

Much more satisfying was this map of where the people were born. It nicely illustrates patterns of migration into 19th century Spokane. Don't miss the guy from China!



Google Fusion tables allow for other types of visualization as well. Here is a pie graph of causes of death:



Those are some mighty thin slices of pie! The chart does not tell us very much, except that typhoid and pneumonia were common. We would have been better off creating categories--contagious disease, accidents, infant death, etc.

You can also make bar graphs--here is one for the occupations of the deceased. The laboring classes seem to have had it bad in early Spokane, though you would need some demographic analysis to make any conclusions here:


Finally, here is the a frequency showing the distribution of deaths throughout the year. The table is interactive, you can click and drag to explore it. Notice anything odd? The Grim Reaper seems to have forgotten Spokane entirely some months:



The students were quite puzzled by this and came up with all sorts of reasons that several months could have passed without a death. Of course the most likely explanation is simply that the records for those months have gotten lost in the century since they were initially recorded.

All in all this pedagogical experiment was a great success. My students learned how digital sausage is made--the decisions that go into choosing what metadata to record and visualize, the challenges of working with hand-written 19th-century documents, the amount of pain-staking work that went into a data visualization.

4 comments:

Dr. Cynthia Annett said...

What a wonderful exercise! For many years I took my undergraduate field ecology course to the old cemetery in Lawrence KS and had them record data off headstones for a human demography exercise. This was a great place to do it because there is so much rich history to explore (even though we did it as a biology lesson)- for example, students found the mass grave from Quantrill's Raid, they discovered a small group of escaped slaves with handmade headstones, the Chinese railroad workers, etc. But what always hit them the hardest was the number of infants and young mother's who died in childbirth during the 19th century. That made demographic research "real" for them. I love what you are doing with records and Fusion Tables and will circulate it to my colleagues in biology as a great way to expand on our labs.

Larry Cebula said...

Thank you Cynthia. Cemeteries are a wonderful teaching resource, aren't they? When I worked in SW Missouri I regularly sent students out to rural cemeteries to do research and be consumed by chiggers. The cemeteries they explored (and there are 150 in Jasper County alone!) were not as interesting as yours in Lawrence but they too were struck with the deaths of children and young women.

Cemeteries also offer great digital teaching opportunities, because students can collect data, link it to online databases, and do all sorts of visualizations and mapping. Take one family from the cemetery and create a Google Earth field trip of their lives--from where the parents were born, where they moved, and where the kids wandered off to. So much good stuff.

Larry Cebula said...

Also--here is the cemeteries tag on this blog: http://northwesthistory.blogspot.com/search?q=cemetery

The best post is the second one down, "Deciphering a Mysterious Headstone."

Scott Wyatt said...

What a great exercise! You're right, of course, that this data retrieval shows migration (and immigration) patterns to the Inland Empire. I was interested to see the reference to the Chinese immigrant who passed away at age 35 in 1988. This would have made it likely (in my estimation) that he came through initially with the Northern Pacific Railroad crews in the early part of that decade. I just published an historical novel about the Chinese immigrant experiences building the NPRR through Spokane Falls, Eastern Washington, and North Idaho Terriotory. It was no easy life for these folks.