May 21, 2008
By Robert B. Townsend
Some recent observations in the blogosphere about the “conservative” nature of our disciplinary research practices, and an invitation to speak at the JSTOR publisher’s workshop last week, got me thinking about just how far we have traveled over the past 20 years.
As a point of comparison, I think it helps to cast our minds back to the digital dark ages of 1989. Back during my first try at graduate studies, a forward-thinking faculty member dragged my class into the library to learn a little about online databases. With the help of a librarian—because mere researchers were not allowed to touch the computers back then—we entered a few keywords for the paper I was working on. After a couple minutes, the database came back with three citations. And I remember that seemed pretty cool, until I looked at the bottom of the page and saw that the price tag for this little search was $2.37. However neat that might be, I could not imagine that something that cost 80 cents per citation would ever get much widespread use.
When I returned to graduate studies seven years ago, research practices seemed fundamentally transformed. Back in the 1980s, my first step in a research project would have been to go to the library and root around in paper bibliographies and indexes. With JSTOR, I can sit in my basement and cast a net over a much wider set of journals and see connections and common themes across a number of disciplines in an evening. And as a new study from Ithaka demonstrates, many historians are starting their research the same way.
At the same time, it also lets me dig deeper and more narrowly into my research, as well. I am currently finishing a dissertation about the early development of the discipline. The journal literature in the discipline is a vital source for that research, but as you might imagine, the hundreds of articles and other materials in my 60 year period of study forms a daunting wall of paper. Through keyword searching into that dense mass of text through JSTOR, I have been able to unlock a wealth of information that would have taken me months of painstaking reading to unearth. Through fairly simple text searches and tabulations, I can identify when particular buzzwords seemed to enter into fashion. And I can also track the ebb and flow of particular markers in the discipline, such as when succeeding generations of historians stopped comparing each other to the big names of previous generations.
Perhaps more surprising and unexpected to me, JSTOR also opens up perspectives from scholars who considered themselves part of the discipline at the time, but have since fallen out of the story. By expanding my searches to include journals in a number of related disciplines, the materials in JSTOR demonstrate that it took a while before history got marooned on its own disciplinary island.
Admittedly, my methods are pretty crude—generally just developing a search that is precise enough to target in on specific sets of keywords or concepts that I want to track, and then tracing the relationships in which they appear or simply graphing their appearance over time. This is facilitated by the high-quality and reliable metadata in JSTOR. An essential part of my frustration with Google Books was the appalling quality of its basic metadata—particularly reliable information on publication dates. That makes it impossible to use Google Books in a similar fashion—however useful it might be for finding particular books.
And my method is not perfect—some of the changes turned up by my method can be arbitrary or particular to the journal. But they often provide a way of looking past individual members of the discipline to the larger group, opening up a social perspective on the discipline. Fortunately, there are other scholars who are far more proficient in text mining and statistics. They are already developing new and more sophisticated ways to dig into the information, and as they produce new models and new methods for researchers like me to follow, I expect this will open up exciting new areas of historical research.
So I remain quite optimistic about the way the discipline is integrating digital databases into its work. Perhaps one day, Google will get around to fixing the metadata, and make that a more useful resource for historical text mining. In the meantime, I wish the digitally advanced members of our discipline could pair their enthusiasm for these technologies with a little patience for their slower-moving colleagues.