Digital History As Data Transliteration

putting digital history in its place, and letting it move forward from there: a response to seth denbo.

Grid Classifier Ben SchmidtBen Schmidt’s “grid classifier,” from his whaling logs and digital history series.

A comment on Seth Denbo’s post, “Digital Dispatches: Data Storytelling and Historical Knowledge,” Perspectives in History, April 2015:

Seth —

Sorry, this turned into a long comment. I hope it is worth your while.

I like how you put pressure on the concept of data as currently understood in digital history. It makes far more sense to situate the idea of data in the context of the full range of sources that historians have always turned to in order to create interpretive stories about the past. Data are not just numerical forms that easily lend themselves to computational analysis; data are all modes of evidence from the past in whatever mediations they inhabit, whether it be on paper or in performance or in digital versions.

To think of digital data as one type of evidence (I’m drawing upon Gibbs and Owens’ great work on this in “The Hermeneutics of Data and Historical Writing,” in Writing History in the Digital Age, eds. Jack Dougherty and Kristen Nawrotzki) is to put the digital domain in its place and, in doing so, also to access its potential for enhancing historical discovery. It means considering digital history in continuity with longer-running historical practices rather than imagining the turn to the digital as some kind of radical break. There’s more professional payoff in pitching it as a radical break, of course, but ultimately less intellectual value.

To place digital history in the context of existing historical practices of analyzing data, sources, and evidence allows us to consider more creatively what we might do with computation as historians. The computer is not going to “do” history for us magically; for example, I don’t think it inherently links “big” data to “big” history to “big” policy impact, at least not necessarily, despite certain claims found in Armitage and Guldi’s History Manifesto. What it might do, to my mind, is something more basic, but also more important: it gives us an additional powerful tool for accessing and considering trends and patterns in the past, and doing so across many different levels of scale, scope, size, form, mode, and medium.

I actually think the intense focus on scale has led us all a bit astray. The more important aspect of digital historical tactics to me has to do with the potential ductility of digital data and code. Being able to transliterate our evidence among different “iterations”—or outputs—is the key to why digital history methods are worth pursuing. With the right tools and interfaces, melding together statistical, artistic, and other approaches, we can reveal new aspects of sources and evidence because of how data can be transposed from one form to another: a set of numbers to a chart, a map to a table, an image to sound. From these transliterations, we can begin to perceive details and overarching trends that we could not access previously.

The work ahead is to continue to figure out which transpositions are convincing and which are not, to develop and debate and hone methods that we believe are legitimately correlating one input of data to a different output or representation. That’s the work still to be done. And there’s a lot of experimentation and debate still to be had there before any kind of established practices emerge and settle into place.

For instance, when the students in my Digital History course worked with Ben Schmidt’s whaling logs experiments (the very ones you mention), we were actually not that impressed with the visualizations and their capacity for storytelling (they are actually very confusing, but we accepted that they were just experimental and not a fully developed project that Ben created—we are all big fans of his work). It was not the visualizations that pulled us in, it was his contention that historians can make intelligent and creative use of statistical tactics to handle “messy” data. The whaling logs, he points out, are actually not neat or accurate representations of the past. They are full of gaps and inaccuracies and questionable data (see Ben’s post on his Sapping Attention blog: “Logbooks and the long history of digitization,” 12 October 2012; see also his “Machine Learning at Sea,” 1 November 2012; and “Where are the individuals in data-driven narratives?,” 14 November 2012). So, according to Ben, one has to be statistically savvy—and in doing so, one is already asserting certain interpretive moves of course, which is fine since there is no purity ever in historical analysis—in order to search for patterns within these data that might convincingly link evidence to argument.

Ben’s major historiographic intervention was to claim that the whaling logs reveal not a developing global network of capitalism in the nineteenth century, but rather a capitalist-industrial degradation of the environment through ongoing over-harvesting of fish from particular areas of the oceans. To harvest evidence of this himself, Ben employed machine learning using, in his words, “a two level application of K-nearest neighbor classification based on a training set tagged by a number of origin ports” (see “Reading digital sources: a case study in ship’s logs,” 15 November 2012 on Sapping Attention). It is how Ben devised and implemented this system of data analysis and whether we find his choices convincing that is crucial. If one agrees with the ways he chose to sort and organize, curate and analyze, his data—how he wielded machine learning and statistics and certain educated guesses to organize his evidence into a story— one will find his argument compelling; if one questions his position that we must examine historical activity at an aggregate level rather than at the level of individual actors, or if one takes issue with his choices in how to wield his data to make this claim, then we are off and running with what I think will be a rich and rewarding digital historiography.

All of these moves that Ben and other emerging digital historians are making are not about some simple, straightforward reach back to grasp the past with the digits of the digital; rather they demand a continual and sophisticated awareness of how data needs to be “massaged” in order to make sense of it and render it into intelligible historical stories. Digital data and computation present new possibilities for this “massaging” through multiple means: statistical inquiry, visualization, sonification, multimedia re-renderings, “deformance” tactics, etc.). One can do this statistically, as in Ben’s case. One can do it through spatial history and visualizations, as in something such as Richard White’s unearthing, by turning seemingly inscrutable railroad freightage rate tables into “distortion visualizations” on maps, of the political and cultural struggles between rural populist farmers and railroad companies (see “Seeing Space in Terms of Track Length and Cost of Shipping” on the Stanford Spatial History Project website; I’ve written more about it at “Space-Time Continuum: Richard White’s ‘The Spatial Turn,’ on Issues in Digital History blog, 18 October 2013). One can do it in other ways too (for instance, I’m experimenting with the sonification of images to perceive new historical patterns within them, see Michael J. Kramer, “A Foreign Sound In Your Ear: Digital Image Sonification,” on Issues in Digital History, 18 February 2015). But the overarching point is that the debates should be about which tactics are convincing and why or why not, not about contrasting digital history as a whole to “analog” history.

We are, as you point out, digitally extending existing modes of “data-driven” storytelling. Which means that when going digital, we should not think of ourselves as abandoning past avenues of wielding sources and evidence to make sense of the past; we should think of ourselves as embedded in and reaching out digitally from that past itself.

Thanks for the great post.

— Michael

Leave a Reply