“Process is the new god…Digital Humanities mean iterative scholarship…It honors the quality of results; but it also honors the steps by mean of which results are obtained as a form of publication of comparable value. Untapped gold mines of knowledge are to be found in the realm of process” (The Digital Humanities Manifesto 2.0, 5).
When I’m working on my project, I often think back to this quote. My project deals with doing the same thing over and over again. I’m working on a sentiment analysis of Marcus Tullius Cicero’s orations paired with a network analysis of his letters in the Ad Familiares. It involves hours of data entry, running the same tests in R with slight variations, and corpus preparation (otherwise known as me struggling to put spaces in between the paragraphs of the .txt files). “Iterative scholarship” is a nice way to put it; “torture” is another way.
However, Digital Humanities requires a dash of optimism and a handful of perseverance, an ability to squint and see the gold mines hidden “in the realm of process.” And indeed, within my first two weeks of being at the Digital Studios, I do feel like I’ve gleaned little, golden nuggets of knowledge.
While struggling with the spreadsheets for Gephi, I became acquainted with Cicero’s familiars and the world they all lived in. I’ve made many mistakes while putting these spreadsheets together. I mixed up the Catones and the Bruti often and realized that there was more than one Caesar. But through my errors and through the process in general, I’ve already gained insights into Cicero’s world without even running the data through Gephi yet. I’ve already seen and counted the people that Cicero writes about the most (although it comes as no surprise that he mentions Julius Caesar the most).
My most recent struggle has been with inserting breaks within the .txt files so that I can measure sentiment paragraph by paragraph. At first, I was doing this manually, which is just as tedious as it sounds. Then, with the guidance of Matthew Butler, the Senior Developer here at the UIowa Digital Scholarship & Publishing Studio, we managed to figure out a way to have the computer do the tedious work for us through Python. It was my first time grappling with Python, but I had a great guide who forged the path for me through Python jungle in the realm of process. Not only did I become familiar with Python, but when I was manually inserting breaks, I also became further acquainted with my corpus. My files had been “lemmatized,” or in short all of the Latin words are reverted back to their stem, in theory. For some reason, this lemmatizer changes first conjugation infinitives to the second person singular passive. For example putare becomes putaris instead of puto. I was able to make changes to my sentiment lexicon accordingly and improved it.
This iteration also occurs in traditional scholarship. We read and reread the same works repeatedly, pouring ourselves over the same lines for close readings. Articles are written about with the same argument but with minor tweaks. Then why does digital scholarship seem so different? It’s not. As I pass my time here, I realize that the gap between traditional and digital scholarship isn’t as wide as it first seemed. I’m spending much more time with the texts than I am with R and Gephi. Furthermore, I understand that Digital Humanities can be intimidating. The realm of process can seem like a hopeless waste(of time)land to some. But new, golden discoveries, both personal and professional, lie in wait; you just have to persevere. And if you’re a Classicist and you can successfully struggle through Tacitus and Aristotle, you can also defeat the mighty Python and wrestle with Gephi.