My dissertation, “Messengers and Messages in Middle English Literature,” examines the under-explored role of messengers in fourteenth-century English romances, where they often prove to be crucial elements of the plot or interesting stand-ins for an authorial function. During my capstone experience, I will be pursuing corpus linguistic investigations, with the help of the Digital Studio’s Nikki White and Matthew Butler, into key words for messengers and messages and their collocations. This is a particularly challenging undertaking given the lack of any spelling standard in Middle English, which makes the number of possible search terms for any key word positively daunting. My capstone project will work towards solving this problem and eventually result in a chapter of the dissertation displaying the value of a Digital Humanities approach to Middle English literature, complete with data visualizations and discussion of process.
The first step in the process was to build a corpus of medieval texts on which I could conduct textual analysis. Due to my dissertation’s focus on medieval English romances, I decided to limit the scope of the corpus to strictly medieval texts written in Middle English, while avoiding those written in Latin, Anglo-Norman, Old-French, and other languages of continental Europe. This decision had the added benefit of narrowing my prospective corpus down into a much more manageable number of texts.
The next challenge was to acquire the text files that would make up my corpus. Based upon my preliminary research, it seemed that the easiest way to build my corpus would be to build a collection of texts on Hathitrust. While most of the texts I needed were likely available on Hathitrust, curating a corpus one individual text at a time proved to be more time-consuming and less intuitive than I had originally imagined. Eventually, I realized that it was likely that someone else had likely already done this work and may even be willing to share—fortunately, I was right.
As luck would have it, the files I needed had already been assembled by one of the most well-known and longest-running digital humanities projects in medieval studies: The Middle English Dictionary project, hosted at the University of Michigan. Nikki White reached out to the project manager, Paul Schaffner, who provided a link to the entire corpus of Middle English Texts. This proved to be a huge breakthrough for the project. Currently I am sorting and ‘cleaning’ the data in preparation for textual analysis. This has involved sorting the files and making sure that there are not duplicate texts in the corpus—for instance, the files included several different versions of Geoffrey Chaucer’s The Canterbury Tales, which could have skewed the data resulting from textual analysis. In order to streamline the process of sorting and cleaning, Matthew Butler helped me to write a script which pulls the metadata from the files and exports it into a CSV file which can be read in Microsoft Excel. Currently, I am about to begin the process of performing textual analysis on the corpus—more on that next time!