Hello blog readers! I’m Hayder, one of the summer fellows in the UIowa Digital Scholarship & Publishing Studio. I’m working on creating a website to educate people about a new fuel combustion process which is Chemical Looping Combustion (CLC). CLC is a promising method of natural gas combustion to produce energy. This process utilizes the lattice oxygen molecules of metal oxides to combust the natural gas, instead of air, which minimizes the formation of pollutant byproducts such as NO2, N2O, or NO, which form when the reaction occurs in air (e.g., N2 and O2). The CLC process is highly efficient with little to no side effects. Minimizing the formation of pollutant gases could help to solve the global warming issue!
I have been working on this project since Spring 2014 as my main research topic for a PhD in Chemical and Biochemical Engineering. This project is funded by the NSF. One of its objectives is to train the next generation of multidisciplinary scientists and engage with the public. While advances such as CLC for cleaner combustion create opportunities for better living through chemistry, they also mandate that we both train the next generation of multidisciplinary scientists and educate the public. Under this objective, we got involved with the Digital Scholarship & Publishing Studio department to create a website to present some background information about pollution, the CLC process, and our research results in this field.
Thus, I have been involved with the Digital Scholarship & Publishing Studio department to get their help in creating the website and creating animated figures that can explain the outcomes of our investigation. Currently, I am trying to create an animated figure that explains the reduction mechanism of cobalt oxide in the CLC process. The figure below demonstrate the reduction mechanism but without animation.
I’ve been called a “luddite” for over a decade, mostly by myself to pre-empt the comment from others. It’s for good reason—I do, after all, make books by hand. In fact, at the University of Iowa Center for the Book, in which I am an MFA candidate for Book Arts, I make them from scratch: writing my own work, forming sheets of paper, setting and printing movable type, and binding them all into a final book object with thread and needle, and perhaps some glue. So how could I find a foothold for the Digital Scholarship and Publication Studio Summer Fellowship?
It’s true that I am not the most comfortable with technology. Do you know that my parents run the leading computer technology store in my home of Nassau, The Bahamas? Let’s just say the apple fell far, far away from that tree. But through my work at the UICB, I have gained a deeper understanding of how these crafts have shaped our digital world—even just considering the rich depth of book and type history in contemporary digital design language, these disparate realities intersect more than we think. Now more than ever, I’ve been contemplating how to use digital spaces to advance or re-examine or share literature.
In 2009, while finishing up my BFA in Writing at Pratt Institute and navigating the world of handmade books and independent press culture, I began my own small press for Caribbean literature called Poinciana Paper Press. Truthfully, two close friends, and Bahamian writers I admired, wanted to publish their work—one couldn’t find a literary magazine to accept their short story, and the other didn’t want to wait over a year for his poetry collection to see the light of day—and I offered to make limited editions of their work as handmade chapbooks. Thus began my press. I enjoy making beautiful and engaging books, but my true love remains the words that inspire these vessels. I know that I’m biased, but Caribbean literature has the most dynamic range of work out there in the literary landscape, and I want to make it more accessible.
I also love podcasts, which keep me company as I bind many books in the studio. Years ago, I entertained a light bulb moment that explored the possibility of a podcast for Caribbean literature—not in the form of a talk show, but more in the form of authors simply reading their work. “Ah,” I thought. “Maybe in another life where I have the time and skill set to record and edit audio with all that fancy and complicated equipment,” and promptly let the idea collect dust in my “one day” folder. When the opportunity for this fellowship came along, that light bulb flickered back on. What if, with the professional support and guidance from the people in the Digital Scholarship and Publication Studio, I could actually gain these unknown skills to make this dream a reality? After all, I am exploring how to diversify my publishing platforms. I put my straw hat in the ring and I am now thrilled to have the summer to produce my podcast for Caribbean literary voices, Tongues of the Ocean.
Hailing from the fractured physical landscape of the Caribbean and its diaspora, digital spaces like online forums have allowed us to sustain important literary exchanges with one another, building a dynamic community with a range of voices, histories, and experiences. While revered print publications in the region were negotiating how to move into the digital realm, an online-only literary magazine changed the game. Founded by Nicolette Bethel in 2009, tongues of the ocean brought together exciting new work by up-and-coming Caribbean authors in a very accessible way. Though it shared its last issue in 2014, the work by its contributors mark a provocative shift in voice and aesthetic, reflected in their full-length book collections a decade later. When I thought more seriously about making this podcast a reality, I approached Nicolette to ask if the name could live on in this new manifestation, and she graciously agreed. I am thrilled that she will be my first guest on the show, and overwhelmingly grateful to build the podcast upon this significant foundation.
This project is not without another important context. In the mid-20th century, the BBC’s groundbreaking Caribbean Voices broadcast made literature from the colonial Anglophone Caribbean space accessible to listeners far and wide. Giving Caribbean writers a platform through which to share their work, this program marked an important turning point in literary history in the region. I am certainly not the powerhouse of the BBC, and I don’t see myself as having any definitive authority over the Caribbean literary canon by any means; all I can hope for is that I too can successfully use a contemporary tool to share the current literary landscape of my home. Basically, my endgame is this: I just want people to hear Caribbean literature, and to fall in love with it like I do every time I open a book from the region.
So far I have been overwhelmed by the show of support from the Studio, my fellows, and also from writers and creative thinkers in the Caribbean region. Besides Nicolette’s leap of faith, I also have to thank Holly Bynoe from the National Art Gallery of The Bahamas, Nicholas Laughlin from the Bocas Literary Festival, and Deborah Anzinger from NLS Kingston for their encouragement, critical feedback, vital guidance, and willingness to connect me with the tools and people I need to launch this project. Armed with these means, I have been able to spend the first part of this summer ahead of the fellowship thoughtfully fleshing out the mission and structure of the podcast and identifying how to problem-solve my anxieties (which all stem from my inexperience with audio equipment, recording standards, and editing). Also working with Cydne Coleby on branding, Liam Farmer on theme music, and Lisa Benjamin on navigating the legal precautions of this venture, the foundation of the podcast was coming together well before I started the fellowship.
After a week of weighing audio recording possibilities through the Studio, another DSPS fellow, the wonderful Mary Wise, brought to my attention a podcast recording studio on campus, helping a final piece of the foundation to fall into place (and alieving a great portion of my audio-related anxiety). After a test run this week, and with branding coming together by the beginning of July, I should be poised to take the plunge with legit guests by mid-July and will report back on those successes (and inevitable challenges and learning curves) in my final blog post.
In recent years, digital humanists have been at the forefront of challenging data’s supposed neutrality. Lisa Gitelman and Virginia Jackson have suggested that the discourse of objectivity that often surrounds conversations about data-drive research is not only reductive, but also unlikely to encourage future scholarship and more rigorous debate. They suggest instead that data be thought of as “situated and historically specific,” and that we recognize that “it comes from somewhere and is the result of ongoing changes to . . . conditions that are at once material, social, and ethical” (4). Indeed, just as words encode strings of meaning which can be ambiguous and open to interpretation, so too are numbers and databases invested with a rhetorical significance that must be tested and scrutinized. Information and the means by which it is assembled, organized, and presented must not be thought of in terms of self-evidence or a set of givens; rather, knowledge and the methods by which it is retrieved and made accessible should be open to interpretation.
Only recently have scholars begun to seriously explore the assumptions embedded in data visualization and graphical display. Johanna Drucker invites digital humanists to “take on the challenge of developing graphical expressions rooted in and appropriate to interpretative activity.” She takes issue with “realist” models of data visualization which appear to be motivated by the assumption that graphical displays and user interfaces show the phenomenon itself, rather than an interpretable representation of it. “Data,” she writes, “pass themselves off as mere descriptions of a priori conditions,” in turn foreclosing important discussions of ambiguity and uncertainty that could open up meaningful scholarly debate. Writing in 2011, Drucker felt that data visualizations concealed the very phenomena they were meant to expose. Meaningful insights were hidden behind an apparently “objective” digital representation, and more than five years later, I am left wondering the extent to which scholars of the digital have seriously confronted these problems.
At the same time, I’m equally interested in thinking about what existing visualizations can reveal about my own data. Rather than build a project from the ground up, how can I leverage, hijack, or appropriate the already-important work done by others in an effort to make it “fit” with my own data. Of course, such an open source or peer-to-peer mentality depends upon a number of important factors, not least among them being the ability to negotiate permissions for modifying someone else’s code. If all goes to plan, though, fitting data into existing visualizations can actually reveal unknowns without sacrificing the kind of interrogations Drucker images as being so essential to Digital Humanities Scholarship. At the very least, it has taught me how to creatively redeploy existing technologies and manipulate graphical representations for my own ends.
My project, Mapping Whitman’s Correspondence, is in many ways concerned with visualizing Whitman’s social network as it emerged in place and time. Importantly, the code I used to animate the trajectory of sent and received messages was adapted from a radically different project motivated by very different research objectives. University of Iowa professor Caglar Koylu actually wrote the code as part of his own scholarship on immigration patterns in Europe and the United States, and he was gracious enough to allow me to tinker with it in order to see what the code might reveal about my own data.
While my initial prototype utilized geocoding methods in order to visualize the flow of letters, much important information was nevertheless lost in the process. The correspondents themselves are most obviously absent from the visualization, and more nuanced data like the content of the letters, as well as frequent topics of conversation, were subordinated to an aesthetically pleasing though informationally limited animation.
Only recently have I started to focus more of my attention on thematic concerns, such as the often unacknowledged intersection of the poet’s public and private writing activity. What types of connections would I find between his letters and his most famous work, like Leaves of Grass. Are the boundaries separating public and private as rigidly maintained as we often believe? To explore these questions further, I began identifying all letters which contained references to Whitman’s poetry, prose, or journalistic activity. The more I worked with these documents, and the more I sensed the interconnections between them, it became clear to me that a more interactive interface was essential for helping users navigate the often overwhelming amount of data contained within the correspondence network.
I opted for what’s known as a concept map. In this iteration, the object consists of three parts: a table of “names” in the center, a list of “themes” on the left, and a cluster of “perspectives” on the right. Initially, I positioned individual correspondents in the middle, and the nodes which extended on either side contained references to Whitman’s writings. At first, this made the most sense to me in that the project is about the correspondence, so why not highlight the actual writers by situating them in the center of the “map.” After having met with digital humanities librarian Stephanie Blalock, though, it became clear to me that, while useful, what I had proposed was essentially a glorified finding aid. For example, users could very easily see what Whitman’s doctor friend, Richard Maurice Bucke, was most interested in talking about, but such discoveries revealed little else about the correspondence network.
What if I changed things up a bit? What would the concept map look like if I foregrounded gender as opposed to individual writers? What could this reveal about the nature of Whitman’s correspondence? Not only that, but what could this reveal about archival practice more generally, a potentially generative inquiry seeing as how all of my data is assembled from The Walt Whitman Archive. Perhaps I could begin to make inferences about the gendered nature of archival research, curation, and preservation in Whitman studies.
In the image above, you can see that I repositioned “themes” and “perspectives” in the middle of the map, with female correspondents now located on the left, and male correspondents on the right. While informative, doing so is not without its problems, and it is here where I return to Drucker’s observation regarding the assumptions embedded in visualizations themselves. Indeed, while the concept map does inherently preserve the notion of gender as a binary construct, it is also useful for conceptualizing the gendered identities that undergird epistolary activity. But whereas Drucker might consider digital repackaging to be fundamentally “at odds with humanistic method[s]” of interpretation and analysis, I see it as opening a dialogue where none might otherwise exist. If, as I believe, Whitman’s poetry and prose can be thought of as existing within a shared network of public and private activity, then it is important to consider the ways in which factors such as race, gender, and class contribute to such production.
Drucker, Johanna. “Humanities Approaches to Graphical Display.” Digital HumanitiesQuarterly, vol.
5, no. 1, 2011. Web. 3 July 2017.
Gitelman, Lisa and Virginia Jackson. “Introduction.” “Raw Data” Is an Oxymoron. Cambridge: MIT
Arianna here! I’m another one of the summer fellows working in the Digital Scholarship and Publishing Studio. I’m also a dancer! I’m currently working towards finishing my MFA in Dance Performance this spring.
During my first couple weeks in the studio I’ve been exploring some digital media components to incorporate into my dance performance, including some digital interactivity and motion capture work. This requires familiarizing myself with the Kinect motion capture system, digital creative programs like Adobe After Effects and Premier Pro, and, perhaps most importantly, a computer program called Isadora.
In short, Isadora acts as the control unit that drives visual manipulation components in a performance. Within the program, the user connects different modules in order to manipulate components like lighting, special effects, or video imagery that is played on a projection screen. The modules allow information to travel from source to source, and allows the user to transform an experience—similar to the way a stage manager would orchestrate the configuration of lighting, sound and other effects in a theater experience. Here’s an example of a skeletal tracking sample patch within the Isadora program:
Now you’re probably thinking, “That looks like computer programming,” and you’re right. Your next thought is probably (and understandably), “Dance and computer programming? But why?…I don’t get it”. It’s true, these two mediums don’t seem to mix. But there is a growing and evolving following of this marriage of mediums that people have grown to call “digital dance making.”
Without question, the programming aspect of this work has been the most challenging. Maybe frustrating is more accurate word? Okay, I’ll say it: it’s the worst.
Jokes aside, as someone who has very little experience with computer coding and programming, it’s really challenging to trouble shoot issues as they come up. Maybe the most surprising realization has been how many extra programs need to be installed just to simply start running the skeletal tracking and motion capture programs.
But I think I’m getting the hang of things. I’m most excited to start meshing my dance practice into the digital aspects I’ve been working on. I think the motion capture data from my improvisational dance scores is going to be incredible once mixed with the Adobe design components that I’ve been creating.
I’m pretty passionate about this idea of blending artistic mediums, and especially this idea of incorporating anything digital into artistic practice. One of my favorite debates is whether art imitates life, or life imitates art; regardless of the answer, our lives are certainly incredibly influenced by technology—how could this not be true of our art?
I have been tasked with writing an engaging and honest blog about my work as one of the Digital Scholarship and Publishing Studio fellows but I have a problem. I rarely use the adjective “engaging” to describe my writing. I have, I think, managed to find a creative solution to this particular conundrum. Using something I love, the 1989 adventure movie The Princess Bride, I will review the three biggest challenges I have faced during my summer work at the Studio. I hope you enjoy!
**A Note About Links: When I mention a tool or program I use, I linked a resource from a digital humanities scholar who provides an introduction or overview of the same tool! Happy clicking!**
Method to My Madness: Which Way’s My Way?
My dissertation project examines American Indian civil and legal protest at earthworks and burial sites in the Midwest at the turn of the twentieth century. Pretty cool, right? Though I do not have many digital sources, I have a plethora of textual sources from professional and amateur archaeologists who observed a handful of American Indian people who camped near earthworks and burial sites. Using Google forms and Excel, I translated correspondence from the papers of Ellison Orr, an amateur archaeologist active in Iowa from 1916 to 1951, and Charles Reuben Keyes, the first director of the Iowa Archaeological Survey (1921-1951). I parsed these letters for geographic data in the letter (To and From address) and used categories I created to organize the correspondents. But, what to do with this data? What questions could I ask it? Though I am not a digital methods skeptic, I am not an expert on digital methods and there are moments when I feel genuinely overwhelmed by the possibilities.Social network analysis, textual analysis, and mapping are just a few of the many methods that digital humanities scholars use with data sets like mine! Much like Fezzik, I found myself wondering, “Which way’s my way?”
Collaboration saved my project and my sanity. Nikki White and Rob Shepard, my Studio points of contact, reviewed my data and made some recommendations on the kinds of visualizations I could create. Their expertise has been invaluable. Unfortunately, the data I have would not make an analytically valuable social network visualization. But, I did have interesting geographic data and perhaps with some creativity and effort I could create a dynamic and interactive map!
Data Cleaning: Am I Going Mad?
Armed with a mapping method, I moved to cleaning my data. I used Excel pivot tables to move through my columns and I opened and closed OpenRefine to clean my data. Even though I did not crowd source my data, it was still quite messy. Some messiness was human error–capitalization, spelling, spacing and the like. I was driven to the brink of insanity when forced to grapple with inconsistencies that have nothing to do with spelling or capitalization. Rather, I had to make and documentchoices about the data in my spreadsheets because the questions changed between when I started gathering the data (last year) and now. Column after column and row after row, I made decisions about each cell. My method of cleaning data is not new or revolutionary. Digital humanities scholars have been publishing excellent scholarship on working with and cleaning data for decades. Three excellent pieces from humanities scholars on the challenges of working with data are:
**If you are a digital novice and the term data makes you feel queasy, I would highly recommend Daniel Rosenberg’s Data before the Fact**
In Lieu of a Conclusion: Where I am Going Next
Armed with clean data, I am very excited to work on three visualizations that I hope will help me answer the research question at the heart of my dissertation. For the sake of brevity, I will only describe the first visualization I hope to create. Primary source research has revealed the activism of Emma Big Bear (1869-1968). Emma was a Ho-Chunk woman who resumed residency near the mounds in northeastern Iowa from 1917 to her death in 1968.
Emma Big Bear returned to northeastern Iowa from the Winnebago Reservation in Nebraska around 1917. She and her husband Henry Holt lived in a wickiup, spoke only Ho-Chunk, and lived near the earthworks in northeastern Iowa for almost five decades. Simultaneous to Emma’s site occupation, amateur archaeologists opened earthworks and hunted for artifacts. I hope to visualize Emma’s campsites alongside a map of the sites that artifact collectors frequently targeted. Artifact collectors described the sites they excavated when they wrote to Ellison Orr (mentioned above). Did Emma and Henry ever cross paths with the most active amateur archaeologists in the region? Did she and Henry ever observe the excavations of amateur archaeologists like Ellison Orr, Dale Henning, or Paul Rowe? Using oral histories and amateur archaeologist correspondence, I hope to create an interactive public exhibit using ArcGIS Online based on my data.
Wish me luck!
About Me: My name is Mary Wise and I am a PhD candidate in the history department. My dissertation examines the history of American Indian activism at earthworks and effigy mounds in the Midwest from 1890 to 1950.
“Process is the new god…Digital Humanities mean iterative scholarship…It honors the quality of results; but it also honors the steps by mean of which results are obtained as a form of publication of comparable value. Untapped gold mines of knowledge are to be found in the realm of process” (The Digital Humanities Manifesto 2.0, 5).
When I’m working on my project, I often think back to this quote. My project deals with doing the same thing over and over again. I’m working on a sentiment analysis of Marcus Tullius Cicero’s orations paired with a network analysis of his letters in the Ad Familiares. It involves hours of data entry, running the same tests in R with slight variations, and corpus preparation (otherwise known as me struggling to put spaces in between the paragraphs of the .txt files). “Iterative scholarship” is a nice way to put it; “torture” is another way.
However, Digital Humanities requires a dash of optimism and a handful of perseverance, an ability to squint and see the gold mines hidden “in the realm of process.” And indeed, within my first two weeks of being at the Digital Studios, I do feel like I’ve gleaned little, golden nuggets of knowledge.
While struggling with the spreadsheets for Gephi, I became acquainted with Cicero’s familiars and the world they all lived in. I’ve made many mistakes while putting these spreadsheets together. I mixed up the Catones and the Bruti often and realized that there was more than one Caesar. But through my errors and through the process in general, I’ve already gained insights into Cicero’s world without even running the data through Gephi yet. I’ve already seen and counted the people that Cicero writes about the most (although it comes as no surprise that he mentions Julius Caesar the most).
My most recent struggle has been with inserting breaks within the .txt files so that I can measure sentiment paragraph by paragraph. At first, I was doing this manually, which is just as tedious as it sounds. Then, with the guidance of Matthew Butler, the Senior Developer here at the UIowa Digital Scholarship & Publishing Studio, we managed to figure out a way to have the computer do the tedious work for us through Python. It was my first time grappling with Python, but I had a great guide who forged the path for me through Python jungle in the realm of process. Not only did I become familiar with Python, but when I was manually inserting breaks, I also became further acquainted with my corpus. My files had been “lemmatized,” or in short all of the Latin words are reverted back to their stem, in theory. For some reason, this lemmatizer changes first conjugation infinitives to the second person singular passive. For example putare becomes putaris instead of puto. I was able to make changes to my sentiment lexicon accordingly and improved it.
This iteration also occurs in traditional scholarship. We read and reread the same works repeatedly, pouring ourselves over the same lines for close readings. Articles are written about with the same argument but with minor tweaks. Then why does digital scholarship seem so different? It’s not. As I pass my time here, I realize that the gap between traditional and digital scholarship isn’t as wide as it first seemed. I’m spending much more time with the texts than I am with R and Gephi. Furthermore, I understand that Digital Humanities can be intimidating. The realm of process can seem like a hopeless waste(of time)land to some. But new, golden discoveries, both personal and professional, lie in wait; you just have to persevere. And if you’re a Classicist and you can successfully struggle through Tacitus and Aristotle, you can also defeat the mighty Python and wrestle with Gephi.
Hello blog readers! I’m Andrea, one of the fellows in the UIowa Digital Scholarship & Publishing Studio this summer. I’m in the middle of the MFA in Literary Translation Program at UIowa, and I translate from French to English. This summer I’m working on a website that includes some translations with digital features, inviting the reader to think like a translator and explore how translators make decisions. For this post, I’m going to throw out some ideas that I’ve been mulling around in my head regarding Machine Translation.
I could conduct research and practice digital translation for the next ten years and still not be done, and the research from my first year would become obsolete by the end of the decade. This may seem daunting, but for me, young and naïve 23-year-old academic who still doesn’t drink caffeine, the potential of this field is exciting.
I’ve been reading a lot of theory recently about machine translation and technology, specifically Google Translate which has gained dominance in the last five years. I love the ways these articles/essays both underscore and undermine traditional translation theory. Translation theory is thus incredibly dynamic, constantly altered by the technologies that necessitate and facilitate its existence.
The field of translation is growing, and so is the fear that one day we translators will be completely replaced by Translation Machines. If you’re not scared yet, read about Google Translate’s new “Neural Machine,” which debuted last fall and is gaining traction in new languages:
A brief quote that explains how it works:
“The team trained its system on hundreds of hours of Spanish audio with corresponding English text. In each case, it used several layers of neural networks – computer systems loosely modelled on the human brain – to match sections of the spoken Spanish with the written translation. To do this, it analysed the waveform of the Spanish audio to learn which parts seemed to correspond with which chunks of written English. When it was then asked to translate, each neural layer used this knowledge to manipulate the audio waveform until it was turned into the corresponding section of written English.”
In other words, it’s translating “wave-to-wave” instead of “word-to-word.” This is a translation on a neurological level, not just a transcription of bilingual dictionaries. The emphasis is on accuracy of meaning without the obstacle of the words themselves.
I entered this program at UIowa because it seemed like a perfect fit for my interests in French literature. But will the skills I gain here be superseded by Machine Translation? Will my career be safe (if it was even safe in the first place)? Should I consider pursuing other fields?
Something I like to tell myself: “Machines will never be able to fully grasp a literary text’s meaning, its artistry, or its historical, cultural, and etymological context.”
“Maybe, maybe not,” I cynically respond. Imagine a Google translate of the future that allows settings for input and output. Translate one of Basho’s haikus into “teenage German vernacular.” Translate Romeo and Juliet into “Latin, in the style of Cicero.” Translate Kafka’s Metamorphosis to English using predominantly Anglo-Saxon words. What if one day Machine Translation has this ability to alter its algorithms to include stylistic input and output? I don’t think this is impossible. After all, Google Translate learns from its usage. The more translations that are run through it, the stronger it gets, and the smarter the translators get, the smarter it gets. With the right upkeep, it will always have more memory, more resources, more algorithms than its human counterparts.
Many translators, including myself, sometimes use Google Translate for reference, and after getting a rough draft of an excerpt, they work from there to imbue the text with the elements “lost in translation.” If Machine Translation can help us translate more efficiently, without completely taking over our careers, we should celebrate its consequence: the increase of world literature available in English. As translators, isn’t that our ultimate goal?
So what does the future look like for translators? Will we become obsolete (or when?)? Will we have to pursue careers managing and coding Translation Machines? I can’t say. For now, Google Translate is unable to capture style, rhyme, and rhythm; it can only reproduce meaning. For now, Translation Machines are unable to adequately compute the artistic qualities of texts. For now, human translation prevails.
Maybe one day Translation Machines will be able to interpret a text’s beauty. Until then, we need tech-savvy human translators working on world literature. Until then, we need innovative and passionate translators to do what Machines cannot. Until then, I’ll be translating.
This summer the Studio will pilot a new fellowship program with the help of the University of Iowa Graduate College and the Studio Steering Committee. Nine current graduate students have been named Summer Studio Fellows. The students will soon take part in an 8-week course that provides mentored digital scholarship experience, as well as training in skills and tools they might use as they pursue innovative ways of thinking about and sharing their creative endeavors. Below you can read more about new fellows and a description of their proposed projects.
Hayder Alalwan, PhD student, Chemical and Biochemical Engineering Department Currently working on a PhD in the Chemical and Biochemical Engineering department, Hayder Alalwan will continue work on a project started in the Spring of 2014. He will explore the creation of a website to publicly share information on chemical looping combustion (CLC). That process process uses the lattice oxygen molecules of metal oxides to decompose the gas, instead of air, which minimizes formation of pollutant byproducts such as NO2, N2O, or NO, which form when the reaction occurs in air (e.g., N2 and O2). In addition, the CLC process is highly efficient at decomposing gas with little to no side reaction. Hayder’s work will help bring his research findings to a broader public as part of his work in science communication.
Alexander Ashland, PhD student, English Department Alexander Ashland plans to expand on his work of Mapping Whitman’s Correspondence, integrating new data into an existing database, dedicating time to revisiting the existing prototype, and exploring the possibilities for implementing crucial features, such as search functionality, timescale manipulation, dynamic proportional symbols, and filterable keywords. Ashland’s current data has been gathered from the Civil War, Reconstruction (1867-1876), Post Construction (1877-1887), and Old Age (1888-1892) eras.
Sonia Farmer, MFA student, Center for the Book Sonia Farmer plans to launch a podcast that shares the rich world of Caribbean literature. The podcast will provide Caribbean writers with a platform share their writing, and grant people easy access to a multitude of voices. Farmer comes to us from the UI’s Center for the Book to hone her digital editing skills and develop the platform.
Andrea Lakiotis, MFA student, Literary Translation Program
Andrea Lakiotis will explore online digital publishing while engaging with translation theory and practice. She brings experience in digitizing data, mapping, and code to the digital translation work she will be doing with the Studio.
Caitlin Marley, PhD student, Classics Department Classics student Caitlin Marley plans to analyze Marcus Tullius Cicero’s corpus through computing algorithms by using his orations and social network. With this information she will map the “emotional plot” of the orations as well as the networks across space and time.
Ben J. Miller, PhD student, Psychological and Quantitative Foundations Department Ben J. Miller studies the educational needs of pediatric patients and their families. Efficient and effective education plays a large part in regard to their care. This summer, Ben will refine his digital design skills in service to educating parents on using distraction to help their children cope during painful medical procedures. Ben is designing an infographic for use in pediatric waiting rooms that demonstrates how to harness the power of their smartphones and tablets for distraction.
Arianna Russ, MFA student, Dance Department As an MFA student in Dance Performance, Arianna Russ explores the integration of digital media into her artistic work. In collaboration with Dance and Theatre Arts Assistant Professor Dan Fine, Arianna will deepen her understanding of motion capture and digital artistic practice.
Katherine Wetzel, PhD student, English Department As a doctoral candidate in the department of English, Katherine Wetzel plans to continue her work on Met-Memory that she is currently constructing as part of her Studio Scholars Initiative. This project examines the tensions within local, national, and global expressions of Britishness as they occur in late-Victorian literature. The summer fellowship will also provide her with opportunities to explore the place of theory within the digital humanities.
Mary Wise, PhD student, History Department A PhD candidate in the History Department, Mary Wise plans to construct an interactive and publicly accessible map that examines the American Indian earthwork excavations in the Upper Midwest between 1890 and 1930. With training and support from Studio staff, she sees this project leading to the creation of an all-digital history dissertation.
In a blog post last week, I addressed Endangered Data Week and the history of political parties hiding, removing, or altogether abolishing public access to government documents. However, my post wasn’t alone in trying to shed light on this serious issue. In schools, universities, libraries, and classrooms across the world, hundreds of concerned people came together to bring awareness to the issue of endangered and disappearing data. And while Endangered Data Week is now over, the threat is not. So this week, I teamed up with the Digital Scholarship & Publishing Studio to highlight some of the excellent work currently being done by digital humanists and to provide some advice on how to get involved.
First, I visited with Tom Keegan, Head of the Digital Scholarship & Publishing Studio, and Matt Butler, the Studio’s Senior Developer, to discuss the services offered by university libraries to keep scholarly data safe. They stressed the import of digital institutional repositories in helping scholars to maintain their own data and make it accessible to others free of charge. The University of Iowa’s institutional repository, Iowa Research Online, houses an array of faculty, graduate, and undergraduate work. Librarians work closely with faculty, staff, and students to ensure these materials are properly archived and made available according to agreed upon standards. As I have pointed out before, non-university repositories like Academia.edu are for-profit and will indeed use your data in order to make them money.
Profit is a big factor to consider when thinking about where to put data. As Eric Kansa, founder of Open Context emphasized to me: “We need to maintain nonprofit (civil society) infrastructure to help maintain data (and backup internationally) during political crises. Organizations like the Internet Archive, and other libraries (including university libraries) are critical, because they have the expertise and infrastructure needed to maintain public records.” Kansa rightfully points out that libraries are integral to this fight, but notes that individuals need to know more about the vulnerability of data as well.
So, what do we do about all the government data (e.g. climate data) that is currently being pulled from government websites? This was just one question addressed by the group behind the formation of Endangered Data Week. Like most DH projects, EDW was forged by proactive academics who wanted to make a difference by using the biggest megaphone in the world: The Web. Michigan State University professor and digital humanist Brandon Locke, in collaboration with Jason A. Heppler, Bethany Nowviskie, and Wayne Graham, designed EDW on the model provided by Banned Books Week and Open Access Week. From there they brought the project to the Digital Library Federation‘s new interest group on Government Records Transparency/Accountability, directed by Rachel Mattson.
In order to find out more about this initiative and the problems they are addressing, I spoke to Bethany Nowviskie,Director of the Digital Library Federation (DLF) at CLIR and a Research Associate Professor of Digital Humanities, UVa. Prof. Nowviskie was kind enough to answer a number of questions I had about endangered data and how to get more involved in the fight to save it:
SB: Who owns federal data? In other words, should data be available to us because we pay taxes and fund data-producing institutions like HUD? The EPA? Why is the Executive in control of so much of this open data?
BN: Except where issues of personal privacy and cultural sensitivity are involved, data collected or produced by taxpayer-funded agencies of the federal government should be openly available to everyone. It’s a matter of transparency for the health of the republic — sunlight being, as they say, the best disinfectant — and of accountability of the government to its people. These are our datasets, and we should have the ability to analyze and build on them — using them to understand our world better, as it is, and to be able to *make it better.*
SB: How do we create a more centralized, non-profit infrastructure that can maintain data during political crises?
BN: Most pieces of our needed infrastructure are already in place. We call them libraries. The DLF will join a large number of allied groups in early May, convened by DataRefuge (our Endangered Data Week partner) and the Association of Research Libraries, to discuss a new “Libraries+ Network,” to take on exactly this issue: https://libraries.network/about/ Some questions that will motivate us: how can we create greater coherence among the many governmental, non-profit, and even commercial groups with longstanding commitments and expertise in particular areas of the data preservation enterprise? Might we re-energize and re-imagine something like the Federal Depository Library program for the digital age? What would it take for governmental agencies to implement data management plans for the full lifecycle of their information, just as researchers who receive federal funds are now typically required to do?
SB: What can regular non-specialists do to contribute?
BN: This is one reason DLF jumped at the chance to support grassroots efforts to organize the first annual Endangered Data Week. The goals expressed and audiences implied in our capsule summary (“raising awareness of threats to publicly available data; exploring the power dynamics of data creation, sharing, and retention; and teaching ways to make endangered data more accessible and secure”) go far beyond the professional research data management and data stewardship community. Probably the most useful thing a non-specialist can do is to educate herself on the issues and represent the value of open data legislation and the advances in open government we saw under the Obama administration to her representatives. We also need to urge follow-through on past bipartisan commitments in this sphere, such as the OPEN Government Data Act: https://www.datacoalition.org/open-government-data-act/
SB: Can you give some examples of digital projects or initiatives that depend on federal data to reveal racial inequity (e.g. redlining projects), bias, or certain dangers (e.g. lead poisoning)?
BN: Well, FOIA requests played an important role [in the Flint water crisis]— as they have done in Title IX enforcement on college campuses. In this sphere, I also think it’s worth mentioning that identical bills were recently introduced in both the House and Senate that would prohibit federal funds from being “used to design, build, maintain, utilize, or provide access to a Federal database of geospatial information on community racial disparities or disparities in access to affordable housing.” [House Bill, Senate Bill]. They went nowhere, and were ostensibly meant to “protect local zoning decisions,” but *what is up with that?* This is the kind of thing that should energize non-specialist readers.
SB: How can we have trust in the integrity of datasets that have been given over to private institutions or saved by non-federal entities? In other words, who will hold the “control” copy (e.g. like a seed bank) that can assure us that datasets that have been saved were not then tampered with?
BN: So, there’s a huge professional community — many of them are DLF members or members of the National Digital Stewardship Alliance which we host — whose whole focus is on questions like this, and there are excellent protocols and procedures for ensuring data integrity. I’m not familiar enough with the ins and outs to give you a good quote, but it’s not a new problem, for sure, and methods for auditing and certifying digital repositories and verifying the integrity and security of the data within them are well established. As always, matters of policy, funding, and the professional development and nurturing of the communities who do the work are a bigger challenge than the technology!
Bethany’s comments above echo what others on campuses across the US are saying: data is a resource. Like water or electricity, access to it ought not be taken for granted. We must continue to be vigilant in the face of lazy and aggressive attitudes, alike. Libraries and library associations remain a big part of the fight to preserve this data, but all of us can play a part by being more aware, spreading the word, and getting involved in the movement.
As you may know, April is national poetry month, an annual series of events by the Academy of American Poets to help support the appreciation of American poetry. If you’re looking for great book-length collections of poems, you might be interested in the Iowa Poetry Prize winners. Many of the previous years’ winners are made available in PDF form at Iowa Research Online. What you may not know is that April is also National Poetry Generation Month, an annual tradition where programmers and creative coders spend the month writing code that generates poetry.
In honor of this time of year, I thought I’d take a look at the Iowa Poetry Prize winners through code. There are many methods for analyzing and generating natural language, but one system that has received a lot of attention recently is neural networks. A neural network is a large collection of artificial neurons based very loosely on a biological brain. These neurons exist in layers that perform statistical calculations and affect the state of other connected neurons. It differs from other computational models in that there is no knowledge hard coded and controlled by elaborate conditional statements (if this then that). Rather, neural networks learn to solve tasks by observing data and producing optimal functions that will produce similar outputs given new data it’s never seen before. The uses for such a system include image and speech recognition, classification problems, and many forms of prediction and decision making. For example, a neural net could be trained to detect images of cats by observing tens of thousands of labeled images of cats. Google has recently launched a new project that uses this technique to match your doodles with professional drawings.
What happens when we train an artificial intelligence to write english language having only read Iowa Poetry Prize winners? Let’s find out!
To start, I downloaded all of the IPP winners from Iowa Research Online, extracted the poems as plain text, and concatenated them all into a single text file named poems.txt. This served as the training set. Next, I set up this Torch-based Docker container implementation of a recurrent neural network based on work by Andrej Karpathy and Justin Johnson. It was tempting to spin up the Google cloud VM with an attached GPU, since these types of machine learning tasks are sped up greatly running on a graphics processing unit with CUDA, but it’s also quite expensive at 75 cents-per-hour. Once I had it working, I started the preprocessing and training, which took about 16 hours to complete.
After a lot of experimentation to create some useful training models and keep the network from overfitting and underfitting the data, I had something that was acceptable and so began sampling output. One parameter of sampling that was fun to play with was the “temperature” of the sample. A lower temperature produced output that was much more predictable and less error prone while a higher temperature was much more inventive but riddled with mistakes. I decided to split the difference and start at 0.5. Here’s the first poem.
Speritas Of The Stars
Morning comes of the sun
to the thin world is a star of her light.
The sheet and the body of parts
of the flame is a light, the body
sees of the wars beautiful on the street.
The sun, the stars of the sound, and desire,
and a man could love the streets.
The single shiller of light,
and the single stranger falls countal.
Father and she were the sutters of the body
instraining to the complete
window of light, still.
You’ll notice a few words in this poem that don’t actually exist in english. That’s because this RNN operates at the character level, not the word level. It has to learn, from scratch, how to write english. It starts with random strings of letters and slowly, after many iterations, learns about spaces, proper punctuation, and finally readable words. The higher the sampling temperature, the more invented words. Let’s look at a “hot” poem.
Pelies, One Yighter
The shadows just plance croved
I am one
its funlet from the wind
staskaccus, gring of detches of hearts face eashog
what wing to the streed in the resert of change, a glince
She read on his fill bathered, a hand the
with beautiful, casty, stery, kooms, in one father
something the mouth cold leaves.
A night and no one is a woman; you green her
My spere would must not the look teering mower
I see itselfor.
At that sign they thought the remelled the mum,
but like an wait they mite of ammiral
after things of the body
which children would love
the forest flowers and hark a path.
The shawr rate in a ruched parts in humstily
his poom her as of the trabs conterlity.
Much more Jabberwockyesque. If we ease up just a little on this we get
A Badicar Flower
The watcher blue says
they would have shapes,
the night dreaming,
a painted nother
tricks me, the wind,
the dayed from the boging feeling
of the histance in his everyness.
What do you think — poetry prize worthy? While writing poetry is fun, there are, of course, practical applications too. I’m currently working with faculty member Mariola Espinosa on a HathiTrust project called Fighting Fever in the Caribbean: Medicine and Empire, 1650-1902. We have 9.3 million pages of medical journals and need to find references to yellow fever in multiple languages. A trained neural network could look through these quickly and find references that a human might miss. I’m also working on another project with Heidi Renee Aijala looking for references to coal smoke in Victorian literature. Perhaps a neural net could be trained to look for non-keyword references.
While I’m probably not going to put a poet out of work any time soon, you can imagine many real-world uses. There is a tremendous potential for neural networks and other types of machine learning to caption images, transcribe handwriting, translate documents, understand the spoken word, and play chess at the international master level. Perhaps someday it might also write a meaningful poem.