“Digital Humanities Across Borders” (DLCL 204, cross-listed with Comp Lit and English) kicked off last week. As of today, there are seven registered students and three intrepid auditors (including two area studies librarian colleagues).
The Digital Humanities Summer Institute workshops I’ve taught before have a similar number of classroom hours, but packing them into a single week produces a very different dynamic. My DHSI classes focused on one specific, technical topic (Drupal for DH projects), which led to a very different experience of both class preparation and teaching. When it comes to building Drupal sites, there’s a clear set of sequential steps. There’s a set of functionality that are relevant for most humanists, and that’s what we cover. In the afternoons, the students work on their own projects, because we only have five days to build something and giving homework isn’t particularly feasible. I wrote all the course materials and came up with the pacing for the syllabus myself, but it wasn’t difficult to make it all fall into place.
Shortly after starting my current job as the Academic Technology Specialist in the Division of Literatures, Cultures, and Languages, I offered to teach. Initially I threw out the idea of covering the medieval Slavic literature course, since the Slavic department is short on medievalists and that used to be my disciplinary area. But it was hard to argue when Dan Edelstein, the Division of Literatures, Cultures, and Languages (DLCL) chair, suggested that I perhaps instead teach something that could draw a bigger enrollment. We agreed I’d do “a digital humanities course”, and left it at that.
Being based in the DLCL, it made sense to focus the class on textual, non-English digital humanities. I’ve been surprised — though perhaps I shouldn’t be— at the pervasiveness of English in the pedagogical materials for digital humanities tools and methods. Even in #dariahTeach: while the lesson on TEI has been translated into Hungarian and French, the exercise still involves encoding a poem by William Butler Yeats. In “Digital Humanities Across Borders”, the students bring their own texts, in any non-English language that they can read. We’ve got Chinese (both traditional and simplified characters), Japanese, Spanish, Portuguese, Russian, German, and Italian. I’m looking for tools and software that claim to work for each of these languages, so the students can try out a range of methods, including topic modeling, network analysis, counting words, and various approaches to NLP.
Familiarity with tools is one step towards “doing digital humanities”, but it felt deeply unfair to throw tools at students and leave them to figure out on their own what it means to be a modern languages person who “does digital things”. Between early exposure to the ADHO DH conference (DH 2007 was held at the University of Illinois at Urbana-Champagin, just as I started the MLIS program there) and the “listening tour” phase of Project Bamboo, I had many opportunities for acculturation to digital humanities at the beginning of my career. Beyond just knowing what kinds of questions to ask using digital tools, and how to operate those tools, I want my students this quarter to have a sense of orientation around how (and to what extent) DH is “done” in their fields. I want them to have an idea for how they can position their research in their field, in whatever “DH group” exists within their field, and in the larger national and international DH communities (ACH, ADHO, and if there’s a DH organization centered on their language or country of specialization.)
There’s no shortage of readings that one could assign for this class, and I’m very grateful to Molly des Jardin for sharing her East Asian Digital Humanities syllabus. That said, these are students who have gotten, and/or will get plenty of practice reading and writing academic-style prose. I want to do something different here. I’m updating the syllabus with pointers to additional readings on all the topics we’re covering (and the syllabus and other class materials are available on Github for anyone to use), the students don’t have to do any reading to prepare for class. Instead, I’m asking them to spend the time they would otherwise be doing readings experimenting with applying different tools and methods to the major text they’re working with in their language. I’m hoping that by the end of the course, they’ll be as comfortable experimenting with tools as they are opening a book — or at least, comfortable enough to work their way through error messages without giving up.
The first two class sessions covered introductions (of ourselves, the course, and that classic question, what even is digital humanities), some of the kinds of questions we can answer using digital tools and methods, and a few examples of how people with backgrounds in non-English languages and literatures have position their work within the landscape of digital humanities.
Today we had our first hands-on session in which we tackled OCR: a timely topic given the recent release of David Smith and Ryan Cordell’s OCR report, “A Research Agenda for Historical and Multilingual Optical Character Recognition” accompanied by Ryan Cordell’s “Why You (A Humanist) Should Care About OCR”, which took care of a large part of class prep for me for today. I assigned installing and experimenting with ABBYY FineReader (which has a reasonably friendly UI) as homework, and together we dove into installing Imagemagick and Tesseract — starting with locating the Mac terminal or Windows command prompt. While only one student left the class today with OCR’d text, many others came close, extracting Tesseract-compatible PNGs from PDFs after working through error messages one after another. “Congratulations, you’re doing digital humanities!” I tried to reassure them as they were copying (or retyping) error messages from the command line and wading into Google’s results. We’ll pick this up again on Thursday.
There was an amazing moment of synchronicity. Melissa Hosek, one of this year’s Digital Humanities Fellows, was able to use what she’d learned about the Windows command prompt at a recent DH Fellows meeting to help her fellow Windows-using classmates when the problems they were running into (including basic syntax for the command prompt) had me stumped. As I ran from one Mac-using student to another, and saw the Windows users all sitting around one table, working together across the linguistic divides between their operating systems (Korean, Portuguese, English), it may have been the first time I’ve ever seen such a vivid simultaneous embodiment of every one of the “values” Lisa Spiro proposed for digital humanities: openness, collaboration, collegiality and connectedness, experimentation, diversity.
It was an adrenaline rush to try to cover installation and use of command-line tools in under an hour, across two operating systems and seven languages. Scampering from one end of the classroom to the other and back again, I had a very visceral memory of my exhaustion at the end of every day the first year I taught Drupal at DHSI without a co-instructor.
But I also had a humbling moment helping one student debug issues with Imagemagick. I wasn’t sure they’d typed the command right, so I asked if I could try it on their laptop. I put my fingers on the keyboard in the places they’ve been trained to go since I was 6 years old, and confidently pounded out the command… and then I looked at the screen, and saw it was all wrong. Every forward-slash I had typed was a hyphen, and there was no hyphen. The Latin characters on the keyboard had drawn me in, and I hadn’t realized that they were arranged slightly differently — especially the punctuation. Typing the commands yourself isn’t good pedagogy, and neither is making snap judgements that “close enough” is anywhere close to right. Most of the materials my students are working with are in languages I can’t actually read (despite recognizing an Indo-European root here, a logograph there). By the end of the quarter, they’ll be a lot closer to knowing what I know about DH tools and communities than I will be to reading many of their texts without assistance. My contribution here is a variety of technical and social approaches to how one might do digital humanities. The questions of why to do digital humanities, and what to do it with, and the evaluation of how well any of these particular approaches work for a specific language or genre are places where I have to defer to the students in their language and disciplinary knowledge.
Today I learned that ₩ (the won sign) is used to represent the path separator (\) on Korean Windows in the command prompt. With some help from Google and the Internet Archive, I was able to find out somemore about the history of that convention, and it all comes down to Unicode, which we’ll be covering with a special guest lecture by Debbie Anderson from the Script Encoding Initiative at UC Berkeley next week. I can’t wait to see what else I learn this quarter.