DLCL ATS round-up, winter & spring 2022

Quinn DombrowskiJune 15, 2022
DLCL ATS round-up, winter & spring 2022

In almost four years at Stanford, I've never missed one of my quarterly round-ups, but March 2022 was unlike any other month of my life. When Russia invaded Ukraine at the end of February, I ended up getting together with Anna Kijas from Tufts and Sebastian Majstorovic from the Austrian Center for Digital Humanities and Cultural Heritage to co-found Saving Ukrainian Cultural Heritage Online (SUCHO), a rapid-response DH effort that's brought together over 1,300 volunteers from around the world to archive Ukrainian cultural heritage websites. Especially for the first six weeks, it consumed my life day and night; writing up winter quarter wasn't a priority. But summer starts this week, and before I get too caught up in summer plans, it's time for a look back over the last half-year.

Classes

Winter quarter started with the worst COVID surge so far, though if we had better testing we'd probably be there again right now in the Bay Area. Between daycare closures, then getting COVID myself, the quarter was well underway by the time I was on campus regularly, and the DH independent study I was teaching, with three students, was comfortably settled into a series of remote coworking sessions. During winter quarter, I was also supporting visiting Slavic faculty member Kat Hill's "Russia in Color" class, which had a DH project component. It was an ongoing source of randomness and whimsy to see the student meetings show up on my calendar, never knowing what language, what medium, or what angle on color the student would be interested in exploring. I learned some things about how the library's digitization workflow works when a student needed a Winnie the Pooh DVD digitized to compare the color to what was used in the Soviet "Vinni-Pukh". I also learned about teachable machines, when Eric Kim was interested in training a "horse/not-a-horse" model for processing manuscript images. And I learned some things about different color modes when working with images computationally, thanks to Georgii Korotkov's final project. The class final presentations were held a few days after SUCHO started, but I made it in-person for the first set, and virtually for the second. I'm hoping to share some of the students' work (with permission) over the summer, but sadly, it's not a class that's likely to repeat, as Kat left the University after winter quarter. My days of working on computational analysis of Russian periodicals may be over for now, but I'll continue to keep an eye on work going on in computational periodical space elsewhere in DH.

During spring quarter, Victoria Zurita joined me once again for the DH Practicum independent study. I'm happy to include her project, on Rubén Darío's archives, as part of our revamped DH project gallery on the new DH at Stanford website.

Existing Projects

Things are coming together for the Global Medieval Sourcebook despite a few delays, but we'll absolutely be wrapping everything up for the relaunch this summer. In conversations on Twitter about migration from database-backed CMSes to static sites, I've been happy to have GMS (and its documentation) to point to as an example of a Drupal-to-Jekyll conversion.

Things have gotten complicated with the various departmental-support websites that I've built and continue to support as Stanford Web Services is retiring their D7-based Stanford Sites in favor of a more recent version with much less functionality, by the end of June. I built a lot of sites using that platform because it was free and centrally-maintained, and it's been a bit of a scramble to try to find alternatives for everything in a relatively short time. Most likely, there'll be a number of sites that need to go onto paid medium-term hosting, and/or be rebuilt with more limited functionality on the new system.

Transkribus has taken a back seat over the last two quarters; the honors thesis idea didn't quite take off. But discussions with the latest cohorts of grad students have again ignited some possibilities I'll pursue this summer, and developments in Arabic and Ottoman Turkish models have made me optimistic about possibilities in that area.

We had a long dry spell with the Data-Sitters Club, beginning DSC #13: Goodbye Friends, Goodbye in November, but it was hard for everyone to make the time, especially in these times, to work through emotionally-taxing writing. But we managed to get it done in time for the DH Unbound 2022 conference, when we published two new books in one week: our wrenching tribute to DH friends who recently died young, and DSC #14: Hello, DMCA Exemption explaining what the new exemption to the DMCA for text and data mining does (and doesn't) mean for DH scholars. That book also formed the basis for a roundtable at DH Unbound 2022 with Rachael Samberg, Erik Stallman, and Lauren Tilton.

Masha Gorshkova and Steele Douris and I got a proposal for Multilingual Harry Potter fanfic accepted to the DH Unbound conference. The time I'd hoped to work on the project between getting the proposal accepted and the conference itself mostly went to SUCHO, but some of the preliminary new results we were able to share -- related to how people wrote fanfic during the great fanfic-writing surge of 2020 (which happened after our original data collection in 2019). Long story short: Italian Harry Potter fanfic writers vented their angst, while Russian Harry Potter fanfic writers went for happy endings. Seeing those kinds of striking differences was enough to inspire me to add the project to my summer to-do list, in hopes of finally writing something up.

Animal Crossing: New Digital Humanities has been on hold since last year, first due to scheduling challenges, and then because of SUCHO. But we're planning on bringing it back this summer, including giving a talk for a Japanese DH lecture series in Animal Crossing, about SUCHO.

I was hoping to reopen the Textile Makerspace in winter with a good air purifier, but it took a while before I was regularly on campus... which didn't last long before I was off campus for a while because of SUCHO. But more than a few students reached out to me over the last couple months, and I'm feeling good about the summer, especially with some volunteers helping keep it open on days I'm not around. An influx of new equipment thanks to an anonymous donor and a retro knitting machine will give us something mechanical (and pre-digital) to explore.

I took the occasion of Day of DH to put together a little survey of how people are feeling -- especially about virtual/hybrid conferences and other accommodations -- after two years of the pandemic, as a way to get back to DH-WoGeM. This Women and Gender Minorities in DH group was approved shortly after the pandemic started, and ironically, ongoing childcare disruptions have meant I haven't been able to do much with it. But there's a few people interested in lending a hand with coordinating it, and I'm hoping to get things back on track in the summer.

Some of the work I did in the fall with Cécile Alduy on French far-right political rhetoric turned into a book... that also turned up on my desk one day! It's always a joy to see this work turn into something meaningful out in the world.

Corpus-building took a few tentative steps forward early in the year, as I continued to work on amassing book lists across different languages, but I didn't get nearly as far as I hoped to by the end of the academic year. The project suffered a big setback when I explored with Sarah Sussman what it would take to get the equivalent of the NYT bestseller list data for French and was quoted a ludicrous number. I did give a talk in May as part of the CIFNAL Speaker Series (Collaborate Initiative for French Language Collections) about the need for such corpora and tools to work with them, where I got to preview a forthcoming Data-Sitters Club Multilingual Mystery about the limitations of the French spaCy NLP model compared to English.

The corpus-building work is partially a response to nice things they have in English, including a corpus (both text and scans) of women's magazines, that formed the basis for a LitLab project on domestic appliances. I ended up being a fly on the wall for the project, but it was really interesting to see what kinds of approaches they were able to try through a combination of querying metadata and parsing the texts. Some new projects have taken me into comparative spaces that have given me an excuse to work more with English, and enjoy the comparative ease afforded by nice metadata.

New projects

This spring, I started working with Adrian Daub on some data for his upcoming book on cancel culture as a moral panic in France and Germany. The New York Times data set has been useful for points of comparison (especially around "political correctness" in the 90's and into the 21st century), and I've gotten to explore the landscape of newspaper access across languages and countries. Suffice it to say, research using German newspapers (which are typically behind paywalls) isn't easy, and while there are news aggregator sites, pay-per-article and convoluted (i.e. non-scrapable) code takes that off the table. We've got a few more options to try, though, including working with the Internet Archive; looking forward to writing more about that this summer.

A relatively new project that took some time to get started due to negotiations over the terms for accessing and disseminating the materials is with Fatoumata Seck, on a Maoist cultural movement called Le Front Culturel Sénégalais dating to the 1970's. The songs, poems, translations, and other materials will be accessioned to Stanford Digital Repository for long-term storage, and will also be presented on a website that may incorporate maps, timelines, and network visualizations tracing the history of the movement, depending on the resources available.

As part of the DLCL's efforts to improve transparency about graduate placement, I'm working with some internal data to present aggregate visualization summaries of career trajectories. We had an idea to look at the corresponding dissertations, and with some help from Andrew Berger in DLSS, I now have (non-consumptive, computational use) access to almost all the dissertations from my department in the last 20 years. This summer I'm interested in seeing what I can get out of them, and sync it up with data like dissertation committee.

Of course, the metaphorical elephant in the room on my "new projects" list is SUCHO. Even locally, it's been more than just me: around 20 Stanford grad students and library staff have helped out, at one time or another, in different capacities. On the DLSS side, Yuliya Ilchuk has been there for all our translation needs with communicating with cultural heritage workers on the ground in Ukraine, and is incorporating work on SUCHO into her next book project on memory. Slavic grad student Georgii Korotkov made a huge contribution to the project by reading through all the documentation for a post-Soviet library system called Irbis, and writing a scraper to be able to capture the library catalogs from Ukrainian libraries that use that software. On the library side, special thanks to Ed Summers and Laura Wrubel in the DLSS group (Laura has been babysitting my Twitter scraping for three months now, rebooting it every time it collapses), and to Simon Wiles for everything from debugging volunteers' web crawling to taking on large crawls himself, to now building the public-facing gallery (and internal-facing data entry interface) for our more recent meme collection work for SUCHO. Simon has also been the developer for an adjacent project to SUCHO, called ForFutureUse. Developed in partnership with a number of Slavic scholars and librarians, ForFutureUse is a way to provide free storage (and a publication venue, if desired) for dissident creatives in Russia and Belarus, as well as anyone displaced in Ukraine. There is much more to be said about SUCHO, and writing some of it up -- including in an open-access Handbook of Emergency Web Archiving -- is on my to-do list this summer.

Writing

I haven't written much this year so far, though I have a few pieces I've ushered through a couple rounds of edits and tweaks in prepration for publication later this year.

I did get a proposal about SUCHO accepted to an upcoming edited volume about DH in libraries, edited by Glen Layne-Worthey and Isabel Galina. It'll be a collaborative effort with a handful of SUCHO volunteers.

Talks and Events

Winter was relatively quiet, with just two talks: at the Australia National University's "Digital Approaches to Multilingual Text Analysis", a talk called "Non-English DH is Not a Thing" argued that #MultilingualDH is a rhetorical device that have pragmatic utility in some contexts (e.g. "please pay attention to language here"), but once you get into practical implementation, there's no single "multilingual DH" that you can support -- much of the work that needs to happen is language-specific, and has to be done with thought and care as to which languages you actually want to support. In late February, right before Russia invaded Ukraine, Nichole Nomura and I gave the DHARTI conference a sneak preview of the Young Readers Database of Literature (YRDL), which had its proper birthday in late April at a LitLab talk.

Since then, though, there's been a lot of SUCHO talks -- including a lot of talking to the media about the project, which was pretty new for me. In April, I was invited to give a talk about SUCHO at the Fiesole Retreat in Athens, Greece, which was my first time at an in-person event since the pandemic. Other venues have included the Internet Archive's Library as Laboratory talk series, an IFAR panel, Charleston In-Between, and the IIPC Web Archiving 2022 conference. In the midst of all this have been a keynote for Day of DH 2022 at the University of North Texas ("Taking Fun Seriously" -- which turned into a SUCHO talk), and a CUDAN Open Lab Lecture for Tallinn University. I gave them a title in the fall ("Coding, Childcare, and Badly-Behaved Tools: Adventures in Multilingual 'Data-Sitting'"), and couldn't have imagined SUCHO then or how it would fit into that picture.

Other Things

This year really did not work out the way I planned with the Slavic DH working group -- which is to say, between COVID and the war and generally the state of things, it never quite got off the ground. We reframed it in our proposal for a new research unit in the fall, more around the SUCHO work, also with the goal of organizing some kind of SUCHO event at Stanford around the first anniversary of Russia's invation, with additional support from the library.

During the winter, the Library hired Peter Leonard as the new Assistant University Librarian for Research Data Services (the group I report through in the Library). We also went through the hiring process for a new Academic Technology Specialist for History-- same job as me, in a different department. I'm looking forward to being able to share good news soon on that front.

Looking ahead, there are some conversations in play between SUCHO, Stanford Libraries, and UNESCO about roles, partners, and goals for SUCHO in the shorter-term, and a National Digital Library of Ukraine in the longer term. These discussions tend to move slowly, as does our advocacy work, coming out of SUCHO, for more proactive web archiving of cultural heritage materials around the globe. Much as I never thought of web archiving as really "my thing" before this, I suspect this is a thread of work I've now picked up for the long haul.

There's a lot in flux right now: a new DLCL chair, new faculty member in Slavic, new AUL for Research Data Services, and numerous other departures, arrivals, and reorganizations. Perhaps by the fall some of the larger context for what I'm doing here will be clearer. In the meantime, there's work to be done.