DLCL ATS round-up, fall 2023

This fall, I got my first experience teaching a large class, helped launch a major new Unicode project, and got excited about the possibility of weaving as a medium for data visualization.

Textile Makerspace

It's been a great fall at the Textile Makerspace, which I incorporated into the Future Text: AI in Literatures, Cultures, and Languages class (more on that below). We've been having regular hours thanks to student staff, and have picked up a few additional volunteers and staff for the winter who will help out with more hours.

I've declared this year to be the Year of the Loom, and we've picked up a few new ones: a folding rigid heddle loom, an inklette loom, and a giant inkle loom that doubles as a warp board for the standing loom that replaced my desk earlier this year. We finally replaced the steel heddles on the standing loom with texsolv that doesn't chew up the warp threads, and it was such a joy to get it warped and ready for use. I've got a student next quarter interested in using it for the DH Practicum.

I've been learning how to use these looms, too, and have mostly been working with the rigid heddle loom. I've put together weavings based on Slack messages in the CIDR channel (my group in the library), Star Trek novels, the AI class data, and Gideon the Ninth by Tamsyn Muir. It's quickly becoming my favorite craft, though it comes with risks like staying up far too late when I'm working on one.

In keeping with the loom theme, I've had the chance to get to know some other weavers on campus, and borrow a table loom from Prof. Hideo Mabuchi in Physics (because, of course, at Stanford there's stash of table looms in medium-term storage in a physics professor's office.) I also put in a Making@Stanford equipment grant for some of the folding rigid heddle looms that could be checked out through the library, but I'm still waiting to hear back.

I also got to meet Owen Hipwell from the D-School, who's their new makerspace person and who is setting up a new letterpress and other print (Risograph!) studio on campus, with public open hours planned. This is going to make for great collaborations, since they have the kind of space that supports screenprinting -- and what's more, print studios seem to be having a moment in DH circles more broadly.

Classes

DLCL 103: Future Text: AI and Literatures, Cultures, and Languages, which I taught with Laura Wittman and with Eric Kim and Andrew Nepomuceno as TAs, was the biggest thing I was working on all fall. I've never had anything close to a 60-person class, or a topic that was changing so quickly. We had put a fair bit of thought into the weekly readings and assignments, but hardly a week passed without us radically overhauling the readings because of breaking news or new writing that we'd come across. Discovering that the author of one of those readings was a Stanford undergrad, and another was someone whose kids had gone to the same daycare as mine, meant we got to have special guest visitors. Towards the beginning of the quarter, we took field trip to the Making Global Computing exhibit at Green Library and the Textile Makerspace, and I was happy that a few students took us up on the offer of doing a creative project with a textile angle.

About 50 of the students in the class were CS juniors and seniors taking it for their humanities core distribution requirement, which made for very different vibes than I'm used to. I've got more thoughts about the class and the final projects to write up at the beginning of next quarter, but one way I've been coming to grips with the class has been through my own final project: a weaving of the class data. I'm looking forward to finishing the second part of the weaving over the break; the data was only finished as of yesterday, with submission of final grades. The final syllabus is on GitHub.

Existing Projects

This fall we published a Data-Sitters Club book I've been excited about for a while: "Xanda Rescues the Topic Modeling Disaster". Topic modeling is often thrown around as a method, but I never loved my results when I tried it, and over the course of this book I came to understand it a lot better thanks to Xanda Schofield. I've also got a book on environments half drafted, and the Data-Sitters Club is ready to start 2024 with a new fun sub-series.

There's several other projects that have continued to be on hold, either because they're not yet ready to move forward (Senegalese Countercultural Movement, Global Medieval Sourcebook, The Futurist Archive), or because the AI class ate enough of my time and organizational energy that I couldn't get the meetings together to wrap things up (Harry Potter multilingual fanfic.) I also haven't made much progress on the multilingual DH working group activities or Jewish cookbooks. I've been working on Jewish cookbooks this quarter. But the latest Data-Sitters Club book has given me some good ideas for how to wrangle the DLCL dissertations. I also did some accessibility remediation work on French revolutionary data, which I haven't touched in a while.

At the same time, it's been a big quarter for SILICON (Stanford Initiative on Language Inclusion and Conservation in Old and New media). We had a soft launch event, Encode/Include in October, and a two-day conference, Face/Interface at the beginning of December. We got a project manager position approved and posted, and have begun reviewing the applications. This fall has included a team crash course on fonts, text input UX, and Unicode, and an incredibly useful group discussion with a panel of experts before the Encode/Include event where we tried to identify a set of significant stumbling blocks for the improving the text stack for digitally-disadvantaged languages. Out of those conversations, we've laid the groundwork for a summer intern program, as well as a series of proof-of-concept activities between now and then.

The Browsertrix Cloud pilot came to fruition this fall, with funding from DLSS. I'm really excited to have Browsertrix Cloud in our toolkit, at least for capturing digital scholarship projects. The old Global Medieval Sourcebook site is on my list as a test for how this might work.

I've been working with Annie Lamar and Brad Rittenhouse on more material for the HPC for Humanists resource, and we're looking towards doing a workshop on how to scale up some common form of DH analysis using the HPC cluster sometime in the winter or spring.

My role as ACH representative to ADHO has gotten a little bigger than I anticipated, since I've also had to be involved with program committee and local organizer discussions around the DH 2024 conference in Washington, DC. ADHO got its first official Code of Conduct approved and posted this fall. On the ACH side, I've mostly been trying to facilitate conversations around publications, and helping wrangle some of the infrastructure, including getting the ACH members site migrated to the ADHO server with help from my colleague and ADHO infrastructure chair Simon Wiles.

New projects

This quarter I've been working with Tania Flores on getting her database of Flamenco letras set up for data entry. At Simon's suggestion, we're using PocketBase, which we can host on Stanford Domains, and provides a relatively intuitive interface for both data entry and tweaking the data model we iterated on together. It lets us kick the can down the road when it comes to the web interface for the project: we could potentially create a static site based on the data, or migrate the data to some kind of content management system, and/or just export the data for analysis and visualization and archiving.

I worked with Ostap Kin on visualizing his data on Ukrainian translations of Dovid Hofshteyn's poetry, and we talked about digitizing the texts themselves for future text analysis. AJ Naddaff also has a corpus of texts in Arabic he'd like to digitize, as does newly-appointed faculty member Andrei Pesic, so it looks like OCR/HTR will be back in my life in a major way in the near future. Planning out my teaching for the next year or two also led me to return to the project of getting a computationally-usable version of the different departmental reading lists, and I made some progress with the Russian list. I expect I'll be back to book scanning in earnest in the winter, with an upgraded replacement for my 2020 book scanner that should be much better for books with glossy pages.

Writing

What We Teach When We Teach DH, ed. Brian Croxall and Diane Jakacki, just came out, with a chapter co-authored by my colleague Alix Keener, "Sharing Authority in Collaborative Digital Humanities Pedagogy: Library Workers’ Perspectives", and "Bringing Languages into the DH Classroom" by me. My chapter needs a contextualizing blog post, but there'll be time for that in the winter.

A piece I'm much happier with is "The Librarian, The Computer, The Android, and Big Data" (on the depiction of data-work in a corpus of Star Trek novels), co-authored with Nichole Nomura, which recently came out in the Vectors journal and I ended up assigning in the AI class to complement a piece low-key complaining about archivists' lack of enthusiasm for an AI project.

I didn't manage to write a blog post every two weeks like I intended (as prompted by Brandon Walsh from the UVA Scholars Lab), but I managed more than one per month. There was a write-up of an event at the Internet Archive where I presented with DLCL grad student Alyssa Virker. I wrote up my own AI class final project proposal for a weaving with the class data. I updated my popular post about how to write an ADHO conference proposal. And finally, I reflected on the joy of geeking out at the SILICON Face/Interface conference.

I've got a few new things early in the pipeline, including a piece on Unicode for a special issue of the Journal of Electronic Publishing (JEP) on multilingualism, where I'll be drawing on the Unicode Archives and co-authoring it with Manish Goregaokar and Ben Joeng (Yang) from Unicode. There's also a response to an edited volume from a 2020 multilingual DH event, and probably another piece on failure in the near future. The Data-Sitters Club also submitted an abstract to another JEP special issue "on gathering".

Talks and Events

This fall was light on talks and events -- happily, since the task of delivering a 90-minute lecture weekly in the AI class took up a lot of mental space and energy.

I did submit a couple things to the DH 2024 conference: a "mini-conference" (alternate format) for #DHmakes with Claudia Berger and others, another "mini-conference" to organize a #DHRPG play-through at the conference, and I was added to a workshop on teaching NLP organized by colleagues at the Princeton Center for DH.

Other Things

Yesterday, ACH announced its exit from Twitter. I've personally been off it since its acquisition last fall, but this quarter I completely deleted and relinquished my account. Bluesky may not have all the people doing all the things (looking at you, AI people), but it has become a very solid replacement for DH Twitter (and many other flavors of Twitter I cared about). The number of requests on the DH Bluesky invite code form dropped off considerably over the fall, but the form is still open and being checked occasionally. Please fill it out if you'd like to join!