Skip to content Skip to navigation

Digital Humanities and Data Science

I'm proud to announce that Stanford University Library will be bringing on Scott Weingart as a data scientist to help support digital humanities scholarship here at Stanford. Scott is well-known in the digital humanities community for his work on information visualization MOOCs, courses on network analysis, editing the Journal of Digital Humanities issue focused on topic modeling, and work alongside other DH scholars to create The Historian's Macroscope, an online text that provides an exhaustive introduction to the particular flavor of digital humanities that involves bringing a computational lens to traditional humanities research questions. Regardless of the name of the position anyone gave to Scott, it's obvious that his support would be welcomed by digital humanities scholars. So, why data science, and not something a bit less science-sounding than, say, "digital humanities specialist"?

If you're not familiar with data science, then you're not alone. Despite its increasing popularity here in Silicon Valley, it's still not clear if data science is a discipline or a particular set of methodologies, or just a big tent in which to lump a variety of practitioners who take various computational approaches. One of the jokes about data science is that a data scientist "knows more programming than a statistician and is better at stats than a programmer." In other words, data science seems to be what they call digital humanities on the other side of campus.

And while part of Scott's job while he's here is to flesh out what and where data science is in relation to digital humanities, there are very good, practical reasons to bring in an expert in network analysis and information visualization. And even more to give that person a job title that reflects that expertise and focus. Stanford is probably the most vibrant place for digital humanities right now, with multiple projects and individual faculty pursuing a variety of research agendas that involve spatial analysis, text analysis and network analysis directed at traditional humanities scholarship. And practically every one of these projects has some kind of network component, whether that network represents Imperial Roman transportation, character interactions in a play, genealogy, knowledge creation in post-processual archaeology, or a "more traditional" social network analysis of Benjamin Franklin.

I have argued, and will continue to argue, that the use of information visualization and network analysis by interlopers--like digital humanities scholars--is healthy and completely justified. This is why we're working with tech industry professionals (many self-identified as "data scientists") to try to foster collaboration on digital humanities projects with folks outside the academy. Still, there is a practical need to better inform these methods and research with reference to more traditional network and information science. It's not just a practical need, it's a professional need, one that is not well-suited to faculty collaborators in those fields, because they're more focused on their own research agendas and not educating their interdisciplinary partners. That's why we've hired a data scientist to actively support a variety of these projects and identify common types of objects, agendas, and methods in digital humanities approaches to network analysis and information visualization here at Stanford.

The bad news is that we'll only have Scott here at Stanford for the summer, but moving forward, I hope to establish the very real stake that digital humanities has in data science, and how it can further support the research agendas of digital humanities scholars.