Published 18th January 2022
Can you describe your background and your current role?
I am a digital scholarship librarian at Haverford College.
Most of my work is about identifying a research problem or an idea, and building a community around that where people can contribute in different ways. It often starts from a particular collection of materials or particular research interest, but more often than not, it is an ongoing dialogue amongst a community of people.
Much of what I do is manage and train our student developers. And because I’m working with students from STEM fields (science, technology, engineering and maths) as well as students from the humanities and social sciences, a lot of what I do is to help the students find where they can contribute to a project, and what’s relevant to their professional interests or their educational interests.
So one student will be working in training a computer vision model, another will be annotating, another will be writing content. And there’s this whole range of ways in which the students contribute and connect those individual interests with the project goals, and also, as a program, making sure that we’re doing new things and exploring new avenues to keep the program innovative.
Could you give me an example of a digital archive tool?
One of the things I do quite often is work with computer vision models, where we use the models to make sense of the visual aspects of a collection of documents, or photographs. There are pre-trained models, from Google or Facebook, and they will tell you, is this image safe for work? Does it include any landmarks? But that’s really, totally irrelevant for most academic research, so a lot of what I do is to work with scholars to identify what they would want the machine to see in these images.
I had a student that was interested in finding documents that were signed with a thumbprint and using that as a proxy for literacy. For the machine, that’s very easy to do, to train it and say, this is what a thumbprint looks like, find it in all of this collection of tens of thousands of documents. Suddenly you’ve got a search for that visual attribute.
Another thing that I do a lot of technology-wise is natural language processing. It’s a similar kind of issue where if you’re studying literature for example then the pre-trained models that you’ll get from industry aren’t very helpful; they’re not interested in the same kind of things that a literary scholar would be interested in, so you need to train your own models. I’m part of an NEH-funded project where we’re not only training models for the study of literature and historical sources, but also for languages that are currently not supported, like Ottoman Turkish, or old Kannada, or classical Chinese.
Right now there’s this wall that certain scholars hit when they find that the computational tools for the languages that they work in, just don’t exist, so they hit a dead end. But we’re trying to create workflows so that people can create the tools they need for their own research, whatever the language.
Would you call yourself an RSE?
I would like to be able to… I think it’s much more common in the UK and it has a broader meaning there. I think, for me, I’m a little intimidated by the engineer term, there’s a lot of imposter fear amongst people who don’t have formal training in computer science. I’m always kind of amazed when the people I work with like my code!
But I think even coming in without a formal degree in it, I would say that engineering is very much what I do. For example, much of my job is assessing what the different potential solutions and workflows would be, and finding the right tool or the right language or the right process to achieve a task. So I think what I do is engineering.
When did you first hear the term “RSE”?
I first heard about it on Twitter approximately two years ago. I then joined the US RSE Slack space at about the same time, so it’s probably around when that group was forming.
What is your favourite thing about your work and being an RSE?
I think my favourite thing is that I get to work a lot with people who are really good at embracing fun. So much of academia is very serious; it feels like we always need to establish the gravity of our work, and its impact, etc. so we don’t often let ourselves just enjoy the work! Yet, that’s often why we’re in that field because we love being in a certain place surrounded by old books and documents and things.
I also think that the space is there to just reach out to a friend who may be at a different university, I work often with people in Russia, and just say, “Hey, you have this amazing collection of crowdsourced diaries, let’s get together and geocode them all and see what we get”.
You also get to tinker with things, and that tinkering is actually relevant to my job. There’s a creativity to it that doesn’t get accessed often in academic work.
And what is your least favourite?
I think more than anything, it’s probably just that it’s so alien to most people in academia; I always have to explain what it is that I do. Often I say “digital scholarship” or even “digital humanities” but no one quite knows what that means. So it’s just a struggle to establish what it is that I do. People hear “librarian” and they assume I file books, and you know, that’s certainly something I can do, but I can also train machine learning models.
We’ve already mentioned the US RSE group but do you have any interactions with the RSE Society and other parts of the RSE community?
The Turing Institute had a summer school recently, so I attended that, but that’s it so far. It’s clearly an important community, but I haven’t quite found my place in it yet.
Do you see yourself as an academic, researcher, software engineer, technician…? All of it? Something else? A mix of one or two terms?
Alison Langmead has an article about role-based collaboration that I find really helpful, where basically there’s a technical stakeholder and that technical stakeholder has their own research agenda and their own professional interests. In a project, the humanities or science researcher will have their particular research agenda, but in joining a project I have my own agenda as well, and so in that regard, I’m also a researcher.
There are projects in which I’m employed to build a web application or database or whatever it is that they need. But by and large, I focus my efforts on things that advance my research agenda, or at least the things that I’m interested in learning at the time.
I’d say I’m motivated by my own agenda, either I’m learning something, or we’re researching something; so researcher is probably a good term.
What do you see as your most likely future career path from here? And what would be your ideal career path?
I’m at a small college with two DS (digital scholarship) librarians and we’re part of a consortium, where I have several other colleagues. We work primarily with student developers who create remarkable projects. But I also see the appeal of having a dedicated team of full-time developers working on a project. Stanford’s Center for Interdisciplinary Research (CIDR) is a good example. Working in a team could be really appealing, especially one where you have a designer for example. I’m not qualified to be a designer, but I work a lot with the Center for Digital Humanities at Princeton, they have an amazing designer, and their projects are just gorgeous. I would never be able to do that and I think that there are certain skill sets, and you can’t be expected to have all of them. So I’d say working with other specialists who have specialized knowledge that I don’t have is important.
I also think working remotely brings a lot of flexibility. This last year and a half, I worked remotely with my students and my colleagues, and it was seamless, almost. It’ll be nice to see them in person, but the reality is it didn’t interrupt our work much at all.
Is there anything else you’d like to add about your future career path?
I think I’d like a role where I can really follow ideas. I really dislike maintenance, and a lot of the work I’ve been doing is making our projects easier to maintain. I’ve been identifying common use cases and building reusable templates. I’m also trying to simplify and be more consistent in the technologies that we use. So I’ve been doing things, like saying let’s just annotate some text, and we’ll make a little Streamlet app as a prototype and can see, is this accomplishing what you want? And doing that we’re sort of focusing on the data, and the idea that we’ll start with data and we’ll end with data. That’s something that we can preserve, and it doesn’t require this whole array of virtual machines running all the time and breaking.
In your view, how could RSEs be better supported in their work? What do you need? What is missing?
Well, I certainly appreciate owning this as a space, or as a group, I think that group identity is really important. There’s code4lib for people who work in libraries and write code, and I saw them as potentially my group, but they’re more sort of library systems people; and I’m in more of a research role. I think that the RSE title fits better, what I do, and the group that I would want to be a part of, and where I would find peers and colleagues that I can learn from.
The term engineer carries some prestige and communicates the value of our work. It can be hard to see that I would make twice as much if I was in industry. So I think addressing that question is important. There needs to be some clear value and benefit that academic work brings that offsets the differences in salary. We need to clearly articulate that value or people will continue to be tempted by post-academic jobs. I’ve seen five or six people recently leaving libraries for industry in a whole variety of developer support roles or developer advocates, writing documentation, and things like that. They’re still coming to conferences and keeping a foot in the field, so there’s continuity and needs that academia continues to provide even when colleagues take on roles in industry. Ideally, there’d be more of a revolving door and you could go from industry into academia and back. If that was more seamless, that would be amazing.
Which question did we not ask you which we should have (and what is the answer)?
A lot of what motivates me is keeping a foot in the door in my previous field. So I might ask if there is a connection between your original disciplinary training and the kind of work that you continue to do as an RSE?
For me, it seems to give meaning to a lot of the projects that I work on, and I can’t quite explain why. But I don’t think we’re unusual in that regard, and it does feel like that is something that distinguishes an RSE from an industry engineer. The nature of what we’re doing is somehow really important. Most tech companies pride themselves on their values and ability to provide positive change for people’s lives, so there’s probably something distinct about working on pure research, cultural heritage, or community-engaged projects that give meaning to the work in a unique way. I’d love to find a good word for it because it seems important.