About two months ago the Journal for Library Associations published an issue completely about Digital Humanities in libraries. Enthusiastically I printed all the open access articles (I know, not very nature conscious of me…) and put them on my desk. As it often goes with papers on desks, they’ve been lying there ever since. This changed this morning as I took out the stack and started reading them. And I loved it! Article after article I took out my highlighter and marked sentences and paragraphs that sounded too familiar to me, working in a research library with an interest in Digital Humanities.
The KB has started to look at Digital Humanities (DH) as a topic not so long ago, but has been involved with DH related projects for quite some time, although there were not called DH at the time. Examples are the CATCH projects that started in 2004, but since the beginning of our digitising-days our material is used by a variety of people and institutions. However, the KB is special in a way when it comes to doing DH research. We are the national library of the Netherlands and are thus not connected to a specific university or research institution. This means that we do not employ our own researchers. We do have a Research department, but most of the people here do not dive into our content, but do research to ensure the public can do this.
Although the JLA I read only discusses university libraries, and their associated researchers, this does not mean that the articles from for example Miriam Posner or Bethany Nowviskie are not relevant for us. The, often mentioned, lack of flexibility that is apparently inherent to a library also exist here and the desire to only publish something once it is perfect is something I too can relate to. Working with digitised material is never perfect. The software is not perfect, so how can the outcomes be? Nonetheless, the KB chose to show these imperfections in our OCR by opening up the texts to the public, including all mistakes and an estimate of accuracy.
Being a research institute, with a large digital corpus that we are more than happy to share, without our own researchers (apart from the occasional research fellow), the KB not only faces the challenges of the university libraries as mentioned by Miriam Posner in her article (i.e. inflexibility, lack of time, authority, and incentive, overcautionesness, etc.), but I believe another crucial element can be added to this list: No affiliated researchers. Until not so long ago, when a researcher wanted to use (a section) of our digitised sets he/she would find someone from within the library who could help them get it. There was no official route to obtain the data or one contact person for a specific set, so it could be possible that people left the KB with hard disks full of images or that they tracked one of our employees down at a conference and badgered them until they got an e-mail with instructions on how to harvest a collection. Luckily, this has changed with the creation of the Data Services team.
The Data Services team are the go-to guys when it comes to our digital sets. They have taken up the responsibilities of advertising our datasets on our (unfortunately only in Dutch) website, at events and conferences and on the Dutch Open Data community, such as Open Data Nederland. We hope these efforts will lead to interesting use of our data and perhaps even some enrichments that we might implement in the future (OCR correction anyone?). But how can we be sure that our data does indeed gets used and that we reach the people who might be interested? And how do we know if our methods is in fact what they are looking for?
This issue is one that I would imagine is easier to solve when you can simply walk to the other side of the building, knock on some doors and talk to researchers of whom you know their interests, because they teach Data Mining at your university. Unfortunately, we are not in that position, apart from the people who have asked for our data and those that will come to our (currently a work in progress) KB Lab. Now that we have the instructions to harvest our sets available on the website, less and less people will probably be doing this, leaving us more in the dark about what interesting things are happening with the digitised Early Dutch Books Online or the ANP radio bulletins.
So, how do we get and stay in touch with interested parties that might contribute to the enrichment of our collections? How can we be sure that what we are doing is in fact what researchers need? How much can and do we want to adapt our methods to fit the need of researchers? For example, do we want to offer all possible data formats if there is a demand for it or is that something that the scholars might be able to tackle themselves? (Solutions and ultimate answers of course always welcome in the comment section below!)
We are undertaking several activities to try to find answers and also our place in the wonderful world of Digital Humanities. The establishment of our own KB Lab, where we will to work with scholars who wish to do something with our data, is one such activity. Another is the poster session that we will present together with the BL Labs project at the DH2013 conference this summer. Our aim there is to talk to different researchers about our collections and their ways of working. What types of collections they would like, what data format they would love to see, but also what they would like to do with our data. So, if you’re around in Nebraska, please come and find me at the posters and let’s talk this through!