Dianne Dietrich joined LTS Metadata Services in July 2008. A recent graduate of the University of Michigan’s School of Information, Dianne became the first Research Data Specialist in LTS. In this piece, she talks about her position and how it relates to new and traditional library services
As the Research Data & Metadata Librarian, my LTS position is technically a new position, but what I do is not novel for libraries, or even for Cornell. For just over two years, Mann Library has had a Research Data & Environmental Sciences Librarian (Gail Steinhart) who works with researchers and faculty to publish environmental and ecology related datasets. Many faculty, from Astronomy to History to Sociology, are creating tables and files of data for their research that they want to organize, preserve or share with others. Libraries have had a great deal of experience organizing, preserving and providing access to print and electronic materials over the years, and now, we're exploring new ways of applying this knowledge to digital research data.
My friend from library school asked me to explain a current project at my new job. I responded:
“Well, right now I'm working with a database of bird data. It contains data about bird behavior collected by a group of researchers over the course of ten years, and it has almost fifty tables in it. My job is to figure out what exactly is in it, so that I can work on cleaning it up—like standardizing place names, and formatting for numbers and dates. Then, I can work on helping to describe it with metadata so that other research teams can potentially use it in the future.”
My friend, an archivist, immediately noticed parallels between what I do and what she does. She noted that she often sorts through boxes of personal papers, trying to figure out who and what's mentioned, organize it into logical units, and describe it with just enough detail so that others can use it for their research. Even though I work with digital data and my friend works with boxes of aging paper, there are clearly important commonalities in our work.
These connections extend to the work done in technical services, as well. In LTS, many of us work with complex items, such as books in other languages, or part of a larger series; items with special formats, like DVDs and maps. Merely providing titles and authors for these items isn't enough for our patrons to find them and use them. We also know that certain attributes are important for certain types of items, and not others. I keep these same principles in mind as I am working with research data. In my work, I need to pick a metadata scheme that will allow me to sufficiently describe the research data I'm working with. I also need to be mindful of the types of files I'm handling, especially if the data is going to be shared with others.
When I am thinking about research data, one of my main focuses is organization. As librarians, if we did not pay attention to how we organize our materials, none of our users would ever be able to find a book, or download an article from an e-journal again. Many of our Cornell researchers will never have a spreadsheet with as many columns as we have books at Olin, but their data is certainly complex, and requires careful organization if it is not to become unwieldy. One of my responsibilities includes working with researchers to help organize their data output so they can manage it more easily and effectively.
While my work does share important similarities with the work done at LTS, my work involves a great deal of exploration in areas beyond the library. This summer, I attended two workshops on cyberinfrastructure applications for working scientists, including one on the National Virtual Observatory (NVO) for astronomy data. In the NVO, a community of scientists has put forth standards for describing and transmitting astronomy data and images over the Web so that other scientists, regardless of their proximity to a physical observatory, can collect, analyze, and use the data. Contributing data to the NVO requires its owner to create proper metadata for the dataset, which might be an area where libraries can help.
Working with digital data, though, poses new challenges. Unlike books, computer files can run the risk of becoming obsolete, with the potential of permanently losing their content. Working with research data means I must know how to work with proprietary file formats, and recommend alternative file formats to minimize this risk of loss of data. Over time, digital files can not be abandoned on a hard drive the way books can be left alone on a shelf. The preservation of digital data in the long run requires plans for backups and migrations. In my new role, I need to be mindful of all of these specific concerns when I'm working with digital data.
I have always had a love of computers, and using technology in constructive ways. My new role here at Cornell allows me to focus on what's important to me as a librarian—connecting people with information—while letting me explore the technological issues surrounding new types of resources, like digital research data. What I do every day may be slightly different than most people at LTS, we are ultimately still connected by the same guiding principles.
|