This is the first of my regular round-up of what’s happening in the world of resource discovery. Twice a month I’ll be sharing what I’ve found during my internet travels and also highlighting things that have caught my eye under the #UKDiscovery Twitter hashtag. You can also see the latest tweets from that hashtag compiled into an eye-pleasing PDF format (created via the FiveFilters PDF newspaper maker).
Firstly, I want to share the output of the JISC Activity Data Synthesis project which I was involved with last year. The project website was published in November 2011, which already seems like a lifetime ago, but hopefully the collective wisdom gathered together there will be useful for some time to come.
The JISC Activity Data programme was a collection of nine projects which, although not directly part of the Discovery initiative, covered some relevant terrain – particularly around issues of licensing, metadata and open data. Other strong themes that emerged during the course of the programme were ‘big data’ (particularly so for the Exposing VLE Activity Data project), data storage and data visualisation. If you’re interested in getting to grips with data visualisation then the online talk that Tony Hirst kindly did for us as part of our virtual exchange sessions is well worth a watch. Five of the projects were focused on library activity data so they are worth exploring if that’s the domain you’re involved with: AEIOU, LIDP, RISE, SALT and OpenURL.
Now onto the highlights of things I’ve come across over the past few weeks:
- Bodleian Libraries project What’s the Score? is a Google-funded project that’s using the Galaxy Zoo crowdsourcing platform. Volunteers are invited to help make the Bodleian’s digitised music score collection become more accessible online by adding descriptive metadata to their records. So far 21% of their collection has been catalogued.
- The Bodleian have also been involved with a geodata project that has mapped 32,000 letters (from the Electroic Enlightenment database) to their relevant locations across the world. Sadly it’s a subscription only service so I couldn’t take a look at the letters themselves but it struck me that it could work brilliantly as a Google Maps mashup if the data was ever made open.
- We’ve come along way in the world of online search since 1999 but when I stumbled across Andrew Gordon’s PhD thesis, ‘The Design of Knowledge-rich Browsing Interfaces for Retrieval in Digital Libraries’ [pdf], I couldn’t help but think that it looked like it could’ve been written yesterday have a look and judge for yourself.
- The importance of context and presentation for gaining value from ‘big data’ has been a hot topic in recent weeks, firstly in this short piece by Bryce Roberts and, along with Dennis Berman’s piece for the Wall Street Journal is a timely reminder that we might not necessarily be best placed to identify that context – All of which is an interesting backdrop to the Digging into Data Challenge projects which were announced a couple of weeks ago. And, as a postscript, I was fascinated to see an Occupy Wall Street activist in the US tweeting about the National Centre for Text Mining’s (NaCTem) batch request service last week, suggesting that it’s not just ‘big data’ that have the potential to be used in unforeseen ways but also the services around ‘big data’.