Discovery Licensing Clinic

May 23, 2012

Image

photo credit: Ed Bremner

The first Discovery Licensing Clinic brought together representatives from a number of different libraries, archives and museums to spend a day considering practical responses to the Discovery open licensing principles and getting practical guidance from the assembled experts. It was an opportunity to identify issues and discuss the range of tactics that institutions might adopt in scoping metadata releases and making the associated licensing decisions.

Our panel of experts on the day consisted of Francis Davey (Barrister), Naomi Korn (Copyright Consultant), Paul Miller (Cloud of Data) and Chris Banks (University Librarian & Director, Library, Special Collections & Museums, University of Aberdeen)

Chris Banks has written a blogpost reflecting on the day and her presentation slides can be viewed below:

The issues around licensing open metadata do represent a significant hurdle for institutions but none of those issues are insurmountable. Our hope is that licensing clinics such as this one, and the ones we plan to run in the future, will give managers and decision makers the knowledge they need to progress the open metadata agenda within their organisation.


Highlights from the Content and Discovery Joint Programme event

May 22, 2012

On the 23rd April colleagues from projects across the Discovery, JISC Content, JISC OER and Emerging Opportunities programmes gathered in Birmingham to share knowledge and identify shared challenges and key agendas that need to be progressed going forward. As is often the way with these types of events the discussions that took place over a day and a half were as useful to those running the event as they were for the delegates attending. The notes below represent just a handful of my highlights.

Joy Palmer presented on behalf of the Discovery Programme and gave a compelling overview of the challenges and aspirations we share around the discovery of content. She highlighted how, as the RDTF work was translated into the Discovery initiative, it became clear that we needed to talk in terms of an ecosystem as opposed to an ‘infrastructure’ because the latter suggested that the initiative was aiming to impose an overarching infrastructure model over the entire museums, libraries and archives (and JISC) discovery space.

“To a large degree, what today is about is determining to what degree we can operate as a healthy and thriving ecosystem, where components of our content or applications interact as a system, linked together by the flow of data and transactions.”

But as Joy stated, this is not to oversimplify matters. Her talk touched on the many apparently competing theories about how to enable discovery in the dataspace, highlighting the complexity we’re all confronting as we make decisions about the discovery and use of our data: Big Data and The Cloud, Paradata, Linked Data, Microdata, and the ‘return’ of Structured Data.

But in terms of our shared goals to have our content discoverable or useable via the web, she explained it is the tactic of opening up data that is relevant to us all, even if our challenges in achieving ‘openness’ differ.

The slides from Joy’s presntation are available to view on Slideshare:

Discovery: Towards a (meta)data ecology for education and research

View more PowerPoint from joypalmer

In the afternoon I facilitated Andy McGregor and David Kay’s session on business cases where the participants obligingly contributed to David’s mapping exercises.

There were some interesting discussions around the participants’ experience of writing business cases, including useful suggestions for getting the most out of building a business case:

  • Predicting and measuring benefit are key challenges to overcome but we can do that by using the data at our disposal to create a convincing narrative. However it’s not about manipulating that data and making up stories retrospectively, we need to put energy into building robust analytics that help communicate our story clearly and convincingly.
  • Filling out a business case template shouldn’t be an activity that only happens in order to secure funding or other resources – it can be very useful to reiterate the process throughout the course of the project in order to track any changes in the course of the project.

The following links may be useful if you are interested in building robust business cases:

In the plenary session on day two the conversations centred around a number of discussion points:

  • Terms such as ‘microdata’ (machine-readable semantic tagging of webpage content) and ‘paradata’ (usage analytics or contextual information about data/metadata) were new to some of the participants and this prompted a discussion around the seemingly unavoidable challenge of jargon that we face within the Discovery arena. One suggestion was that instead of working to define a stronger vocabulary that is understood by all, perhaps we should be identifying stronger metaphors which everyone can relate to; metaphors that communicate the vision of what we are working towards and help everyone understand how they can get involved with delivering that vision within their own context.
  • We should be stepping outside of the sector to see the potential for emerging areas of activity (e.g. paradata). Looking to those sectors who are ahead of the game saves the library, museum and archives sectors having to try and work from a blank page. We also need to identify where our sectors are ahead and recognise how those advantages leave us well positioned to make significant progress.
  • Projects would benefit from a system of ‘evaluation buddies’ from within their programme to help uncover evidence of project impact and then share this evidence, together with highlighting any awards and recognition won by projects. This will help institutions build their internal business cases for bidding to run and then embed JISC projects in the future. There was also the suggestion that JISC could usefully build a collection of the major use cases (in a similar way to the Open Bibliographic Data Guide) together with short case studies that demonstrate the institutional impact.
  • Across the two days there were mentions of ‘microdata’ (machine-readable semantic tagging of webpage content), ‘big data’ (i.e. high volume) and ‘heavy data’ (data which ‘stretches current infrastructure or tools due to its size or bulk’ but the argument was made that the primary objective should be to produce ‘simple data’ (data that is both simple to produce and simple to consume).
  • There was recognition that aggregation is an art not a science and that current data standards are a) opinion, not fact and b) open to interpretation. High quality data is key to producing usable datasets but there was a question about how that quality can be defined. One suggestion was that data clean-up is a highly specialist service that should be decoupled, as per the government’s view with regard to open data.

Some key takeaway points for the Discovery programme:

  • Information about the Discovery programme, its projects and the underlying principles should be in a format that is ‘reframeable’, making it easy for interested parties to access information on their terms and cascade that information to their own audience or stakeholders.
  • Identifying and highlighting the tangible benefits of the Discovery Priniciples enables supporters of those principles to embark on fruitful conversations with colleagues in their institutions.
  • There is huge benefit in sharing the learning and challenges from within, and without, the Discovery programme.  An ongoing process of synthesis, re-synthesis and distillation will extract maximum value from the activity taking place across the Discovery initiative.
  • The quality of metadata is key to the success of Discovery initiatives – we need to explore how high quality metadata is defined and ensured.

Community Feedback

April 10, 2012

“Anything that helps people to make more meaningful use of resources is a good thing”

Veronica Adamson and Jane Plenderleith report on recent interviews.

Since March, we’ve carried out a series of interviews with leaders and managers in the library, archive and museum (LAM) community about what open data means for their users and communities. Discussions focused on benefits, issues and challenges for institutions, collections and users in this space. Some interesting and thought-provoking views have emerged, providing much food for thought on the development of the RDTF vision.

Here are some key points emerging from our discussions:

  • Supporting open data – the LAM community is keen that resources are available to a wide community of users and contribute as much as possible to the furthering of knowledge
  • Simplifying access – there is strong support for systems which help users easily to discover resources and avoid the confusion caused by a multiplicity of disparate datasets
  • Communication – to these ends, LAM professionals need accessible language, and clear evidence of the benefit of open data aggregation, aligned with institutional priorities
  • Local examples – networks of libraries, museums and archives are already sharing data and developing local solutions to metadata challenges relating to standards, purpose and nomenclature
  • High quality aggregation – we need to move beyond small-scale initiatives providing partial answers, which then sit on websites gathering digital dust
  • Special Collections as the archives of the future – as more and more published material is available digitally, the role of the library is as custodian of unique collections, so data relating to these collections is an invaluable national resource.

Our thanks to all those who have been involved in this process so far. If you’d like to share your thoughts, aspirations, plans or reservations about these matters in our forthcoming round of interviews, please get in touch via info@discovery.ac.uk.


Warwick workshop prioritises resource discovery

March 29, 2012

In January 2012, JISC and SCONUL convened a workshop for Library Directors and Senior Managers to review the evolving requirements for institutional Library Management Systems (LMS), referenced as Domain 3 in the 2009 SCONUL report to HEFCE.  Entitled ‘The Squeezed Middle’, the workshop focused on the key service developments impacting the LMS footprint, given evolving approaches in Resource Discovery (Domain 2) and shared service developments in the management of subscription resources (Domain 1).

After considering a business modeling framework presented by Lorcan Dempsey and a number of future scenarios set in the year 2020, the workshop reviewed a catalogue of over 60 potential library service and institutional knowledge management objectives. The group evaluated them in terms of desirability, feasibility and their potential to act as drivers of mission critical change.

It was striking that the Discovery agenda represented a very high proportion of the items ranked as high priority looking to 2020. It was also noted that above campus initiatives (such as shared cataloguing and records improvement) and services (such as resource discovery aggregations) can act as catalysts for reviewing workflows (both user and librarian) and reappraising library team skills.

The highest ranked Discovery related targets were as follows:

  • 31 – Provide 1-stop search across all asset types
  • 32 – Publish open linked catalogue metadata
  • 33 – Expose the collection to other search mechanisms
  • 34 – Emphasise exposure of special collections
  • 35 – Integrate LMS & VLE resources, including reading lists
  • 43 – Curate local learning resources, including OERs
  • 44 – Drive the value of reading lists

Medium priority Discovery related targets were:

  • 36 – Provide recommender and associated ‘social’ services
  • 45 – Curate institutional research data
  • 46 – Expose the institutional repository
  • 47 – Expose the university archives

The headline priorities included

  • Provide 1-stop search across the range of Teaching, Learning and Research asset types that are authored and collected within institutions
  • Integrate reading lists effectively with the discovery of and access to library, VLE and repository resources
  • Establish sustainable curation, workflow management and exposure for all digital scholarly assets – including local learning resources, OERs and research data
  • Not on the original list, delegates added the potential for a persistent personal interface to assets, typically through bookmarking; the metaphor of a personal e-shelf was regarded as attractive.

Other challenges such as re-thinking the user access points for resource discovery or collaboration on adoption of widely used authorities and vocabularies were regarded as less critical, though not unimportant. The abandonment of the traditional LMS OPAC received a low vote on the basis that this will be an outcome of success in these broader ambitions. Whilst enhancing the discoverability of university museum assets received a low average vote, it was highly scored by those institutions with their own museum collection.

So Discovery featured highly for library management both as an end in itself and as a catalyst for changing processes and practice, relationships and responsibilities. However, we must also reflect on whether this professional and user-centred aspiration relates to a destination at which we will one day arrive or perhaps may be better viewed as an essential element in the continuous evolution of the academy.


w/c 12 Mar 2012 – Discovery News Roundup

March 16, 2012

Here’s my round up of news from the world of Discovery and beyond over the past couple of weeks. As with my previous posts, many of the items were gleaned from the #ukdiscovery twitter hashtag which you can dip into whenever you like by opening up this FiveFilters ‘newspaper’ pdf.

First of all, some news from the Discovery initiative – There is an opportunity to attend the free Licensing Clinic that the Discovery project is running on Wednesday 9th May in Birmingham. This practical roundtable event is aimed at managers and decision makers in libraries, archives and museums and there will be the following experts on hand to help guide you through your institution’s particular open metadata licensing challenges: Francis Davey (Barrister), Naomi Korn (Copyright Consultant), Paul Miller (Cloud of Data). Please note that places at this event are strictly limited to 15 delegates so you’re advised to book sooner rather than later and you can do that by signing up via the Eventbrite registration page.

In recent weeks I’ve seen a few articles relating to the need for skills development in the area of ‘data wrangling’/’data management’:

Those articles left me wondering whether there are specific skills needed for dealing with and managing open metadata which we should be identifying and highlighting? On a related note, I saw a short conversation regarding Linked Data on Twitter that I think a lot of people will relate to and which could be equally applied to any of the areas touched on by the Discovery initiative – To summarise, the main point of the conversation was that [people] have no trouble understanding what terms such as Linked Data mean while they are being explained to them but that knowledge is hard to retain and quickly loses definition when you walk away and/or try to explain it to anyone else.

Resources such as the Open Metadata Handbook are undoubtedly a useful touchstone people can keep returning to when they need a refresher but what else needs to be in place to ensure that knowledge about open metadata is discovered, shared and becomes embedded within staff skillsets?

One of the aims of the Discovery initiative is to raise awareness of open metadata and if you’d like to help us do that then you can either:

Some other links of interest from the wider world of data:

Lastly, I’ve started exploring how I can use Delicious to share other items of interest that I pick up during my travels across the webosphere – To that end I’ve started using Packrati.us to auto-bookmark my Twitter favourites and shared hyperlinks in Delicious and have also created a #UKDiscovery ‘stack’ where I’ve started sharing any of my bookmarks that seem particularly pertinent to the Discovery initiative.


Open Data – The Missing Link?

March 12, 2012

Ken Chad positions Discovery in the context of global and national thinking

In March 2011 the first issue of Google’s Think Quarterly[1] online magazine was dedicated to data. Nigel Shadbolt of the University of Southampton writes that one of the key responses to the 21st century demand for information is open data. The data.gov.uk website and the influence of Shadbolt alongside Sir Tim Berners-Lee has positioned UK government as one of the leaders in open data[2].

However, despite the increased recognition of Shadbolt’s argument that “open data provides a platform on which innovation and value can flourish”, more needs to be done. This is certainly the case with libraries, museums and archives. Discovery Chair, Prof. David Baker, emhasises that by opening up more data for reuse “we can better serve UK educators and researchers to excel in their work by increasing access to, and visibility of, relevant content”.

If we are to achieve the ambition of the Discovery initiative for a sustainable ‘metadata ecology’, two broad issues need to be addressed. The first is around making a clear business case. Key figures like Shadbolt and Berners-Lee have done much to clarify and advocate the broader business case especially for government data. However more remains to be done to help heads of libraries, museums and archives articulate the particular business case for their organisations – as Discovery is undertaking to do.

Secondly, a commitment to licensing open metadata will be vital. It is encouraging that this is central to a number of current projects in libraries, museums and archives with the British Library[3] amongst those leading the way. At the same time Discovery is providing case studies and tools such as the Open Bibliographic Data Guide to support managers, practitioners and developers.

References

  1. ^ http://thinkquarterly.co.uk/
  2. ^ http://www.guardian.co.uk/news/datablog/2010/jan/21/timbernerslee-government-data
  3. ^ http://www.bl.uk/bibliographic/datafree.html

w/c 27 Feb 2012 – Discovery News Roundup

March 4, 2012

Here’s my round up of news from the world of Discovery and beyond over the past few weeks. As with previous posts, many of the items were gleaned from the #ukdiscovery twitter hashtag which you can dip into whenever you like by opening up this FiveFilters ‘newspaper’ pdf [update: URL fixed].

Last week the Discovery team published Issue 6 of the Discovery Newsletter which included the following articles among others:

  • an article on how the Copac Collections Management Tool project is aiming to help collections managers.
  • an introduction to ‘Will’s World’ – one of the JISC-funded large-scale exemplar projects.
  • an invitation for supply chain organisations such as system vendors and publishers to engage with the Discovery initiative.

If you’d like to receive future newsletters by email you simply need to drop us a line at rdtf-discovery@sero.co.uk and you’ll be added to the distribution list.

It was interesting to read Harvard’s announcement of the changes they will be undergoing in order to unify their 73 (!) libraries. Much of the announcement concentrated on structural changes but this sentence caught my eye and it seems to suggest that some game changing LIS developments could be in the offing: “The changes will position the Library to lead in scholarly communication and open access, to design next generation search and discovery services, and to accelerate digitization and digital preservation.

Of course Harvard’s Library Lab team are already involved in designing next generation search and discovery services as part of the Digital Public Library of America (DPLA) Beta Sprint initiative – the scale of the data they’re dealing with is pretty impressive but it was the live demo of their “pre-alpha” ShelfLife/LibraryCloud system that took my breath away and got me thinking about new possibilities for discovery interfaces.

When I first read this short blogpost from the Louie B. Nunn Center for Oral History, University of Kentucky I initially dismissed it as not quite newsworthy enough to include in this digest … but I kept thinking about the story after I had clicked away from it.  It seems to me that the ‘Oral History Metadata Synchronizer’ (OHMS) tool that they’ve developed with their digital library division has huge potential for improving the visibility of audio collections and connecting them to other relevant resources. The story of how the Nunn Center have used OHMS to preserve and share interviews with survivors of the Haiti earthquake is a moving reminder that metadata is (at the risk of getting poetic and misty eyed) more than sterile information, and the discovery it enables is human as much as it is digital.

Staying on the subject of audio collections, the Music Library Association is currently working on a final version of their Music Discovery Requirements document and they are currently inviting thoughts and suggestions. This presentation by Nara Newcomer provides useful background on the aim of the Music Discovery Requirements document.

The Discovery programme is particularly focused on the business case for adopting open metadata so it was interesting to read this white paper from Nielsen which reports on the effect of supplying (or not supplying) metadata within the book industry. One of the key conclusions reads: “Overall we see clear indications that supplying a set of full enhanced metadata for product records helps to maximise sales, and that this relationship between enhanced metadata and sales is even stronger for the online retail sector.” Of course UK university libraries are not in the business of book retail and this report could simply serve to make publishers more commercially protective over the metadata they create but all the same it is good to have some high profile research published in this area. It’s a pity that they don’t separate out enhanced metadata from the provision of a cover images in their analysis – from research I’ve been involved in previously I suspect there might be some interesting findings that remain hidden by the approach they’ve taken.

Europeana have published data for 2.4 million items under an open metadata licence as part of its Linked Open Data pilot. The data is provided by eight national libraries and a number of cultural heritage organisations (including some from the UK) and there’s also a convincing animation on the ‘what and why’ of linked data which, pleasingly, keeps the end user at the forefront of the discussion. Europeana also launched the ‘European Library Standards Handbook’ which is their guide for libraries who are providing content to data aggregators – it includes a legal overview as well as a technical guide. If you are interested in linked open data then you might want to follow the University of Bristol’s ‘Bricolage’ project which is JISC-funded and will be publishing catalogue metadata from their Penguin Archive and Geology Museum collections.

Earlier this week I found myself having one of those ‘am I the only person not at this event?’ moments as my Twitterstream gradually filled up with all manner of interesting and diverting tweets from the OCLC EMEA Regional Council Annual Meeting.  Owen Stephens captured some of the knowledge that was shared around the topic of APIs in his blogposts written on the day. One of the sessions that seemed to be particularly well received was Alison Cullingford’s presentation on recent survey findings from the RLUK Unique and Distinct Collections project so it will be interesting to read the report when it is published. The meeting also brought news that an open data commons licence is being considered for WorldCat:

WorldCat: open data commons licence is being considered and will be discussed with OCLC membership through Global Council #EMEARC

— Simon Bains (@simonjbains) February 29, 2012

I will not pretend to be an expert but these guides that the Archives Hub have added to their website look very useful for anyone who is interested in accessing Archives Hub data using SRU and OAI-PMH interfaces.

I’ll finish up by sharing some interesting news in the wider world of open data and metadata:

  • The JISC Managing Research Data Programme is doing some heavy lifting in terms of building a registry of metadata standards  (for UK university research datasets) – I’m sure they would be pleased to hear from you if you have any insights you’d like to share with them.
  • The Government’s call for input to their consultation on “open standards for software interoperability, data and document formats” is ongoing and it doesn’t close until 3 May so there’s plenty of time left to think about what the direct and indirect supply chain ripples might be.
  • In my last news digest I mentioned that ‘big data’ suddenly seemed to be everywhere – This week Nick Edouard’s reflective post over on the BuzzData blog struck a chord with me, particularly his point that “Open-data initiatives are good for many reasons, not least because they can radically improve internal data-sharing.” Often the discussion around open data tends towards a leap of faith/altruistic model but keeping focused on the ‘what’s in it for us?’ question seems a surer way of securing the internal resources needed to release data in the first place.

In closing, a couple of blogposts I’ve read recently have got me thinking about the importance of identifying a vision that other people can quickly understand and get behind:

I think that the Discovery vision packs a similar punch but perhaps it could be more emotive?: “[Our vision] is about making resources more discoverable both by people and machines.” Is that a vision which speaks to you? Have you found the words to succinctly describe your institution’s vision for resource discovery? Please do share your thoughts in the comments below.


w/c 6 Feb 2012 – Discovery News Round-up

February 9, 2012

Here’s my round up of news from the world of Discovery and beyond over the past few weeks. Many of the items were gleaned from the #ukdiscovery twitter hashtag which you can dip into whenever you like by opening up this FiveFilters ‘newspaper’ pdf that I generated.

Last week Joy Palmer shared plans for the next phase of guidance materials and workshops here on the Discovery blog and is looking for your feedback on the outlined approach so please do wade in and let us know what you think. And bonus points for anyone who can suggest a better title for the event than ‘Un’developer hands-on development event. The best I can come up with is ‘Can’t Code, Won’t Code’ so the field is wide open.

The National Information Standards Organization (NISO) are currently inviting public comment on the working group recommendations that have come out of the joint NISO and NFAIS (the National Federation of Advanced Information Services) project to develop Recommended Practice on Online Supplemental Journal Article Materials. The main aim of the project is to improve the ‘discoverability and findability’ of journal supplemental materials for librarians and would-be readers by establishing and maintaining links to the related article. The comment period runs until 29th February and, although the recommendations are aimed mainly at publishers, they are also interested in feedback from the wider scholarly community. [via @simonhodson99]

One of the key NISO/NFAIS recommendations is around consistency and, interestingly, this was also one of the key discussion points raised during recent focus groups run by the JISC/AHRC-funded Open Access e-Books research project (OAPEN-UK). So far the project have heard from humanities and social sciences (HSS) monograph publishers, authors/readers and institutional representatives and next week they are running focus groups for research funders, e-book aggregators and learned societies. Incidentally, if you are interested in taking part in one of those focus groups then further details can be found on their Events page. [via @publishersrcly]

A couple of weeks ago it seemed to be ‘Big Data’ week on my twitter stream – all and sundry were tweeting about it and it wasn’t just the data geeks any more. It certainly seemed to suggest, as reported in this Museum Geek post, that “the era of Big Data has begun” but it struck me that the conversation around big data seems to be moving on from mostly logistical or functional discussions about gathering, storing, sharing and making use of data to a realisation that generating and circulating more data doesn’t solve anything on its own (see GigaOm’s article which likens it to virtual landfill via @paulmiller). In the world of building websites there’s a saying that ‘content is king’ but in the world of data it would appear that ‘content + context = king and queen’. Which had me pondering whether the Discovery initiative could usefully consider establishing Open Paradata Guidelines to sit alongside our Open Metadata Principles. And coming from a humanities background myself I found Michael Kramer’s assertion that “data is always already meta-data” an interesting point to mull over.

The Data Catalogs website, which was launched last summer, aims to be “the most comprehensive list of open data catalogs in the world”. I’m sure it’s relatively early days yet but there are already 212 catalogues listed and the list of experts involved in the website is impressive. It looks like it will grow into a useful centralised resource, particularly if a more advanced search is added, but I noticed that not all of the entries state what their metadata license is – it seems to me that there’s an opportunity to improve consistency and clarity by making that a mandatory field. What did impress/surprise me though is that any visitor to the website can improve a record simply by clicking on the ‘Please help improve this page by adding more information’ link at the bottom of the record and editing the fields that appear [via @rufuspollock]. If you are interested in the issues around licensing open data then Naomi Korn and Professor Charles Oppenheim’s practical guide is worth a read.

And finally, a few items of interest from the wider world of Discovery:

  • This article about book mashups on the Programmable Web ‘API News’ blog got me thinking about countless possibilities for making library and museum and gallery collections more visible and connected in new ways. Then this morning someone tweeted about the strangely hypnotic Flight Radar website and I wondered if one day I might find myself gazing at a map that shows books flying overhead as they wend their way from place to place as inter-library loans.
  • March is looking set to be Culture Hack Month, with events taking place on both sides of the Pennines. Hack for Culture takes place on the 3rd and 4th March in Liverpool and is bringing interested parties together “to explore the possibilities offered by joint experimentation with a wide variety of hidden cultural data sets”.  The 24 hour-long CultureCode Hack takes place towards the end of March in Newcastle and will give cultural and arts organisations with open data the opportunity to work with developers and designers to create something new. You can take a peek at the hacks that were developed the Culture Hack North event in Leeds last year to get an idea of what can be produced in such a short amount of time.

New Discovery open metadata projects

February 3, 2012

Five new Discovery projects started this week. They are all focused on the creation and release of open metadata from libraries, museums and archives in line with the Discovery open metadata and technical principles.

The projects are:

  • Bricolage – will publish catalogue metadata as Linked Open Data for two of its most significant collections: the Penguin Archive, a comprehensive collection of the publisher’s papers and books; and the Geology Museum, a 100,000 specimen collection housing many unique and irreplaceable resources. University of Bristol
  • Open Education Metadata UK – will publish metadata sourced from four significant UK education collections as Open Data in a variety of formats, for anyone to reuse as linked data in their own applications. In addition, subsets of two collections which have high latent potential for linked data will be catalogued. Institute of Education
  • Open Book – will release open metadata for the Fitzwilliam’s Designated Collection (over 150,000 records) and linked open data for the internationally important collection of illuminated manuscripts in the Fitzwilliam Museum (approximately 500 manuscripts records). The Fitzwilliam Museum, University of Cambridge
  • Music Collections at Cardiff University: Advancing the Resource – focuses on a collection of manuscript and printed music from the eighteenth and nineteenth centuries, a resource of nearly 3000 items largely unknown to the wider scholarly community. This project will catalogue the material online, and make the data available through the Archives Hub and COPAC, as well as RISM (UK) (Répertoire International des Sources Musicales). Cardiff University
  • Trenches to Triples – will provide Linked Data markup to 200 collection level descriptions and 6,000 item level catalogue entries relating to the First World War from the Liddell Hart Centre for Military Archives and will also provide a demonstrator for using Linked Data to make appropriate connections between image databases, Serving Soldier, and detailed catalogues. King’s College London

The projects are just getting started but will all have blogs which will record their progress. Look out for further information on the projects via the discovery site. All of the learning and outputs from these projects will be summarised on the Discovery website to ensure that others can benefit from what the projects learn and produce.

I have written an overview of all the current Discovery work on the JISC website.


Five Reasons To Be Cheerful

January 20, 2012

Five reasons to be cheerful about the Discovery Service Projects

David Kay, working with the Mimas Discovery team

So, what’s new? Another year, another round of  projects – the second phase of the Discovery initiative.

Whilst it would be naïve to trumpet progress or to estimate distance travelled at this stage, I confess to being enthused by the discussions taking place at the kick off workshop in Birmingham on 11 January. You’ll find initial introductions to all the projects mentioned in this post here.

The meeting brought together 10 of the 11 projects linked to the JISC 13/11 call for Discovery Services, the Cambridge / Lincoln CLOCK project being the only absentees. So let’s start right there for the first of five observations in this post …

The CLOCK collaboration emerged directly from a fruitful dialogue about the practical value open catalogue data in Phase 1 (check out the COMET and JEROME precursors). Likewise the Open Bibliography project, championed by the inventive Mark MacGillivray, continues powerful work started in the JISC Expo programme with 30m openly licensed records already in the bag – check out their demonstrator.

Observation 1 – Thinking shared and experience gained within the Discovery initiative is maturing in to a powerful community tool.

And lest anyone should suggest that all the running is being made by libraries, up steps the AIM25 archival consortium with ‘Step Change’, working to apply the linked data based indexing productivity endorsed by archivists in Phase 1 to the widely used CALM cataloguing application. Meanwhile, in the world of museums, Contextual Wrappers 2 (led by the Cambridge Fitzwilliam museum and Collections Trust, working with Knowledge Integration) plans to extend its collection descriptions model across the HE Museums sector, informed by a grounded ‘market’ survey.  We should also highlight the efforts of Search25 (the M25 library consortium) and ServiceCore (the OU project harvesting dozens of Open Access repositories) to ensure their services address community needs.

Observation 2 – Responding to practitioner and community opinion is at the heart of Discovery aggregator thinking.

Discovery is not about a single model that fits all. However, the growing interest in Linked Open Data as an approach with a future is significant. This ranges from the Bodleian recognizing it as a vehicle for breaking down the silos that divide their own collections (the Digital.Bodleian project) to museums across the North East using linked data and supporting vocabularies in the Cutting Edge project to enable cross-searching of collections to meet the needs of very different types of users from schools to researchers. AIM25 Step Change shares the same confidence.

Observation 3 – There is a measured expectation that linked data can yield practical value for highly focused local services, as well as delivering in grand ‘web scale’ settings.

It is particularly interesting how the value of place and other geographic information is becoming leveraged in a variety of ways within the linked data model. Pelagios 2, involving Southampton and the OU with a range of international partners, is linking data to place to assist in cataloguing, annotation, search and visualization of ancient objects. Fast forward a couple of millennia and the DiscoverEDINA project is using an automated Geotagger to expose place metadata embedded in digital media files. The links of AIM25 Step Change to Historypin address the same theme.

Observation 4 – The adoption of common vocabularies seems key to making the most of key access points across the ‘web of data’ – and place looks like the early candidate for generating critical mass.

The afternoon sessions focused on the objectives of the Discovery initiative under the themes ‘Terms of Use’, ‘Data’ and ‘Interfaces’ and the underlying quest for service sustainability. On behalf of the Mimas-led Discovery team, Owen Stephens set out 12 practical measures of quality implementation, whilst recognizing that no single project will address every measure.

Terms of Use

1 – Adopting open licensing

2 – Requiring clear reasonable terms and conditions

Data

3 – Using easily understood data models

4 – Deploying persistent identifiers

5 – Establishing data relationships by re-using authoritative identifiers

Interfaces

6 – Providing clear mechanisms for accessing APIs

7 – Documenting APIs

8 – Adopting widely understood data formats

Service

9 – Ensuring data is sustainable

10 – Ensuring services are supported

11 – Using your own APIs

12 – Collecting data to measure use

Observation 5 – Whilst there is still much work to be done, Discovery is moving from abstract principles to tangible measures of practical implementation.

As you can tell, I think the plans and ambitions of these Phase 2 projects are indicative of healthy developments and increasing maturity in the wider Discovery initiative. And this is where the Discovery team led by Mimas has a vital role in supporting practical implementation beyond these institutions through case studies, guidance materials and targeted workshops … watch this space!