Monday 29 October 2012

ELO project stakeholder assessment now completed

motion gears by ralphbijker on Flickr
The Enhancing Linnean Online project has completed its survey of required enhancements, and the report on its findings has now been agreed by the project team. These initial stakeholder assessments, based on face-to-face interviews and online feedback, provide an empirical basis for the planned development and enhancement of the system, and are now being adapted into a specification for the development work.

Details of the full report will ultimately be included in the project's final report. Here is a summary of the key enhancements, which will deliver improvements in the following areas:

1. Interoperability
  • Ensure that direct specimen access by URL, a REST-ful based search API, and linked data endpoints are implemented effectively and intuitively
  • Publish details and examples of the searching and linking APIs in online system documentation
  • Identify external systems to link to, and use embedded linked data (RDFa) as the basis of linking
2. User Experience
  • Upgrade repository software to EPrints 3.3 (for improved linked data, search and other functionality
  • Redesign Item Page template (implementing design and features described in the report)
  • Implement a number of requested UI enhancements (listed in report Appendix)
3. Metadata implementation
  • Redesign metadata schema, and convert existing metadata, using a core set of common elements and using EPrints "Item Type" to implement each collection's metadata as a sub-class.
  • Implement metadata as embedded linked data in Item Page (using Dublin Core and Darwin Core mappings defined in report Appendix)
4. Licensing
  • Allow lower definition images to be used on the site: embed automatically a statement of LSL ownership in such images, and permit private/educational reuse under the terms of a non-commercial license
  • Enhance the Administrator interface to enable Society’s librarians to access/request full-size copies of the images, and record permissions granted/actions taken for reuse.
5. Revenue generation
  • No specific actions with regard to revenue generation, however if time permits we will investigate the feasibility of trialling an on-demand printing API, such as that of MOO, and report results.
6. Digital preservation strategy
  • Review naming conventions and representations for image files in metadata during accessioning and archiving
  • Improve integration of records processing with existing service via EPrints interface
  • Trial EPrints file format validation tools to assess feasibility (with regard to scalability and architecture)
  • Review and enhance procedures for ongoing planning and scheduling of accessioning
7. Educational reuse of the materials
  • No specific recommendations (enhancements set out in Interoperability, User Experience, Metadata and Licensing will facilitate reuse), but educational use cases to be included in post-implementation testing and reviewing.

Monday 26 March 2012

The AIDA metrics

We've mentioned so far Beagrie's metrics for measuring improvements to the management of academic research data, and the Ithaka metrics for measuring improvements to delivery of content, particularly with regard to the operations of an organisation's business model.

A third possibility is making use of UoL's AIDA toolkit, a desk-based assessment method which has gone through many iterations and possible applications. Over time, we've shown how it could be used for digital assets, records management, and even research data (although admittedly it has never been used in anger in those situations). AIDA doesn't intend to measure assets, but instead measures the capability of the Institution (or the owning Organisation) to preserve its own digital resources.

In July 2011 we produced a detailed reworking of AIDA that could specifically be used for research data. This was part of the JISC-funded IDMP project and the intention was that AIDA could feed into the DCC's online assessment tool, CARDIO. The detail of the reworked AIDA was assisted greatly by the expertise of numerous external consultants, recruited from a wide range of international locations and skillsets. They fine-tuned the wording of the AIDA assessment statements to make it into a benchmarking tool with great potential.

AIDA is predicated on the notion of "continuous improvement", and expresses its benchmarking with an adapted version of the "Five Stages" model which was originally invented and developed at Cornell University by Anne Kenney and Nancy McGovern. It also uses their "Three Legs" framework to ensure that the three mainstays of digital preservation (i.e. Organisation, Technology and Resources) are properly investigated.

We think there may be some scope for applying AIDA to JISC ELO, mainly as an analysis tool or knowledge base for measuring the results of responses to questionnaires and surveys. It could assess broadly whether the Linnean Online service finds itself at a Stage Two or Stage Three. We could subsequently measure whether the enhancements, once implemented, have moved the service forward to a Stage Four or Stage Five.

This could be done with a little tweaking of the wording of the current iteration of AIDA, and through selective / partial application of its benchmarks. We think it would be a good fit for the ELO project strands which discuss Metadata, Licensing, and Preservation Policy - all of which are expressed in the Organisation leg of AIDA. The Resources leg of AIDA could be tweaked to measure improvements in the area of ELO's Revenue Generation. One of the most salient features of AIDA is its flexibility.

Versions of the adapted AIDA toolkit can be found via the project blog, although the improved CARDIO version has not been published as yet.

Thursday 22 March 2012

The Ithaka metrics

In our last post, we considered whether the Beagrie metrics are going to work for this project. This time, we'll look at another JISC-related initiative, the Ithaka study on sustainability (Sustaining Digital Resources: An On-the-Ground View of Projects Today) from July 2009.

Beagrie's metrics were of course directed at the HFE sector, and the main beneficiaries in his report are Universities, researchers, staff, and students who benefit from improved scholarly access. Conversely, Ithaka takes the view that an organisation really needs a business model to underpin long-term access to its digital content, and manage preservation of that content. They undertook 12 case studies examining such business models in various European organisations, and identified numerous key factors for success and sustainability.

The subjects of these case studies were not commercially-oriented businesses as such, but Ithaka takes a no-nonsense view of what "sustainability" means in a digital context: it means whatever you do, you need to cover your operating costs. One of the report's chief interests then, is discovering what your revenue-generating strategy is going to be. They identify metrics for success, but it's clear what they mean by "success" is the financial success of the resource and revenue model, and that is what is being measured.

The metrics proposed by Ithaka are very practical and tend to deal with tangibles. Broadly I see three themes to the metrics:

1. Quantitative metrics which apply to the content
  • Amount of content made available
  • Usage statistics for the website
2. Quantitative metrics which apply to the revenue model
  • Amount of budget expected to be generated by revenue strategies
  • Numbers of subscriptions raised, against the costs of generating them
  • Numbers of sales made, against the costs of generating them
3. Intangible metrics
  • Proving the value and effectiveness of a project to the host institution
  • Proving the value and effectiveness of a project to stakeholders and beneficiaries

How would these work for our project? My sense is that (1) ought to be easy enough to establish, particularly if we apply our before-and-after method here and compile some benchmark statistics (e.g. figures from the Linnean weblogs) at an early stage, which can be revisited in a few years.

As to (2), revenue generation is something we have explicitly outlined in our bid. Since the project is predicated on repository enhancements, we intend to develop these enhancements in line with existing revenue models proposed to us by the Linnean staff. Our thinking at this time is that the digitised content can be turned into an income stream by imaginative and innovative strategies for reuse of images and other digital content, which might involve licensing. As yet we haven't discussed plans for a subscription service, or direct sales of content.

(3) is an interesting one. The immediate metric we're thinking of applying here is how the enhanced repository features will improve the user experience. I'm also expecting that when we interview stakeholders in more detail, they can provide more wide-ranging views about "value and effectiveness", connected with their research and scholarship. These intangibles amount to much more than just ease of navigation or speed of download, and they ought to be translatable into something of value which we can measure.

But maybe we can also look again at the host institution, and find examples of organisational goals and policies at Linnean that we could align with the enhancement programme, with a view to indicating how each enhancement can assist with a specific goal of the organisation. As Ithaka found however, this approach works better with a large memory institution like TNA, which happens to work under a civil service structure with key performance indicators and very strong institutional targets.

In all the Ithaka model looks like it can work well for this project, provided we can promote the idea of a "business model" to Linnean without sounding like we're planning some form of corporate takeover!

Tuesday 13 March 2012

A Welcome from the Linnean Society of London


The Linnean Society of London is delighted to be part of the JISC-funded project “Enhancing the Linnean Collections Online” in partnership with ULCC. As the Deputy Librarian handling many of the day-to-day enquiries about the Online Collections, I am particularly pleased to be involved together with the Librarian and our IT Consultant.

This steadily growing online resource is of great importance for scientists and researchers world-wide. The Society is especially keen to conduct a formal user and stakeholder needs assessment and feedback exercise. We are interested in how we can improve ease of use and navigation for our regular users, as well as promote and tailor the Linnean Online Collections to new user groups – especially in an educational context.

The Linnaean Collections include the specimens of plants, fish, shells and insects acquired from the widow of Carl Linnaeus in 1784 by Sir James Edward Smith, founder and first President of the Linnean Society. They also include the library of Linnaeus, as well as his letters and manuscripts.

In his publications, Carl Linnaeus (1707-1778) provided a concise, usable classification system of all the world's plants and animals as then known. Some of his works in particular have also been accepted by international agreement as the official starting point for modern nomenclature.

This confers a high scientific importance on the specimens used by Linnaeus for their preparation, many of which are now treasured by the Linnean Society.
These collections are one of the foundation stones of modern Biology.

Apart from their scientific merits, the specimens are also beautiful and fascinating. Marvel at the giant horns of a Hercules Beetle. Admire the deep blue colour of a Delphinium flower collected and pressed over 200 years ago. Discover the eerie Death's-head Hawk Moth (Acherontia atropos L.; image © The Linnean Society of London) made famous through the book and film "The Silence of the Lambs". All available online in zoomable, high-resolution images.



In addition to the collections already online, we look forward to adding other important source material such as Linnaeus’ Annotated Library and James Edward Smith’s Herbarium and correspondence.

“Nomina si nescis, perit et cognitio rerum” - “If you do not know the names of things, the knowledge of them is lost too” (Carl Linnaeus, from Philosophia botanica (1751) p.158 under VII Nomina)
Following in Linnaeus’ footsteps, we are undertaking this project to organise knowledge in the best possible way, so that it can be used and preserved for many centuries to come.

Friday 9 March 2012

Beagrie's Metrics

We're aiming at delivering a set of enhancements to Linnean, but how will we know if they worked? One of the aims of the ELO project is to measure the results of the programme of enhancements in terms of tangible benefits to Linnean and its stakeholders. We're thinking about a framework that will enable us to measure the results of this before-and-after process.

Our thinking at the moment is that we could adapt and make use of the Beagrie metrics published in Benefits from the infrastructure projects in the JISC managing research data programme, which were devised for measuring the value of research data to an HEI.

The Institutions that Beagrie worked with were asked about how their lives would improve if their research data was better managed. Data management planning is a wide-ranging process that includes preservation as one of the outcomes. Those consulted were very strong at coming up with lists of potential benefits. But it was slightly harder for them to come up with reliable means of measuring those benefits.

Even so, the report came up with a very credible list. It was organised under the names of the stakeholders who would benefit the most. A little tinkering with that table allows us to put Linnean at the top of the list as the main beneficiary. We also know Linnean has researchers, and that they are concerned with scholarly access. This suggests a framework like the one below might work for us.

Benefits Metrics for Linnean
  • New research grant income
  • Number of research dataset publications generated
  • Number of research papers
  • Improvements over time in benchmark results
  • Cost savings/efficiencies
  • Re-use of infrastructure in new projects

Benefits Metrics for researchers
  • Increase in grant income/success rates
  • Increased visibility of research through data citation
  • Average time saved
  • Percentage improvement in range/effectiveness of research tool/software

Benefits Metrics for Scholarly Communication and Access
  • Number of citations to datasets in research articles
  • Number of citations to specific methods for research
  • Percentage increase in user communities
  • Number of service level agreements for nationally important datasets

The Institutions in the report go on to give specific instances of how these metrics apply in their case. For instance, for the "Average Time Saved" metric the Sudadmih project reported:

"In an attempt to measure benefit 1 (time saved by researchers by locating and retrieving relevant research notes and information more rapidly) Sudamih asked course attendees to estimate how much of their time spent writing up their research outputs is actually spent looking for notes/files/data that they know they already have and wish to refer to. The average was 18%, although in some instances it was substantially more, especially amongst those who had already spent many years engaged in research (and presumably therefore had more material to sift through). This would indicate that there is at least considerable scope to save time (and improve research efficiency) by offering training that over the long term could improve information management practices."

However, the report is also clear that any form of enhancements (technical, administrative, cultural) can take some time to bed down before their benefits are even visible, let alone become measurable. "Measuring benefits therefore might be best undertaken over a longer time-scale", is one possible conclusion. That is a caveat we'll have to bear in mind, but it doesn't preclude us devising our own bespoke set of metrics.

Wednesday 29 February 2012

Hello World!

Welcome to the Project Blog for the JISC-funded Enhancing Linnean Online project.

The project runs from Feb 1st 2012 to November 30th 2012. The kick-off meeting took place on Feb 27th at The Linnean Society of London in Piccadilly - details to follow.

The blog will report on activities and issues identified throughout the project, including the intitial stakeholder assessments, enhancements to the Linnean Online systems, and the evaluation of the outcomes. We also hope to include information about the project team, interesting items and information about the collections themselves.