BiblioSight News

Integrating the Web of Science web-services API into the Leeds Met Repository

Posts Tagged ‘Scrum’

Project meeting – minutes

Posted by Nick on November 18, 2009

Present: Peter Douglas, Wendy Luker, Arthur Sargeant, Mike Taylor, Babita Bhogal, Nick Sheppard

1. Apologies

Sue Rooke

2. Minutes from last meeting and actions

As emphasised at the last meeting, it has not been possible, within our timescale, to engage a suitable academic replacement after Phil Jones left the institution earlier in the project and it is now anticipated that academic staff / researchers will be involved in evaluating the outcomes of the project beyond the formal end of jiscri. WL/NS do now have a meeting scheduled (30th November 2009) with Professor Richard Light, the recently appointed Chair of the Carnegie Research Institute, to discuss Bibliosight and the wider repository infrastructure.

NS/PD have done some work on clarifying use cases – see item 4.

Transformation of XML from WoS to LOM format for ingest into intraLibrary. See – http://bibliosightnews.wordpress.com/2009/11/16/mapping-fields-from-wos-api-lom/ – more work still needs to be done in this area. (Action – NS/MT)

AS has updated the schematic diagram to clarify what will be achieved by the end of November. See – http://bibliosightnews.wordpress.com/2009/11/13/332/

NS to contribute project management post to blog on day to day work – ongoing – NS to action ASAP.

PD has contributed a blog post on technical standards used in Bibliosight – http://bibliosightnews.wordpress.com/2009/11/17/the-role-of-standards-in-bibliosight/

3. Update on development of desk-top application

As emphasised at the last meeting, three discrete functional requirements of the desktop application (from now on referred to as Bib App) have been clearly identified:

• Retrieve records from WoS as XML
• Perform an appropriate XSLT transformation to LOM format suitable for ingest to intraLibrary
• Deposit LOM records into intraLibrary using SWORD

MT has been working primarily on stages 1 and 2 and has adopted a pragmatic approach, treating them as two discrete tasks before attempting to integrate the functionality in a single user interface, he has a desktop client that will take XML and perform an XSLT transformation so, once we have clarified the LOM format we require – see http://bibliosightnews.wordpress.com/2009/11/16/mapping-fields-from-wos-api-lom/ – it should be relatively straightforward to plug into the WoS API to retrieve XML from the Web of Science which can then be transformed into appropriate LOM.

Deposit of the LOM into intraLibrary via SWORD should also be fairly straightforward – see – http://bibliosightnews.wordpress.com/2009/11/17/the-role-of-standards-in-bibliosight/ – however, in order to generate clean, consistent LOM, there are still a number of issues to be resolved.

From a technical perspective, Mike is not a Java programmer* and is working very hard to master the language in order to implement an integrated UI that can unify these three discrete functional areas – the precise functionality of the Bib App will also be informed by developing use cases – see item 4 below.

*The WoS API is Java based which perhaps makes it less accessible than it could be – it may be that JISC wish to make recommendations to Thomson Reuters and others regarding the development of open web services APIs. See – http://blogs.ukoln.ac.uk/good-apis-jisc/

Action: NS/MT to continue to investigate issues around three functional areas

Action: MT to continue developing Bib App – development will necessarily take us beyond the formal end of jiscri projects at the end of November

4. Update on use cases

PD/NS have summarised our three use cases in some detail which need writing up in full ASAP (Nick to action).

Particular issues that were identified include:

• In light of progress through the project, UC narratives need to be updated from the now outdated drafts proposed in the original bid
• UCs need to be fully itemised with an ‘actor’ clearly identified for each success scenario
• More thought needs to be given to extensions to each UC

There was particular discussion around UC_2 which centres on targeted communications to researchers to encourage deposit of an appropriate author produced version of a recently published/cited article. It is clear that such a use case will need to identify individual publisher’s copyright policy around deposit in an IR; if they do permit deposit, what restrictions / conditions to they impose? For example, a very common restriction is in the form of a 12/18 month embargo that would need to be incorporated into the workflow.

Action: NS to explore use cases in more detail and write up in full.

5. JournalTOCsAPI workshop – 20th November 2009 – Nick attending

NS is attending a workshop being run by the JournalTOCsAPI project on Friday 20th November and has been invited to give a 15 minute presentation on Bibliosight.

The workshop has two main objectives:

1. To learn the techniques/methodologies that professionals managing repositories use to identify new content for their repositories and the potential benefits as well as the shortcomings that they have identified in the JournalTOCsAPI

2. To give an opportunity to repository managers and API developers to learn the thoughts of experts in institutional repositories for efficiently integrating and reusing up-to-date journal TOC RSS feeds within repository systems and forward looking research information systems.

Action: NS to attend and participate as required

6. Project management tasks – project evaluation

The project management task to be addressed on the blog will be project evaluation.

Action: NS/WL to liaise and post on project evaluation

7. Formal end of project

The formal end of the project in line with the jiscri programme is the end of Novemeber 2009 by which time we are confident we will have a detailed proof of concept for Bibliosight that is well documented on the blog. However, there is still a considerable amount to be done to implement a fully functional Bib App which is a valuable outcome for the institution and the sector; work will therefore be ongoing beyond the end of the jiscri project, internal resources allowing.

8. A.O.B.

None

Posted in Bibliosight | Tagged: , | 1 Comment »

Project meeting number 5: Draft agenda

Posted by Nick on November 16, 2009

Date of meeting:  Tuesday 17th November 2009

1. Apologies

2. Minutes from last meeting and actions

3. Update on development of desk-top application

4. Update on use cases

  • Identify new research in WoS on a regular basis (daily/weekly/monthly); retrieve available metadata associated with records – add to intraLibrary
  • Identify new research in WoS on a regular basis (daily/weekly/monthly); check copyright/SHERPA-RoMEO; generate targeted email

5. JournalTOCsAPI workshop – 20th November 2009 – Nick attending

6. Project management tasks – project evaluation

7. Formal end of project

8. A.O.B.

Posted in Agenda | Tagged: , | 1 Comment »

Project meeting – minutes

Posted by Nick on October 1, 2009

(Date of meeting 29th September 2009)

Present:  Peter Douglas, Wendy Luker, Arthur Sargeant, Mike Taylor, Babita Bhogal, Sue Rooke, Nick Sheppard

1.  Apologies

No apologies

2.  Team membership

Thank you to Sue Rooke who has agreed to join the Bibliosight project team; Sue is a research administrator in the Faculty of Health and has already been involved in repository development, contributing to developing workflows and providing feedback on the Open Search interface.  We hope that Sue will contribute, in particular, to use case development.

The team is still lacking a representative from the academic community and we are currently waiting for a reply to recent correspondence. WL is attending the research sub-committee on Monday 5th October and may raise the issue there if necessary.

Action:  WL/NS to pursue academic contact(s) for a representative to sit on the project team

3.  Progress since last meeting

• API

We have now received the updated documentation from Thomson Reuters and Mike has submitted a query to the API  and received an appropriate response in XML. Thomson Reuters’ FAQ gives a full summary of the data fields that can be queried by the service and the data elements that can be returned which appears to be in line with this XML response.

We are therefore able to formally reduce the associated risk back to low:

Risk Probability Impact Action to Prevent/Manage Risk
API unsuitable for project deliverables Low (elevated to Medium;1stSeptember 2009 – reduced back to Low; 29th September 2009) High Feedback from Thomson Reuters indicates proposal technically feasible.

Problems with API/documentation have been mitigated by release of new documentation from Thomson Reuters; 29th September 2009)

N.B.  The wording of the documentation appears to suggest that it is only possible to return 100 records with a single query using the API – NS to clarify with Thomson Reuters.  If this is the case, the practical implications  are limited in the case of Leeds Metropolitan University which publishes a relatively small amount of research but would be considerable for an institution with a greater research output.

Action:  NS to clarify 100 record limit with TR

Action:  MT to continue appropriate* implementation of API

* Hopefully what is “appropriate” will evolve over the coming weeks!

• Use cases

Technical difficulties have contributed to a lack of conceptual clarity amongst the project team and there was considerable discussion around precisely what data Bibliosight will now seek to retrieve from WoS using the API and what we will aim to achieve with that data.

The original use case narratives outlined in the bid were several and focussed on an alert service for researchers and/or repository administrators to encourage the deposit of an appropriate full text in the repository and perhaps neglected the obvious administrative use case whereby metadata from WoS is pulled directly into intraLibrary.

N.B.  An important use case was also the extraction of citation metrics that would potentially inform the REF – we are not yet clear how this would be achieved but we understand it will rely on the Article Match Retrieval service.

Of course we also want to produce outputs that are of use to the wider community rather than just to users of our specific repository software and this reflects the considerations of the Readiness for REF project which also hopes to enable UK repositories to make effective and efficient use of the WoS API (as part of a much broader project) and is focussing on EPrints, DSpace and Fedora as the most well established OA research repository platforms.  R4R raises several pertinant questions, many of which also arose independently and in a similar form during our own discussion:

  • What are the different workflows relevant to (i) backfilling a repository with a one-off download and (ii) ongoing use of WoSAPI to populate a repository?
  • What uses might records downloaded from WoSAPI be put to?
  • How might the workflows be designed to enable other datastreams also to help populate the repository (eg from UK PubMedCentral, arXiv, or sources that better serve the arts, humanities and social sciences)?
  • What workflows might be able to handle facts such as that the WoS record will become available some time after the paper is published, whereas deposit into the repository may happen earlier than that?
  • What methods might be helpful in addressing the inevitable questions of duplicate records, or ambiguous relations with existing records?
  • Are there implications for a repository’s mission and reputation if the balance of content it holds is rapidly changed by a large number of WoS-derived records?

Use cases may also be informed by the JournalTOCsAPI project (see item 5 below) who also explored similar issues in a recent post.

One  practical consideration from a technical perspective and that will have a bearing on developing use cases is the best method of extracting comprehensive records from institution “X” – the most appropriate field to query seems to be the address field but it is not clear how consistent the institutional address in this field will be – for example, early experimentation has found that “leeds metropolitan university” only returns 201 records; using a wildcard in the form “leeds met*”, however, returns 1503 records (test conducted 29th September 2009).  This was an issue flagged to follow up with Thomson Reuters reps on Wednesday 30th September (see item 4; post to follow).

In terms of the practicalities of actually getting records from WoS into intraLibrary once they have been harvested, Peter did indicate that it should be possible to upload suitable XML records into intraLibrary though this will need to be in LOM format, meaning that we may need to perform an XSLT transformation to convert data retrieved from WoS into a suitable format.  Also, Peter is uncertain whether XML that can be imported in this way will also include the LOM extensions we are using to accommodate bibliographic information and will need to speak to his technical colleagues at Intrallect to clarify.

Note:  There was also discussion around appropriate integration with SFX, our OpenURL resolver, as a possible means of identifying a published URL for WoS records – this is an area that has scope implications both for Bibliosight and the remit of the Leeds Metropolitan University repository itself; beyond an Open Access repository of research (i.e. to also comprise citation only records).  This is an area that may need to be explored in more detail later in the project.

Action:  PD to clarify re upload of XML to intraLibrary including LOM extensions

Action:  NS/BB/SR to meet with another member of the URO to clarify potential use cases (meeting on Thursday 1st October)

Action:  All team members to contribute to ongoing discussion on the blog.

• Project reporting – blog; tags specified by JISC

It was agreed that the specific subject for blog posts this month will be ‘Technical standards’ – Peter agreed to contribute a post before the next meeting.

Action: PD to contribute a blog post on ‘technical standards’.

Action: All team members to contribute to ongoing discussion on the blog.

4.  Visit by Thomson Reuters reps on Wednesday 30th September

Mike and I met with Jon and Gareth from TR on Wednesday 30th (yesterday) who were able to clarify several issues for us – separate post to follow

5. Review of JournalTOCsAPI – http://www.journaltocs.hw.ac.uk/index.php?action=api

During the meeting, I gave a quick overview of the recently released JournalTOCsAPI at http://www.journaltocs.hw.ac.uk/index.php?action=api with a view to de-mysifying the concept of an API for the less technical amongst us and also potentially giving the more technical a developmental steer.  Currently, queries need to be submitted to the API by URL and are returned as an RSS feed which includes as much metadata as in the original TOC feed – depending on the quality of the original record – comparable to Bibliosight in many respects, this project perhaps has greater flexibility regarding the metadata it is able to query and return – it is, after all, building an API from the ground up that will query an openly accessible data source – however, it is likely that the quality of the data may not be as consistent as WoS; there may be fields missing, for example.

It has also been informative to engage with another, similar project as a ‘user’ and we discussed how Bibliosight might also engage with JournalTOCsAPI community of users and agreed that it is a valuable opportunity to solicit the opinion of repository managers from other institutions using different software platforms.

Action:  NS to continue engaging with JournalTOCsAPI as a ‘user’

Action:  NS to send an email that can be forwarded to JournalTOCsAPI community of users as suggested in recent correspondence from Lisa Rogers

6.  Article Match Retrieval & Researcher ID

These were only touched upon briefly in the meeting and flagged to follow up with Thomson Reuters reps on Wednesday 30th September (see item 4; post to follow).

7.  A.O.B.

None

8.  Date of next meeting

20th October 2009 – 11:30 am

Posted in Bibliosight, Progress post, SCRUM minutes | Tagged: , , , , , , , , , | 2 Comments »

Project meeting number 1: Draft agenda

Posted by Nick on July 9, 2009

Date of meeting:  Monday 13th July 2009

1.  Apologies

2.  Project overview

3.  Project management and meetings

  • Team and roles
  • Project reporting; blog
  • Technical

4.  Licensing

5.  A.O.B.

6.  Date of next meeting

Posted in Agenda | Tagged: , , | 1 Comment »

Quickstep into rapid innovation project management

Posted by Nick on June 17, 2009

As I’m on annual leave for 10 days from tomorrow, I’ve been trying to set up the first couple of our monthly project meetings before I go – past experience has taught me that getting your project team all together in one room is easier said than done; the whole point of #jiscri, of course, is that the approach is agile and light and I’ll definately be making as much use of Web 2.0 as possible to communicate with the project team but there really is no substitute for a good old fashioned face to face meeting with luke-warm tea and biscuits.

I haven’t yet seen the project documentation but Wendy has had a preliminary discussion with the programme manager, Andy McGregor, who has indicated that the blog should be the primary mechanism for reporting on our project – in lieu of a formal final project report; there will be specific areas we need to address in our blog posts and I’m looking forward to learning more about this aspect of the programme.  Andy also referred to a couple of other #jiscri projects that it would be useful for us to liaise with; R4R (Readiness for REF) at Kings College – http://www.kcl.ac.uk/iss/cerch/projects/portfolio/r4r.html – and one building an API for TicTocs (don’t quote me on that – need to learn more).

So.  Our first “scrum” is scheduled for Monday 13th July; in the first instance, we should all aim to gather information independently ahead of the scrum which will necessarily be focussed on planning to scope the likely project trajectory – we can think about technical developments in more detail at that meeting.

For the record and by initial only (until they introduce themselves here I hope!) our scrum is:

NS, WL, AS, MT, BB, KB, PD, PJ

Posted in Bibliosight | Tagged: , , , , | 1 Comment »

First post

Posted by Nick on June 12, 2009

I had a preliminary meeting with Wendy and Arthur this morning about getting Bibliosight underway.  Our intention is to follow the Scrum methodology – http://en.wikipedia.org/wiki/Scrum_(development) – recommended by JISC for rapid innovation projects and conduct 5 development cycles over the 6 month period of the project.

Arthur and I are speaking with web-developer-Mike on Monday to get his preliminary perspectives on the technical implementation of the Web of Science API and it’s probably also necessary to have a technical discussion with Intrallect sooner rather than later – especially in view of recent discussions around potential research specific developments to intraLibrary – there are almost certainly synergies with some of the developments mooted around Symplectic:

  • the ability for metadata ingested from external systems (e.g. Symplectic, or Web of Science) to be arranged in a way consistent with other records in intraLibrary
  • additional metadata fields to include number of citations – continuously updated (done in Symplectic)
  • Integrate with Symplectic Publications database to support bibliographic metadata transfer from Symplectic to intraLibrary / deposit of digital copy linked to metadata transfer

Such developments to intraLibrary are unlikely to be implemented during the timescale of the project, however, so we’ll need to consider how we can implement the API and derive useful functionality – with an eye to appropriate integration with intraLibrary in the future, perhaps.

JISC have also indicated that there are other projects doing similar things and would like us to work closely with them – Wendy is speaking with our programme manager in more detail next week.

In the spirit of openness, here is the successful Bibliosight bid in all its glory…

Posted in Bibliosight | Tagged: , , | 1 Comment »

 
Follow

Get every new post delivered to your Inbox.