BiblioSight News

Integrating the Web of Science web-services API into the Leeds Met Repository

Posts Tagged ‘1st sprint’

Project meeting number 2: Draft agenda

Posted by Nick on August 27, 2009

Date of meeting:  1st September 2009

1.  Apologies

2.  Progress since last meeting

  • API
  • SWOT analysis
  • Project reporting

3.  Liaison with other projects

4.  Use case development

5.  A.O.B.

6.  Date of next meeting

Posted in Agenda, Bibliosight | Tagged: , , , , , , | 1 Comment »

SWOT update

Posted by Nick on August 13, 2009

I’ve had a few responses to my request for input into a SWOT analysis which are summarised here along with my own analyses.

**Remember project team you can still contribute to the SWOT survey and you don’t have to fill out the whole form – any Strengths, Weaknesses, Opportunities or Threats you can think of are welcome throughout the project lifecycle.  N.B.  The survey has been modified to reflect PESTLE for external factors (I’ve missed out Environmental partly because I can’t think of any but mainly because I only having a basic, free PollDaddy account!)**

Internal factors:

People:

Strengths: Experienced and skilled programmer on team (Mike Taylor) / Experienced project team who have worked together previously / Good working relationship established with commercial partner / Strong buy in from senior staff within University.

Weaknesses:  One member of staff on the team is on a temporary contract; however, this does not expire until after the end of this project / lack of specific skill set required (Java) resulted in early difficulties in basic implementation of API.

Opportunities: Project team has access to necessary skill sets internally.

Threats: Other commitments within project team interfere with short project lifecycle.

Resources:

Strengths: Resources in place – no need to wait for any subscriptions, to go through any kind of purchasing process.

Threats: Actually not yet clear what the extent of the ‘free’ service will be from Thomson Reuters – there may, in fact, be additional subscription costs.

Innovation & Ideas:

Strengths: The ideas behind the project are timely, fit in with University objectives, and will be of value to the wider community / Bibliosight is one of 3 projects working in this area and this is an opportunity to share ideas and maybe even code so that we can get further along then we would on our own.

Weakness: Whilst the original project idea is a strength, other project teams are working in this area, and there is a risk that our work may be superceded either by the commercial developer (Thomson Reuters) or other projects in the sector.

Opportunities: Bibliosight represents a fertile area in current developments in research metrics and innovation & ideas should continue to evolve throughout the project.

Marketing:

Strengths: The project blog is set up, and is linked to by an existing and well visited blog, so the project should attract attention.  Ditto for Twitter. / There is a real advocacy benefit to be reaped if this service fits closely with the users’ workflows.

Weaknesses: Difficult to market a potential service before a working prototype is available.

Opportunities: Possibility of engaging with user community recruited by JournalTocsApi project.

Threats: Lack of local and wider community engagement.

Operations:

Strengths: Project managed / outputs will be implemented by established repository team responsible for overall development of repository infrastructure.

Weaknesses: The 6 month funding structure is a threat in terms of sustainability. What happens if we don’t have a finished service at the end of 6 months, and even if we do, how is ongoing development funded?

Opportunities: As the repository infrastructure itself is still in development there is the opportunity to integrate project outputs more easily into the evolving infrastructure.

Threats:  (As yet unseen) difficulties in appropriate integration with developing repository infrastructure.

External factors:

Political:

Strengths: Potential to fit in with broader political zeitgeist in HE – contribute to developing processes for REF.

Threats: Conflict with commercial interests of Thomson Reuters.

Economic:

Strengths: Low cost project with the potential to deliver a flexible product with wide opportunity for reuse accross the sector.

Threats: Not yet clear what the extent of the ‘free’ service will be from Thomson Reuters – there may be additional, unforseen costs.

Social:

Strengths: Well established (within) #jiscri community utilising Twitter/blogs.

Opportunities: Possibility of engaging with user community recruited by JournalTocsApi project.

Threats: Lack of engagement of wider HE communities.

Technological:

Strengths: Third party API should result in robust application.

Weaknesses: Early question marks around API / documentation – current documentation out of date; revised documentation expected by the end of August 2009.

Threats: Revised documentation won’t be available in line with project lifecycle.

Legal:

Threats: Thomson Reuters commercial model not fully defined – potential implications for reuse


Posted in SWOT | Tagged: , , , , | 1 Comment »

No one said it would be easy

Posted by Nick on August 7, 2009

@laytor has run into early problems trying to implement the API.

Having followed the documentation supplied by Thomson Reuters for the API it is throwing an error that we have been unable to interpret due to lack of specific expertise in Java. It seemed, in fact, that the documentation we have might not be up to date as the terminology seems to be a little at variance with that used in the recent webinar (Introduction to Thomson Reuters Research Evaluation Tools) that @laytor and I attended.

I’ve been in touch with our contact at Thomas Reuters who have confirmed that the documentation is indeed a little out of date and they are on the verge of releasing new versions of many of their WS documents, hopefully by the end of the month. To be fair, I did harangue them somewhat to ensure we had enough info for #jiscri bid and I’m grateful they sent us what was available so quickly.

We are also fortunate that @laizydaizy agreed to help us out with her Java skills even though she is not on the project tem and very busy with the JISC funded, repository-related PC3 project – we have worked closely together before of course on earlier JISC projects – an obvious benefit of networks that have been forged by these projects (though we’re also institutional colleagues of course!)

We are also having a little difficulty disambiguating the various services offered by TR and how the API fits in. The webinar referred to WoK web services lite and premium as well as Article Match and Retrieve and researcher ID.

Here is a summary to my naive questions from the nice man at TR:

Article Match Retrieval service has one very specific purpose: it takes in metadata and pitches out URLs that can be used to link to specific data elements in Web of Science and Journal Citation Reports. As far as I know, you haven’t been entitled to this service, but it would be easy enough to arrange. The service is free.

Web Services Lite: This service responds to queries to return a limited range of data elements from the Web of Science. The fields are Author, Source (volume, number, issue, date, page span), Article Title, Keywords, and UT (a unique record identifier). The primary use case for the Lite service is to populate institutional repositories and is scheduled to be made available within the next two-to-three weeks. This service is also free.

Web Services Premium: This service is a much more robust version of WS Lite and is very similar to the API we sent you earlier. The primary differences are that the service needs to be entitled and has much, much better documentation. WS Premium is scheduled to be available within the next month to six weeks. I’m not sure what the price is (if any) for the service, but we hope to have that sorted out in the very near future.

Though I was present while @laizydaizy and @laytor talked about Java classes and suchlike I didn’t understand a whole lot (to be honest it was like being at some sort of Kabbalistic cult meeting) but I think the gist was that there may be a fundamental problem with the code generated by TRs batch file itself – @laizydaizy got all excited as she pored over the esoterica on @laytor’s double screens (I want double screens!)and put it all on a usb drive to take back to her lair. She’ll let us know how she gets on when she emerges.

The other issue she raised was that the API appears to be using an older version of Java and it remains to be seen to what extent the new documetation is updated. This might mean that even if we do get the thing working we’ll end up with what is in effect legacy code – this is something else we need to clarify with TR and feed back to JISC.

Posted in Uncategorized | Tagged: , , | 1 Comment »

Project meeting – minutes

Posted by Nick on July 14, 2009

Present:  Charles Duncan, Wendy Luker, Arthur Sargeant, Mike Taylor, Babita Bhogal, Nick Sheppard

1.  Apologies

Phil Jones sent his apologies.

Peter Douglas sent his apologies – Charles Duncan attending from Intrallect in his stead.

2.  Project overview

WL chaired the meeting and began by presented an overview of the proposed project; to exploit the Web of Science web-services API in order to promote full text deposit of author versions of published peer reviewed research papers in the Leeds Met repository; to develop an alerting service to alert the repository team/URO when a research paper associated with Leed Met is picked up by WoS; automated communication to a researcher which would alert them to the presence of their citation on Web of Science, and request an author version for the repository; potentially also to import metadata from WoS to automatically populate the repository.

3. Project management and meetings

The project is funded under the JISC Rapid Innovation programme (tag: JISCRI; programme code repository and wiki at http://code.google.com/p/jiscri/) and is due to complete at the end of November 2009.  A rapid development cycle is therefore essential and will be based on the SCRUM methodology recommended by JISC.

  • Team and roles

The team of 6 people comprises:

a) Members responsible for project deliverables

Wendy Luker – Project Manager (or SCRUM master); Arthur Sargeant – Project consultant; Mike Taylor – Web-developer responsible for technical development; Nick Sheppard – Repository Development Officer responsible for project research; Peter Douglas – representative of Intrallect

b) Representative stakeholders who will inform development and potentially benefit from project deliverables.

Babita Bhogal – represents the University Research Office; a potential customer/user of project deliverables; Phil Jones – represents the Carnegie Research Institute; a potential customer/user of project deliverables.

There will be 5 “sprint” cycles; at the end of each cycle there will be a full team meeting to review progress and technical development.  In addition NS/MT will liaise more closely throughout the sprint cycle including face to face on a weekly basis – these meetings may also include WL, AS as necessary.

N.B.  The JISC programme manager has indicated that Bibliosight could benefit from work being done at Kings College with the R4R (Readiness for REF) project and should also liaise with another JISCRI project based at Heriot Watt University that is building an API for ticTocs.

Action:  NS – investigate / establish contact with these projects and provide a detailed overview before the next meeting.

To reflect the scale of projects under the programme, JISC are advocating a light-weight reporting framework utilising the blog as the primary mechanism.  It is anticipated that all team members will contribute to the blog and that the subject for posts will be specified at each meeting in line with 6 subject areas specified by JISC.  These are:  Project SWOT analyses; User participation; Day to day work; Technical standards; Value add; Small wins and fails; Progress report.

Aggregation tag for the project is #bibliosight (blog posts and Twitter updates).  Other relevant tags are #JISCRI, #SWOT, #rapidInnovation, #progressPosts, #UseCase

Action:  NS/WL -  blog initial SWOT analysis in advance of next meeting.

Action:  NS – ensure all team members have administrative access to the blog.

  • Technical

The first workpackage is a “full technical review of Web of Science Web Services API / technical developments required to appropriately integrate API into repository” with the time scale June-July 2009.

NS/MT recently attended a webinar run by Thomson Reuters where they presented an Introduction to Thomson Reuters Research Evaluation Tools which reviewed the API; MT has also reviewed API documentation and has gained the appropriate administrative permissions to run a Java programming environment on his local machine and is now in a position to explore the API in more detail. MT may require technical input from Java programmers at Intrallect and CD confirmed that this would be acceptible under the terms of the bid.

A code repository has also been set up in line with JISC guidelines at http://code.google.com/p/bibliosight/.  This is where any code produced by the project will be stored subject to appropriate Open Source licensing (see below) and the location for all documentation and bug tracking.  The version control system implemented is subversion; as  the only developer currently associated with the project, MT is the only user who requires full  access.

There was also some preliminary discussion around how the API will most appropriately be integrated into the Leeds Met repository; whether WoS data will be pulled directly into intraLibrary or into an external environment for example and what the implications of this might be eg. prototype proof of concept build of intraLibrary.  However, it was decided that initial focus should be on manually mapping the process and on the API itself before these issues can usefully be explored further.

CD raised a technical question regarding the API; whether the interface only supports  SOAP or if it can also supports REST which would potentially provide a lower technical threshold.

Action:  NS – full review of Thomson Reuters services; article match and retrieve; Web-services lite; Researcher ID upload; Researcher ID download; Web-services premium.  Disambiguation of free vs. paid services.  SOAP vs. REST

Action:  MT – explore / implement API and document process.  Establish precisely what information can be extracted from WoS using the API.

Action:  NS/MT/AS – manually review WoS to elucidate desired process i.e. What information do we want and what information can we get a) manually b) programmatically (free vs. paid)

  • User testing and engagement

This will be facilitated through appropriate liaison with BB (URO) and PJ (CRI) and will initially focus on communication – NS is attending the CRI Readers’ and Professors’ meeting on Thursday 16th July – and generating use-cases and scenarios, possibly in collaboration with Intrallect who have experience and expertise in this area.

Action:  NS to attend CRI Readers’ and professors’ meeting on Thursday 16th July for initial communication and feedback.

Action:  NS/BB/PD to liaise to generate preliminary use-cases/scenarios.

4.  Licensing

Software/code/project deliverables are to be made available under appropriate licence agreements in line with JISC guidelines.  The licence provisionally applied at http://code.google.com/p/bibliosight/ is GNU GENERAL PUBLIC LICENSE Version 3 – http://www.gnu.org/copyleft/gpl.html.  This may or may not be suitable for our software requirements; other project deliverables may require different licensing models; requires further research.

Action:  NS to research; liaise with OSS watch to clarify licensing issues

5.  AOB

Administrative housekeeping (unminuted)

6.  Date(s) of next meeting(s)

Given the short project lifecycle, it was decided that provisional/approximate dates should be outlined for all remaining meetings:

  • w/c 31st August 2009 (bank holiday Monday)
  • Late September
  • Mid-late October
  • Early November
  • Last week in November

Action:  NS to call next meeting w/c 31st August 2009

Posted in Bibliosight, SCRUM minutes | Tagged: , , , , | 1 Comment »

 
Follow

Get every new post delivered to your Inbox.