BiblioSight News

Integrating the Web of Science web-services API into the Leeds Met Repository

Posts Tagged ‘#UseCase’

Quick sketch #2

Posted by Nick on November 13, 2009

The diagram below is Arthur’s update of my earlier quick sketch to illustrate what Bibliosight will aim to achieve by the formal #jiscri deadline.

It is numbered and colour coded – stages 1 – 3 (shades of blue) are within the #jiscri timeframe; stages 2 (green) & 5 (buff) will require ongoing work beyond the deadline.

(N.B.  Click on the image for a full size view in a separate browser window.)

Bibliosight

Posted in Bibliosight | Tagged: , , , , , , , | 2 Comments »

Thinking out loud…

Posted by Nick on November 11, 2009

As the deadline for #jiscri draws close I have just returned to work after a month away from Bibliosight and I’m now desperately trying to catch up with the project and determine exactly what we can aim to achieve by the end of November…The candid truth is that we have only very recently got to the point where Mike can actually do some coding and begin to put together a prototype that fulfills the requirements of our (still formative) use-case[s].

Yesterday morning I had a stab at completing a more detailed template for a primary use-case (this comprises a narrative and the use case itself); then in the afternoon I sat down with Mike to catch up with his progress from a technical perspective and to brain-storm around precisely what functions we require from our prototype and how this may be achieved; there are also some outstanding issues of clarity pertaining to Thomson Reuter’s API documentation, specifically “WoS Search Retrieve Codes and Descriptions” in that we currently have unrestricted access to the API but it is my understanding that the free* service will actually be restricted.  We are not certain:

a)  Precisely which of the fields are associated with the restricted subset that we will be able to query and/or return under the current terrms of our WoS subscription*

b)  What some of the fields actually are as they lack a description in the documentation

*Free to us under existing subscription

Disclaimer:  I’m very much thinking out loud here and attempting to translate what I understand are ongoing conceptual issues for Mike as he works through the documentation.

Note:  I’ve continued to refer to ResearcherID – see http://bibliosightnews.wordpress.com/2009/10/02/visit-from-thomson-reuters/ – though it is not a service we plan on implementing as part of Bibliosight, and not necessarily even in the longer term, I’m pretty sure we are likely to require some sort of unique identifier for authors – a subject that is currently receiving a lot of attention from the repository community.

Anyway…looking back over the blog it seems that:

The requesting system can query the Web of Science using the following fields:

  • Address (including Street, City, Province, Zip Code, or Country)
  • Author
  • Conference (including title, location, data, and sponsor)
  • Group Author
  • Organization or Sub-organization
  • Source Publication (journal, book or conference)
  • Title
  • Topic
  • Year Published

The service will support the AND, OR, NOT, and SAME Boolean operators.

The Web of Science Web Service returns five fields to the requesting system:

  • Article Title
  • Authors — All authors, book authors, and corporate authors
  • Source — Includes the source title, subtitle, book series and subtitle, volume, issue, special issue, pages, article number, supplement number, and publication date
  • Keywords — all author supplied keywords
  • UT — A unique article identified provided by Thomson Reuters

The test queries that Mike has submitted to the API have returned XML that appears to be both more granular than indicated and that includes fields other than those that constitute these five (e.g. abstract) so the first thing to do, perhaps, is to contact Thomson Reuters and see if they can apply the restrictions that we will ultimately need to work with, if only to remove some of the noise and make it easier to see the wood for the trees.

The API documentation actually lists over 100 “fields”; only a handful of these are actually described in the documentation, however, and while many are reasonably transparent, others are a little less so and some look like they may duplicate information – or are they perhaps used as alternatives? (e.g. bib_id = Volume, issue, special, pages and year data / bib_issue = Volume and year data).  There is also some lack of consistency in this bibliographic info on a record by record basis; we need to ensure that we have consistent XML being returned for all records – hopefully we can then develop a template in intraLibrary itself that reflects that consistent XML as closely as possible such that we can devise an XSLT style-sheet to perform the approriate transformation.

Mike already has a desktop client that will take XML and perform an XSLT transformation so, once we have clarified the LOM format we require (an action for me from the last meeting), it *should* be relatively straightforward to plug into the WoS API to retrieve XML from the Web of Science which can then be transformed into appropriate LOM.

Then we need to ingest that LOM into intraLibrary, preferably using SWORD…which I shall think about another time!

Posted in Progress post | Tagged: , , , , | 1 Comment »

Generating use cases

Posted by Nick on July 22, 2009

According to the web-authority that is Wikipedia “a use case describes ‘who’ can do ‘what’ with the system in question. The use case technique is used to capture a system’s behavioral requirements by detailing scenario-driven threads through the functional requirements.” http://en.wikipedia.org/wiki/Use_case

A template for developing a use case is outlined as follows:

  • Use case name
  • Version
  • Goal
  • Summary
  • Actors
  • Preconditions
  • Triggers

The “use cases” in the original bid really just comprise the summary part of this template and I would now like to work with the University Research Office (and other potential “actors”) to flesh out these sketches and develop new ones:

  • The repository team/URO are automatically notified when bibliographic information about an article associated with Leeds Met is available in Web of Science. Such a facility can be incorporated into the workflow to ensure citation data is up to date in the repository.
  • Researchers have expressed the wish for targeted communications regarding their outputs which would encourage them to deposit an appropriate author produced version of a recently published / cited article. A link to Web of Science could therefore produce an automated communication which would alert them to the presence of their citation on Web of Science, and request an author version for the repository. This would be much more useful to them than a regular, generic reminder to deposit their publications, and the timeliness of it would make deposit a more likely outcome. It would have the potential to contribute to advocacy of the repository service by providing evidence of the putative link between Open Access and increased citation rate.
  • The Research Excellence Framework (REF) that will replace the Research Assessment Exercise (RAE) in 2010 is yet to be finalised; it is likely to make greater use of quantitative measures of assessment, such as bibliometrics. The need exists, therefore, to implement technologies that facilitate the extraction and collation of relevant data for use by institutions, individual academicsand HEFCE. It is also important to develop use-cases that inform the evolving process of the REF.

In the first instance I hope to have a discussion on the blog to more clearly elucidate our goals, preconditions and triggers before sitting down with our actors (URO/academic staff).

Posted in Use cases | Tagged: , | 6 Comments »

 
Follow

Get every new post delivered to your Inbox.