Categories
SWORD v2

Decisions regarding the challenges of SWORDv2

Following some great recent discussions by the SWORD Technical Advisory Panel, we’re pleased to announce a few decisions that have been made regarding some of the details for the new version 2 of SWORD.  The full email announcing the decisions is shown below, or can be seen in the list archives of the technical advisory group: http://www.mail-archive.com/sword-app-techadvisorypanel@lists.sourceforge.net/msg00105.html

The decisions came about from discussions within the group over the past few weeks.  They relate to the following questions:

  1. Whether the Statement should be embedded in the Deposit Receipt or be a separate document referenced in an atom:link element: In order to allow SWORD v2 to move from a fire-and-forget methodology to one where a SWORD client can interact with the deposit through what we’re calling the ‘deposit lifecycle’, some form of feedback is required where the client can ask the server for details of what has happened to the deposited item(s).  The proposal is to support this via the provision of a ‘statement’.  Think of it a bit like a bank account statement: You can see what has gone into the bank account (deposits), what might have have happened to the deposit (e.g. interest being added), and full details of of the item.The question here, was whether a copy of the statemnet should be given to the SWORD client when it makes the deposit(s), or if the client should ask for a copy of the statement whenever it wants it.
  2. Whether to use OAI-ORE for the Statement format or an Atom Feed (as per CMIS and GData): There is a decision to be made as to how the statement should be formatted.  Should it be formatted as an OAI-ORE resource map, or using an Atom Feed.  There are pros and cons for each method.
  3. How the client and server should negotiate over the format of the content returned by the edit-media link (EM-URI): If multiple formats of statement are allowed, how should the client and server come to an agreement as to which is the best format to send, based upon a combination of the servers capabilities and the clients preferences.  This problem is known as content negotiation.

The full email below outlines these problems, and the decisions made.  The next job is to now attempt the implementation of the standard, and based on the experiences of the developers and initial users, the standard will likely become refined further.

Dear All,

Thanks for your extensive feedback on the various issues that we have been discussing on this list, it has been really valuable for the project team to get this input.  We have, we think, identified 3 particular issues of contention:

1/ Whether the Statement should be embedded in the Deposit Receipt or be a separate document referenced in an atom:link element

2/ Whether to use OAI-ORE for the Statement format or an Atom Feed (as per CMIS and GData)

3/ How the client and server should negotiate over the format of the content returned by the edit-media link (EM-URI)

The project team has gone through each of these issues carefully, and attempted to extract the simplest solutions but with a view to keeping the SWORD 2.0 specification quite open at this stage, so that community best practices can actually inform the standard itself in the long run.

Therefore, we’re proposing the following approaches to these issues:

1/ Whether the Statement should be embedded in the Deposit Receipt or be a separate document referenced in an atom:link element

If the Statement is to be embedded in the Deposit Receipt, then it needs really to be in OAI-ORE form, for the purposes of being clear foreign markup.  Nonetheless, bearing in mind that there is a question as to whether the Statement should be an Atom Feed, it is clear that this solution will not be adequate by itself.  We therefore propose that the standard provided to the project’s funded developers to code against says that an OAI-ORE serialisation MAY be embedded in the Deposit Receipt (the Deposit Receipt will not be required to meet the OAI-ORE spec for being a resource map itself).

Alongside – or instead – of this, there MAY be one or more atom:link elements in the Deposit Receipt which link to an external Statement. These atom:link elements can specify their type attribute to say whether they are an application/rdf+xml or  application/atom+xml;type=feed.  It will be a requirement of the spec that there MUST be an embedded Statement or at least one separate Statement.

Therefore, you may see a Deposit Receipt like:

<atom:entry>
  <atom:link rel="http://purl.org/net/sword/terms/statement" type="application/rdf+xml" href="http://....."/>
  <rdf:RDF>
    <!-- ORE statement goes here -->
  </rdf:RDF>
</atom:entry>

2/ Whether to use OAI-ORE for the Statement format or an Atom Feed (as per CMIS and GData)

Another good reason for the approach in (1) is that this means we can provide different Statement URIs with different type attributes.  We plan to ask developers to produce an ORE and an Atom Feed Statement format under the project funding.  So you may see a Deposit Receipt like:

<atom:entry>
  <atom:link rel="http://purl.org/net/sword/terms/statement" type="application/rdf+xml" href="http://....."/>
  <atom:link rel="http://purl.org/net/sword/terms/statement" type="application/atom+xml;type=feed"href="http://....."/>
   <rdf:RDF>
      <!-- ORE statement goes here -->
   </rdf:RDF>
</atom:entry>

The combination of approaches in (1) and (2) may seem woolly or indecisive, but we believe that we can’t determine in advance which of these approaches is better, and that it should be up to the community of users and implementers to decide which approach works best based on actual usage of the developed software.  Therefore, while the burden of implementation is placed on the funded portion of the project, we expect community driven implementations/usages to favour one approach over another (possibly taking into account things like compatibility with GData and CMIS, or preferring the more semantic web approach of ORE). We can then use this information later in deriving a SWORD spec which is based on best practices.

3/ How the client and server should negotiate over the format of the content returned by the edit-media link (EM-URI)

The Content Negotiation issue arises from the fact that AtomPub requires at most one edit-media URI with a given type to be available in the Atom Entry (Deposit Receipt).  Since the SWORD server may contain multiple files rather than the one file that AtomPub assumes, what this EM-URI returns under GET is unclear.  We initially considered 2 approaches:

a/    A separate HTTP header like Accept-Packaging to allow content negotiation on a package format
b/    A separate HTTP header like Accept-Media-Features to allow general content negotaiton on feature sets

As we discussed, both of these have pros and cons, and none of the approaches to doing this are marked by any best practices, which makes the project team unwilling to commit to anything too complex or substantial, at a risk to the simplicity and overall success of SWORD. Instead we are suggesting adopting a much simpler approach:

The Deposit Receipt can contain already contain a sword:package element (as per SWORD 1.3), and SWORD 2 plans to allow an arbitrary number of such elements.  These elements will describe the packaging formats supported by the server, so the client will know in advance what the capabilities of the server are.  Therefore, instead of engaging in a content negotiation process, the client will just specify a separate HTTP header indicating what package format should be returned.  Whether this header re-uses the Packaging header used during deposit or specifies a new header has yet to be decided.

Hopefully these approaches make sense to the group.  We are interested in how you think these will go down both during the project and beyond in the community, and if there are any obvious problems with what we’re proposing here as the way forward for SWORD.

All the best,

Richard
(On-Behalf-Of the SWORD project team)