1 x Cite
Embed
1

The other major addition that we propose for SWORD is the content which lies under the manifest link. The following information may be of use to rich deposit clients to enhance the user experience:

2 x Cite
Embed
3

1. The state of the item in the repository

3 x Cite
Embed
1

Most repositories, at a high level, have 3 states that items are likely to be in (based on analysis of DSpace, EPrints and Fedora software):

4 x Cite
Embed
3
  • In preparation: this typically lasts a short while, and is the state that new repository items begin life in. During this stage, files and metadata are being added to the item, and it is generally only worked on by a single user. Repository administrators do not usually have access directly to the item.
  • Under review: once a repository item is fully prepared it is usually injected into some form of review process. Repository managers carry out general tasks like metadata verification and copyright compliance, or tasks more specific to their environment or organisation such as appropriateness for the archive. Within the review stage, there is no clear common workflow, as usages for repositories and for SWORD are sufficiently broad as to encompass many different workflows.
  • Archived: once review completes the repository item is archived. In some cases this may mean that the item has been made public under a stable identifier, while in others it may mean that the item has gone into the dark archive for preservation. Purposes for and consequences of archiving are not covered here.
5 x Cite
Embed
2

Additionally, we expect to require states such as:

6 x Cite
Embed
0
  • Deleted: the repository item was at one point in the Archive, but has been removed
  • Rejected: the repository item was rejected from the repository before reaching the Archive
7 x Cite
Embed
0

It is suggested that we introduce a short ontology to cover the above states, and for the serialisation of this in the manifest document to be extensible such that additional states can easily be added, either by the SWORD standard at a later date or by specific implementation for local needs.

8 x Cite
Embed
0

It is also desirable to offer the repository the opportunity to describe what each of those states means to them; this would be similar in ethos to the sword:treatment element, which allows the repository to describe what processes have actually happened to the package during deposit [SWORD spec section B.9.8], except that it is giving details on what is currently happening to the deposit.  With each of the states represented by a URI this would also allow for the addition of non-standard states which are still comprehendible by the client.

9 x Cite
Embed
1

With the ability of the repository to describe these states it would be easy for SWORD clients to allow depositors to keep track of their submissions, giving them feedback on their progress, rather than being subject to the current fire-and-forget approach.  The potential for creating rich clients, and for making users feel invested in the process of deposit would be increased.

10 x Cite
Embed
1

In summary, we would expect to add two elements to the manifest:

11 x Cite
Embed
0
  1. An assertion to the state of the item (In preparation, under review, archived, etc.);
  2. An identifier which resolves to a description of the meaning of that state in the repository.
12 x Cite
Embed
0

Furthermore, since this paper is intended to generate discourse around the direction of the standard, we are also looking for common and generic states which are useful to systems which are not necessarily repositories.

13 x Cite
Embed
1

2. Information about the unpackaged object in the repository

14 x Cite
Embed
1

Many repositories are likely to unpackage the incoming content and import it into their native format. This typically includes:

15 x Cite
Embed
0
  • A top-level object: this is effectively the equivalent of the package – it is the container within which all the content is held;
  • Associated files: the actual content of the item, as extracted as individual files from the package. This may also include additional files created by the repository, such as format conversions, thumbnails, etc;
  • Metadata: anything from structural, administrative and bibliographic data about the item, provided in the package as metadata files or within the manifest itself, depending on package format. As with the files, this may contain metadata that was not in the original package;
16 x Cite
Embed
1

It may be of interest to clients to be able to display this information to their users, for the purposes of creating a rich deposit environment.  It is certainly possible to include identifiers for the individual files which would allow further operations to be performed on them; for example HTTP DELETE could be provided to allow the client to subsequently delete individual files, or HTTP PUT could be used to replace one file with another.

17 x Cite
Embed
0

Including the metadata in the description of the item would be a significant challenge, and is therefore not addressed here.  Future work in this area could consider operations similar to WebDAVs PROPFIND and PROPPATCH operations as a starting point [17], or by performing HTTP PUT to the Entry document for the repository item.  It is unclear how useful or successful these approaches would be.

18 x Cite
Embed
2

It is the recommendation of this paper, therefore, that the basic structural information concerning the unpackaged item is made available in some common serialisation, and that this explicitly excludes any treatment of the actual metadata content.  Instead we would focus simply on the content files.

19 x Cite
Embed
1

[17] WebDav: http://www.webdav.org/specs/rfc2518.html