Much of the work which has led up to this paper has been carried out at Symplectic [21] while implementing a full CRUD (Create, Retrieve, Update, Delete) API against a number of digital repositories using AtomPub with SWORD extensions [22].
The objective of this work was to integrate their publications management system (Symplectic Elements [23]) into a variety of digital repositories, to allow the provision of a rich deposit client which integrated smoothly with the researchers day-to-day use of the publications system. Its functionality allows academics to upload individual files to the repository over a period of time, making the use of a content package difficult.
In order to meet the requirements, a full AtomPub implementation with SWORD extensions was developed which provided a consistent API across the repositories, allowing them to be interchanged at will, without the client needing to be aware of the changes. The result has been a simple and effective interface to these digital repositories, abstracting the user away from the need to even necessarily be aware of the repository. It answers the limitations to SWORD laid out in the Introduction:
- External systems have to do some of the work of a repository: The repository is considered the authority for file content, and file uploads go straight into it.
- Users must know when an item is archivable: It provides a simple file management interface, and does not explicitly question users on when an item is strictly complete; modifications are always possible later.
- Full AtomPub profile for SWORD is unclear: It uses the full range of AtomPub features, adapted for usage with SWORD, some of which have been drawn on for this paper.
- Dependence on structured packages: By not using packages in the same way, it bypasses the problem of needing a repository to be able to understand some particular package format.
During the SWORD 4 project, Symplectic will provide to the community an open source server implementation of the proposed next version of the standard based on its exiting Repository Tools system. This will contain a number of additional technical proposals on the direction of the standard and its implementation for comment by the community, thus helping the project define the next version of the standard in full. This will not be the final implementation of the SWORD 2.0 standard, but a discussion piece for developers to work from. The repository platform used for this example will be chosen based on demand from the community.
If Symplectic have already done a whole load of this work, rather than propose it why don’t they release their methodology back to the community for free?
It seems to me like that this proposal has now been written by someone who already has the problem solved to some degree. Question is should JISC pay a commercial company to open source something they have already done?
Personally I’m more in favour of providing the money to those already in open source projects to go and simply re-implement it. Thus this is not seen to be directly benefitting a commercial organisation.
It’s not clear to me how item 4 (the repository not needing to understand a particular package format) is being addressed here. How is the initial submission of a multi-part dataset achieved?