An email has just been sent to the SPARC-IR email list about the Repository66 maps saying:
Stuart Lewis’s world map of more than 1,000 repositories at http://www.repository66.org will be a centerpiece of the SPARC repositories meeting November 17 & 18, 2008. Don’t let your dot on the world map be overlooked. Please take a moment to update your listing at openDOAR or ROAR as soon as you can.
I thought I’d give a overview of how repositories get added to the map, so that the process is transparent. It works as follows:
- Periodically a script is run that downloads a list of repositories indexed by ROAR. Repositories that are added to ROAR can either have a location set by the user, or ROAR can take a guess at the location of the repository from its IP address.
- The script then downloads a similar list from OpenDOAR. Repositories that are added to OpenDOAR have a location added and checked by the OpenDOAR team.
- If the location of a repository is not known by either directory, it is flagged in a list for me to fix. When I have time, I locate these repositories.
- Repositories are matched in each directory by use of their OAI-PMH base URL. If this isn’t possible, the script tries to match on the normal repository web URL.
- The data is then automatically ‘mashed-up’ to create a single dataset, which is then loaded on to the website.
The list of repositories on the site comes from ROAR, but is augmented by data held in OpenDOAR. If a repository is listed in OpenDOAR, but not ROAR, it is not currently included in the map. The reason for this is that it is highly likely that there are entries that exist in both directories, but have slightly different OAI-PMH URLs stored in each. By using all repositories from both directories would cause duplicated repositories on the map. So one of the directories needs to be the master list.
The reason for using ROAR over OpenDOAR as the definitive list is that its entry criteria are not as strict as OpenDOAR’s, so potentially contains a wider spectrum of repositories. However now that the list of repositories in OpenDOAR is larger than in ROAR, this decision may be changed. I’m open to persuasion on this!
Tags: Repository map mashup