How many, how fast?

No, I know I’m clearly not referring to the number of blog posts recently (unless the answer is very few, very slowly). So, what’s been happening in Open Repository land?

Well, after a very busy couple of days with the DSpace User Group sessions (the presentations from which are now online), we’ve been hard at work tinkering under the hood of Open Repository.

All of our repositories are now registered within Google’s Webmaster tools service, which means we can keep a track on the indexing of the sites, and if any problems are being encountered. Thanks to this, I’ve been able to track down a couple of small bugs hiding in the RSS feed code, and the XHTML headers that caused problems when viewing a tiny minority of items. All problems that this has shown up have now been rectified.

There is also a new tool available that allows repository admins to match parts of the metadata when it is filled from a PubMed ID or DOI, and insert additional metadata into the item automatically.

But, back to the original question, which refers to some changes to the file-type analysis tool. As you know, this was enabled for all our clients just before the Open Repositories conference. However, it was a bit slow. Actually, quite a lot slow. For a repository with 2000 items in it, the initial page took over 3 minutes to display. In fact, with 2000 items, every page in the analysis tool would take at least 3 minutes, and in the worst cases would even take over 10 minutes to display.

This has now been improved slightly. For the same repository, the initial page will typically display in under 5 seconds. And every page of the analysis tool will start to display in under 5 seconds – even for the worst case scenario of listing a breakdown of 2000 items (which should complete downloading in about 20 seconds).

I say generally, because sometimes the pages do take a bit longer to load – for example, if it hasn’t been used in a while, then the database may need to do a bit more work to load the necessary data. But even in those cases, it is at least usable now.

So, if you haven’t looked at the file-type analysis tool – or refrained from using it due to the performance – now is the time to give it a chance.


View the latest posts on the Research in progress blog homepage