Record duplication – do not ask what trial registries can do for you…

Trial registration aims to give a comprehensive and ambiguous view of health research. However when users search aggregating sites such as the World Health Organization International Clinical Trials Registry Platform (WHO ICTRP) , they will often be presented with multiple records of the same trials, as reported in this article published in Systematic Reviews. Here Helene Faure explains why record duplication happens and what more could be done to highlight overlaps between trial registries.

Facts or registries’ life

Current WHO-adopted registries represent specific geographical areas and aim to be comprehensive and authoritative, but they also have to adapt to the reality of health research and its collaborative nature. The same studies may be listed in different sources. As the WHO ICTRP International Standards acknowledge: ‘it is possible that the trial will need to be registered more than once in order to meet the ethical, legal or other requirements of each country.’

The language aspect should also not be ignored. A number of registries need to engage participants in their own language and additional efforts are needed to curate information both in the local language and in English for ICTRP purposes (the Brazilian registry offers three languages). Even amongst English language-based registries, objectives may be different: focused on very structured content ( or expected to provide a plain English summary (ISRCTN).

All registries have their distinctive reason for existence, even if they all share common values and common standards. The issue of record duplication is a well-known fact of our activities.

When faced with duplication, aggregating sites wonder at first whether they should consider creating a ‘best’ record summarising all duplicates. This means creating an additional curation level which is labour intensive, when in fact different layers of information will serve different purposes and readers will decide which record is more informative. To the question ‘should identified overlaps be eradicated or displayed better, I would say the latter’.

To the question ‘should identified overlaps be eradicated or displayed better, I would say the latter’.

Highlighting duplications

Acknowledging that duplication happens does not mean that registries do not have policies in place to track the nature and extent of overlaps.

As per International Standards ‘If a trial is registered on more than one registry then all known identifiers for the trial should be submitted to each registry as Secondary Identifiers. These include trial registration numbers allocated by other registries.’

All registries encourage researchers to check whether their study is already registered (for example Australia New Zealand registry ‘how to register’ guidance). Then respective editorial teams carry out a number of cross checks on condition, country, funder, sponsor, name of investigator, number of participants involved etc., in order to establish whether a new submission could be a duplicate.

They ultimately present any findings to applicants. This might lead to clarifying the title or some aspect of the record in order to eliminate any ambiguity with other pre-existing records. Follow-up and/or nested studies can sometimes bring up the question of whether they warrant their own trial ID. ISRCTN has general rules and also adopts a case by case approach. Decisions are made clear in fields such as Title and Editorial Notes.

Such checks do not mean that mistakes have never taken place. The evolution of the WHO ICTRP Minimum Data Set over the years has shown that a number of elements are needed to describe a study. The number of internal duplicates remains low and, for example, in over 10 years of existence and over 14,600 records, the ISRCTN registry has only withdrawn under 40 records.

The WHO ICTRP uses the alphanumeric trial IDs as hooks between registries. Registries take great care in recording those ‘hooks‘ from other registries, either in specific fields (for those registries where the likelihood of overlap is greater, e.g. or the European registry EU-CTR) or in the secondary IDs fields. This secondary ID field will also contain information that is meaningful to the data providers, even if it is not a recognised trial ID, as registries have to balance local needs and international requirements.

Due to its regulatory remit, EU-CTR operates a bit differently at the moment. Its underlying database EudraCT was initially designed to receive trial details from all EU member states. Duplication has been in existence right from the start but regularly published statistics distinguish between number of records and number of trials. There are likely to be changes in the way the information is presented when the Regulation EU 536/2014 becomes effective in a few months.

Looking for, identifying and flagging hidden duplicates (i.e. records that will not quote trial IDs from other trial registries) seems like a never ending task. In good faith, registry 1 will have checked that a trial is not listed in likely candidate for overlap registry 2 but studies evolve and new countries can be added without the initial registry being notified immediately.

Do not ask what registries can do for you…

Identifying overlaps between data sources could be seen as the role of aggregators. However the article (LINK) makes it very clear that it is quite labour intensive. The WHO ICTRP Universal Trial Number (UTN) was once seen as a possible solution but it has never been clear how a number which is not attached to the very dataset that unambiguously describes a trial and is not also publically available for accountability can actually resolve the issue.

Portals such as the WHO ICTRP site often do not have enough resources to trawl through hundreds of records and have to limit their remit to providing a more comprehensive ‘one- stop shop’ for trials. The ICTRP focuses on encouraging best practices and is also planning the addition of new data elements. There are ongoing discussions regarding results reporting, as most registries do record publications resulting from trials and it makes sense for the portal to receive more informative data feeds.

WHO ICTRP is focusing on standards for results.

Other stakeholders have a part to play. Pharmaceutical sponsors may well be asked to make one single submission for the EU-CTR feed to the WHO ICTRP in the future but they are also likely to have to keep on complying with local regulations.

All registries expect records to be updated at regular intervals and will always welcome corrections. ISRCTN strongly encourages readers as well as systematic reviewers to contact us should they spot records that could reference other trial IDs in order to make the WHO ICTRP even more efficient in presenting related records.

Trials can be updated at all times in registries.

Registries are often asked ‘how many unique trials are taking place in country x for condition y’. To provide a bigger picture, they can only direct to other registries and/or aggregators and stress caveats. One major caveat is that, as rightly highlighted by the article, overlaps are always underestimated.

A ‘proliferation’ of new registries will not necessarily make the identification of duplication worse. Recently accredited registries are very likely to be even more aware of the issue and to plan accordingly. By meeting local needs, they might also focus on studies beyond international drug trials and therefore not registered anywhere else, and so help the WHO ICTRP provide a greater coverage of clinical trials worldwide.

View the latest posts on the On Medicine homepage