Monday, 20 October 2014

Do we need a bioinformatics dead links database?

Keith Bradnam on his ACGT blog has recently highlighted the irksome issue of “link rot” - the unfortunate but all-to-common scenario in which a publication points to a URL that ceases to exist after an oft-too-brief time has elapsed. (In severe cases, before the paper is even published!)

Personally, I think that journals need to take a bigger responsibility in maintaining the integrity of their publications. Surely it is not beyond a big publishing house to have a bot that systematically checks their published links once a week and reports and errors? Authors could be notified and/or online manuscripts updated. In severe cases where the URL is critical to the paper, such as databases and webservers, lack of repair could even lead to retraction.

The problem goes beyond publications, however. For (sometimes mysterious) reasons of their own, even established biological resources quite often change their URLs too. For example, Pfam embeds the Wikipedia entry for a domain into their pages. I was browsing the globin page last week and clicked on a link to see the entry in Interpro, only to be greeted with:

Sorry, the requested URL has changed

The requested URL: http://www.ebi.ac.uk/interpro/IEntry/ac=IPR002338

has moved to

/interpro/entry/IPR002338;jsessionid=BB50E35E3DD96F6C83B7AD432B8E2E2A

You will automatically be redirected to the correct URL in a few seconds. Please update your bookmarks.

It redirected OK but I wonder how long it will be before that redirect itself disappears… and how many Wikipedia articles and database cross-references run the risk of breaking.

Perhaps we need a database where bioinformaticians can report, track, update and hopefully fix all these rotten links? In the meantime, if anyone spots any in my own webpages or papers, please let me know!

No comments:

Post a Comment

Thanks for leaving a comment!