Sunday 29 April 2012

SeqSuite: another blog is born

Although this is my blog, I feel a bit funny about explicitly blogging about work stuff in a blatant "look at what I've done, please read/use/cite it!" fashion. I have therefore created a new blog explicitly for that side of things. Actually, it has existed for a while but not really had much content until today's post about SLiMMaker (my second solo excursion into python CGI). The blog is currently titled SeqSuite: open-source bioinformatics in Python and the central theme is the application and development of the various tools that I have cobbled together over the years, although I hope to diversify into other related stuff should time allow.

The main focus is actually SLiMSuite, which is the collection of short linear motif (SLiM) tools that I have been involved in developing. (The term "SLiM" will probably be my longest lasting contribution to science. It's weird seeing it start to appear in textbooks! (Although I must point out that I did not invent the concept. I'll save SLiMs for another post, I think.)) SLiMSuite is both the main pillar of my research but also, I think, the more original and unique of my software. (Some of the other stuff I have made because I have been too lazy to hunt down something that does what I want.)

Despite the ascendancy of SLiMSuite, I've stuck with "SeqSuite" as the name, though, as (a) other "Slim Suites" seem to exist and I want to head off any legal objections in advance, and (b) I have some other tools that are nothing to do with SLiMs and, along with SLiMSuite, form the larger SeqSuite package.

I'm still not entirely sure how the two blogs will work - finding the time to write one blog is hard enough - but I think I will continue to post the more "human" aspect of work-related matters here, and the more technical aspects on the SeqSuite blog. (Hopefully, I can convince some of my collaborators to contribute there too.) I think, like a lot of academics, I realise the importance of trying to communicate what we do, but haven't quite worked out yet the best way to go about it.

Extreme cat ladders

There are cat ladders, and then there is this!

Tuesday 24 April 2012

CGI, where've you bin all my life?

I'd been meaning to play around with CGI (Common Gateway Interface) programming for some time as a way of making simple functional websites. I finally got round to it last month, thanks to a great introductory page at Tutorials Point. What I did not realise is quite how easy it was.

I've still only really scratched the surface and scanned over the page to get something up quickly but, in essence, you only need three things:
1. A webserver that supports CGI.

2. An html page containing some "form" code that contains a submit button and (optionally) some input options (e.g. text boxes or checkboxes).

3. A python script (or another language) that generates HTML code based on the variables and values from the form.
And that's essentially it. Actually, you don't even need (2), as you can feed variables directly to the cgi script, but it makes it easier for the user, I think.

My first attempt at this can be found here. It's a bit of silly fun but it shows what can be done with just a few simple lines of code. I've cheated a little bit by using an existing python module to generate the middle of the HTML code but, in a way, that's the point - you can easily adapt existing functional code to output text.

In this case, I use the random "Zen wisdom" text strings that are generated in my code to lighten up error messages when I'm debugging. (If you ever use one of my programs, you sometimes come across such an error message, which always causes confusion (and usually embarrassment for me!) but I think it's a small price to pay for making debugging more fun!) The scary thing is how often the random Zen Wisdoms sound deep and meaningful, e.g.
"It is bold to play jenga with blocks of passion."
Well, maybe not that deep and meaningful!

Saturday 21 April 2012

Move over CIA World Factbook, Globe-Town is here!

I don't know an awful lot about the World Bank but it has recently been drawn to attention that they currently have an "Apps for Climate Change" challenge. One of the entries is lead by Jack Townsend, a postgrad student in one of the labs that I collaborate with and part of the University of Southampton's Web Science Doctoral Training Centre. It's called Globe-Town and it's a really good-looking and user-friendly way to navigate around the World Bank Statistics:

The blurb:
Modern life is making us next-door neighbours with far-away places. We share major risks, responsibilities & opportunities with countries around the world, something that is never more applicable than to climate change which connects us all through the environment we share. Globe-Town is a way of viewing information about the countries of the world and exploring how they are connected together. With Globe-Town we can learn about the places we depend upon most and discover what connects them to us.
At the top of the page, you can pick a country and a climate-related statistic and get a nice visual on how that nation is connected to other countries. Below, Globe-Town offers data from the World Bank and rates them according to the World Average. These are divided into three categories: Society, Environment and Economy.


Clicking on the different sub-categories then brings up some graphs looking at that particular metric over time and, perhaps most interesting at all, comparing the numbers relating to your country of interest with those of the highest and lowest countries. I'm not sure how great all of the underlying data is - you can get some weird things like 140% of children completing primary school - but Jack and co. cannot be blamed for this as it all comes from the World Bank.

All in all, this a fun, educational and informative site and well worth a vote for the App challenge.

Monday 16 April 2012

Short-cited VII: does Open Access publishing encourage bad science?

The argument about Open Access publishing is one that is going to run for some time, I am sure. (See yesterday's piece in The Economist, for example. One spectre that raise its ugly head from time to time, though, is the notion that Open Access is bad because it encourages bad science. In essence, the argument goes: Open Access is cheap for the publishers, therefore they are willing to publish anything. In support of this, I have seen people cite the high acceptance rates at PLoS and BioMedCentral of (apparently) 50-70% as an indicator that they will publish pretty much anything. I've even seen this used in favour of removing pre-publication peer-review altogether in favour of post-publication peer-review, as in the case of WebmedCentral - the implication being that, in the Open Access world, pre-publication peer-review is useless.

I think this is all rather unfair on Open Acess journals, especially the likes of PLoS and BMC. The latter have just done away with the need for the science to be deemed "interesting" as long as the science is sound. (And this for only some of their titles.) This is a far cry from publishing anything and peer-review being worthless.

The peer-review process involves highlighting things that need to be changed to meet the "scientific soundness" criterion. Most authors then go ahead and make the required changes, usually improving the paper as a result. I would argue that this is the single biggest utility of peer-review; as an author, I want to go through this process myself, even if it is sometimes irritating. The fact that PLoS and BMC publish 50-70% of papers just indicates that scientists are capable of producing scientifically sound work (following revisions) 50-70% of the time. The higher rejection rate at Science and Nature is more to do with what science is perceived as being of general interest, rather than the soundness or quality of the work. (I also suspect the acceptance rate for the flagship PLoS and BMC journals that do stipulate a need for impact is considerably lower than across the whole series.)

The peer-review process and quality of reviewers is not always perfect but it is still better than none at all. I am very reluctant to ever publish in a journal that does not have pre-publication peer review and it is not just about the journal's "Impact Factor"; the quality of the papers is going to be lower without the opportunity for revisions in the face of peer-review and at least the threat of being in that 30-50% that aren't deemed scientifically sound enough to publish.

The biggest problem in the modern scientific literature is the pressure on scientists to publish too soon and too often, not the presence of journals that are willing to publish boring science. Until we get rewarded for the quality of our science and not the quantity or impact of our papers (first is not always best), I cannot see this changing, sadly.

Sunday 15 April 2012

Why the Titanic is a big deal to Southampton

There are several places that stake a claim to the RMS Titanic, which struck an iceberg one hundred years ago today (twenty minutes before midnight, ship's time) and sunk just over two and a half hours later with over one thousand people still on board. She was built in Belfast, set sail from Southampton, and last stopped at Cobh in Cork (Ireland), then called Queenstown. All three have Titanic museums, including the new Sea City museum in Southampton, which opened recently.

I think that Southampton stands out, however, because the claim is not really about prestige or fame but genuine heart-felt loss. The reason for this is quite simple and horrifying: four fifths of the crew and about a third of the 1514 who died came from the city. This is approaching 0.5% of the population of the city at the time. This is perhaps shown best at the Daily Echo interactive map of the Titanic crew victims and survivors from Southampton:



Another way to put things in perspective: in that one night, Southampton lost almost as many people as during the entire Blitz in World War II. (Southampton was bombed 57 times, destroying or damaging over 45,000 homes!) Some of the individual stories are worse even than this. At Northam School (near the current Saints stadium, St Mary's), for example, almost half of the 250 pupils lost their father in the disaster.

Living in Southampton, therefore, it seemed wrong not to write a short post about it. (Although, not coming from Southampton, I do not really think that I have a true sense of how the impact of the Titanic's loss is still felt.) So next time you are watching Kate and Leo in 3D, spare a thought for Southampton.

Friday 13 April 2012

Eyes in their stars

I'm currently working my way through Evolution's Witness: How eyes evolved by Ivan Schwab. When (if!) I've finished, I'll review it. So far, I am not entirely sure how much I like the writing style but it is certainly packed with great images and interesting facts.

One such interesting fact is the discovery (for me) that certain kinds of brittle stars (echinoderms that are closely related to starfish) have arms that are covered in eyes. One such critter is Ophiocoma wendtii (Picture from the Wiezmann Institute). These have all sorts of interesting abilities, such as being able to regenerate lost limbs (given their name, I guess this happens quite often) and some species reproduce by asexual fission of the central disc, producing two "daughters" with three long legs and, while they regrow, two short ones. I'm not sure if Ophiocoma wendtii is one that reproduces this way but it's the eyes that I'm interested in here!
Eye close-up
Left, is a scanning electron micrograph from Photonic structures in biology by Pete Vukusic & J. Roy Sambles (Nature 424, 852-855) of the top of one of the arms (top panel). Each bump is a separate "eye" featuring a calcite microlens that focuses light to photoreceptor cells within the arm tissue. These cover the surface of each arm, meaning that the brittle star can detect light across its whole body.

A close-up of an individual lens is also shown (bottom panel). (The scale bars are, 10μm, so these things are small!) The physics is a bit beyond me but these little eyes feature a "double-lens design" that "minimizes spherical aberration and birefringence that might otherwise degrade the optical function." In other words (I think), these critters get a fairly undistorted view of the direction that light is coming from.

Potentially, they can use the pattern of light and shade to help avoid predators. Perhaps more fun, though, they also use pigment-containing chromatophores to change colour at night in response to light levels. Not to shabby for an animal that we last shared an ancestor with over 500 million years ago!

Thursday 12 April 2012

Short-cited VI: the scientific skullduggery of the citation cartel

In yesterday's post, I briefly mentioned and linked to an interesting blog article on "The Emergence of a Citation Cartel", in which editors from one journal are exposed writing review articles in a different journal that almost exclusively cite their journal's papers from the last two years, thus boosting their Impact Factor.

The most striking case cited was a review article in Medical Science Monitor by Eve, Fillmore, Borlongan and Sanberg: "Stem cells have the potential to rejuvenate regenerative medicine research". This review cites 495 papers (at an average of around 9 words per citation!), of which 445 are from "Cell Transplantation - The Regenerative Medicine Journal", published between 2008 and 2009.

At the time, I wondered:
How they can write a coherent article by doing this without plaigiarising heavily, I am not sure, as they must have to ignore masses of relevant literature...
The answer turns out to be depressing simple: despite the rather intriguing title of the review, it is nothing more than literally "an analysis of the articles published in the journal Cell Transplantation - The Regenerative Medicine Journal between 2008 and 2009" under the pretext that this "reveals the topics and categories that are on the cutting edge of regenerative medicine research". If you want to read the whole article, you can download it for personal use here. It's a bit boring, though, to be honest. You can actually get the idea of what they did from the abstract:
The increasing number of publications featuring the use of stem cells in regenerative processes supports the idea that they are revolutionizing regenerative medicine research. In an analysis of the articles published in the journal Cell Transplantation - The Regenerative Medicine Journal between 2008 and 2009, which reveals the topics and categories that are on the cutting edge of regenerative medicine research, stem cells are becoming increasingly relevant as the "runner-up" category to "neuroscience" related articles. The high volume of stem cell research casts a bright light on the hope for stem cells and their role in regenerative medicine as a number of reports deal with research using stem cells entering, or seeking approval for, clinical trials. The "methods and new technologies" and "tissue engineering" sections were almost equally as popular, and in part, reflect attempts to maximize the potential of stem cells and other treatments for the repair of damaged tissue. Transplantation studies were again more popular than non-transplantation, and the contribution of stem cell-related transplants was greater than other types of transplants. The non-transplantation articles were predominantly related to new methods for the preparation, isolation and manipulation of materials for transplant by specific culture media, gene therapy, medicines, dietary supplements, and co-culturing with other cells and further elucidation of disease mechanisms. A sizeable proportion of the transplantation articles reported on how previously new methods may have aided the ability of the cells or tissue to exert beneficial effects following transplantation.
Table 1 of the paper, "Characterization of publications in Cell Transplantation from 2008 to 2009", tabulates all 453 papers from this period. (This actually begs the question as to why only 445 seem to be in the reference list but I am certainly not going to try and find out which eight are missing!) As one can imagine, this is hardly gripping reading and I would greatly question that two years of one journal can give any unbiased insights into "the topics and categories that are on the cutting edge of regenerative medicine research"; it therefore comes as no surprise that (according to Google Scholar), this paper has only been cited once to date.

So, despite first appearances, this does not seem to be a case of scientific malpractice. It's hard to see it as anything other than scientific skullduggery (and drudgery), however. The pretext for publication is really pretty thin and, I would argue, if the intent was not at least in part to hide the activity then the choice of sources would feature in the title, not just the abstract. As highlighted in the Scholarly Kitchen article, three out of four authors are editorial board members of Cell Transplantation and, if nothing else, this surely constitutes a "conflict of interests". (None were listed.) Indeed, parts of the paper read like an advert for the journal. Perhaps there is a place for such an article, but surely it is as an editorial in the journal itself? The Scholarly Kitchen post covers more activities of a similar ilk - the authors seem to have made this review a regular activity.

I'm not generally one to point fingers and, to be fair to Eve et al., it's a competitive world; it could be argued that they are guilty of nothing other than "playing the game". (And playing it well!) That said, if Impact Factor and other metrics are going to be used to compare journals and select where to publish, surely this kind of thing must be stopped?

Wednesday 11 April 2012

Short-cited V: where to publish?

The second side of bibliometrics is the one I have not considered so much: how to use citation data to help you decide where to publish. This was more important, I suspect, when Impact Factor was the thing that everyone cared about. Happily this is no longer the case - at least, not in the UK - but it is still undeniably important and used in a general sense to establish whether you, as an author, are routinely publishing in top ranking journals.

Call me old-fashioned but I prefer to choose my journals on the basis of their research scope and target audience. Second, the format of the articles can be important - sometimes, a long article is easier to write and might have higher impact than the short, concise papers that some of the highest-impact journals favour. Nevertheless, when the scope and style of the journals appears similar, citation metrics can be an indicator of readership and influence; I know my own opinion of journals is heavily influenced by individual papers I have read from those journals, as well as being biased by my genetics background, and so an impartial view can be informative.

What I did not realise is the range of citation metrics and resources available for comparing journals. The main two sites (as listed on the bibliometrics course I attended) seem to be  Scimago and JCR (Web of Knowledge), which use Scopus and ISI citation data, respectively. To compare journals, however, you need to pick a category and here is where I encountered my first problem: in Scimago, I could not find any bioinformatics or computational biology category. JCR does, at least, have a mathematical and computational biology category but the overlap with my kind of bioinformatics (i.e. molecular evolution) is not that great and so the problem still remains: if your field does not nicely align with one of their selected categories, it is nigh-on impossible to get a general overview or ranking of your options (or papers).

You can, of course, still do pairwise comparisons of specific journals. Then the question becomes, which metric?  The traditional metric is "Impact Factor", which is the mean number of times a paper is cited in the two year period following its publication. As explored in my last post on bibliometrics, the suitability of this period is obviously very field-dependent and also genre-dependent. Methods articles, for example, are likely to have a greater lag before people start publishing papers using them than discovery articles. More worrying, though, is the way these metrics can be blatantly abused. Self-citation (such as editorials citing papers in that issue) has been used for years to boost Impact Factor a little but it seems that this at least is something that JCR etc. monitor and respond to. More sinister is something that I only discovered today: the citation cartel. I recommend reading the linked Scholarly Kitchen article but, in a nutshell, this is when a group of editors boost their journal's Impact Factor by writing a review in another journal that almost exclusively cites papers from their own journal that were published within the last two years! How they can write a coherent article by doing this without plaigiarising heavily, I am not sure, as they must have to ignore masses of relevant literature, but the examples in the article are pretty horrific.

If you are suspicious of the standard two-year Impact Factor due to these abuses, you can always opt for a 5 year Impact Factor, which is probably a little less prone to abuse and, arguably, a better indicator of impact. (I know that I still make use of a lot of papers dating back to 2007!) If not sure which duration is more appropriate, you can check the cited half-life (median age of articles cited by the journal) and citing half-life (median age of articles from that journal when cited). There are also slightly more esoteric metrics out there, including the "Article Influence Score", which is Impact Factor weighted (how?!) by the journal prestige/ranking of citing articles, and the rather mysterious "Eigenfactor" score, which was described to us as the amount of time spent in a given journal when taking a random walk through article space. Hmmm. At least these ones are harder to fake, I guess.

I think it's quite interesting to look at these things, partly because I am just a geek who likes data and graphs and things, but I think the general conclusion of short-cited IV still holds: don't use citation metrics alone to make decisions. Any decisions. Ever!

Tuesday 10 April 2012

Cozy is...

...two cats, one bed.

Chocolate of the gods (or kings, or god-kings)

I have posted about Montezuma's chocolate shop in Winchester before but it's so good that I could not help but write a second post, especially during this most chocolatey of seasons. (Plus, the original post was over two years ago.) On our recent trip to Winchester, we popped in to Montezuma's for a bit of Easter chocolate, having failed to be inspired by the usual Easter Eggs in the supermarket. Never afraid to challenge tradition, I opted for a bar of Dark Chocolate Orange & Geranium rather than an egg or bunny and, boy, is this stuff good! That is all!

Sunday 8 April 2012

Water meadows, water voles and watering holes

Today, we got some good fresh air on the Water Meadows walk in Winchester, returning along the Itchen Navigation. This is a lovely (and flat!) stroll of about 7-8km or so, the highlight of which was a water vole that swam along right next to us before having a quick preen and disappearing into his hole.

As an added bonus, we arrived back in Winchester at the right time and place to try out a new watering hole for lunch, in the form of the Bishop on the Bridge pub. This is a pub that I have managed to overlook on my many previous visits to Winchester but it does a good range of beers and has a good menu too.

I'm a big believer that you can tell a lot about a pub's fayre from its burgers and, after a nice stroll along the Itchen, I was in the mood for a good burger; I was not disappointed. I opted for one with cheddar and mushrooms and it really hit the spot. Although the shrooms were not outstanding and got a bit lost, the burger itself was very tasty; even if it was shaped a bit too perfectly to be hand-made, it was very meaty and juicy (and good value for money). The chips were also great and I think the tomato relish was up with the nicest I have had, all washed down with a nice glass of Leffe blond.

It may have taken over four years to discover both the walk and the pub but I don't think it will be another four years before we repeat the itinerary - highly recommended if you're in the Winchester area.

Saturday 7 April 2012

No stars for NOM bigots thanks to Marriage #EquaLatte campaign

The issue of same-sex marriage has been getting a lot of attention recently, both sides of the Atlantic. Highlights have included a Scottish Cardinal who is so out of touch with reality that he believes legalising gay marriage would “shame the UK in the eyes of the world” (not the part whose opinion I would care about) and, in a mind-boggling statement of inverted intelligence, declared it to be a “grotesque subversion of a universally accepted human right”. (What?!) Of course, this was followed up by the greatest quote of all by Chief Executive of Stonewall, Ben Summerskill:
“If Roman Catholics don’t approve of same-sex marriage, they should make sure they don’t get married to someone of the same sex.”
It seems that things are a little more sinister across The Pond, and the (unfortunately not-ironically-named) "National Organization for Marriage" (for marriage, are you sure?) has called for a nationwide boycott of Starbucks for supporting Gay Marriage. Happily, we don't live in a world where conservatives get to enforce their narrow-minded bigotry on everyone else unchallenged, and Sum Of Us have started a counter-campaign to thank Starbucks for standing up for equal human rights, in their #EquaLatte campaign. (Who can resist a good pun, eh?) So far, over 640,000 people have signed up versus 29,000 on NOM's "Dump Starbucks" pledge site. Job done, I think. Well done, Sum of Us! (You can sign up here.)

Sunday 1 April 2012

UK government prefers to launch cruise missiles than scientific cruises

I found out this week that, thanks to various government cuts and the like,the National Oceanography Centre is going to have massive staff cuts. Fair enough, one might say, given the "current economic climate" but I'm more worried about the actual climate, to be honest.

It always annoys me when things that we excel at as a nation, like our oceanographic science, are hit so hard when the actual numbers are, in government terms, peanuts. How much are we talking? £1.5 million. Sounds like a lot until you consider that the annual military budget for the country is approx. £37 billion, while the annual science budget is about £4.6 billion... and cruise missiles weigh in at £500,000 a pop. So, for the cost of three cruise missiles*, we could have safeguarded the NOC and its invaluable work understanding global climate and working out how to fight/mitigate climate change. There can't be that many people who wouldn't consider that a fair swap. Sadly, it seems, they're in government.

*or six dinners with David Cameron.