Methods section driven reproducibility

A cornerstone of the scientific method has always been the ability to draw the same conclusions after the execution of different experiments. I would very much like to say that there is a consensus in the scientific community on how to call such a process but unfortunately that doesn’t seem to be the case. The terms “reproducibility”, “replicability” and “robustness” are often used interchangeably and different people might rank them differently depending on how they interpret them. Luckily, a recent paper cleverly proposed to stick to “reproducibility” to describe the process as a whole and to name its different flavors by adding a prefix. In short, Goodman et al. indicate the following kinds of reproducibility in science (the short summaries are mine):

  • Methods reproducibility: giving sufficient details about the experimental procedure and the processing of the data so that the “exact same” results can be obtained
  • Results reproducibility: carrying out an independent study with “very similar” procedure and obtaining “close enough” results as the original study
  • Inferential reproducibility: drawing “qualitatively” similar results from independent studies or a reanalysis of the same data

In the specific area of computational biology, the requirements to meet these three objectives can be more precisely defined:

  • Methods reproducibility: providing “machine code” that give exactly the same output given the same input
  • Results reproducibility: providing all the relevant details about the algorithms used so that they can be re-run/reimplemented and give quantitatively similar results on the same or different data
  • Inferential reproducibility: providing an interpretation of the results of an experiment so that it can be qualitatively compared with another study

It’s easy to see how the latter flavor of reproducibility is the most valuable, as getting to the same conclusions using different data or even completely different experimental strategies can sometime provide further support by itself. Needless to say that is also the one that requires the most work and resources to achieve.

Regarding methods reproducibility, it has become pretty fashionable in computational biology; many journals are explicitly requesting authors to deposit all computer code as supplementary material. The extreme case being providing either VMs or so called containers to ensure that the specific computing environment does not alter the final result, leading to perfect methods reproducibility. This is an important thing to aspire for, especially to avoid scientific fraud (or bona-fide errors), and many people have proposed technologies to make this relatively easy to achieve. Despite all this, I believe that in many cases the emphasis should be on achieving better results reproducibility over perfect methods reproducibility. This usually comes in the form of none less than the good old methods section of a paper1. If the algorithms used in an experiment are explained with sufficient detail, it will only be (relatively) trivial to reimplement them to produce very similar results on different data, thus reproducing (in the “results” sense) the original paper. What’s more interesting, writing an implementation of an algorithm from scratch is a great exercise and provides a great way to properly understand how a method works, not to mention the possibility to improve it. In fact, I recently had to reimplement some algorithms that were very well described in other paper’s methods sections (part of this, and the whole of this with some help)2. In the process I have better understood the algorithms and I ended up making improvements and extensions. It also has convinced me that trying to reimplement an algorithm from a paper could be an interesting part of a computational biology class. All of this is simply not possible through methods reproducibility, unless a thorough inspection of the source code is made, which in many cases can be a true nightmare. Even the most advanced container technology or programming language will eventually fade, but a well-written couple of paragraphs will continue on for a long time.

Footnotes

[1] or the documentation of your software package, or a chapter of a book or an interactive blog post

[2] our former colleague Omar was particularly good in reimplementing existing methods to make them more user friendly and extensible, like SIFT or motif-x

Mexico (2017)

Alternative cover for “Teoria della classe disagiata”

An alternative cover for “Teoria della classe disagiata” by Raffaele Alberto Ventura (which I guess can be translated as “Theory of the uneasiness of the middle class”, although the original title is a pun). A really thought-provocative essay describing the decline young middle class, especially from southern Europe. The book goes a long way to try to explain the causes why our generation is facing a worse socioeconomic perspective than the one of our parents, even though it does so via a really broad view on capitalism and economy, which is hard to prove or confute. Despite its limits the book really succeeds in describing the “gambling” mechanism by which the middle class acquires “positional goods” in order to have a better chance of climbing the crowded social ladder, despite the economic crisis has effectively transformed this process into something more similar to a lottery. I would love to see it translated in other languages so that this very original point of view finally gets discussed in the perspective of the future of the European union and hoping to inspire new forms of class solidarity.

The original cover

New York (2017)

Some snaps from a very quick (less than 48 hours!) visit to sunny and cold New York

The Smithsonian National Museum of African American History & Culture

Earlier this month we had the chance to pass through Washington D.C. for a couple of days, and we managed to visit the latest addition to the Smithsonian museums: the National Museum of African American History & Culture. Opened in 2016, it has a very central position in the National Mall, being right in front of the Washington monument and at a stone’s throw to the White House. This is already enough to signal the ambition the museum has in terms of symbolism and impact. The actual inside of the museum reinforced this impression.

It turns out that a good half of the museum is below ground, where the visit is supposed to begin. Everyone is forced to take an elevator, which acts as a time machine; when the doors open is the end of the 15th century, and a suffocating dungeon takes the visitors through the horrors of the transatlantic slave trade. The exit of the dungeon offers some relief, but you find yourself still at the bottom of a very large open pit with dark walls. The declaration of independence is presented with big box letters, but its ambiguity (“all men are created equal”) towards the very large part of the US population still considered nothing more than a slave is clearly stated. Jefferson himself is a statue whose expression is masked by shadows.

The visitors can then slowly ascend through a series of ramps, each one presenting a chapter in the struggle towards emancipation and equality. The symbolism of having to walk all the way up from the bottom of a pit is perhaps an obvious symbolism but it really gives an uplifting feeling. The last turn of the ramps, just exiting the dark pit and out of the “time machine” reserves one last uplifting surprise, and a strong message. A writing on the wall, until then invisible, that states: “I, too, am America”.

The rest of the museum is above ground, separated from the pit and its horrors by two floors. The closer to the top, the more the contribution of african-americans to the military, society, sports and culture is celebrated.

This journey to hell and return has been very strong emotionally, even on someone like me who has little to no relationship with colonialism and racial tensions. The little colonialism my country has ever imposed had been related to fascism, and as a country we considered that amended with the civil war that contributed in freeing Italy from the axes. Recent reports of slave practices in Libya, whose government is strongly supported by Italy in the hope of stopping migratory fluxes from reaching Europe, is however casting a shadow on my country as a whole, and on the government and opposition forces alike.

Italy (2017)

Valencia (2017)

Some snaps taken while attending the FEMS 2017 conference.

Colombia (2017)

I recently switched to a new camera (the Fujifilm x100t); I’m still adapting to it, but so far I really like its portability and quality. I’m also a big fan of fixed lens cameras, as they force you to (try to) follow a consistent style.