2012-02-14

Reference based SAM/BAM compression

In some respects the SAM/BAM specification is quite loose, in that there is more than one way to represent a given piece of information. We can take advantage of this to reduce the size on disk of mapped reads which match the reference sequence, while still maintaining conformance within the spec. I've written a SAM/BAM reference based compression script in Python - put this in your pipeline and smoke it!

2012-01-23

Ion Torrent Suite on GitHub

Good news - the Ion Torrent Suite is now freely available open source software on GitHub under the GPL v2 licence, as promised late last year. There is now something more substantial behind talk of Ion Torrent "democratising sequencing", and a clear advantage over the closed source tools of rival companies. I commend them!

2012-01-16

Ion Torrent does the Samba

I'm a bit behind the curve here (see Lex's blog post from July 2011), but I was amused to find out Ion Torrent call their current nucleotide flow order TACGTACGTCTGAGCATCGATCGATGTACAGC the "Samba". Apparently the idea is to avoid reads going out of phase which could happen with the traditional repeated flow TACG (still used by Roche 454), by giving the molecules which missed a base a chance to catch up, and for IonTorrent this works better.

I wonder if the flow order next revision will also get a dance based name? I'd suggest conga, since it is about synchronising lots of people.

2011-12-13

Validating ID via Gravatar

Most people will have seen a Gravatar user icon online, short for the rather grand sounding "Globally Recognized Avatar". For example GitHub.com and StackOverflow use them, and many blog platforms uses them for user comments (sadly Blogger doesn't, yet). To get a user's icon, you construct a URL with the MD5 checksum of their email address - and if the user isn't registered you get default image or a unique generated abstract icon. This means you can cross-reference a list of email address with a list of Gravatar icon URLs (i.e. a list of email MD5 checksums).

2011-12-12

Is IonTorrent open or not?

It seems IonTorrent are trying to present themselves as the open democratising sequencing platform for high throughput sequencing, with their Ion Community, sample datasets and (in theory) open source software. That sounds great and much more open than Roche 454 or Illumina, but I don't think they're doing a very good job of it - apparently they're even managing to break the GPL (see below).

Update 14 Dec: See comments, Ion Torrent Suite v2 should be coming to GitHub in January under the GPL v2 - that counts as open in my book :)

Update 23 Jan: As planned, the Ion Torrent Suite is on GitHub under the GPL v2. Nice!