For this example, first make an empty query file:
$ touch empty_file.fasta
Here's a simple example command showing older versions of BLAST+ would handle this corner case nicely, finishing with a zero return code (meaning success - shown here using echo and the special question mark environment variable). I tried this with BLAST+ 2.2.18 though to 2.2.28 inclusive:
$ blastp -query empty_file.fasta -db nr -outfmt 6; echo "[Return code $?]"
[Return code 0]
[Return code 0]
But not any more, both BLAST+ 2.2.29 and the current release 2.2.30 have broken this:
$ blastp -query empty_file.fasta -db nr -outfmt 6; echo "[Return code $?]"
Command line argument error: Query is Empty!
[Return code 1]
Command line argument error: Query is Empty!
[Return code 1]
Following Unix conventions for an error, here the message is printed to stderr, and a non zero return code is used (one). I just don't agree that this is an error.
I accept that an empty input query file is unusual, but it does happen legitimately - particularly in automated pipelines. For instance, I have written Galaxy workflows which do things like start from a protein set, filter based on the presence of a signal peptide, then run BLAST against some known false-positives, which are then removed. This pipeline might very reasonably return zero sequences - and I want BLAST to accept this and carry on.
This bug was actually reported to me by Jim Johnson (see his issue report here), suggesting we add a work around in the Galaxy BLAST+ wrappers. The group at the University of Minnesota Supercomputing Institute has a pipeline which chunked large sequence sets by length before running BLAST. Occasionally one of the size bins could be empty, at which point BLAST+ broke their workflow.
My suggestion is for the NCBI to either remove this check, or simply downgrade it to a warning on stderr - with the critical requirement that it should revert to a zero return code. e.g.
$ blastp -query empty_file.fasta -db nr -outfmt 6; echo "[Return code $?]"
Warning: Command line argument error?: Query is Empty!
[Return code 0]
Warning: Command line argument error?: Query is Empty!
[Return code 0]
This gives some useful feedback for the user (especially if running BLAST+ by hand at the command line), without breaking legitimate use cases.
Since NCBI BLAST+ don't have a public bug tracker, I am blogging this here, and have reported the problem by email as well.
Update 19 October 2018
Belatedly noting this was fix in the BLAST+ 2.2.31 release, e.g.
$ blastp -query empty_file.fasta -db nr -outfmt 6; echo "[Return code $?]"
Warning: [blastp] Query is Empty!
[Return code 0]
Thank you!
No comments:
Post a Comment