VCFfilter: can´t eliminate negative values

Hi, I am filtering a VCF file. It seems I cannot eliminate negative values for MQBZ and RPBZ. For exemple, when typing RPBZ > -5, the analysis crashes. When I type RPBZ < 5, to filter on the other extremity, it works. Any suggestion? Thanks

Welcome, @nicogouin

I think this prior Q&A will help you, too → Negative values with VCF filter tool - #6 by jennaj

In short, remove any spaces between the - and 5. So: TERM > -5

Let us know if that works or not still :slight_smile:

Hi Jennifer, thanks. This is what I did, it does not work. I got the following messages:

Fatal error: Exit code 139 ()
/jetstream2/scratch/main/jobs/57719518/tool_script.sh: line 22: 152953 Segmentation fault      (core dumped) vcffilter -f 'RPBZ > -5' input1.vcf.gz > '/jetstream2/scratch/main/jobs/57719518/outputs/dataset_801e5ecc-f562-4084-ab55-eac5f851708f.dat'

Hi @nicogouin

Is there anything special about your VCF input file?

Things to check

  1. Datatype assigned to the dataset is vcf
  2. The file is actually uncompressed (your error leads me to think this might be the problem…meaning, the file is still compressed. Galaxy usually works with uncompressed VCF data, and this tool definitely does.)
  • You can check the first two by “re-detecting” the datatype under the pencil icon → Edit attributes “Datatypes” tab. If the guess is not vcf there is a format problem.
  1. Then, does the RPBZ field exist in the file?

Run those checks and if you can’t solve the problem, I’d like to review for some potential server side issue. How to share your work:

I’ll also try to find my older test and run that again to see if anything pops out. :mechanic:

Thanks for the followup!

Hi, I checked all you suggested. The file is a vcf uncompressed file, actually generated in the Galaxy platform with vcftools. The RPBZ field exists since it worked when I used the command RPBZ < 5. I also has the same problem trying to filter the MQBZ field (MQBZ > -4). It seems the problem is the “-” sign. Best, Nicolas

1 Like

Hi @nicogouin

Could you share an example?

  • If you are working at UseGalaxy.org, you can send in a bug report and include a link to this topic in the comments. Then let me know back here and I’ll look for it.
  • If you are working somewhere else, please post back here following the guide in the banner at this forum, also here How to get faster help with your question

I’m curious what the error message is, and need the VCF header and a few data lines for an example file to replicate the query/error. We can report any persistent issues (corner case or not) to the relevant people. This tool has had issues in the past (seems to be a fragile underlying tool) so an exact use case including where it was run (not just the server, but the cluster) and the tool version will probably matter. A shared history or bug report will include all of those low level details.

Thanks!

Hi jennaj,

here is a copy the report i have just sent:

"Hi, I am having trouble using this function when the threshold value is negative. See link to the failed analysis: https://usegalaxy.org/api/datasets/f9cad7b01a47213553c8ed3f2a0ce121/display?to_ext=vcf

Here is the link to the help dialogue I have been having
Galaxy Community Help <incoming+a77402963681265d0f71fd170717d810@galaxy.discoursemail.com>

Hope you can help find out why this is not working. Best, Nicolas"

Hope I did it well.

Best,

Nicolas

Hi @nicogouin I found it. More feedback soon, likely today! Thanks

Hi @nicogouin

I found the problem, and learned something!

For context, please see this two line discussion from the vcffilter tool author: vcffilter core dumped on negative values · Issue #249 · vcflib/vcflib · GitHub

That means the query you want would be one of these:

RPBZ < 0 - 5
RPBZ > 0 - 5

Please give that a try. My test run for this on your full dataset is still running but would have failed by now if the query itself was problematic.

Thank you, I will try it.

Hi, it worked. I successfully used the commands MQBZ > ( 0 - 4 ) and RPBZ > ( 0 - 5 )
Can I ask you if you know how to remove INDELS and keep ony di-allelic SNPs in the VCF-BCF tools available in the platform? I have tried several options already, and they all failed. And i promise, I wn´t bother you with more questions! Thanks for your help, Nicolas

1 Like

Hi @nicogouin

You are funny! Questions is what this forum is about.

For this, maybe annotate your VCF with SnpEff eff then filter on those new annotations. You could also load your data into Gemini and run queries there. See the bottom of both tools form for help/tutorials. Any questions you can come back here.

Thanks a lot!