Mutect2 error - please advise

Hi please can you help when i try to run a WGS file through Mutect2, it gives this error below (seems to suggest a sample name isnt given but we have labelled as tumorA7.bam)?:

Job State error
Command Line ln -s /corral4/main/objects/f/0/e/dataset_f0eacd16-136f-45df-8a30-b6ca0a2c9dd1.dat tumor.bam && ln -s /corral4/main/objects/_metadata_files/4/9/9/metadata_499047be-4732-4827-8553-37ef0b37afff.dat tumor.bam.bai && gatk GetSampleName --input=“tumor.bam” --output=“samplename.txt” && sample=cat samplename.txt sed 's/"//g' && gatk Mutect2 --QUIET --tumor-sample “$sample” --input tumor.bam --output output.vcf.gz
Tool Standard Output empty
Tool Standard Error Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/corral4/main/jobs/061/708/61708404/tmp -Xmx3410m -Xms256m 19:53:39.306 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/usr/local/share/gatk4-4.1.7.0-0/gatk-package-4.1.7.0-local.jar!/com/intel/gkl/native/libgkl_compression.so Oct 26, 2024 7:53:39 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine INFO: Failed to detect whether we are running on Google Compute Engine. 19:53:39.849 INFO GetSampleName - ------------------------------------------------------------ 19:53:39.850 INFO GetSampleName - The Genome Analysis Toolkit (GATK) v4.1.7.0 19:53:39.850 INFO GetSampleName - For support and documentation go to https://software.broadinstitute.org/gatk/ 19:53:39.853 INFO GetSampleName - Initializing engine WARNING: BAM index file /corral4/main/jobs/061/708/61708404/working/tumor.bam.bai is older than BAM /corral4/main/jobs/061/708/61708404/working/tumor.bam 19:53:40.683 INFO GetSampleName - Done initializing engine 19:53:40.687 INFO GetSampleName - Shutting down engine [October 26, 2024 7:53:40 PM GMT] org.broadinstitute.hellbender.tools.GetSampleName done. Elapsed time: 0.03 minutes. Runtime.totalMemory()=259588096 *********************************************************************** A USER ERROR has occurred: Bad input: The given bam input has no sample names. *********************************************************************** Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (–java-options ‘-DGATK_STACKTRACE_ON_USER_EXCEPTION=true’) to print the stack trace. Using GATK jar /usr/local/share/gatk4-4.1.7.0-0/gatk-package-4.1.7.0-local.jar Running: java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /usr/local/share/gatk4-4.1.7.0-0/gatk-package-4.1.7.0-local.jar GetSampleName --input=tumor.bam --output=samplename.txt
Tool Exit Code 2
Job Messages * desc: Fatal error: Exit code 2 ()

  • error_level: 3
  • exit_code: 2
  • type: exit_code
    Job API ID bbd44e69cb8906b538e6b2cd463a7e31

Maybe this gives more info https://gatk.broadinstitute.org/hc/en-us/articles/30332033330971-GetSampleName. You could run the Samtools view tool in galaxy and output the header of your bam file in sam format. Then you can look at the header and check how the samplename looks.

1 Like

this helps thank you. I can see there is no samplename when i do this. Is there a way to use samtools on galaxy to input a the samplename in the header of the bam please?

Hi @Imran_Noorani , very glad that the advice from @gbbio helped!

You can adjust BAM headers with a few tools. Search the tool panel with the keyword “header” to find these. When I do that at UseGalaxy.org, the first few tools appear to be what you are looking for. :slight_smile:

Let us know if that works or not!

thanks - after this when trying to run Mutect2 i get this error: ‘A USER ERROR has occurred: Contig chrEBV not present in the sequence dictionary’

I use the hg38 human reference (appropriate for my BAM). I tried excluding this interval by adding ‘chrEBV’ as a text to exclude, but this did not help. Please can you advise on best way to overcome this error?

Do you know where this came from? It is not a chromosome identifier in the hg38 reference genome. Your BAM needs to contain data for only the chromosomes in the genome index.

If you used a different genome index to generate the BAM, then use that same genome version for this too instead of the server genome version. This has scientific implications: any data that was associated with chrEBV won’t be able to be processed by this tool until you provide the correct fasta index (that is how the tool compare the bases in the BAM to the reference bases – aka the "nucleotides). This is really important to get right with all analysis, but especially for anything variant related.

We have a guide that explains what you can check for and adjust. Maybe start there? → Reference genomes at public Galaxy servers: GRCh38/hg38 example