paired-end histogram failure

I have a project due for class and keep getting an error at the paired-end histogram step. I have asked the professor but he does not know what is wrong or how to help. Any ideas of what the error means and how I can fix it ?
This is the instructions

  • Mapp the cutadapt output to the reference genome using Bowtie2
    • The genome and gene annotations are in the “Sp5.0 Mapping Test” History that I already shared with you, and also posted in the shared DropBox folder IGV_Sp5_Genome_links.txt
    • If you mapped to the whole genome, great! Otherwise, Map the reads to Sp5.0_Chromosome_NW_022145615.1.fa as if it was the entire genome. The chromosome and its corresponding index are also located in the DropBox folder.
  • Filter duplicate reads with MarkDuplicates
  • Visualize and analyze the distribution of read lengths with the tool Paired-end histogram****Peak call and visualization of normalized data

This is the error
An error occurred with this dataset:
format tabular database papHam1

Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/corral4/main/jobs/054/233/54233597/tmp -Xmx2488m -Xms256m Exception in thread “main” htsjdk.samtools.SAMFormatException: SAM validation error: ERROR: Record 941707, Read name LH00401:11:GW231120000:5:1105:14959:

Hi @michele_palermo

That part of the log is reporting the target reference genome.

The Baboon genome is the default for runs that use a native index on the server.

It sounds like you want to map against a custom reference genome instead. To do that, adjust the settings for the target genome then select your fasta from the history.

Or, if you actually want to use a native index, you’ll need to choose the correct option on the form.

Hope that helps!

I just looked but paired-end histogram does not give you the option to pick a genome reference this is for urchin. It only allows you to put an upper and lower bp limit.

Hi @michele_palermo

I was probably unclear … but what you want to do is back up to the prior mapping step. It looks like you mapped against the wrong genome.

To review what was done: click on the “i” icon for the BAM output of Bowtie2 and you’ll likely spot the problem in the summary table.