Hello!
Please, I am trying to extract the DNA sequence from a summit file that contains the chromosome, the summit +50, summit-50, and region names! but I am getting 0bytes and a warning message that (5345 warnings, 1st is: Chromosome by name ‘chr1’ was not found for build ‘hg19’. Skipped 5345 invalid lines, 1st is #1, “chr1 868552 868652 name_1”)
I want the sequence for motif discovery!
Do you have any suggestions?
Thank you!
Hi @shamjdeed
Which Galaxy server are you working at? URL if public. If your own Galaxy server, please describe.
I am using Galaxy Europe server (https://usegalaxy.eu/).
Thanks for the info.
The EU server does have the hg19
database indexed for this tool but it seems to be problematic. I was able to reproduce your error and it looks similar to another reported error from a few weeks ago (a problem with the bedtools
indexes). We’ll need an administrator for that server to find out exactly what is going wrong and when it is expected to be fixed: ping @bjoern.gruening
Just as an aside, your coordinates appear to have a 1-based start. MACS
will produce data that is labeled as bed
format, but the format is not quite in that specification. The Extract
tool expects a 0-based start coordinate (true bed
format). To transform your data, “1” needs to be subtracted from the second column of data.
The tools to use are:
-
Compute an expression on every row (Galaxy Version 1.2.0)
. Enter the expressionc2-1
and set “Round the result” to “Yes” to add a whole number column to your dataset with the adjusted start coordinate. -
Cut columns from a table (Galaxy Version 1.0.2)
. Enter the expressionc1,c5,c3,c4
to replace the original start coordinate with the adjusted start coordinate. - The last step is to reassign the datatype. Click on the dataset’s pencil icon to reach the Edit Attribute forms, then on the tab for “Datatypes” use the “Detect datatype” button. Your data would be expected to be set to the datatype
bed
by that process when the formatting is correct.
Note: Be careful about the full name of the Cut
tool used. There are two available, and only the one listed above will rearrange columns.
Reference FAQs:
Thanks for reporting the problem! The EU team will help with follow-up.
Thank you very much for the information.