Please, I am trying to extract the DNA sequence from a summit file that contains the chromosome, the summit +50, summit-50, and region names! but I am getting 0bytes and a warning message that (5345 warnings, 1st is: Chromosome by name ‘chr1’ was not found for build ‘hg19’. Skipped 5345 invalid lines, 1st is #1, “chr1 868552 868652 name_1”)
I want the sequence for motif discovery!
Do you have any suggestions?
Which Galaxy server are you working at? URL if public. If your own Galaxy server, please describe.
I am using Galaxy Europe server (https://usegalaxy.eu/).
Thanks for the info.
The EU server does have the
hg19 database indexed for this tool but it seems to be problematic. I was able to reproduce your error and it looks similar to another reported error from a few weeks ago (a problem with the
bedtools indexes). We’ll need an administrator for that server to find out exactly what is going wrong and when it is expected to be fixed: ping @bjoern.gruening
Just as an aside, your coordinates appear to have a 1-based start.
MACS will produce data that is labeled as
bed format, but the format is not quite in that specification. The
Extract tool expects a 0-based start coordinate (true
bed format). To transform your data, “1” needs to be subtracted from the second column of data.
The tools to use are:
Compute an expression on every row (Galaxy Version 1.2.0). Enter the expression
c2-1and set “Round the result” to “Yes” to add a whole number column to your dataset with the adjusted start coordinate.
Cut columns from a table (Galaxy Version 1.0.2). Enter the expression
c1,c5,c3,c4to replace the original start coordinate with the adjusted start coordinate.
- The last step is to reassign the datatype. Click on the dataset’s pencil icon to reach the Edit Attribute forms, then on the tab for “Datatypes” use the “Detect datatype” button. Your data would be expected to be set to the datatype
bedby that process when the formatting is correct.
Note: Be careful about the full name of the
Cut tool used. There are two available, and only the one listed above will rearrange columns.
Thanks for reporting the problem! The EU team will help with follow-up.
Thank you very much for the information.