Hi! I am continuing to having a problem with sortbed tool, while trying to sort the output of concatenated 2 bed files. The error says: Chromosome “X” undefined in /cvmfs/data.galaxyproject.org/managed/len/ucsc/mm10.len. could someone help me with it?
Welcome @Amina19
This is a chromosome mismatch problem. The Mouse GRCm38/mm10 reference genome is sourced from UCSC and has a specific chromosome naming format. What you have in your bed file is likely based on the Ensembl format. Tools expect exact matches.
What to do:
- Confirm that your data is actually based on some Ensembl version of a mouse assembly.
- Confirm that is GRCm38. If a different assembly, make sure that the UCSC version of the assembly is a match (GRCm37/mm9, GRCm38/mm10, GRCm39/mm39) before assigning it as a database or using it as an input choice on a tool form.
- Try using a tool like this to covert the identifiers: Replace column by values which are defined in a convert file (see the help section about where to source a mapping file between the two different assemblies)
- More details are in this FAQ. The exact manipulations are rarely the same, so some trial and error will probably be needed. Mismatched Chromosome identifiers and how to avoid them
Hope that helps!