Too slow with CVMFS reference data

yukieymd · June 16, 2021, 8:37am

Does a job using CVMFS take a long time? I implemented CVMFS configuration on my local galaxy with reference to https://galaxyproject.org/admin/reference-data-repo/ Then I executed bwa job for test using hg38, but it is still running after about 6 hours. Smaller one such as dm3 finished but it took about 50 minutes. It seems to be slow for the size of the data. In both cases, input data is the same, paired-end fasq filest about 400MB each and the second run with dm3 finished less than a minute.

I think it’s not a problem of our network because downloading data from other sites, such as UCSC chromosome data, comes out to 2-3MB/s. I would like to know if the galaxy with CVMFS performance is normal, and if there is a setting that would improve performance.

Thank you,
Yukie

hexylena · June 17, 2021, 6:50pm

What have you set as your cache size for cvmfs? This can affect production loads.

nate · June 17, 2021, 11:53pm

Which Stratum 1 was selected and the connection speed to that stratum 1 would also make a major difference. If you’re going to make heavy use of the CVMFS repo it’s recommended to at least run a local squid cache if not a full stratum 1 (which can be private to you).

yukieymd · June 18, 2021, 9:23am

Hi @hexylena ,

I set CVMFS_QUOTA_LIMIT=“100000”. I would like to avoid re-downloading once downloaded as much as possible. Is it too large?

Thanks,
Yukie

yukieymd · June 18, 2021, 9:52am

Hi @nate ,

Thank you for telling me about the squid. I don’t use it that much now, but I’ll try when I need it.

Thanks,
Yukie

yukieymd · June 23, 2021, 3:13am

Follow-up comment. I tried to change Stratum1 server and it led to improved performance. Bwa job with same input files and dm3 finished less than 10 minutes. Thanks for the advice, both of you.

Best,
Yukie

Topic		Replies	Views
Expected speed / responsiveness with cvmfs mount server-admin , cvmfs	12	535	June 28, 2023
Data Backlog for BWA-MEM mapping , queued-gray-datasets	1	315	March 24, 2023
MAF to FASTA times out usegalaxy.org support tool-help , maf_to_fasta1	3	28	August 27, 2024
Datasets taking too long to load usegalaxyfr	1	20	March 21, 2025
BWA MEM2 too slow server-admin , workflow , galaxy-local , tool-help , variant-analysis	9	42	April 2, 2025

Too slow with CVMFS reference data

Related topics