import dataset NCBI

Anna_Maria · May 1, 2023, 8:49am

Dear all, I cannot import dataset from UNITE or NCBI…is there any tutorial about this issue?
thanjs

jennaj · May 1, 2023, 6:08pm

These FAQs have some tips about loading data to Galaxy. https://training.galaxyproject.org/training-material/faqs/galaxy/#data%20upload

And many tutorials have examples that include an Upload step. You can also navigate the training site by analysis domain. https://training.galaxyproject.org/training-material/search?query=upload

If you still need more help after reviewing, please share more details. Troubleshooting errors

Anna_Maria · May 2, 2023, 5:06pm

Dearc jennifer tha nks for the support but I really do not know how to porceed. Could you send me a protocol for the analysis of the 18S (fungi)
thanks
anna Maria

jennaj · May 2, 2023, 5:11pm

Please see Analyses of metagenomics data - The global picture

Anna_Maria · May 3, 2023, 4:09pm

Dear Jennifer many thanks, the document refers to 16s analysis . I repeat all the steps but I do not know how to go on when I have to align my ITS sequences (18s) . what is the :

Select Reference Template
“reference” to the reference file in the ncbi nrDNA database ? how can upload the file?

many thanks

Anna Maria

jennaj · May 3, 2023, 4:20pm

Hello,

Tutorials are examples. Going through them will help you to learn and understand what appropriate inputs and parameters are involved. None of this is exact, and is what you would need to do even if working outside of Galaxy.

In short, you’ll need to adapt methods for your own data. Suggestions:

find a few publications that do something similar to what you want to do
consider visiting general bioinformatics forums
tool author forums
public data repositories
sites hosted by scientific communities focused on your research domain
leveraging resources that are not currently using Galaxy is expected. GTN tutorials cover a small slice of available bioinformatics tools/methods.

Others are welcome to add more comments!

Anna_Maria · May 9, 2023, 7:25pm

dear Jennifer finally I got a results but it seems that something is wrong . If I copy that sequence obtained (fileterd…merged…ecc) they blast on NCBI from position 200nt…as only 1 of the paired sequence has been processed. I add here the steps I did

Anna_Maria · May 9, 2023, 7:31pm

Dear Jenniferfinally I got a results but it seems that something is wrong . If I copy that sequence obtained (fileterd…merged…ecc) they blast on NCBI from position 200nt…as only 1 of the paired sequence has been processed. I add here the steps I did and the files.The sections in yellow are more critical pointsù

Could you please …thank you!!!

data
PS8_1.fastq.gz
PS8_2.fastq.gz
PS8_F_filt.fastq.gz
PS8_R_filt.fastq.gz
PS8count
PS8dadaehisotry.Rhistory
PS8taxa
tabella
UNITE.fasta

help me to understand what is wrong?

fnFs ← sort(list.files(pattern=“PS8_1.fastq”))

fnRs ← sort(list.files(pattern=“PS8_2.fastq”))

sample.names ← sapply(strsplit(fnFs, “_”), [, 1)

fnFs ← file.path(fnFs)

fnRs ← file.path(fnRs)

plotQualityProfile(fnFs[1:2])

> filt_path <- file.path( "filtered")

> filtFs <- file.path(filt_path, paste0(sample.names, "_F_filt.fastq.gz"))

> filtRs <- file.path(filt_path, paste0(sample.names, "_R_filt.fastq.gz"))

> out <- filterAndTrim(fnFs, filtFs, fnRs, filtRs,

+                      maxN=0, maxEE=c(2,2), truncQ=2, rm.phix=TRUE,

+                      compress=TRUE, multithread=FALSE)

Creating output directory: filtered

> errF <- learnErrors(filtFs, multithread=TRUE)

14339212 total bases in 50672 reads from 1 samples will be used for learning the error rates.

> errR <- learnErrors(filtRs, multithread=TRUE)

11466811 total bases in 50672 reads from 1 samples will be used for learning the error rates.

> derepFs <- derepFastq(filtFs, verbose=TRUE)

Dereplicating sequence entries in Fastq file: filtered/PS8_F_filt.fastq.gz

Encountered 29001 unique sequences from 50672 total sequences read.

> derepRs <- derepFastq(filtRs, verbose=TRUE)

Dereplicating sequence entries in Fastq file: filtered/PS8_R_filt.fastq.gz

Encountered 36009 unique sequences from 50672 total sequences read.

> names(derepFs) <- sample.names

Warning message:

jennaj · May 10, 2023, 2:09pm

Hi @Anna_Maria

Are you still working in Galaxy?

Are you following/adapting a published protocol?

Anna_Maria · May 10, 2023, 2:20pm

dear Jennifer, I am following this protocol
https://training.galaxyproject.org/training-material/topics/metagenomics/tutorials/mothur-miseq-sop/tutorial.html

Anna_Maria · May 11, 2023, 5:41pm

Dear Jennifer, could you please help me?
thanks
Anna Maria

jennaj · May 12, 2023, 7:32pm

I’m not sure why you are having problems when mapping at NCBI. I can let you know that BLAST excepts fasta data as an input, not fastq.

The tutorial you are following 16S Microbial Analysis with mothur (extended) includes a workflow. Maybe you can adapt that to fit your analysis?

Topic		Replies	Views
16s Metagenomics freelance training	0	10	November 6, 2024
Metagenomics workflow in galaxy usegalaxy.org support gtn-tutorial , workflow	1	454	August 7, 2023
Editing a workflow to choose a different reference genome usegalaxy.org.au support gtn-tutorial , workflow , reference-index , troubleshooting	6	35	April 23, 2025
Paste Link to Upload Data -- Uploading tar archives by URL usegalaxy.org support upload	6	1360	February 2, 2024
Error uploading pub files usegalaxy.eu support upload , datatype	32	2372	April 27, 2020

import dataset NCBI

Related topics