TP on Galaxy - SarTools
This TP was created in the context of the R Bilille training program in 2026.
Open usegalaxy.fr and log in.
Differential analysis
create a new story
upload data
tsv while
importingThe file Lobel2016Count.zip is composed of counts from
the paper Lobel L, Herskovits AA (2016) Systems Level Analyses
Reveal Multiple Regulatory Activities of CodY Controlling Metabolism,
Motility and Virulence in Listeria monocytogenes. PLoS Genet 12(2):
e1005870. doi:10.1371/journal.pgen.1005870.
The file Target.txt gets the description of the samples
to be analysed in SarTools : 11 replicates for 2 conditions (6 WT vs 5
codY).
Visualize the target.txt file and check that
tsv or tabular formatOpen the SARTools DESeq2 tool from the Tools panel.
target.txtLobel2016Count.zipstrain, which
corresponds to the third column of target file and contains conditions
to compareWT,which
is the condition of referenceopen the report
look at the graphs on raw data
look at SERE / PCA and HC graphs : what can you observe ?
launch again the report replacing :
mediumBHIlook at PCA : we confirm that medium separates samples on the first axis.
As we observe that ‘medium’ has a strong effect on data, we have to take this into account while launching the analysis on strain. To do so, we launch again the analysis, including this effect as blocking factor.
In a new SarTools analysis :
fill in again wih strain and WT in
‘Factor of interest’ and ‘Reference biological condition’
click on ‘advanced parameters’
open the report :
PCA is still the same, OK
Analysis :
histogram of raw p-values is OK
there are more differentially expressed genes when taking the batch effect into account
independant filtering is performed
SarTools creates several files :
Differential and enrichment analyses
create a new story
upload data
tsv while
importingThis data comes from a previously published article, Hardiville et al. (2020).
In short : we want to compare 3 replicates of 3 different samples, wild-type cells (without mutation), T114A (a mutant with a Threonine-to-Alanine substitution at position 114) and S158A (a mutant with a Serine-to-Alanine substitution at position 158) of the TATA-Box Binding Protein.
The FastQ files have been analysed using RNA-Seq pipeline from NF-Core.
We provide you the count table generated by the pipeline.
T114A and S158A have been renamed Mutant A and Mutant B for an easier
comprehension.
And a subset of genes was randomly selected to reduce the size of the
data to import in Galaxy.
Visualize the metadata.txt file and check that
tsv or tabular formatcondition and WT
as ‘Factor of interest’ and ‘Reference biological condition’It’s possible that the final job appears with errors / in red, but the report is generated anyway.
open the report
condition variableWe will perform the enrichment analysis in this web
site.
No need to create an account, just to indicate your email address.
In this analysis, we will identify pathways in which genes are
overrepresented among the list of overexpressed genes in the comparison
Mutation A vs Mutation B.
The files generated by SarTools are not exploitable directly in Galaxy. We have to download them and re-import in Galaxy.
As we’re interested in overexpressed genes, we will retrieve the file
indexed by ‘up’ in the results of the Mutation A vs
Mutation B comparison.
right click on ‘MutationBvsMutationA.up.txt’ and save the file on
your computer (enregistrer la cible du lien sous)
import this file in Galaxy :
tabular formatwe have to get the first column with gene IDs
c1We then get the SYMBOL IDs of overexpressed genes.
On the web site :
MSigDB website, on the left panelIn the results you can find :
You can download the results in a tsv format.
You can perform the enrichment analysis on the MSigDB
website.
This analysis can be performed on R directly, with more flexibility.