Available Modules

Modules are the building stones of all DSL2 nf-core blocks. You can find more info from nf-core website, if you would like to write your own module.

  • metagenomics 91
  • genomics 23
  • fastq 20
  • classify 20
  • bam 19
  • classification 19
  • taxonomic profiling 18
  • database 17
  • alignment 11
  • assembly 10
  • download 10
  • taxonomy 10
  • taxonomic classification 10
  • sort 9
  • long reads 9
  • fasta 8
  • coverage 8
  • binning 8
  • kmer 8
  • metagenome 8
  • db 8
  • contamination 7
  • quality 7
  • build 7
  • mags 7
  • completeness 7
  • virus 7
  • checkm 7
  • mag 7
  • kraken2 7
  • genome 6
  • contigs 6
  • visualisation 6
  • phage 6
  • mapping 6
  • markduplicates 6
  • kmers 6
  • vsearch 6
  • index 5
  • cram 5
  • bacteria 5
  • statistics 5
  • sequences 5
  • bins 5
  • complexity 5
  • sketch 5
  • ptr 5
  • isolates 5
  • sourmash 5
  • coptr 5
  • profiling 5
  • microbiome 5
  • gatk4 4
  • sam 4
  • merge 4
  • k-mer 4
  • ancient DNA 4
  • table 4
  • aDNA 4
  • archaeogenomics 4
  • palaeogenomics 4
  • diversity 4
  • malt 4
  • ganon 4
  • skani 4
  • redundancy 4
  • umitools 4
  • clustering 3
  • population genetics 3
  • dedup 3
  • profile 3
  • report 3
  • antimicrobial resistance genes 3
  • distance 3
  • de novo assembly 3
  • merging 3
  • bin 3
  • abundance 3
  • containment 3
  • UMI 3
  • amplicon sequences 3
  • vrhyme 3
  • kraken 3
  • bracken 3
  • sylph 3
  • checkv 3
  • vcf 2
  • annotation 2
  • nanopore 2
  • pacbio 2
  • bcftools 2
  • isoseq 2
  • long-read 2
  • imaging 2
  • depth 2
  • antimicrobial resistance 2
  • damage 2
  • ncbi 2
  • peaks 2
  • plasmid 2
  • deduplication 2
  • prediction 2
  • detection 2
  • arg 2
  • compare 2
  • telomere 2
  • deep learning 2
  • deeparg 2
  • host 2
  • microbes 2
  • krona chart 2
  • dist 2
  • profiles 2
  • krakentools 2
  • split_kmers 2
  • barcode 2
  • iphop 2
  • krakenuniq 2
  • taxon tables 2
  • otu tables 2
  • standardisation 2
  • standardise 2
  • ome-tif 2
  • MCMICRO 2
  • signature 2
  • FracMinHash sketch 2
  • metamaps 2
  • taxonomic profile 2
  • metagenomic 2
  • MaltExtract 2
  • HOPS 2
  • authentication 2
  • edit distance 2
  • single cells 2
  • genome bins 2
  • metagenomes 2
  • genomad 2
  • bed 1
  • filter 1
  • map 1
  • gtf 1
  • split 1
  • sentieon 1
  • count 1
  • VCF 1
  • copy number 1
  • imputation 1
  • rnaseq 1
  • trimming 1
  • reporting 1
  • indexing 1
  • QC 1
  • metrics 1
  • repeat 1
  • example 1
  • plot 1
  • amr 1
  • cluster 1
  • machine learning 1
  • iCLIP 1
  • umi 1
  • antimicrobial peptides 1
  • extract 1
  • duplicates 1
  • visualization 1
  • cat 1
  • amps 1
  • FASTQ 1
  • fragment 1
  • antibiotic resistance 1
  • riboseq 1
  • ccs 1
  • normalization 1
  • add 1
  • retrotransposon 1
  • hybrid capture sequencing 1
  • copy number alteration calling 1
  • DNA sequencing 1
  • targeted sequencing 1
  • bedgraph 1
  • fgbio 1
  • arriba 1
  • spark 1
  • html 1
  • fusion 1
  • rsem 1
  • genome mining 1
  • hi-c 1
  • gatk4spark 1
  • comparison 1
  • krona 1
  • atac-seq 1
  • chip-seq 1
  • population genomics 1
  • interactive 1
  • primer 1
  • instrain 1
  • long terminal repeat 1
  • dereplicate 1
  • salmon 1
  • orf 1
  • registration 1
  • contig 1
  • sequenzautils 1
  • varcal 1
  • scaffold 1
  • trim 1
  • identifier 1
  • RNA-Seq 1
  • UMIs 1
  • dereplication 1
  • microbial genomics 1
  • drep 1
  • md 1
  • nm 1
  • uq 1
  • melon 1
  • extractunbinned 1
  • linkbins 1
  • sintax 1
  • vsearch/sort 1
  • usearch 1
  • chromap 1
  • quality assurnce 1
  • qa 1
  • postprocessing 1
  • confidence 1
  • sorted 1
  • taxonomic composition 1
  • prepare 1
  • catpack 1
  • multiqc 1
  • vsearch/dereplicate 1
  • Staging 1
  • vsearch/fastqfilter 1
  • fastqfilter 1
  • CRISPRi 1
  • tag2tag 1
  • drug categorization 1
  • impute-info 1
  • tags 1
  • 16S 1
  • standard 1
  • haplotag 1
  • staging 1
  • post Post-processing 1
  • metagenome assembler 1
  • source tracking 1
  • tag 1
  • cell_barcodes 1
  • rna velocity 1
  • cobra 1
  • extension 1
  • duplicate removal 1
  • maxbin2 1
  • metagenome-assembled genomes 1
  • megahit 1
  • denovo 1
  • debruijn 1
  • vqsr 1
  • variant quality score recalibration 1
  • AMP 1
  • peptide prediction 1
  • metaphlan 1
  • otu table 1
  • bgc 1
  • splitcram 1
  • GTDB taxonomy 1
  • genome taxonomy database 1
  • archaea 1
  • combining 1
  • gc_wiggle 1
  • calmd 1
  • peak-caller 1
  • cut&tag 1
  • cut&run 1
  • chromatin 1
  • seacr 1
  • assembly-binning 1
  • applyvarcal 1
  • VQSR 1
  • fracminhash sketch 1
  • hash sketch 1
  • nucleotide composition 1
  • concoct 1
  • signatures 1
  • duplicate marking 1
  • ARGs 1
  • antibiotic resistance genes 1
  • pcr 1
  • groupreads 1
  • consensus sequence 1
  • ChIP-Seq 1
  • phantom peaks 1
  • illumina datasets 1
  • phylogenetic composition 1
  • multimapper 1
  • LCA 1
  • Ancestor 1
  • intervals coverage 1
  • reference 0
  • structural variants 0
  • variant calling 0
  • align 0
  • gff 0
  • variants 0
  • qc 0
  • quality control 0
  • cnv 0
  • MSA 0
  • variant 0
  • gfa 0
  • somatic 0
  • conversion 0
  • convert 0
  • proteomics 0
  • single-cell 0
  • phylogeny 0
  • bedtools 0
  • graph 0
  • bisulfite 0
  • sv 0
  • gvcf 0
  • variation graph 0
  • methylation 0
  • databases 0
  • wgs 0
  • bisulphite 0
  • methylseq 0
  • picard 0
  • compression 0
  • protein 0
  • bqsr 0
  • illumina 0
  • cna 0
  • consensus 0
  • stats 0
  • tsv 0
  • serotype 0
  • 5mC 0
  • demultiplex 0
  • openms 0
  • DNA methylation 0
  • base quality score recalibration 0
  • protein sequence 0
  • histogram 0
  • searching 0
  • scWGBS 0
  • pairs 0
  • samtools 0
  • WGBS 0
  • haplotype 0
  • filtering 0
  • structure 0
  • pangenome graph 0
  • matrix 0
  • expression 0
  • neural network 0
  • bisulfite sequencing 0
  • mappability 0
  • transcriptome 0
  • aligner 0
  • LAST 0
  • bwa 0
  • plink2 0
  • low-coverage 0
  • transcript 0
  • genotype 0
  • bcf 0
  • seqkit 0
  • cooler 0
  • phasing 0
  • gzip 0
  • germline 0
  • annotate 0
  • sequence 0
  • validation 0
  • gene 0
  • mmseqs2 0
  • biscuit 0
  • decompression 0
  • hmmer 0
  • ucsc 0
  • gff3 0
  • feature 0
  • spatial 0
  • newick 0
  • genotyping 0
  • segmentation 0
  • evaluation 0
  • msa 0
  • blast 0
  • bismark 0
  • mkref 0
  • glimpse 0
  • hmmsearch 0
  • pangenome 0
  • reads 0
  • json 0
  • demultiplexing 0
  • mitochondria 0
  • cnvkit 0
  • snp 0
  • differential 0
  • multiple sequence alignment 0
  • low frequency variant calling 0
  • prokaryote 0
  • bedGraph 0
  • short-read 0
  • scRNA-seq 0
  • single 0
  • splicing 0
  • NCBI 0
  • tumor-only 0
  • mirna 0
  • benchmark 0
  • deamination 0
  • mem 0
  • concatenate 0
  • interval 0
  • single cell 0
  • tabular 0
  • fastx 0
  • csv 0
  • de novo 0
  • text 0
  • mutect2 0
  • kallisto 0
  • summary 0
  • ont 0
  • call 0
  • MAF 0
  • counts 0
  • indels 0
  • svtk 0
  • structural 0
  • wxs 0
  • idXML 0
  • adapters 0
  • mpileup 0
  • reference-free 0
  • 3-letter genome 0
  • clipping 0
  • query 0
  • gridss 0
  • view 0
  • family 0
  • bedpe 0
  • preprocessing 0
  • ngscheckmate 0
  • genome assembler 0
  • matching 0
  • fai 0
  • bigwig 0
  • read depth 0
  • ampir 0
  • fungi 0
  • peak-calling 0
  • CLIP 0
  • dna 0
  • diamond 0
  • circrna 0
  • rna 0
  • microarray 0
  • ATAC-seq 0
  • microsatellite 0
  • union 0
  • miscoding lesions 0
  • isomir 0
  • compress 0
  • palaeogenetics 0
  • archaeogenetics 0
  • bgzip 0
  • interval_list 0
  • hic 0
  • paf 0
  • cut 0
  • haplotypecaller 0
  • resistance 0
  • pypgx 0
  • HMM 0
  • enrichment 0
  • chromosome 0
  • gsea 0
  • logratio 0
  • happy 0
  • STR 0
  • HiFi 0
  • chunk 0
  • biosynthetic gene cluster 0
  • bcl2fastq 0
  • propr 0
  • hmmcopy 0
  • image 0
  • parsing 0
  • quantification 0
  • BGC 0
  • public datasets 0
  • clean 0
  • ranking 0
  • phylogenetic placement 0
  • xeniumranger 0
  • SV 0
  • genmod 0
  • transcriptomics 0
  • DNA sequence 0
  • mtDNA 0
  • sample 0
  • sequencing 0
  • ancestry 0
  • snps 0
  • fcs-gx 0
  • macrel 0
  • mlst 0
  • amplify 0
  • fastk 0
  • das tool 0
  • structural_variants 0
  • C to T 0
  • DRAMP 0
  • das_tool 0
  • angsd 0
  • insert 0
  • fam 0
  • bim 0
  • SNP 0
  • small indels 0
  • subsample 0
  • pangolin 0
  • panel 0
  • pan-genome 0
  • pairsam 0
  • duplication 0
  • prokaryotes 0
  • replace 0
  • bacterial 0
  • covid 0
  • benchmarking 0
  • dictionary 0
  • lineage 0
  • polishing 0
  • indel 0
  • PCA 0
  • mapper 0
  • fingerprint 0
  • prokka 0
  • regions 0
  • typing 0
  • RNA-seq 0
  • genomes 0
  • neubi 0
  • entrez 0
  • eukaryotes 0
  • scores 0
  • seqtk 0
  • mcmicro 0
  • aln 0
  • bwameth 0
  • npz 0
  • windowmasker 0
  • bakta 0
  • nucleotide 0
  • highly_multiplexed_imaging 0
  • mkfastq 0
  • image_analysis 0
  • cellranger 0
  • gene expression 0
  • zip 0
  • unzip 0
  • uncompress 0
  • untar 0
  • mask 0
  • RNA 0
  • rna_structure 0
  • proteome 0
  • guide tree 0
  • long_read 0
  • somatic variants 0
  • transposons 0
  • complement 0
  • roh 0
  • transcripts 0
  • organelle 0
  • remove 0
  • converter 0
  • intervals 0
  • genome assembly 0
  • mzml 0
  • chimeras 0
  • PacBio 0
  • comparisons 0
  • combine 0
  • quality trimming 0
  • score 0
  • adapter trimming 0
  • popscle 0
  • pileup 0
  • genotype-based deconvoltion 0
  • bamtools 0
  • variant_calling 0
  • hidden Markov model 0
  • archiving 0
  • minimap2 0
  • amplicon sequencing 0
  • notebook 0
  • reports 0
  • ataqv 0
  • informative sites 0
  • kinship 0
  • identity 0
  • relatedness 0
  • repeat expansion 0
  • virulence 0
  • cut up 0
  • survivor 0
  • miRNA 0
  • cool 0
  • pseudoalignment 0
  • dump 0
  • lossless 0
  • observations 0
  • shapeit 0
  • khmer 0
  • CRISPR 0
  • prefetch 0
  • spaceranger 0
  • wastewater 0
  • wig 0
  • tabix 0
  • ambient RNA removal 0
  • ligate 0
  • cfDNA 0
  • uLTRA 0
  • png 0
  • gstama 0
  • ichorcna 0
  • mash 0
  • tama 0
  • pigz 0
  • bustools 0
  • refine 0
  • resolve_bioscience 0
  • gene set 0
  • trancriptome 0
  • gene set analysis 0
  • spatial_transcriptomics 0
  • lofreq 0
  • screen 0
  • phase 0
  • haplotypes 0
  • reformat 0
  • serogroup 0
  • minhash 0
  • GC content 0
  • maximum likelihood 0
  • megan 0
  • polyA_tail 0
  • hla 0
  • hlala 0
  • k-mer frequency 0
  • hla_typing 0
  • hlala_typing 0
  • checksum 0
  • corrupted 0
  • tree 0
  • nanostring 0
  • mapcounter 0
  • nacho 0
  • haplogroups 0
  • mRNA 0
  • find 0
  • pair 0
  • trgt 0
  • cgMLST 0
  • regression 0
  • taxids 0
  • SimpleAF 0
  • taxon name 0
  • zlib 0
  • differential expression 0
  • variation 0
  • vg 0
  • vcflib 0
  • ampgram 0
  • amptransformer 0
  • orthologs 0
  • WGS 0
  • image_processing 0
  • standardization 0
  • repeats 0
  • svdb 0
  • de novo assembler 0
  • small genome 0
  • interactions 0
  • functional analysis 0
  • join 0
  • reformatting 0
  • function 0
  • pharokka 0
  • bloom filter 0
  • k-mer index 0
  • COBS 0
  • archive 0
  • xz 0
  • mudskipper 0
  • long terminal retrotransposon 0
  • transcriptomic 0
  • kma 0
  • parallelized 0
  • orthology 0
  • rrna 0
  • genetics 0
  • tnhaplotyper2 0
  • rgfa 0
  • small variants 0
  • multiallelic 0
  • nucleotides 0
  • cnvnator 0
  • proportionality 0
  • mitochondrion 0
  • leviosam2 0
  • lift 0
  • mirdeep2 0
  • cancer genomics 0
  • homoploymer 0
  • ped 0
  • Duplication purging 0
  • purge duplications 0
  • library 0
  • preseq 0
  • adapter 0
  • import 0
  • doublets 0
  • variant pruning 0
  • anndata 0
  • bfiles 0
  • subset 0
  • gene labels 0
  • read-group 0
  • hostile 0
  • duplicate 0
  • decontamination 0
  • GPU-accelerated 0
  • graph layout 0
  • human removal 0
  • screening 0
  • nextclade 0
  • removal 0
  • msisensor-pro 0
  • cleaning 0
  • micro-satellite-scan 0
  • tumor 0
  • msi 0
  • instability 0
  • MSI 0
  • Read depth 0
  • RNA sequencing 0
  • soft-clipped clusters 0
  • snpsift 0
  • snpeff 0
  • effect prediction 0
  • shigella 0
  • switch 0
  • ancient dna 0
  • Streptococcus pneumoniae 0
  • transformation 0
  • rename 0
  • salmonella 0
  • smrnaseq 0
  • fusions 0
  • Pharmacogenetics 0
  • fixmate 0
  • retrotransposons 0
  • dict 0
  • collate 0
  • bam2fq 0
  • frame-shift correction 0
  • long-read sequencing 0
  • scaffolding 0
  • rtgtools 0
  • sequence analysis 0
  • junctions 0
  • pharmacogenetics 0
  • runs_of_homozygosity 0
  • polish 0
  • assembly evaluation 0
  • concordance 0
  • duplex 0
  • deconvolution 0
  • bayesian 0
  • merge mate pairs 0
  • reads merging 0
  • short reads 0
  • xenograft 0
  • graft 0
  • unaligned 0
  • fetch 0
  • realignment 0
  • GEO 0
  • microscopy 0
  • expansionhunterdenovo 0
  • repeat_expansions 0
  • metadata 0
  • tab 0
  • microbial 0
  • allele-specific 0
  • emboss 0
  • panelofnormals 0
  • gatk 0
  • joint genotyping 0
  • secondary metabolites 0
  • NRPS 0
  • RiPP 0
  • interval list 0
  • evidence 0
  • antibiotics 0
  • antismash 0
  • filtermutectcalls 0
  • simulate 0
  • artic 0
  • aggregate 0
  • demultiplexed reads 0
  • concat 0
  • tbi 0
  • gwas 0
  • CNV 0
  • sra-tools 0
  • settings 0
  • BAM 0
  • blastn 0
  • version 0
  • correction 0
  • calling 0
  • cnv calling 0
  • immunoprofiling 0
  • structural-variant calling 0
  • cvnkit 0
  • estimation 0
  • vdj 0
  • recombination 0
  • eCLIP 0
  • splice 0
  • parse 0
  • fasterq-dump 0
  • awk 0
  • intersect 0
  • intersection 0
  • normalize 0
  • norm 0
  • scatter 0
  • reheader 0
  • eigenstrat 0
  • validate 0
  • samplesheet 0
  • format 0
  • eido 0
  • windows 0
  • blastp 0
  • deseq2 0
  • rna-seq 0
  • region 0
  • heatmap 0
  • sizes 0
  • bases 0
  • spatial_omics 0
  • random forest 0
  • allele 0
  • gem 0
  • ChIP-seq 0
  • baf 0
  • getfasta 0
  • derived alleles 0
  • tnfilter 0
  • covariance model 0
  • jaccard 0
  • overlap 0
  • array_cgh 0
  • cytosure 0
  • decomposeblocksub 0
  • ancestral alleles 0
  • gprofiler2 0
  • gost 0
  • genomecov 0
  • closest 0
  • rad 0
  • bamtobed 0
  • sorting 0
  • structural variant 0
  • bam2fastx 0
  • bam2fastq 0
  • immcantation 0
  • airrseq 0
  • vector 0
  • site frequency spectrum 0
  • immunoinformatics 0
  • f coefficient 0
  • bioawk 0
  • unionBedGraphs 0
  • reverse complement 0
  • simulation 0
  • hmmfetch 0
  • decompose 0
  • pca 0
  • pruning 0
  • subtract 0
  • linkage equilibrium 0
  • slopBed 0
  • transmembrane 0
  • genome graph 0
  • chunking 0
  • tnseq 0
  • homozygous genotypes 0
  • decoy 0
  • heterozygous genotypes 0
  • htseq 0
  • inbreeding 0
  • shiftBed 0
  • multinterval 0
  • sompy 0
  • overlapped bed 0
  • maskfasta 0
  • peak picking 0
  • homology 0
  • co-orthology 0
  • clumping fastqs 0
  • deduping 0
  • plastid 0
  • smaller fastqs 0
  • resfinder 0
  • resistance genes 0
  • raw 0
  • mgf 0
  • parquet 0
  • parser 0
  • dbsnp 0
  • standardize 0
  • quarto 0
  • masking 0
  • python 0
  • r 0
  • low-complexity 0
  • coexpression 0
  • correlation 0
  • corpcor 0
  • GFF/GTF 0
  • assay 0
  • trio binning 0
  • tandem repeats 0
  • phylogenetics 0
  • minimum_evolution 0
  • parallel 0
  • csi 0
  • Read coverage histogram 0
  • biallelic 0
  • sequence similarity 0
  • spectral clustering 0
  • agat 0
  • longest 0
  • comparative genomics 0
  • isoform 0
  • autozygosity 0
  • homozygosity 0
  • deep variant 0
  • variancepartition 0
  • mutect 0
  • idx 0
  • update header 0
  • intron 0
  • dream 0
  • transform 0
  • gaps 0
  • introns 0
  • install 0
  • joint-genotyping 0
  • genotypegvcf 0
  • BCF 0
  • short 0
  • file manipulation 0
  • plink2_pca 0
  • propd 0
  • verifybamid 0
  • vcf2db 0
  • gemini 0
  • maf 0
  • lua 0
  • toml 0
  • plant 0
  • vcfbreakmulti 0
  • uniq 0
  • deduplicate 0
  • SINE 0
  • VCFtools 0
  • network 0
  • downsample bam 0
  • DNA contamination estimation 0
  • wget 0
  • mkvdjref 0
  • construct 0
  • graph projection to vcf 0
  • cellpose 0
  • hifi 0
  • Assembly 0
  • subsample bam 0
  • downsample 0
  • unmarkduplicates 0
  • bedtobigbed 0
  • genepred 0
  • refflat 0
  • gtftogenepred 0
  • ucsc/liftover 0
  • mobile genetic elements 0
  • genome annotation 0
  • trna 0
  • covariance models 0
  • umicollapse 0
  • snv 0
  • scanner 0
  • scRNA-Seq 0
  • crispr 0
  • antibody capture 0
  • files 0
  • antigen capture 0
  • helitron 0
  • multiomics 0
  • remove samples 0
  • upd 0
  • uniparental 0
  • disomy 0
  • domains 0
  • long read alignment 0
  • nucleotide sequence 0
  • tnscope 0
  • copyratios 0
  • comp 0
  • denoisereadcounts 0
  • readwriter 0
  • dnamodelapply 0
  • dnascope 0
  • tblastn 0
  • bedcov 0
  • genome polishing 0
  • groupby 0
  • assembly polishing 0
  • genotype dosages 0
  • vcf file 0
  • bgen 0
  • subtyping 0
  • chloroplast 0
  • blat 0
  • alr 0
  • clr 0
  • Salmonella enterica 0
  • boxcox 0
  • bgen file 0
  • Escherichia coli 0
  • createreadcountpanelofnormals 0
  • workflow_mode 0
  • pangenome-scale 0
  • yahs 0
  • all versus all 0
  • mashmap 0
  • wavefront 0
  • whamg 0
  • wham 0
  • compartments 0
  • copy-number 0
  • copy number analysis 0
  • gender determination 0
  • topology 0
  • copy number alterations 0
  • copy number variation 0
  • geo 0
  • workflow 0
  • mapad 0
  • adna 0
  • c to t 0
  • cumulative coverage 0
  • proteus 0
  • readproteingroups 0
  • calder2 0
  • eigenvectors 0
  • hicPCA 0
  • sliding 0
  • cadd 0
  • snakemake 0
  • distance-based 0
  • long read 0
  • homologs 0
  • telseq 0
  • admixture 0
  • mzML 0
  • microRNA 0
  • mass_error 0
  • search engine 0
  • poolseq 0
  • variant-calling 0
  • stardist 0
  • ATACseq 0
  • shift 0
  • ATACshift 0
  • http(s) 0
  • utility 0
  • setgt 0
  • jvarkit 0
  • translate 0
  • tar 0
  • tarball 0
  • adapterremoval 0
  • HLA 0
  • nanoq 0
  • Read filters 0
  • Read trimming 0
  • Read report 0
  • hhsuite 0
  • ATLAS 0
  • uniques 0
  • Illumina 0
  • functional 0
  • sequencing_bias 0
  • mkarv 0
  • hashing-based deconvolution 0
  • rank 0
  • java 0
  • script 0
  • post mortem damage 0
  • xml 0
  • svg 0
  • atlas 0
  • targz 0
  • Computational Immunology 0
  • bias 0
  • scanpy 0
  • nuclear contamination estimate 0
  • resegment 0
  • morphology 0
  • fix 0
  • malformed 0
  • partitioning 0
  • chip 0
  • updatedata 0
  • run 0
  • model 0
  • AMPs 0
  • allele counts 0
  • antimicrobial peptide prediction 0
  • plotting 0
  • regtools 0
  • leafcutter 0
  • amp 0
  • pdb 0
  • recovery 0
  • mgi 0
  • Staphylococcus aureus 0
  • affy 0
  • block substitutions 0
  • reference panels 0
  • relabel 0
  • cell segmentation 0
  • Bioinformatics Tools 0
  • quality_control 0
  • bclconvert 0
  • nucBed 0
  • AT content 0
  • Immune Deconvolution 0
  • nucleotide content 0
  • elfasta 0
  • elprep 0
  • doublet 0
  • patterns 0
  • controlstatistics 0
  • emoji 0
  • regex 0
  • nuclear segmentation 0
  • paired reads re-pairing 0
  • installation 0
  • doublet_detection 0
  • barcodes 0
  • doCounts 0
  • subsetting 0
  • logFC 0
  • significance statistic 0
  • p-value 0
  • scvi 0
  • solo 0
  • import segmentation 0
  • redundant 0
  • hmmpress 0
  • identity-by-descent 0
  • go 0
  • scimap 0
  • Bayesian 0
  • host removal 0
  • structural-variants 0
  • omics 0
  • biological activity 0
  • bamtools/split 0
  • prior knowledge 0
  • haploype 0
  • mygene 0
  • yaml 0
  • associations 0
  • impute 0
  • bedgraphtobigwig 0
  • bamtools/convert 0
  • reference compression 0
  • pile up 0
  • mouse 0
  • reference panel 0
  • bacphlip 0
  • virulent 0
  • nanopore sequencing 0
  • spatial_neighborhoods 0
  • Indel 0
  • grea 0
  • seqfu 0
  • multi-tool 0
  • predict 0
  • background_correction 0
  • illumiation_correction 0
  • hardy-weinberg 0
  • hwe statistics 0
  • hwe equilibrium 0
  • reference-independent 0
  • genotype likelihood 0
  • collapse 0
  • liftover 0
  • probabilistic realignment 0
  • n50 0
  • case/control 0
  • cell_type_identification 0
  • cell_phenotyping 0
  • machine_learning 0
  • element 0
  • trimBam 0
  • bamUtil 0
  • shuffleBed 0
  • SNV 0
  • clahe 0
  • refresh 0
  • association 0
  • GWAS 0
  • temperate 0
  • read group 0
  • cram-size 0
  • bwamem2 0
  • bwameme 0
  • grabix 0
  • ribosomal 0
  • 10x 0
  • background 0
  • single-stranded 0
  • regulatory network 0
  • ancientDNA 0
  • transcription factors 0
  • paraphase 0
  • selector 0
  • size 0
  • Pacbio 0
  • quality check 0
  • realign 0
  • circular 0
  • phylogenies 0
  • hmmscan 0
  • spot 0
  • orthogroup 0
  • authentict 0
  • sage 0
  • mass spectrometry 0
  • featuretable 0
  • extraction 0
  • guidetree 0
  • AC/NS/AF 0
  • functional enrichment 0
  • autofluorescence 0
  • translation 0
  • paired reads merging 0
  • overlap-based merging 0
  • check 0
  • lifestyle 0
  • hamming-distance 0
  • hashing-based deconvoltion 0
  • gnu 0
  • coreutils 0
  • generic 0
  • transposable element 0
  • retrieval 0
  • cycif 0
  • vcflib/vcffixup 0
  • contiguate 0
  • junction 0
  • MMseqs2 0
  • InterProScan 0
  • busco 0
  • droplet based single cells 0
  • antimicrobial reistance 0
  • lexogen 0
  • genotype-based demultiplexing 0
  • donor deconvolution 0
  • cellsnp 0
  • trimfq 0
  • bigbed 0
  • cmseq 0
  • bedtointervallist 0
  • mash/sketch 0
  • calibratedragstrmodel 0
  • reduced 0
  • representations 0
  • getpileupsummaries 0
  • cross-samplecontamination 0
  • mass-spectroscopy 0
  • calculatecontamination 0
  • mcr-1 0
  • MD5 0
  • 128 bit 0
  • taxonomic assignment 0
  • asereadcounter 0
  • daa 0
  • rma6 0
  • Neisseria meningitidis 0
  • 3D heat map 0
  • contour map 0
  • Merqury 0
  • annotateintervals 0
  • targets 0
  • cnnscorevariants 0
  • collectreadcounts 0
  • ploidy 0
  • collapsing 0
  • determinegermlinecontigploidy 0
  • legionella 0
  • clinical 0
  • pneumophila 0
  • createsomaticpanelofnormals 0
  • limma 0
  • Listeria monocytogenes 0
  • createsequencedictionary 0
  • condensedepthevidence 0
  • lofreq/call 0
  • lofreq/filter 0
  • qualities 0
  • estimate 0
  • dragstr 0
  • functional genomics 0
  • sgRNA 0
  • CRISPR-Cas9 0
  • maximum-likelihood 0
  • rra 0
  • composestrtablefile 0
  • short variant discovery 0
  • combinegvcfs 0
  • DNA damage 0
  • NGS 0
  • damage patterns 0
  • collectsvevidence 0
  • smudgeplot 0
  • unionsum 0
  • train 0
  • graph drawing 0
  • SNP table 0
  • contaminant 0
  • single molecule 0
  • cancer genome 0
  • somatic structural variations 0
  • mobile element insertions 0
  • sequencing summary 0
  • NextGenMap 0
  • ngm 0
  • Neisseria gonorrhoeae 0
  • gender 0
  • zipperbams 0
  • graph construction 0
  • ubam 0
  • Beautiful stand-alone HTML report 0
  • squeeze 0
  • odgi 0
  • combine graphs 0
  • graph stats 0
  • graph unchopping 0
  • graph formats 0
  • graph viz 0
  • tumor/normal 0
  • hla-typing 0
  • ILP 0
  • HLA-I 0
  • block-compressed 0
  • unmapped 0
  • GATK UnifiedGenotyper 0
  • bioinformatics tools 0
  • bootstrapping 0
  • methylation bias 0
  • mbias 0
  • heattree 0
  • gangstr 0
  • assembler 0
  • de Bruijn 0
  • microrna 0
  • gene-calling 0
  • target prediction 0
  • mitochondrial genome 0
  • reference genome 0
  • gamma 0
  • UShER 0
  • mosdepth 0
  • mitochondrial to nuclear ratio 0
  • bacterial variant calling 0
  • germline variant calling 0
  • somatic variant calling 0
  • variant caller 0
  • rust 0
  • microsatellite instability 0
  • fq 0
  • lint 0
  • random 0
  • scan 0
  • mtnucratio 0
  • ratio 0
  • generate 0
  • adapter removal 0
  • spliced 0
  • flip 0
  • txt 0
  • abricate 0
  • amrfinderplus 0
  • fARGene 0
  • rgi 0
  • ibd 0
  • hbd 0
  • beagle 0
  • mitochondrial 0
  • genome profile 0
  • Haemophilus influenzae 0
  • haplotype resolution 0
  • file parsing 0
  • gawk 0
  • extractvariants 0
  • variantrecalibrator 0
  • recalibration model 0
  • variantfiltration 0
  • svcluster 0
  • svannotate 0
  • gccounter 0
  • splitintervals 0
  • readcounter 0
  • site depth 0
  • HMMER 0
  • amino acid 0
  • shiftintervals 0
  • compound 0
  • extract_variants 0
  • Hidden Markov Model 0
  • gene model 0
  • Haplotypes 0
  • Imputation 0
  • joint-variant-calling 0
  • GNU 0
  • merge compare 0
  • genomes on a tree 0
  • low coverage 0
  • gget 0
  • genome statistics 0
  • genome manipulation 0
  • genome summary 0
  • tama_collapse.py 0
  • gfastats 0
  • TAMA 0
  • gvcftools 0
  • Mykrobe 0
  • gstama/merge 0
  • Salmonella Typhi 0
  • repeat content 0
  • gstama/polyacleanup 0
  • genome heterozygosity 0
  • genome size 0
  • gunc 0
  • gunzip 0
  • models 0
  • shiftfasta 0
  • hmtnote 0
  • reorder 0
  • Klebsiella 0
  • readorientationartifacts 0
  • learnreadorientationmodel 0
  • indexfeaturefile 0
  • readcountssummary 0
  • getpileupsumaries 0
  • kallisto/index 0
  • quant 0
  • germlinevariantsites 0
  • germlinecnvcaller 0
  • germline contig ploidy 0
  • digital normalization 0
  • k-mer counting 0
  • effective genome size 0
  • pneumoniae 0
  • jupytext 0
  • panelofnormalscreation 0
  • kegg 0
  • kofamscan 0
  • jointgenotyping 0
  • genomicsdbimport 0
  • genomicsdb 0
  • gatherbqsrreports 0
  • tranche filtering 0
  • filtervarianttranches 0
  • filterintervals 0
  • estimatelibrarycomplexity 0
  • duplication metrics 0
  • papermill 0
  • Jupyter 0
  • annotations 0
  • pixel_classification 0
  • shiftchain 0
  • pos 0
  • haemophilus 0
  • selectvariants 0
  • revert 0
  • panel_of_normals 0
  • IDR 0
  • igv 0
  • igv.js 0
  • js 0
  • genome browser 0
  • multicut 0
  • pixel classification 0
  • probability_maps 0
  • Python 0
  • reblockgvcf 0
  • printsvevidence 0
  • printreads 0
  • interproscan 0
  • preprocessintervals 0
  • postprocessgermlinecnvcalls 0
  • genomic islands 0
  • insertion 0
  • snvs 0
  • mutectstats 0
  • mergebamalignment 0
  • leftalignandtrimvariants 0
  • jasminesv 0
  • jasmine 0
  • PCR/optical duplicates 0
  • upper-triangular matrix 0
  • sequencing adapters 0
  • custom 0
  • sertotype 0
  • interleave 0
  • header 0
  • seq 0
  • na 0
  • selection 0
  • random draw 0
  • pseudohaploid 0
  • pseudodiploid 0
  • freqsum 0
  • bam2seqz 0
  • induce 0
  • sex determination 0
  • sequence headers 0
  • genetic sex 0
  • relative coverage 0
  • Cores 0
  • Segmentation 0
  • rare variants 0
  • error 0
  • TMA dearray 0
  • de-novo 0
  • longread 0
  • sha256 0
  • 256 bit 0
  • UNet 0
  • shinyngs 0
  • cls 0
  • grep 0
  • boxplot 0
  • scramble 0
  • amplicon 0
  • ampliconclip 0
  • scatterplot 0
  • corrrelation 0
  • faidx 0
  • track 0
  • insert size 0
  • repair 0
  • paired 0
  • read pairs 0
  • readgroup 0
  • paired-end 0
  • cluster analysis 0
  • subseq 0
  • clusteridentifier 0
  • pcr duplicates 0
  • cutesv 0
  • variant recalibration 0
  • gct 0
  • exploratory 0
  • density 0
  • sambamba 0
  • rdtest2vcf 0
  • spatype 0
  • spa 0
  • streptococcus 0
  • sccmec 0
  • variantcalling 0
  • Sample 0
  • protein coding genes 0
  • detecting svs 0
  • short-read sequencing 0
  • polymorphic sites 0
  • svtk/baftest 0
  • baftest 0
  • countsvtypes 0
  • rdtest 0
  • antitarget 0
  • polymorphic 0
  • vcf2bed 0
  • decompress 0
  • polymut 0
  • polya tail 0
  • fast5 0
  • chromosome_visualization 0
  • Mycobacterium tuberculosis 0
  • chromosomal rearrangements 0
  • eucaryotes 0
  • coding 0
  • cds 0
  • transcroder 0
  • access 0
  • features 0
  • cload 0
  • mcool 0
  • sliding window 0
  • genomic bins 0
  • makebins 0
  • CRAM 0
  • SMN1 0
  • SMN2 0
  • POA 0
  • sniffles 0
  • core 0
  • snippy 0
  • enzyme 0
  • digest 0
  • cooler/balance 0
  • subcontigs 0
  • dbnsfp 0
  • predictions 0
  • SNPs 0
  • invariant 0
  • constant 0
  • partition histograms 0
  • rRNA 0
  • ribosomal RNA 0
  • target 0
  • export 0
  • flagstat 0
  • ligation junctions 0
  • genetic 0
  • deletions 0
  • insertions 0
  • tandem duplications 0
  • CoPRO 0
  • GRO-cap 0
  • PRO-cap 0
  • CAGE 0
  • NETCAGE 0
  • RAMPAGE 0
  • csRNA-seq 0
  • STRIPE-seq 0
  • PRO-seq 0
  • GRO-seq 0
  • picard/renamesampleinvcf 0
  • faqcs 0
  • exclude 0
  • variant identifiers 0
  • str 0
  • indep 0
  • indep pairwise 0
  • recode 0
  • whole genome association 0
  • identifiers 0
  • scoring 0
  • cache 0
  • variant genetic 0
  • sortvcf 0
  • porechop_abi 0
  • pbp 0
  • pairtools 0
  • pairstools 0
  • restriction fragments 0
  • select 0
  • duplexumi 0
  • public 0
  • paragraph 0
  • graphs 0
  • pbbam 0
  • pbmerge 0
  • subreads 0
  • pair-end 0
  • liftovervcf 0
  • read 0
  • pedigrees 0
  • ENA 0
  • motif 0
  • prophage 0
  • identification 0
  • SRA 0
  • ANI 0
  • hybrid-selection 0
  • mate-pair 0
  • pmdtools 0
  • percent on target 0
  • read distribution 0
  • subsampling 0
  • long uncorrected reads 0
  • rhocall 0
  • R 0
  • escherichia coli 0
  • bamstat 0
  • depth information 0
  • strandedness 0
  • experiment 0
  • read_pairs 0
  • fragment_size 0
  • inner_distance 0
  • structural variation 0
  • duphold 0
  • PEP 0
  • sequence-based 0
  • mapping-based 0
  • segment 0
  • integrity 0
  • rtg 0
  • blastx 0
  • pedfilter 0
  • rocplot 0
  • rtg-tools 0
  • salsa 0
  • salsa2 0
  • neighbour-joining 0
  • quast 0
  • endogenous DNA 0
  • circos 0
  • Streptococcus pyogenes 0
  • swissprot 0
  • genbank 0
  • contact 0
  • pretext 0
  • jpg 0
  • bmp 0
  • contact maps 0
  • gene finding 0
  • embl 0
  • split by chromosome 0
  • deletion 0
  • genomic intervals 0
  • schema 0
  • normal database 0
  • panel of normals 0
  • cutoff 0
  • eklipse 0
  • haplotype purging 0
  • duplicate purging 0
  • false duplications 0
  • assembly curation 0
  • Haplotype purging 0
  • eigenstratdatabasetools 0
  • False duplications 0
  • Assembly curation 0
  • pep 0
  • purging 0
  • integron 0

Post-processing script of the MaltExtract component of the HOPS package

000

json summary_pdf tsv candidate_pdfs versions

Normalize antibiotic resistance genes (ARGs) using the ARO ontology (developed by CARD).

0100

tsv versions

Arriba is a command-line tool for the detection of gene fusions from RNA-Seq data.

metabammeta2fastameta3gtfmeta4blacklistmeta5known_fusionsmeta6structural_variantsmeta7tagsmeta8protein_domains

meta versions fusions fusions_fail

Adds imputation information metrics to the INFO field based on selected FORMAT tags. Only the IMPUTE2 INFO metric from FORMAT/GP tags is currently available.

01200

vcf tbi csi versions

bcftools:

BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. All commands work transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed. Most commands accept VCF, bgzipped VCF and BCF with filetype detected automatically even when streaming from a pipe. Indexed VCF and BCF will work in all situations. Un-indexed VCF and BCF and streams will work in most, but not all situations.

bcftools plugin impute-info:

Bcftools plugins are tools that can be used with bcftools to manipulate variant calls in Variant Call Format (VCF) and BCF. The impute-info plugin adds imputation information metrics to the INFO field based on selected FORMAT tags. Only the IMPUTE2 INFO metric from FORMAT/GP tags is currently available

Converts between similar tags, such as GL,PL,GP or QR,QA,QS or localized alleles, eg LPL,LAD.

01200

vcf tbi csi versions

view:

Converts between similar tags, such as GL,PL,GP or QR,QA,QS or localized alleles, eg LPL,LAD.

Locate and tag duplicate reads in a BAM file

01

bam metrics versions

biobambam:

biobambam is a set of tools for early stage alignment file processing.

Merge a list of sorted bam files

01

bam bam_index checksum versions

biobambam:

biobambam is a set of tools for early stage alignment file processing.

Parallel sorting and duplicate marking

0101

bam bam_index cram metrics versions

biobambam:

biobambam is a set of tools for early stage alignment file processing.

Re-estimate taxonomic abundance of metagenomic samples analyzed by kraken.

010

reports txt versions

bracken:

Bracken (Bayesian Reestimation of Abundance with KrakEN) is a highly accurate statistical method that computes the abundance of species in DNA sequences from a metagenomics sample.

Extends a Kraken2 database to be compatible with Bracken

01

db bracken_files versions

bracken:

Bracken (Bayesian Reestimation of Abundance with KrakEN) is a highly accurate statistical method that computes the abundance of species in DNA sequences from a metagenomics sample.

Combine output of metagenomic samples analyzed by bracken.

01

txt versions

bracken:

Bracken (Bayesian Reestimation of Abundance with KrakEN) is a highly accurate statistical method that computes the abundance of species in DNA sequences from a metagenomics sample.

Taxonomic classification of long DNA sequences and metagenome assembled genomes (e.g. MAGs / bins).

0101

txt versions

catpack:

CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)

Taxonomic classification of long DNA sequences and metagenome assembled genomes (e.g. MAGs / bins).

0101010101

orf2lca bin2classification log diamond faa gff versions

catpack:

CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)

Taxonomic classification of long DNA sequences and metagenome assembled genomes (e.g. contigs, MAGs / bins).

0101010101

orf2lca contig2classification log diamond faa gff versions

catpack:

CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)

Downloads the required files for either Nr or GTDB for building into a CAT database

01

rawdb versions

catpack:

CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)

Creates a CAT_pack database based on input FASTAs

01000

db taxonomy versions versions

catpack:

CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)

Taxonomic classification plus read-based abundance estimation from long DNA sequences and metagenome assembled genomes (e.g. contigs, MAGs / bins).

0101010101001010101010101

rat_log complete_abundance contig_abundance read2classification alignment_diamond contig2classification cat_log orf2lca faa gff unmapped_diamond unmapped_fasta unmapped2classification versions

catpack:

CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)

Summarises results from CAT/BAT/RAT classification steps

0101

txt versions

catpack:

CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)

Build centrifuge database for taxonomic profiling

010000

cf versions

centrifuge:

Classifier for metagenomic sequences

Classifies metagenomic sequence data

01000

report results sam fastq_mapped fastq_unmapped versions

centrifuge:

Centrifuge is a classifier for metagenomic sequences.

Creates Kraken-style reports from centrifuge out files

010

kreport versions

centrifuge:

Centrifuge is a classifier for metagenomic sequences.

CheckM provides a set of tools for assessing the quality of genomes recovered from isolates, single cells, or metagenomes.

0100

checkm_output marker_file checkm_tsv versions

checkm:

Assess the quality of microbial genomes recovered from isolates, single cells, and metagenomes.

CheckM provides a set of tools for assessing the quality of genomes recovered from isolates, single cells, or metagenomes.

01230

output fasta versions

checkm:

Assess the quality of microbial genomes recovered from isolates, single cells, and metagenomes.

CheckM2 database download

0

database versions

checkm2:

CheckM2 - Rapid assessment of genome bin quality using machine learning

CheckM2 bin quality prediction

0101

checkm2_output checkm2_tsv versions

checkm2:

CheckM2 - Rapid assessment of genome bin quality using machine learning

Construct the database necessary for checkv's quality assessment

NO input

checkv_db versions

checkv:

Assess the quality of metagenome-assembled viral genomes.

Assess the quality of metagenome-assembled viral genomes.

010

quality_summary completeness contamination complete_genomes proviruses viruses versions

checkv:

Assess the quality of metagenome-assembled viral genomes.

Construct the database necessary for checkv's quality assessment

010

checkv_db versions

checkv:

Assess the quality of metagenome-assembled viral genomes.

Performs preprocessing and alignment of chromatin fastq files to fasta reference files using chromap.

0101010000

bed bam tagAlign pairs versions

chromap:

Fast alignment and preprocessing of chromatin profiles

binning of metagenomic sequences

01

fasta bins fm index links result versions

A tool to raise the quality of viral genomes assembled from short-read metagenomes via resolving and joining of contigs fragmented during de novo assembly.

01010101000

self_circular extended_circular extended_partial extended_failed orphan_end all_cobra_assemblies joining_summary log versions

cobra-meta:

COBRA is a tool to get higher quality viral genomes assembled from metagenomes.

Unsupervised binning of metagenomic contigs by using nucleotide composition - kmer frequencies - and coverage data for multiple samples

012

args_txt clustering_csv log_txt original_data_csv pca_components_csv pca_transformed_csv versions

concoct:

Clustering cONtigs with COverage and ComposiTion

Calculate confidence scores from Kraken2 output

010

score versions

Calculates peak-to-through ratio (PTR) from metagenomic sequence data

01

ptr versions

coptr:

Accurate and robust inference of microbial growth dynamics from metagenomic sequencing reads.

Computes the coverage map along the reference genome

01

coverage versions

coptr:

Accurate and robust inference of microbial growth dynamics from metagenomic sequencing reads.

Indexes a directory of fasta files for use with CoPTR

01

index_dir versions

coptr:

Accurate and robust inference of microbial growth dynamics from metagenomic sequencing reads.

Maps the reads to the reference database

0101

bam versions

coptr:

Accurate and robust inference of microbial growth dynamics from metagenomic sequencing reads.

Merge reads that were mapped to multiple indices

01

bam versions

coptr:

Accurate and robust inference of microbial growth dynamics from metagenomic sequencing reads.

Map reads to contigs and estimate coverage

010100

coverage versions

coverm:

CoverM aims to be a configurable, easy to use and fast DNA read coverage and relative abundance calculator focused on metagenomics applications

A deep learning based approach to predict Antibiotic Resistance Genes (ARGs) from metagenomes

NO input

db versions

deeparg:

A deep learning based approach to predict Antibiotic Resistance Genes (ARGs) from metagenomes

A deep learning based approach to predict Antibiotic Resistance Genes (ARGs) from metagenomes

0120

daa daa_tsv arg potential_arg versions

deeparg:

A deep learning based approach to predict Antibiotic Resistance Genes (ARGs) from metagenomes

Performs rapid genome comparisons for a group of genomes and visualize their relatedness

01

directory versions

drep:

De-replication of microbial genomes assembled from multiple samples

A taxonomic profiler for metagenomic 16S data optimized for error prone long reads.

010

report assignment_report samfile unclassified_fa versions

emu:

Emu is a relative abundance estimator for 16s genomic data.

tool that takes either fragmented metagenomic data or longer sequences as input and predicts and delivers full-length antiobiotic resistance genes as output.

010

log txt hmm hmm_genes orfs orfs_amino contigs contigs_pept filtered filtered_pept fragments trimmed spades metagenome tmp versions

A program that counts sequence occurrences in FASTQ files.

0101

count_matrix stats distribution_plot reads_plot reads_plot_percentage versions

2FAST2Q:

2FAST2Q is ideal for CRISPRi-Seq, and for extracting and counting any kind of information from reads in the fastq format, such as barcodes in Bar-seq experiments. 2FAST2Q can work with sequence mismatches, Phred-score, and be used to find and extract unknown sequences delimited by known sequences. 2FAST2Q can extract multiple features per read using either fixed positions or delimiting search sequences.

Calls consensus sequences from reads with the same unique molecular tag.

0100

bam versions

fgbio:

Tools for working with genomic and high throughput sequencing data.

Groups reads together that appear to have come from the same original molecule. Reads are grouped by template, and then templates are sorted by the 5โ€™ mapping positions of the reads from the template, used from earliest mapping position to latest. Reads that have the same end positions are then sub-grouped by UMI sequence. (!) Note: the MQ tag is required on reads with mapped mates (!) This can be added using samblaster with the optional argument --addMateTags.

010

bam histogram versions

fgbio:

A set of tools for working with genomic and high throughput sequencing data, including UMIs

Cluster genome FASTA files by average nucleotide identity

0123

tsv dereplicated_bins versions

Build ganon database using custom reference sequences.

01000

db info versions

ganon:

ganon classifies short DNA sequences against large sets of genomic reference sequences efficiently

Classify FASTQ files against ganon database

010

tre report one all unc log versions

ganon:

ganon classifies short DNA sequences against large sets of genomic reference sequences efficiently

Generate a ganon report file from the output of ganon classify

010

tre versions

ganon:

ganon classifies short DNA sequences against large sets of genomic reference sequences efficiently

Generate a multi-sample report file from the output of ganon report runs

01

txt versions

ganon:

ganon classifies short DNA sequences against large sets of genomic reference sequences efficiently

Apply a score cutoff to filter variants based on a recalibration table. AplyVQSR performs the second pass in a two-stage process called Variant Quality Score Recalibration (VQSR). Specifically, it applies filtering to the input variants based on the recalibration table produced in the first step by VariantRecalibrator and a target sensitivity value.

012345000

vcf tbi versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

This tool locates and tags duplicate reads in a BAM or SAM file, where duplicate reads are defined as originating from a single fragment of DNA.

0100

cram bam crai bai metrics versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

This tool locates and tags duplicate reads in a BAM or SAM file, where duplicate reads are defined as originating from a single fragment of DNA.

metabamfastafaidict

meta versions output bam_index

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Splits CRAM files efficiently by taking advantage of their container based structure

01

split_crams versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

This tool locates and tags duplicate reads in a BAM or SAM file, where duplicate reads are defined as originating from a single fragment of DNA.

01000

output bam_index metrics versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

GECCO is a fast and scalable method for identifying putative novel Biosynthetic Gene Clusters (BGCs) in genomic and metagenomic data using Conditional Random Fields (CRFs).

0120

genes features clusters gbk json versions

gecco:

Biosynthetic Gene Cluster prediction with Conditional Random Fields.

Download geNomad databases and related files

NO input

genomad_db versions

genomad:

Identification of mobile genetic elements

Identify mobile genetic elements present in genomic assemblies

010

aggregated_classification taxonomy provirus compositions calibrated_classification plasmid_fasta plasmid_genes plasmid_proteins plasmid_summary virus_fasta virus_genes virus_proteins virus_summary versions

genomad:

Identification of mobile genetic elements

GTDB-Tk is a software toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes based on the Genome Database Taxonomy GTDB.

010100

summary tree markers msa user_msa filtered failed log warnings versions

gtdbtk:

GTDB-Tk is a software toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes based on the Genome Database Taxonomy GTDB.

Create a tag directory with the HOMER suite

010

tagdir taginfo versions

homer:

HOMER (Hypergeometric Optimization of Motif EnRichment) is a suite of tools for Motif Discovery and next-gen sequencing analysis.

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

DESeq2:

Differential gene expression analysis based on the negative binomial distribution

edgeR:

Empirical Analysis of Digital Gene Expression Data in R

Plot a metagene of cross-link events/sites around various transcriptomic landmarks.

010

tsv versions

icount:

Computational pipeline for analysis of iCLIP data

inStrain is python program for analysis of co-occurring genome populations from metagenomes that allows highly accurate genome comparisons, analysis of coverage, microdiversity, and linkage, and sensitive SNP detection with gene localization and synonymous non-synonymous identification

01000

profile snvs gene_info genome_info linkage mapping_info scaffold_info versions

instrain:

Calculation of strain-level metrics

Download, extract, and check md5 of iPHoP databases

NO input

iphop_db versions

iphop:

Predict host genus from genomes of uncultivated phages.

Predict phage host using iPHoP

010

iphop_genus iphop_genome iphop_detailed_output versions

iphop:

Predict host genus from genomes of uncultivated phages.

Extract UMI and cell barcodes

010

bam pbi versions

isoseq3:

Iso-Seq - Scalable De Novo Isoform Discovery

Taxonomic classification of metagenomic sequence data using a protein reference database

010

results versions

kaiju:

Fast and sensitive taxonomic classification for metagenomics

Convert Kaiju's tab-separated output file into a tab-separated text file which can be imported into Krona.

010

txt versions

kaiju:

Fast and sensitive taxonomic classification for metagenomics

write your description here

0100

summary versions

kaiju:

Fast and sensitive taxonomic classification for metagenomics

Merge two tab-separated output files of Kaiju and Kraken in the column format

0120

merged versions

kaiju:

Fast and sensitive taxonomic classification for metagenomics

Make Kaiju FMI-index file from a protein FASTA file

010

fmi bwt sa versions

kaiju:

Fast and sensitive taxonomic classification for metagenomics

Generate k-mers (sketches) from FASTA/Q sequences

01

outdir info versions

kmcp:

Accurate metagenomic profiling of both prokaryotic and viral populations by pseudo-mapping

Construct KMCP database from k-mer files

01

kmcp log versions

kmcp:

Accurate metagenomic profiling of both prokaryotic and viral populations by pseudo-mapping

Merge search results from multiple databases.

01

result versions

kmcp:

Accurate metagenomic profiling of both prokaryotic and viral populations by pseudo-mapping

Generate taxonomic profile from search results

010

profile versions

kmcp:

Accurate metagenomic profiling of both prokaryotic and viral populations by pseudo-mapping

Search sequences against database

010

result versions

kmcp:

Accurate metagenomic profiling of both prokaryotic and viral populations by pseudo-mapping

Adds fasta files to a Kraken2 taxonomic database

010000

db versions

kraken2:

Kraken2 is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies.

Builds Kraken2 database

010

db versions

kraken2:

Kraken2 is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies.

Downloads and builds Kraken2 standard database

0

db versions

kraken2:

Kraken2 is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies.

Classifies metagenomic sequence data

01000

classified_reads_fastq unclassified_reads_fastq classified_reads_assignment report versions

kraken2:

Kraken2 is a taxonomic sequence classifier that assigns taxonomic labels to sequence reads

Takes multiple kraken-style reports and combines them into a single report file

01

txt versions

krakentools:

KrakenTools is a suite of scripts to be used for post-analysis of Kraken/KrakenUniq/Kraken2/Bracken results. Please cite the relevant paper if using KrakenTools with any of the listed programs.

Extract reads classified at any user-specified taxonomy IDs.

0010101

extracted_kraken2_reads versions

krakentools:

KrakenTools is a suite of scripts to be used for post-analysis of Kraken/KrakenUniq/Kraken2/Bracken results. Please cite the relevant paper if using KrakenTools with any of the listed programs.

Takes a Kraken report file and prints out a krona-compatible TEXT file

01

txt versions

krakentools:

KrakenTools is a suite of scripts to be used for post-analysis of Kraken/KrakenUniq/Kraken2/Bracken results. Please cite the relevant paper if using KrakenTools with any of the listed programs.

Download and build (custom) KrakenUniq databases

01230

db versions

krakenuniq:

Metagenomics classifier with unique k-mer counting for more specific results

Download KrakenUniq databases and related fles

0

output versions

krakenuniq:

Metagenomics classifier with unique k-mer counting for more specific results

Classifies metagenomic sequence data using unique k-mer counts

012000000

classified_reads unclassified_reads classified_assignment report versions

krakenuniq:

Metagenomics classifier with unique k-mer counting for more specific results

Creates a Krona chart from text files listing quantities and lineages.

01

html versions

krona:

Krona Tools is a set of scripts to create Krona charts from several Bioinformatics tools as well as from text and XML files.

lima - The PacBio Barcode Demultiplexer and Primer Remover

010

counts report summary versions bam pbi fasta fastagz fastq fastqgz xml json clips guess

LongPhase is an ultra-fast program for simultaneously co-phasing SNPs, small indels, large SVs, and (5mC) modifications for Nanopore and PacBio platforms.

0123450101

bam log versions

longphase:

LongPhase is an ultra-fast program for simultaneously co-phasing SNPs, small indels, large SVs, and (5mC) modifications for Nanopore and PacBio platforms.

Identifies LTR retrotransposons using LTR_retriever

metagenomeharvestfindermgescannon_tgca

meta log pass_list pass_list_gff ltrlib annotation_out annotation_gff versions

LTR_retriever:

Sensitive and accurate identification of LTR retrotransposons

A tool that mines antimicrobial peptides (AMPs) from (meta)genomes by predicting peptides from genomes (provided as contigs) and outputs all the predicted anti-microbial peptides found.

01

smorfs all_orfs amp_prediction readme_file log_file versions

macrel:

A pipeline for AMP (antimicrobial peptide) prediction

MALT, an acronym for MEGAN alignment tool, is a sequence alignment and analysis tool designed for processing high-throughput sequencing data, especially in the context of metagenomics.

0000

index versions log

malt:

A tool for mapping metagenomic data

MALT, an acronym for MEGAN alignment tool, is a sequence alignment and analysis tool designed for processing high-throughput sequencing data, especially in the context of metagenomics.

010

rma6 alignments log versions

malt:

A tool for mapping metagenomic data

Tool for evaluation of MALT results for true positives of ancient metagenomic taxonomic screening

0100

results versions

MaxBin is a software that is capable of clustering metagenomic contigs

0123

binned_fastas summary abundance log marker_counts unbinned_fasta tooshort_fasta marker_bins marker_genes versions

Staging module for MCMICRO transforming Imaging Mass Cytometry .txt files to .tif files with OME-XML metadata. Includes optional hot pixel removal.

01

tif versions

mcstaging:

Staging modules for MCMICRO

Staging module for MCMICRO transforming PhenoImager .tif files into stacked and normalized ome-tif files per cycle, compatible as ASHLAR input.

01

tif versions

mcstaging:

Staging modules for MCMICRO

An ultra-fast metagenomic assembler for large and complex metagenomics

012

contigs k_contigs addi_contigs local_contigs kfinal_contigs log versions

pigz:

Parallel implementation of the gzip algorithm.

Performs taxonomic profiling of long metagenomic reads against the melon database

0100

tsv_output json_output log versions

Depth computation per contig step of metabat2

012

depth versions

metabat2:

Metagenome binning

Metagenome binning of contigs

012

tooshort lowdepth unbinned membership fasta versions

metabat2:

Metagenome binning

Annotation of eukaryotic metagenomes using MetaEuk

010

faa codon tsv gff versions

metaeuk:

MetaEuk - sensitive, high-throughput gene discovery and annotation for large-scale eukaryotic metagenomics

Strain-level metagenomic assignment

012340

wimp evidence_unknown_species reads2taxon em contig_coverage length_and_id krona versions

metamaps:

MetaMaps is a tool for long-read metagenomic analysis

Maps long reads to a metamaps database

010

classification_res meta_file meta_unmappedreadsLengths para_file versions

metamaps:

MetaMaps is a tool for long-read metagenomic analysis

Metagenome assembler for long-read sequences (HiFi and ONT).

010

contigs log versions

metamdbg:

MetaMDBG: a lightweight assembler for long and accurate metagenomics reads.

Build MetaPhlAn database for taxonomic profiling.

NO input

db versions

metaphlan:

Identify clades (phyla to species) present in the metagenome obtained from a microbiome sample and their relative abundance

Merges output abundance tables from MetaPhlAn4

01

txt versions

metaphlan4:

Identify clades (phyla to species) present in the metagenome obtained from a microbiome sample and their relative abundance

MetaPhlAn is a tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data.

010

profile biom bt2out versions

metaphlan:

Identify clades (phyla to species) present in the metagenome obtained from a microbiome sample and their relative abundance

Merges output abundance tables from MetaPhlAn3

01

txt versions

metaphlan3:

Identify clades (phyla to species) present in the metagenome obtained from a microbiome sample and their relative abundance

MetaPhlAn is a tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data.

010

profile biom bt2out versions

metaphlan3:

Identify clades (phyla to species) present in the metagenome obtained from a microbiome sample and their relative abundance

A tool to estimate bacterial species abundance

0100

results versions

midas:

An integrated pipeline for estimating strain-level genomic variation from metagenomic data

Download the mOTUs database

0

db versions

motus:

The mOTU profiler is a computational tool that estimates relative taxonomic abundance of known and currently unknown microbial community members using metagenomic shotgun sequencing data.

Taxonomic meta-omics profiling using universal marker genes

0100

txt biom versions

motus:

Marker gene-based OTU (mOTU) profiling

Taxonomic meta-omics profiling using universal marker genes

010

out versions

motus:

Marker gene-based operational taxonomic unit (mOTU) profiling

Taxonomic meta-omics profiling using universal marker genes

010

out bam mgc log versions

motus:

Marker gene-based OTU (mOTU) profiling

write your description here

metareadsformatmode

meta versions npa npc npl npo

Visualise metagenome redundancy curve in PNG format from a single Nonpareil npo file

01

png versions

nonpareil:

Estimate average coverage and create curves for metagenomic datasets

Calculate metagenome redundancy curve from FASTQ files

0100

npa npc npl npo versions

nonpareil:

Estimate average coverage and create curves for metagenomic datasets

Generate summary reports with raw data for Nonpareil NPO curves, including MultiQC compatible JSON/TSV files

01

json tsv csv pdf versions

nonpareil:

Estimate average coverage and create curves for metagenomic datasets

Visualise metagenome redundancy curves in PNG format from multiple Nonpareil npo files in a single image

01

png versions

nonpareil:

Estimate average coverage and create curves for metagenomic datasets

"This package computes informative enrichment and quality measures for ChIP-seq/DNase-seq/FAIRE-seq/MNase-seq data. It can also be used to obtain robust estimates of the predominant fragment length or characteristic tag shift values in these assays."

01

spp pdf rdata versions

phyloFlash is a pipeline to rapidly reconstruct the SSU rRNAs and explore phylogenetic composition of an illumina (meta)genomic dataset.

0100

results versions

Locate and tag duplicate reads in a BAM file

010101

bam bai cram metrics versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

This tool takes in a coordinate-sorted SAM or BAM and calculatesthe NM, MD, and UQ tags by comparing with the reference.

0101

bam bai versions

picard:

Java tools for working with NGS data in the BAM format

PRINSEQ++ is a C++ implementation of the prinseq-lite.pl program. It can be used to filter, reformat or trim genomic and metagenomic sequence data

01

good_reads single_reads bad_reads log versions

Calculate intervals coverage for each sample. N.B. the tool can not handle staging files with symlinks, stageInMode should be set to 'link'.

0120

txt png loess_qc_txt loess_txt versions

purecn:

Copy number calling and SNV classification using targeted short read sequencing

Predict antibiotic resistance from protein or nucleotide data

0100

json tsv tmp tool_version db_version versions

rgi:

This tool provides a preliminary annotation of your DNA sequence(s) based upon the data available in The Comprehensive Antibiotic Resistance Database (CARD). Hits to genes tagged with Antibiotic Resistance ontology terms will be highlighted. As CARD expands to include more pathogens, genomes, plasmids, and ontology terms this tool will grow increasingly powerful in providing first-pass detection of antibiotic resistance associated genes. See license at CARD website

Accurate detection of short and long active ORFs using Ribo-seq data

01201

protocol bam_summary read_length_dist metagene_profile_5p metagene_profile_3p metagene_plots psite_offsets pos_wig neg_wig orfs versions

ribotricer:

Python package to detect translating ORF from Ribo-seq data

Calling lowest common ancestors from multi-mapped reads in SAM/BAM/CRAM files

0120

csv json bam versions

sam2lca:

Lowest Common Ancestor on SAM/BAM/CRAM alignment files

This module combines samtools and samblaster in order to use samblaster capability to filter or tag SAM files, with the advantage of maintaining both input and output in BAM format. Samblaster input must contain a sequence header: for this reason it has been piped with the "samtools view -h" command. Additional desired arguments for samtools can be passed using: options.args2 for the input bam file options.args3 for the output bam file

01

bam versions

calculates MD and NM tags

0101

bam versions

samtoolscalmd:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Call peaks using SEACR on sequenced reads in bedgraph format

0120

bed versions

seacr:

SEACR is intended to call peaks and enriched regions from sparse CUT&RUN or chromatin profiling data in which background is dominated by "zeroes" (i.e. regions with no read coverage).

metagenomic binning with self-supervised learning

012

csv model output_fasta recluster_fasta tsv versions

semibin:

Metagenomic binning with semi-supervised siamese neural network

Apply a score cutoff to filter variants based on a recalibration table. Sentieon's Aplyvarcal performs the second pass in a two-stage process called Variant Quality Score Recalibration (VQSR). Specifically, it applies filtering to the input variants based on the recalibration table produced in the previous step VarCal and a target sensitivity value. https://support.sentieon.com/manual/usages/general/#applyvarcal-algorithm

0123450101

vcf tbi versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Sequenza-utils gc_wiggle computes the GC percentage across the sequences, and returns a file in the UCSC wiggle format, given a fasta file and a window size.

01

wig versions

sequenzautils:

Sequenza-utils provides 3 main command line programs to transform common NGS file format - such as FASTA, BAM - to input files for the Sequenza R package. The program -gc_wiggle- takes fasta file as an input, computes GC percentage across the sequences and returns a file in the UCSC wiggle format.

Calculate pairwise distances and basic clustering from SKA sketches

012

distances cluster_list cluster_files dot versions

ska:

SKA (Split Kmer Analysis)

Create genome sketch using split k-mers

012

skf versions

ska:

SKA (Split Kmer Analysis)

Simple ANI calculation between reference and query genomes.

0101

dist versions

skani:

skani is a fast and robust tool for calculating ANI between metagenome assembled genomes and contigs.

Memory-efficient ANI database queries with skani.

0101

search versions

skani:

skani is a fast and robust tool for calculating ANI between metagenome assembled genomes and contigs.

Storing skani sketches/indices on disk.

01

sketch_dir sketch markers versions

skani:

skani is a fast and robust tool for calculating ANI between metagenome assembled genomes and contigs.

All-to-all ANI computation.

01

triangle versions

skani:

skani is a fast and robust tool for calculating ANI between metagenome assembled genomes and contigs.

Classifies and predicts the origin of metagenomic samples

010000

report versions

Compare many FracMinHash signatures generated by sourmash sketch.

01000

matrix labels csv versions

sourmash:

Compute and compare FracMinHash signatures for DNA and protein data sets.

Search a metagenome sourmash signature against one or many reference databases and return the minimum set of genomes that contain the k-mers in the metagenome.

0100000

result unassigned matches prefetch prefetchcsv versions

sourmash:

Compute and compare FracMinHash signatures for DNA data sets.

Create a database of sourmash signatures (a group of FracMinHash sketches) to be used as references.

010

signature_index versions

sourmash:

Compute and compare FracMinHash signatures for DNA data sets.

Create a signature (a group of FracMinHash sketches) of a sequence using sourmash

01

signatures versions

sourmash:

Compute and compare FracMinHash signatures for DNA and protein data sets.

Annotate list of metagenome members (based on sourmash signature matches) with taxonomic information.

010

result versions

sourmash:

Compute and compare FracMinHash signatures for DNA data sets.

Sylph profile command for taxonoming profiling

010

profile_out versions

sylph:

Sylph quickly enables querying of genomes against even low-coverage shotgun metagenomes to find nearest neighbour ANI.

Sketching/indexing sequencing reads

010

sketch_fastq_genome versions

sylph:

Sylph quickly enables querying of genomes against even low-coverage shotgun metagenomes to find nearest neighbour ANI.

Incorporates taxonomy into sylph metagenomic classifier

010

taxprof_output versions

sylphtax:

Integrating taxonomic information into the sylph metagenome profiler.

A tool for tagging BAM files.

01

bam versions

Standardise and merge two or more taxonomic profiles into a single table

010000

merged_profiles versions

taxpasta:

TAXonomic Profile Aggregation and STAndardisation

Standardise the output of a wide range of taxonomic profilers

01000

standardised_profile versions

taxpasta:

TAXonomic Profile Aggregation and STAndardisation

Domain-level classification of contigs to bacterial, archaeal, eukaryotic, or organelle

01

classifications log fasta versions

tiara:

Deep-learning-based approach for identification of eukaryotic sequences in the metagenomic data powered by PyTorch.

tidk explore attempts to find the simple telomeric repeat unit in the genome provided. It will report this repeat in its canonical form (e.g. TTAGG -> AACCT).

01

explore_tsv top_sequence versions

tidk:

tidk is a toolkit to identify and visualise telomeric repeats in genomes

Searches a genome for a telomere string such as TTAGGG

010

tsv bedgraph versions

tidk:

tidk is a toolkit to identify and visualise telomeric repeats in genomes

Deduplicate reads based on the mapping co-ordinate and the UMI attached to the read.

0120

bam log tsv_edit_distance tsv_per_umi tsv_umi_per_position versions

umi_tools:

UMI-tools contains tools for dealing with Unique Molecular Identifiers (UMIs)/Random Molecular Tags (RMTs) and single cell RNA-Seq cell barcodes

Extracts UMI barcode from a read and add it to the read name, leaving any sample barcode in place

01

reads log versions

umi_tools:

UMI-tools contains tools for dealing with Unique Molecular Identifiers (UMIs)/Random Molecular Tags (RMTs) and single cell RNA-Seq cell barcodes

Group reads based on their UMI and mapping coordinates

01200

log bam tsv versions

umi_tools:

UMI-tools contains tools for dealing with Unique Molecular Identifiers (UMIs)/Random Molecular Tags (RMTs) and single cell RNA-Seq cell barcodes

Make the output from umi_tools dedup or group compatible with RSEM

012

bam log versions

umi_tools:

UMI-tools contains tools for dealing with Unique Molecular Identifiers (UMIs)/Random Molecular Tags (RMTs) and single cell RNA-Seq cell barcodes

Velocyto is a library for the analysis of RNA velocity. velocyto.py CLI use Path(resolve_path=True) and breaks the nextflow logic of symbolic links. If in the work dir velocyto find a file named EXACTLY cellsorted_[ORIGINAL_BAM_NAME] it will skip the samtools sort step. Cellsorted bam file should be cell sorted with:

    samtools sort -t CB -O BAM -o cellsorted_input.bam input.bam

See module test for an example with the SAMTOOLS_SORT nf-core module. Config example to cellsort input bam using SAMTOOLS_SORT:

    withName: SAMTOOLS_SORT {
        ext.prefix = { "cellsorted_${bam.baseName}" }
        ext.args = '-t CB -O BAM'
    }

Optional mask must be passed with ext.args and option --mask This is why I need to stage in the work dir 2 bam files (cellsorted and original). See also velocyto tutorial

01230

loom versions

Extracting sequences that were unbinnned by vRhyme into a FASTA file

0101

unbinned_sequences versions

vrhyme:

vRhyme functions by utilizing coverage variance comparisons and supervised machine learning classification of sequence features to construct viral metagenome-assembled genomes (vMAGs).

Linking bins output by vRhyme to create one sequences per bin

01

linked_bins versions

vrhyme:

vRhyme functions by utilizing coverage variance comparisons and supervised machine learning classification of sequence features to construct viral metagenome-assembled genomes (vMAGs).

Binning virus genomes from metagenomes

0101

bins membership summary versions

vrhyme:

vRhyme functions by utilizing coverage variance comparisons and supervised machine learning classification of sequence features to construct viral metagenome-assembled genomes (vMAGs).

Cluster sequences using a single-pass, greedy centroid-based clustering algorithm.

01

aln biom mothur otu bam out blast uc centroids clusters profile msa versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Merge strictly identical sequences contained in filename. Identical sequences are defined as having the same length and the same string of nucleotides (case insensitive, T and U are considered the same).

01

fasta clustering log versions

vsearch:

A versatile open source tool for metagenomics (USEARCH alternative)

Performs quality filtering and / or conversion of a FASTQ file to FASTA format.

01

fasta log versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Taxonomic classification using the sintax algorithm.

010

tsv versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Sort fasta entries by decreasing abundance (--sortbysize) or sequence length (--sortbylength).

010

fasta versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Compare target sequences to fasta-formatted query sequences using global pairwise alignment.

010000

aln biom lca mothur otu sam tsv txt uc versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Click here to trigger an update.