Available Modules

Modules are the building stones of all DSL2 nf-core blocks. You can find more info from nf-core website, if you would like to write your own module.

  • alignment 70
  • bam 62
  • fasta 27
  • genomics 25
  • cram 25
  • sam 24
  • fastq 20
  • MSA 19
  • index 14
  • map 14
  • genome 13
  • metagenomics 11
  • vcf 9
  • bisulfite 9
  • methylseq 9
  • bisulphite 9
  • reference 8
  • methylation 8
  • 5mC 8
  • sort 7
  • variant calling 7
  • align 7
  • statistics 7
  • graph 7
  • LAST 7
  • msa 7
  • qc 6
  • phylogeny 6
  • metrics 6
  • multiple sequence alignment 6
  • assembly 5
  • merge 5
  • gff 5
  • ancient DNA 5
  • consensus 5
  • structure 5
  • aDNA 5
  • palaeogenomics 5
  • archaeogenomics 5
  • newick 5
  • bismark 5
  • vsearch 5
  • 3-letter genome 5
  • microbiome 5
  • MAF 5
  • filter 4
  • variants 4
  • split 4
  • gfa 4
  • quality 4
  • mapping 4
  • markduplicates 4
  • samtools 4
  • bwa 4
  • feature 4
  • low frequency variant calling 4
  • mem 4
  • malt 4
  • skani 4
  • bed 3
  • structural variants 3
  • quality control 3
  • clustering 3
  • variation graph 3
  • bqsr 3
  • depth 3
  • cluster 3
  • damage 3
  • transcript 3
  • hmmer 3
  • evaluation 3
  • sketch 3
  • duplicates 3
  • clipping 3
  • HMM 3
  • distance 3
  • counts 3
  • view 3
  • nucleotide 3
  • dist 3
  • pseudoalignment 3
  • guide tree 3
  • insert 3
  • pan-genome 3
  • bwameth 3
  • atac-seq 3
  • chip-seq 3
  • database 2
  • bacteria 2
  • gtf 2
  • somatic 2
  • sentieon 2
  • rnaseq 2
  • protein 2
  • stats 2
  • aligner 2
  • gene 2
  • sequence 2
  • hmmsearch 2
  • population genetics 2
  • gff3 2
  • genotyping 2
  • splicing 2
  • bedGraph 2
  • json 2
  • kallisto 2
  • peak-calling 2
  • chromosome 2
  • amplicon sequences 2
  • mask 2
  • structural_variants 2
  • SNP 2
  • indel 2
  • vg 2
  • reformatting 2
  • realignment 2
  • MaltExtract 2
  • fixmate 2
  • recombination 2
  • HOPS 2
  • authentication 2
  • edit distance 2
  • ragtag 2
  • soft-clipped clusters 2
  • gatk4 1
  • coverage 1
  • classification 1
  • nanopore 1
  • variant 1
  • taxonomy 1
  • pacbio 1
  • convert 1
  • proteomics 1
  • trimming 1
  • long reads 1
  • isoseq 1
  • build 1
  • mags 1
  • long-read 1
  • wgs 1
  • picard 1
  • visualisation 1
  • sequences 1
  • taxonomic classification 1
  • matrix 1
  • plot 1
  • base quality score recalibration 1
  • neural network 1
  • scWGBS 1
  • pangenome graph 1
  • WGBS 1
  • DNA methylation 1
  • filtering 1
  • example 1
  • mappability 1
  • biscuit 1
  • germline 1
  • bisulfite sequencing 1
  • machine learning 1
  • peaks 1
  • blast 1
  • rna 1
  • report 1
  • pangenome 1
  • profile 1
  • short-read 1
  • reads 1
  • snp 1
  • single cell 1
  • ont 1
  • cat 1
  • deamination 1
  • amps 1
  • structural 1
  • mpileup 1
  • archaeogenetics 1
  • palaeogenetics 1
  • miscoding lesions 1
  • phylogenetic placement 1
  • SV 1
  • circrna 1
  • paf 1
  • abundance 1
  • clean 1
  • scaffolding 1
  • fusions 1
  • uLTRA 1
  • minimap2 1
  • long_read 1
  • hidden Markov model 1
  • mapper 1
  • subsample 1
  • pileup 1
  • hi-c 1
  • eukaryotes 1
  • fingerprint 1
  • reformat 1
  • amino acid 1
  • maximum likelihood 1
  • demultiplexed reads 1
  • aggregate 1
  • artic 1
  • variation 1
  • duplicate 1
  • pair 1
  • import 1
  • reheader 1
  • rrna 1
  • primer 1
  • homologs 1
  • lofreq 1
  • salmon 1
  • kma 1
  • registration 1
  • image_processing 1
  • blastp 1
  • ancient dna 1
  • dict 1
  • estimation 1
  • orthologs 1
  • splice 1
  • collate 1
  • emboss 1
  • patch 1
  • sintax 1
  • vsearch/sort 1
  • usearch 1
  • long read alignment 1
  • pangenome-scale 1
  • all versus all 1
  • mashmap 1
  • transcroder 1
  • cds 1
  • wavefront 1
  • coding 1
  • downsample bam 1
  • subsample bam 1
  • downsample 1
  • construct 1
  • graph projection to vcf 1
  • iterative model refinement 1
  • eucaryotes 1
  • mapad 1
  • adna 1
  • c to t 1
  • cram-size 1
  • size 1
  • ribosomal RNA 1
  • droplet based single cells 1
  • guidetree 1
  • bwamem2 1
  • bwameme 1
  • pdb 1
  • vsearch/fastqfilter 1
  • fastqfilter 1
  • ATACseq 1
  • shift 1
  • ATACshift 1
  • taxonomic composition 1
  • covariance model 1
  • junction 1
  • phylogenies 1
  • hhsuite 1
  • multi-tool 1
  • probabilistic realignment 1
  • rRNA 1
  • targets 1
  • ANI 1
  • mergebamalignment 1
  • post Post-processing 1
  • sorted 1
  • track 1
  • segment 1
  • duplicate removal 1
  • chromap 1
  • constant 1
  • purging 1
  • neighbour-joining 1
  • hybrid-selection 1
  • induce 1
  • invariant 1
  • SNPs 1
  • snippy 1
  • core 1
  • POA 1
  • amplicon 1
  • paired 1
  • repair 1
  • insert size 1
  • faidx 1
  • calmd 1
  • ampliconclip 1
  • readgroup 1
  • multimapper 1
  • Ancestor 1
  • LCA 1
  • read pairs 1
  • scramble 1
  • cluster analysis 1
  • clusteridentifier 1
  • lofreq/call 1
  • train 1
  • reorder 1
  • spliced 1
  • Hidden Markov Model 1
  • HMMER 1
  • quant 1
  • graphs 1
  • paragraph 1
  • mbias 1
  • methylation bias 1
  • ngm 1
  • SNP table 1
  • NextGenMap 1
  • GATK UnifiedGenotyper 1
  • annotation 0
  • download 0
  • classify 0
  • cnv 0
  • k-mer 0
  • taxonomic profiling 0
  • contamination 0
  • conversion 0
  • binning 0
  • count 0
  • copy number 0
  • single-cell 0
  • VCF 0
  • contigs 0
  • imputation 0
  • bedtools 0
  • bcftools 0
  • kmer 0
  • sv 0
  • gvcf 0
  • reporting 0
  • QC 0
  • databases 0
  • indexing 0
  • cna 0
  • imaging 0
  • illumina 0
  • compression 0
  • table 0
  • openms 0
  • demultiplex 0
  • serotype 0
  • tsv 0
  • phage 0
  • plink2 0
  • antimicrobial resistance 0
  • pairs 0
  • amr 0
  • repeat 0
  • bins 0
  • histogram 0
  • haplotype 0
  • protein sequence 0
  • searching 0
  • expression 0
  • annotate 0
  • seqkit 0
  • db 0
  • bcf 0
  • mmseqs2 0
  • iCLIP 0
  • phasing 0
  • metagenome 0
  • completeness 0
  • transcriptome 0
  • virus 0
  • low-coverage 0
  • validation 0
  • checkm 0
  • cooler 0
  • gzip 0
  • genotype 0
  • dedup 0
  • ncbi 0
  • complexity 0
  • umi 0
  • segmentation 0
  • prediction 0
  • kraken2 0
  • mkref 0
  • spatial 0
  • glimpse 0
  • mag 0
  • ucsc 0
  • decompression 0
  • plasmid 0
  • scRNA-seq 0
  • prokaryote 0
  • antimicrobial peptides 0
  • differential 0
  • kmers 0
  • cnvkit 0
  • tumor-only 0
  • mirna 0
  • mitochondria 0
  • NCBI 0
  • deduplication 0
  • antimicrobial resistance genes 0
  • extract 0
  • csv 0
  • demultiplexing 0
  • single 0
  • FASTQ 0
  • merging 0
  • antibiotic resistance 0
  • idXML 0
  • isolates 0
  • arg 0
  • compare 0
  • profiling 0
  • de novo 0
  • text 0
  • concatenate 0
  • diversity 0
  • sourmash 0
  • gridss 0
  • ptr 0
  • fragment 0
  • interval 0
  • indels 0
  • reference-free 0
  • de novo assembly 0
  • svtk 0
  • coptr 0
  • wxs 0
  • detection 0
  • tabular 0
  • riboseq 0
  • adapters 0
  • mutect2 0
  • benchmark 0
  • call 0
  • visualization 0
  • query 0
  • summary 0
  • fastx 0
  • STR 0
  • fgbio 0
  • gsea 0
  • xeniumranger 0
  • enrichment 0
  • transcriptomics 0
  • bedgraph 0
  • genmod 0
  • ranking 0
  • snps 0
  • bin 0
  • CLIP 0
  • interval_list 0
  • mtDNA 0
  • compress 0
  • read depth 0
  • isomir 0
  • haplotypecaller 0
  • hic 0
  • bigwig 0
  • ganon 0
  • pypgx 0
  • deep learning 0
  • sylph 0
  • genome assembler 0
  • public datasets 0
  • cut 0
  • preprocessing 0
  • telomere 0
  • fungi 0
  • diamond 0
  • sample 0
  • family 0
  • fai 0
  • containment 0
  • retrotransposon 0
  • quantification 0
  • fusion 0
  • resistance 0
  • umitools 0
  • logratio 0
  • BGC 0
  • biosynthetic gene cluster 0
  • microsatellite 0
  • union 0
  • chunk 0
  • redundancy 0
  • add 0
  • bedpe 0
  • ngscheckmate 0
  • matching 0
  • normalization 0
  • propr 0
  • bgzip 0
  • ccs 0
  • image 0
  • hmmcopy 0
  • targeted sequencing 0
  • DNA sequencing 0
  • ancestry 0
  • ampir 0
  • dna 0
  • HiFi 0
  • hybrid capture sequencing 0
  • bcl2fastq 0
  • happy 0
  • copy number alteration calling 0
  • microarray 0
  • ATAC-seq 0
  • parsing 0
  • DNA sequence 0
  • sequencing 0
  • duplication 0
  • benchmarking 0
  • windowmasker 0
  • rsem 0
  • vrhyme 0
  • scaffold 0
  • fastk 0
  • chimeras 0
  • archiving 0
  • tabix 0
  • fcs-gx 0
  • kinship 0
  • identity 0
  • relatedness 0
  • spaceranger 0
  • lossless 0
  • observations 0
  • survivor 0
  • shapeit 0
  • seqtk 0
  • zip 0
  • anndata 0
  • typing 0
  • prokka 0
  • entrez 0
  • untar 0
  • uncompress 0
  • unzip 0
  • UMI 0
  • comparison 0
  • npz 0
  • krona chart 0
  • polishing 0
  • transposons 0
  • rna_structure 0
  • RNA 0
  • small indels 0
  • panel 0
  • score 0
  • html 0
  • krona 0
  • khmer 0
  • reports 0
  • mlst 0
  • genome assembly 0
  • transcripts 0
  • notebook 0
  • amplicon sequencing 0
  • ligate 0
  • population genomics 0
  • cfDNA 0
  • PacBio 0
  • popscle 0
  • immunoinformatics 0
  • ambient RNA removal 0
  • fam 0
  • bim 0
  • variant_calling 0
  • replace 0
  • wastewater 0
  • genotype-based deconvoltion 0
  • GPU-accelerated 0
  • organelle 0
  • lineage 0
  • pangolin 0
  • covid 0
  • spark 0
  • pairsam 0
  • miRNA 0
  • somatic variants 0
  • mzml 0
  • dictionary 0
  • gatk4spark 0
  • bacterial 0
  • informative sites 0
  • microbes 0
  • virulence 0
  • ataqv 0
  • macrel 0
  • cellranger 0
  • intervals 0
  • host 0
  • converter 0
  • prefetch 0
  • CRISPR 0
  • angsd 0
  • bamtools 0
  • repeat expansion 0
  • C to T 0
  • das tool 0
  • das_tool 0
  • bakta 0
  • mkfastq 0
  • roh 0
  • image_analysis 0
  • cut up 0
  • cool 0
  • bracken 0
  • combine 0
  • comparisons 0
  • dump 0
  • proteome 0
  • DRAMP 0
  • neubi 0
  • aln 0
  • mcmicro 0
  • highly_multiplexed_imaging 0
  • amplify 0
  • deeparg 0
  • gene expression 0
  • arriba 0
  • RNA-seq 0
  • genome mining 0
  • scores 0
  • regions 0
  • prokaryotes 0
  • checkv 0
  • remove 0
  • quality trimming 0
  • complement 0
  • adapter trimming 0
  • png 0
  • kraken 0
  • PCA 0
  • genomes 0
  • wig 0
  • phase 0
  • preseq 0
  • haplotypes 0
  • adapter 0
  • purge duplications 0
  • functional analysis 0
  • ampgram 0
  • ichorcna 0
  • library 0
  • metamaps 0
  • genetics 0
  • refine 0
  • concordance 0
  • amptransformer 0
  • instrain 0
  • hlala_typing 0
  • junctions 0
  • lift 0
  • iphop 0
  • vcflib 0
  • polyA_tail 0
  • leviosam2 0
  • mapcounter 0
  • tama 0
  • hla_typing 0
  • pigz 0
  • vdj 0
  • contig 0
  • simulate 0
  • differential expression 0
  • gene set 0
  • polish 0
  • find 0
  • translation 0
  • trancriptome 0
  • RNA-Seq 0
  • gstama 0
  • BAM 0
  • hlala 0
  • haplogroups 0
  • hla 0
  • Duplication purging 0
  • runs_of_homozygosity 0
  • genome bins 0
  • immunoprofiling 0
  • Read depth 0
  • zlib 0
  • interactions 0
  • gene set analysis 0
  • regression 0
  • single cells 0
  • taxids 0
  • taxon name 0
  • blastn 0
  • resolve_bioscience 0
  • homoploymer 0
  • multiallelic 0
  • normalize 0
  • norm 0
  • scatter 0
  • trim 0
  • spatial_transcriptomics 0
  • xz 0
  • instability 0
  • profiles 0
  • archive 0
  • nucleotides 0
  • assembly evaluation 0
  • cnvnator 0
  • COBS 0
  • k-mer index 0
  • bloom filter 0
  • MSI 0
  • msi 0
  • k-mer frequency 0
  • removal 0
  • graph layout 0
  • microscopy 0
  • orthology 0
  • parallelized 0
  • nextclade 0
  • concat 0
  • tnhaplotyper2 0
  • small variants 0
  • transcriptomic 0
  • mudskipper 0
  • tbi 0
  • intersect 0
  • rgfa 0
  • msisensor-pro 0
  • micro-satellite-scan 0
  • tumor 0
  • GC content 0
  • proportionality 0
  • awk 0
  • interactive 0
  • copyratios 0
  • barcode 0
  • region 0
  • sizes 0
  • bases 0
  • parse 0
  • krakenuniq 0
  • hostile 0
  • krakentools 0
  • subset 0
  • orf 0
  • bfiles 0
  • screen 0
  • variant pruning 0
  • SimpleAF 0
  • bustools 0
  • serogroup 0
  • megan 0
  • long terminal repeat 0
  • ped 0
  • pharokka 0
  • function 0
  • graft 0
  • checksum 0
  • xenograft 0
  • retrotransposons 0
  • tree 0
  • minhash 0
  • mash 0
  • long terminal retrotransposon 0
  • intersection 0
  • read-group 0
  • windows 0
  • mitochondrion 0
  • ChIP-seq 0
  • rtgtools 0
  • rename 0
  • identifier 0
  • metagenomic 0
  • evidence 0
  • fasterq-dump 0
  • ome-tif 0
  • MCMICRO 0
  • mirdeep2 0
  • small genome 0
  • pharmacogenetics 0
  • RNA sequencing 0
  • taxon tables 0
  • filtermutectcalls 0
  • transformation 0
  • otu tables 0
  • de novo assembler 0
  • trgt 0
  • fetch 0
  • repeats 0
  • CNV 0
  • doublets 0
  • cnv calling 0
  • metagenomes 0
  • tab 0
  • cleaning 0
  • shigella 0
  • expansionhunterdenovo 0
  • gwas 0
  • metadata 0
  • structural-variant calling 0
  • repeat_expansions 0
  • salmonella 0
  • calling 0
  • GEO 0
  • sra-tools 0
  • varcal 0
  • reads merging 0
  • nanostring 0
  • deconvolution 0
  • bayesian 0
  • short reads 0
  • nacho 0
  • merge mate pairs 0
  • microbial 0
  • correction 0
  • Pharmacogenetics 0
  • frame-shift correction 0
  • switch 0
  • corrupted 0
  • svdb 0
  • long-read sequencing 0
  • sequence analysis 0
  • smrnaseq 0
  • UMIs 0
  • interval list 0
  • standardisation 0
  • settings 0
  • duplex 0
  • standardise 0
  • allele-specific 0
  • taxonomic profile 0
  • standardization 0
  • sequenzautils 0
  • unaligned 0
  • Streptococcus pneumoniae 0
  • version 0
  • allele 0
  • mRNA 0
  • panelofnormals 0
  • random forest 0
  • immunology 0
  • antibiotics 0
  • RiPP 0
  • bam2fq 0
  • baf 0
  • NRPS 0
  • validate 0
  • eigenstrat 0
  • signature 0
  • secondary metabolites 0
  • heatmap 0
  • FracMinHash sketch 0
  • samplesheet 0
  • mass spectrometry 0
  • samples 0
  • deseq2 0
  • rna-seq 0
  • decontamination 0
  • genomad 0
  • eCLIP 0
  • format 0
  • antismash 0
  • gene labels 0
  • BCR 0
  • gem 0
  • eido 0
  • human removal 0
  • join 0
  • split_kmers 0
  • gatk 0
  • joint genotyping 0
  • dereplicate 0
  • qualty 0
  • spatial_omics 0
  • snpsift 0
  • screening 0
  • effect prediction 0
  • snpeff 0
  • cancer genomics 0
  • WGS 0
  • cvnkit 0
  • cgMLST 0
  • comp 0
  • detecting svs 0
  • groupby 0
  • tnscope 0
  • genotype dosages 0
  • fracminhash sketch 0
  • bgen 0
  • TCR 0
  • embeddings 0
  • linkage equilibrium 0
  • sompy 0
  • short-read sequencing 0
  • vcf file 0
  • chloroplast 0
  • decompose 0
  • plink2_pca 0
  • confidence 0
  • bgen file 0
  • bedcov 0
  • assembly polishing 0
  • genome polishing 0
  • hmmfetch 0
  • hash sketch 0
  • Read coverage histogram 0
  • propd 0
  • streptococcus 0
  • transmembrane 0
  • spa 0
  • metabolomics 0
  • spatype 0
  • signatures 0
  • reverse complement 0
  • secondary structure 0
  • genome graph 0
  • Escherichia coli 0
  • htseq 0
  • sccmec 0
  • tnseq 0
  • boxcox 0
  • spatialdata 0
  • simulation 0
  • pca 0
  • clr 0
  • alr 0
  • decoy 0
  • pruning 0
  • blat 0
  • variantcalling 0
  • decompress 0
  • wget 0
  • bedgraphtobigwig 0
  • missingness 0
  • ucsc/liftover 0
  • gtftogenepred 0
  • refflat 0
  • genepred 0
  • bedtobigbed 0
  • bigbed 0
  • umicollapse 0
  • sequencing adapters 0
  • modelsegments 0
  • hwe 0
  • linkbins 0
  • wham 0
  • toml 0
  • maf 0
  • gemini 0
  • vcfbreakmulti 0
  • data-download 0
  • uniq 0
  • vcf2db 0
  • deduplicate 0
  • VCFtools 0
  • verifybamid 0
  • scRNA-Seq 0
  • snv 0
  • DNA contamination estimation 0
  • disomy 0
  • metabolite annotation 0
  • uniparental 0
  • upd 0
  • files 0
  • extractunbinned 0
  • whamg 0
  • metaspace 0
  • dnascope 0
  • rdtest2vcf 0
  • eigenvectors 0
  • polya tail 0
  • lua 0
  • plant 0
  • hicPCA 0
  • vcf2bed 0
  • sliding 0
  • SINE 0
  • rdtest 0
  • snakemake 0
  • readproteingroups 0
  • workflow 0
  • workflow_mode 0
  • network 0
  • countsvtypes 0
  • createreadcountpanelofnormals 0
  • baftest 0
  • svtk/baftest 0
  • denoisereadcounts 0
  • readwriter 0
  • dnamodelapply 0
  • fast5 0
  • proteus 0
  • integron 0
  • copy number variation 0
  • mobile genetic elements 0
  • genome annotation 0
  • copy-number 0
  • trna 0
  • covariance models 0
  • copy number analysis 0
  • gender determination 0
  • unmarkduplicates 0
  • copy number alterations 0
  • references 0
  • melon 0
  • chromosomal rearrangements 0
  • long-reads 0
  • Mycobacterium tuberculosis 0
  • yahs 0
  • scanner 0
  • helitron 0
  • geo 0
  • remove samples 0
  • peak picking 0
  • low-complexity 0
  • site frequency spectrum 0
  • nanoq 0
  • paraphase 0
  • selector 0
  • quality check 0
  • realign 0
  • circular 0
  • spot 0
  • orthogroup 0
  • sage 0
  • featuretable 0
  • extraction 0
  • redundant 0
  • paired reads re-pairing 0
  • regex 0
  • Read filters 0
  • fix 0
  • malformed 0
  • Read trimming 0
  • Read report 0
  • drug categorization 0
  • uniques 0
  • Illumina 0
  • functional 0
  • impute-info 0
  • tag2tag 0
  • partitioning 0
  • hashing-based deconvolution 0
  • transcription factors 0
  • regulatory network 0
  • java 0
  • genotype-based demultiplexing 0
  • coreutils 0
  • generic 0
  • transposable element 0
  • retrieval 0
  • mzML 0
  • prepare 0
  • MMseqs2 0
  • catpack 0
  • InterProScan 0
  • busco 0
  • Computational Immunology 0
  • lexogen 0
  • donor deconvolution 0
  • patterns 0
  • cellsnp 0
  • trimfq 0
  • vcflib/vcffixup 0
  • AC/NS/AF 0
  • Pacbio 0
  • Bioinformatics Tools 0
  • Immune Deconvolution 0
  • doublet 0
  • grabix 0
  • ribosomal 0
  • 10x 0
  • rank 0
  • script 0
  • hashing-based deconvoltion 0
  • scanpy 0
  • leafcutter 0
  • bclconvert 0
  • nucBed 0
  • AT content 0
  • nucleotide content 0
  • regtools 0
  • elfasta 0
  • elprep 0
  • plotting 0
  • controlstatistics 0
  • source tracking 0
  • emoji 0
  • quality_control 0
  • doublet_detection 0
  • tarball 0
  • barcodes 0
  • subsetting 0
  • logFC 0
  • significance statistic 0
  • p-value 0
  • scvi 0
  • solo 0
  • metagenome assembler 0
  • import segmentation 0
  • nuclear segmentation 0
  • cell segmentation 0
  • relabel 0
  • resegment 0
  • morphology 0
  • targz 0
  • tar 0
  • chip 0
  • block substitutions 0
  • xml 0
  • svg 0
  • standard 0
  • haplotag 0
  • staging 0
  • updatedata 0
  • Staging 0
  • run 0
  • microRNA 0
  • multiqc 0
  • mass_error 0
  • search engine 0
  • poolseq 0
  • decomposeblocksub 0
  • translate 0
  • variant-calling 0
  • stardist 0
  • identity-by-descent 0
  • telseq 0
  • vsearch/dereplicate 0
  • mgi 0
  • recovery 0
  • setgt 0
  • jvarkit 0
  • gnu 0
  • ancestral alleles 0
  • resistance genes 0
  • gaps 0
  • introns 0
  • install 0
  • joint-genotyping 0
  • genotypegvcf 0
  • isoform 0
  • variancepartition 0
  • dream 0
  • md 0
  • nm 0
  • parallel 0
  • plastid 0
  • resfinder 0
  • uq 0
  • longest 0
  • short 0
  • intron 0
  • raw 0
  • mgf 0
  • parquet 0
  • parser 0
  • dbsnp 0
  • masking 0
  • standardize 0
  • quarto 0
  • python 0
  • r 0
  • coexpression 0
  • correlation 0
  • transform 0
  • agat 0
  • assay 0
  • bam2fastx 0
  • derived alleles 0
  • tnfilter 0
  • f coefficient 0
  • homozygous genotypes 0
  • heterozygous genotypes 0
  • array_cgh 0
  • cytosure 0
  • vector 0
  • gprofiler2 0
  • gost 0
  • rad 0
  • inbreeding 0
  • structural variant 0
  • bam2fastq 0
  • drep 0
  • immcantation 0
  • tags 0
  • co-orthology 0
  • homology 0
  • sequence similarity 0
  • dereplication 0
  • microbial genomics 0
  • spectral clustering 0
  • comparative genomics 0
  • deep variant 0
  • mutect 0
  • idx 0
  • corpcor 0
  • phylogenetics 0
  • hamming-distance 0
  • hmmpress 0
  • scimap 0
  • Bayesian 0
  • structural-variants 0
  • reference compression 0
  • omics 0
  • biological activity 0
  • reference panel 0
  • prior knowledge 0
  • tag 0
  • cell_barcodes 0
  • hmmscan 0
  • mygene 0
  • associations 0
  • go 0
  • 16S 0
  • pile up 0
  • nanopore sequencing 0
  • rna velocity 0
  • cobra 0
  • CRISPRi 0
  • extension 0
  • grea 0
  • functional enrichment 0
  • paired reads merging 0
  • overlap-based merging 0
  • check 0
  • spatial_neighborhoods 0
  • case/control 0
  • minimum_evolution 0
  • genotype likelihood 0
  • distance-based 0
  • nucleotide sequence 0
  • GFF/GTF 0
  • trio binning 0
  • tandem repeats 0
  • predict 0
  • long read 0
  • hardy-weinberg 0
  • hwe statistics 0
  • hwe equilibrium 0
  • shuffleBed 0
  • reference-independent 0
  • SNV 0
  • collapse 0
  • GWAS 0
  • Indel 0
  • host removal 0
  • haploype 0
  • liftover 0
  • seqfu 0
  • n50 0
  • cell_type_identification 0
  • cell_phenotyping 0
  • impute 0
  • machine_learning 0
  • clahe 0
  • refresh 0
  • association 0
  • airrseq 0
  • de Bruijn 0
  • calibratedragstrmodel 0
  • composestrtablefile 0
  • short variant discovery 0
  • combinegvcfs 0
  • collectsvevidence 0
  • collectreadcounts 0
  • cnnscorevariants 0
  • getpileupsummaries 0
  • condensedepthevidence 0
  • cross-samplecontamination 0
  • calculatecontamination 0
  • bedtointervallist 0
  • asereadcounter 0
  • vqsr 0
  • variant quality score recalibration 0
  • dragstr 0
  • createsequencedictionary 0
  • genomicsdb 0
  • germlinevariantsites 0
  • germlinecnvcaller 0
  • germline contig ploidy 0
  • panelofnormalscreation 0
  • jointgenotyping 0
  • genomicsdbimport 0
  • gatherbqsrreports 0
  • createsomaticpanelofnormals 0
  • tranche filtering 0
  • filtervarianttranches 0
  • filterintervals 0
  • estimatelibrarycomplexity 0
  • duplication metrics 0
  • determinegermlinecontigploidy 0
  • annotateintervals 0
  • heattree 0
  • readcountssummary 0
  • ARGs 0
  • duplexumi 0
  • consensus sequence 0
  • public 0
  • ENA 0
  • SRA 0
  • antibiotic resistance genes 0
  • unmapped 0
  • faqcs 0
  • str 0
  • cache 0
  • percent on target 0
  • endogenous DNA 0
  • Streptococcus pyogenes 0
  • groupreads 0
  • ubam 0
  • gangstr 0
  • somatic variant calling 0
  • gene-calling 0
  • gamma 0
  • UShER 0
  • bootstrapping 0
  • bacterial variant calling 0
  • germline variant calling 0
  • variant caller 0
  • zipperbams 0
  • rust 0
  • fq 0
  • lint 0
  • random 0
  • generate 0
  • single molecule 0
  • getpileupsumaries 0
  • indexfeaturefile 0
  • genbank 0
  • joint-variant-calling 0
  • TAMA 0
  • gene model 0
  • tama_collapse.py 0
  • genomes on a tree 0
  • merge compare 0
  • GNU 0
  • Imputation 0
  • gstama/polyacleanup 0
  • Haplotypes 0
  • Sample 0
  • low coverage 0
  • gget 0
  • genome statistics 0
  • genome manipulation 0
  • gstama/merge 0
  • GTDB taxonomy 0
  • gfastats 0
  • amrfinderplus 0
  • mitochondrial 0
  • beagle 0
  • hbd 0
  • ibd 0
  • rgi 0
  • fARGene 0
  • abricate 0
  • genome taxonomy database 0
  • extractvariants 0
  • extract_variants 0
  • gvcftools 0
  • gunzip 0
  • gunc 0
  • archaea 0
  • genome summary 0
  • Mykrobe 0
  • learnreadorientationmodel 0
  • printreads 0
  • shiftfasta 0
  • shiftchain 0
  • selectvariants 0
  • revert 0
  • reblockgvcf 0
  • printsvevidence 0
  • preprocessintervals 0
  • site depth 0
  • postprocessgermlinecnvcalls 0
  • snvs 0
  • mutectstats 0
  • leftalignandtrimvariants 0
  • readorientationartifacts 0
  • shiftintervals 0
  • splitcram 0
  • Salmonella Typhi 0
  • bgc 0
  • repeat content 0
  • genome heterozygosity 0
  • genome size 0
  • models 0
  • compound 0
  • genome profile 0
  • file parsing 0
  • splitintervals 0
  • txt 0
  • gawk 0
  • variantrecalibrator 0
  • recalibration model 0
  • variantfiltration 0
  • svcluster 0
  • svannotate 0
  • swissprot 0
  • embl 0
  • haplotype resolution 0
  • background_correction 0
  • update header 0
  • BCF 0
  • csi 0
  • deduping 0
  • smaller fastqs 0
  • clumping fastqs 0
  • illumiation_correction 0
  • homozygosity 0
  • element 0
  • trimBam 0
  • bamUtil 0
  • bamtools/split 0
  • yaml 0
  • bamtools/convert 0
  • biallelic 0
  • autozygosity 0
  • bacphlip 0
  • maskfasta 0
  • unionBedGraphs 0
  • subtract 0
  • slopBed 0
  • shiftBed 0
  • multinterval 0
  • overlapped bed 0
  • chunking 0
  • sorting 0
  • jaccard 0
  • overlap 0
  • getfasta 0
  • genomecov 0
  • closest 0
  • bamtobed 0
  • mouse 0
  • virulent 0
  • file manipulation 0
  • amp 0
  • allele counts 0
  • nuclear contamination estimate 0
  • model 0
  • AMPs 0
  • antimicrobial peptide prediction 0
  • Staphylococcus aureus 0
  • installation 0
  • affy 0
  • reference panels 0
  • admixture 0
  • adapterremoval 0
  • antimicrobial reistance 0
  • contiguate 0
  • doCounts 0
  • HLA 0
  • temperate 0
  • authentict 0
  • lifestyle 0
  • autofluorescence 0
  • cycif 0
  • background 0
  • single-stranded 0
  • ancientDNA 0
  • read group 0
  • utility 0
  • bias 0
  • ATLAS 0
  • sequencing_bias 0
  • post mortem damage 0
  • atlas 0
  • mkarv 0
  • http(s) 0
  • bioawk 0
  • split by chromosome 0
  • Cores 0
  • pcr duplicates 0
  • cutesv 0
  • gct 0
  • cls 0
  • na 0
  • custom 0
  • Segmentation 0
  • TMA dearray 0
  • UNet 0
  • mcool 0
  • genomic bins 0
  • makebins 0
  • enzyme 0
  • paired-end 0
  • corrrelation 0
  • cload 0
  • PEP 0
  • deletion 0
  • circos 0
  • eklipse 0
  • eigenstratdatabasetools 0
  • pep 0
  • schema 0
  • escherichia coli 0
  • scatterplot 0
  • depth information 0
  • structural variation 0
  • duphold 0
  • blastx 0
  • cumulative coverage 0
  • digest 0
  • cooler/balance 0
  • Salmonella enterica 0
  • domains 0
  • antigen capture 0
  • multiomics 0
  • mkvdjref 0
  • cellpose 0
  • hifi 0
  • Assembly 0
  • compartments 0
  • crispr 0
  • topology 0
  • calder2 0
  • cadd 0
  • postprocessing 0
  • tblastn 0
  • subtyping 0
  • antibody capture 0
  • qa 0
  • subcontigs 0
  • access 0
  • nucleotide composition 0
  • concoct 0
  • partition histograms 0
  • target 0
  • export 0
  • antitarget 0
  • cmseq 0
  • quality assurnce 0
  • protein coding genes 0
  • polymorphic sites 0
  • polymorphic 0
  • polymut 0
  • chromosome_visualization 0
  • Haemophilus influenzae 0
  • gccounter 0
  • haplotype purging 0
  • Assembly curation 0
  • False duplications 0
  • Haplotype purging 0
  • assembly curation 0
  • false duplications 0
  • duplicate purging 0
  • cutoff 0
  • quast 0
  • panel of normals 0
  • normal database 0
  • genomic intervals 0
  • intervals coverage 0
  • gene finding 0
  • contact maps 0
  • jpg 0
  • fragment_size 0
  • rtg 0
  • integrity 0
  • mapping-based 0
  • sequence-based 0
  • read distribution 0
  • inner_distance 0
  • read_pairs 0
  • subsampling 0
  • experiment 0
  • strandedness 0
  • bamstat 0
  • R 0
  • rhocall 0
  • long uncorrected reads 0
  • bmp 0
  • pretext 0
  • rocplot 0
  • deletions 0
  • CAGE 0
  • PRO-cap 0
  • GRO-cap 0
  • CoPRO 0
  • tandem duplications 0
  • insertions 0
  • sortvcf 0
  • RAMPAGE 0
  • picard/renamesampleinvcf 0
  • pcr 0
  • liftovervcf 0
  • mate-pair 0
  • phylogenetic composition 0
  • NETCAGE 0
  • csRNA-seq 0
  • contact 0
  • recode 0
  • porechop_abi 0
  • pmdtools 0
  • variant genetic 0
  • scoring 0
  • identifiers 0
  • whole genome association 0
  • indep pairwise 0
  • STRIPE-seq 0
  • indep 0
  • variant identifiers 0
  • exclude 0
  • genetic 0
  • GRO-seq 0
  • PRO-seq 0
  • pedfilter 0
  • rtg-tools 0
  • identification 0
  • genetic sex 0
  • sha256 0
  • longread 0
  • de-novo 0
  • error 0
  • rare variants 0
  • relative coverage 0
  • sex determination 0
  • shinyngs 0
  • gc_wiggle 0
  • bam2seqz 0
  • freqsum 0
  • pseudodiploid 0
  • pseudohaploid 0
  • 256 bit 0
  • exploratory 0
  • selection 0
  • sniffles 0
  • predictions 0
  • dbnsfp 0
  • boxplot 0
  • SMN2 0
  • SMN1 0
  • CRAM 0
  • sliding window 0
  • features 0
  • density 0
  • random draw 0
  • seq 0
  • salsa 0
  • duplicate marking 0
  • sambamba 0
  • flagstat 0
  • salsa2 0
  • header 0
  • VQSR 0
  • interleave 0
  • sertotype 0
  • sequence headers 0
  • grep 0
  • subseq 0
  • variant recalibration 0
  • applyvarcal 0
  • assembly-binning 0
  • seacr 0
  • chromatin 0
  • cut&run 0
  • cut&tag 0
  • peak-caller 0
  • illumina datasets 0
  • prophage 0
  • readcounter 0
  • Listeria monocytogenes 0
  • functional genomics 0
  • peptide prediction 0
  • AMP 0
  • qualities 0
  • lofreq/filter 0
  • limma 0
  • CRISPR-Cas9 0
  • pneumophila 0
  • clinical 0
  • legionella 0
  • collapsing 0
  • adapter removal 0
  • sgRNA 0
  • maximum-likelihood 0
  • representations 0
  • 128 bit 0
  • MD5 0
  • mcr-1 0
  • mass-spectroscopy 0
  • metagenome-assembled genomes 0
  • maxbin2 0
  • reduced 0
  • rra 0
  • mash/sketch 0
  • taxonomic assignment 0
  • estimate 0
  • damage patterns 0
  • NGS 0
  • DNA damage 0
  • combining 0
  • denovo 0
  • IDR 0
  • pixel classification 0
  • multicut 0
  • genome browser 0
  • js 0
  • igv.js 0
  • igv 0
  • panel_of_normals 0
  • probability_maps 0
  • haemophilus 0
  • pos 0
  • annotations 0
  • hmtnote 0
  • pixel_classification 0
  • interproscan 0
  • kofamscan 0
  • kegg 0
  • pneumoniae 0
  • Klebsiella 0
  • effective genome size 0
  • k-mer counting 0
  • digital normalization 0
  • kallisto/index 0
  • genomic islands 0
  • papermill 0
  • jupytext 0
  • Jupyter 0
  • Python 0
  • jasmine 0
  • jasminesv 0
  • insertion 0
  • megahit 0
  • debruijn 0
  • phantom peaks 0
  • graph viz 0
  • PCR/optical duplicates 0
  • block-compressed 0
  • HLA-I 0
  • ILP 0
  • hla-typing 0
  • tumor/normal 0
  • graph formats 0
  • upper-triangular matrix 0
  • graph unchopping 0
  • graph stats 0
  • combine graphs 0
  • odgi 0
  • squeeze 0
  • graph drawing 0
  • flip 0
  • ligation junctions 0
  • gender 0
  • subreads 0
  • ChIP-Seq 0
  • motif 0
  • pedigrees 0
  • read 0
  • pair-end 0
  • pbp 0
  • pbmerge 0
  • pairtools 0
  • pbbam 0
  • select 0
  • restriction fragments 0
  • pairstools 0
  • graph construction 0
  • Neisseria gonorrhoeae 0
  • daa 0
  • unionsum 0
  • target prediction 0
  • microrna 0
  • assembler 0
  • metaphlan 0
  • ploidy 0
  • reference genome 0
  • smudgeplot 0
  • Merqury 0
  • contour map 0
  • 3D heat map 0
  • Neisseria meningitidis 0
  • rma6 0
  • mitochondrial genome 0
  • mosdepth 0
  • sequencing summary 0
  • mobile element insertions 0
  • somatic structural variations 0
  • cancer genome 0
  • contaminant 0
  • otu table 0
  • Beautiful stand-alone HTML report 0
  • bioinformatics tools 0
  • mitochondrial to nuclear ratio 0
  • ratio 0
  • mtnucratio 0
  • scan 0
  • microsatellite instability 0
  • antibody 0

Post-processing script of the MaltExtract component of the HOPS package

000

json summary_pdf tsv candidate_pdfs versions

Run the alignment/variant-call/consensus logic of the artic pipeline

01012012

results bam bai bam_trimmed bai_trimmed bam_primertrimmed bai_primertrimmed fasta vcf tbi json versions

artic:

ARTIC pipeline - a bioinformatics pipeline for working with virus sequencing data sequenced with nanopore

Alignment by Simultaneous Harmonization of Layer/Adjacency Registration

0100

tif versions

This module is used to clip primer sequences from your alignments.

0123

bam bai versions

Locate and tag duplicate reads in a BAM file

01

bam metrics versions

biobambam:

biobambam is a set of tools for early stage alignment file processing.

Merge a list of sorted bam files

01

bam bam_index checksum versions

biobambam:

biobambam is a set of tools for early stage alignment file processing.

Parallel sorting and duplicate marking

0101

bam bam_index cram metrics versions

biobambam:

biobambam is a set of tools for early stage alignment file processing.

A fast, compact one-liner to produce duplicate-marked, sorted, and indexed BAM files using Biscuit

010101

bam bai versions

biscuit:

A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data

samblaster:

samblaster is a fast and flexible program for marking duplicates in read-id grouped paired-end SAM files. It can also optionally output discordant read pairs and/or split read mappings to separate SAM files, and/or unmapped/clipped reads to a separate FASTQ file. By default, samblaster reads SAM input from stdin and writes SAM to stdout.

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Performs alignment of BS-Seq reads using bismark

010101

bam report unmapped versions

bismark:

Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.

Removes alignments to the same position in the genome from the Bismark mapping output.

01

bam report versions

bismark:

Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.

Converts a specified reference genome into two different bisulfite converted versions and indexes them for alignments.

01

index versions

bismark:

Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.

Extracts methylation information for individual cytosines from alignments.

0101

bedgraph methylation_calls coverage report mbias versions

bismark:

Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.

Collects bismark alignment reports

01234

report versions

bismark:

Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.

BLASTP (Basic Local Alignment Search Tool- Protein) compares an amino acid (protein) query sequence against a protein database

01010

xml tsv csv versions

blast:

BLAST+ is a new suite of BLAST tools that utilizes the NCBI C++ Toolkit.

Construct species phylogenies using BUSCO proteins

01

gene_trees supermatrix versions

busco:

Construct species phylogenies using BUSCO proteins

Performs fastq alignment to a fasta reference using BWA

0101010

bam cram csi crai versions

bwa:

BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Performs fastq alignment to a fasta reference using BWA

0101010

sam bam cram crai csi versions

bwa:

BWA-mem2 is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Performs fastq alignment to a fasta reference using BWA-MEME

010101000

sam bam cram crai csi versions

bwameme:

Faster BWA-MEM2 using learned-index

Performs alignment of BS-Seq reads using bwameth

010101

bam versions

bwameth:

Fast and accurate alignment of BS-Seq reads using bwa-mem and a 3-letter genome.

Performs indexing of c2t converted reference genome

01

index versions

bwameth:

Fast and accurate alignment of BS-Seq reads using bwa-mem and a 3-letter genome.

Taxonomic classification plus read-based abundance estimation from long DNA sequences and metagenome assembled genomes (e.g. contigs, MAGs / bins).

0101010101001010101010101

rat_log complete_abundance contig_abundance read2classification alignment_diamond contig2classification cat_log orf2lca faa gff unmapped_diamond unmapped_fasta unmapped2classification versions

catpack:

CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)

Cluster protein sequences using sequence similarity

01

fasta clusters versions

cdhit:

Clusters and compares protein or nucleotide sequences

Cluster nucleotide sequences using sequence similarity

01

fasta clusters versions

cdhit:

Clusters and compares protein or nucleotide sequences

Cellsnp-lite is a C/C++ tool for efficient genotyping bi-allelic SNPs on single cells. You can use the mode A of cellsnp-lite after read alignment to obtain the snp x cell pileup UMI or read count matrices for each alleles of given or detected SNPs for droplet based single cell data.

01234

base cell sample allele_depth depth_coverage depth_other versions

cellsnp:

Efficient genotyping bi-allelic SNPs on single cells

Performs preprocessing and alignment of chromatin fastq files to fasta reference files using chromap.

0101010000

bed bam tagAlign pairs versions

chromap:

Fast alignment and preprocessing of chromatin profiles

Indexes a fasta reference genome ready for chromatin profiling.

01

index versions

chromap:

Fast alignment and preprocessing of chromatin profiles

ClipKIT is a fast and flexible alignment trimming tool that keeps phylogenetically informative sites and removes those that display characteristics poor phylogenetic signal.

010

clipkit log versions

Predict recomination events in bacterial genomes

012

emsim em status newick fasta pos_ref versions

Align sequences using Clustal Omega

010100000

alignment versions

clustalo:

Latest version of Clustal: a multiple sequence alignment program for DNA or proteins

pigz:

Parallel implementation of the gzip algorithm.

Renders a guidetree in clustalo

01

tree versions

clustalo:

Latest version of Clustal: a multiple sequence alignment program for DNA or proteins

Make a transcript/gene mapping from a GTF and cross-reference with transcript quantifications.

0101000

tx2gene versions

custom:

"Custom module to create a transcript to gene mapping from a GTF and check it against transcript quantifications"

This tool filters alignments in a BAM/CRAM file according the the specified parameters.

012

bam logs versions

deeptools:

A set of user-friendly tools for normalization and visualization of deep-sequencing data

This tool takes an alignment of reads or fragments as input (BAM file) and generates a coverage track (bigWig or bedGraph) as output.

01200

bigwig bedgraph versions

deeptools:

A set of user-friendly tools for normalization and visualization of deep-sequencing data

Transforms the input alignments to a format suitable for the deep neural network variant caller

012301010101

examples gvcf small_model_calls versions

deepvariant:

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

calculate clusters of highly similar sequences

01

tsv versions

diamond:

Accelerated BLAST compatible local sequence aligner

Performs fastq alignment to a reference using DRAGMAP

0101010

sam bam cram crai csi log versions

dragmap:

Dragmap is the Dragen mapper/aligner Open Source Software.

Export assembly segment sequences in GFA 1.0 format to FASTA format

01

fasta versions

dshbio:

Reads, features, variants, assemblies, alignments, genomic range trees, pangenome graphs, and a bunch of random command line tools for bioinformatics. LGPL version 3 or later.

Filter features in gzipped BED format

01

bed versions

dshbio:

Reads, features, variants, assemblies, alignments, genomic range trees, pangenome graphs, and a bunch of random command line tools for bioinformatics. LGPL version 3 or later.

Filter features in gzipped GFF3 format

01

gff3 versions

dshbio:

Reads, features, variants, assemblies, alignments, genomic range trees, pangenome graphs, and a bunch of random command line tools for bioinformatics. LGPL version 3 or later.

Split features in gzipped BED format

01

bed versions

dshbio:

Reads, features, variants, assemblies, alignments, genomic range trees, pangenome graphs, and a bunch of random command line tools for bioinformatics. LGPL version 3 or later.

Split features in gzipped GFF3 format

01

gff3 versions

dshbio:

Reads, features, variants, assemblies, alignments, genomic range trees, pangenome graphs, and a bunch of random command line tools for bioinformatics. LGPL version 3 or later.

cons calculates a consensus sequence from a multiple sequence alignment. To obtain the consensus, the sequence weights and a scoring matrix are used to calculate a score for each amino acid residue or nucleotide at each position in the alignment.

01

consensus versions

emboss:

The European Molecular Biology Open Software Suite

splits an alignment into reference and query parts

012

query reference versions

epang:

Massively parallel phylogenetic placement of genetic sequences

Aligns sequences using FAMSA

01010

alignment versions

famsa:

Algorithm for large-scale multiple sequence alignments

Renders a guidetree in famsa

01

tree versions

famsa:

Algorithm for large-scale multiple sequence alignments

Alignment-free computation of average nucleotide Identity (ANI)

010

ani versions

Produces a Newick format phylogeny from a multiple sequence alignment. Capable of bacterial genome size alignments.

0

phylogeny versions

Creates a database for Foldmason.

01

db versions

foldmason:

Multiple Protein Structure Alignment at Scale with FoldMason

Aligns protein structures using foldmason

01010

msa_3di msa_aa versions

foldmason:

Multiple Protein Structure Alignment at Scale with FoldMason

Renders a visualization report using foldmason

01010101

html versions

foldmason:

Multiple Protein Structure Alignment at Scale with FoldMason

Performs local realignment around indels to correct for mapping errors

012301010101

bam versions

gatk:

The full Genome Analysis Toolkit (GATK) framework, license restricted.

Generates a list of locations that should be considered for local realignment prior genotyping.

01201010101

intervals versions

gatk:

The full Genome Analysis Toolkit (GATK) framework, license restricted.

Merge unmapped with mapped BAM files

0120101

bam versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Performs fastq alignment to a fasta reference using using gem3-mapper

01010

bam versions

gem3:

The GEM indexer (v3).

Gubbins (Genealogies Unbiased By recomBinations In Nucleotide Sequences) is an algorithm that iteratively identifies loci containing elevated densities of base substitutions while concurrently constructing a phylogeny based on the putative point mutations outside of these regions.

0

fasta gff vcf stats phylip embl_predicted embl_branch tree tree_labelled versions

Reformat a Multiple Sequence Alignment (MSA) file

0100

msa versions

hhsuite:

HH-suite3 for fast remote homology detection and deep protein annotation

Align RNA-Seq reads to a reference with HISAT2

010101

bam summary fastq versions

hisat2:

HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes as well as to a single reference genome.

Builds HISAT2 index for reference genome

010101

index versions

hisat2:

HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes as well as to a single reference genome.

Extracts splicing sites from a gtf files

01

txt versions

hisat2:

HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes as well as to a single reference genome.

Mask multiple sequence alignments

012345670

maskedaln fmask_rf fmask_all gmask_rf gmask_all pmask_rf pmask_all versions

hmmer:

Biosequence analysis using profile hidden Markov models

hmmalign from the HMMER suite aligns a number of sequences to an HMM profile

010

sto versions

hmmer:

Biosequence analysis using profile hidden Markov models

create an hmm profile from a multiple sequence alignment

010

hmm hmmbuildout versions

hmmer:

Biosequence analysis using profile hidden Markov models

search profile(s) against a sequence database

012345

output alignments target_summary domain_summary versions

hmmer:

Biosequence analysis using profile hidden Markov models

iterative searches to detect distant homologs by refining an HMM profile from hits

012345

output alignments target_summary domain_summary versions

hmmer:

Biosequence analysis using profile hidden Markov models

Create a tag directory with the HOMER suite

010

tagdir taginfo versions

homer:

HOMER (Hypergeometric Optimization of Motif EnRichment) is a suite of tools for Motif Discovery and next-gen sequencing analysis.

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

DESeq2:

Differential gene expression analysis based on the negative binomial distribution

edgeR:

Empirical Analysis of Digital Gene Expression Data in R

Search covariance models against a sequence database

01200

output alignments target_summary versions

infernal:

Infernal is for searching DNA sequence databases for RNA structure and sequence similarities.

Produces a Newick format phylogeny from a multiple sequence alignment using the maximum likelihood algorithm. Capable of bacterial genome size alignments.

012000000000000

phylogeny report mldist lmap_svg lmap_eps lmap_quartetlh sitefreq_out bootstrap state contree nex splits suptree alninfo partlh siteprob sitelh treels rate mlrate exch_matrix log versions

Aligns sequences using kalign

010

alignment versions

kalign:

Kalign is a fast and accurate multiple sequence alignment algorithm.

Computes equivalence classes for reads and quantifies abundances

01010000

results json_info log versions

kallisto:

Quantifying abundances of transcripts from RNA-Seq data, or more generally of target sequences using high-throughput sequencing reads.

This module wraps the index module of the KMA alignment tool.

01

index versions

kma:

Rapid and precise alignment of raw reads against redundant databases with KMA

Makes a dotplot (Oxford Grid) of pair-wise sequence alignments

0120100

gif png versions

last:

LAST finds & aligns related regions of sequences.

Prepare sequences for subsequent alignment with lastal.

01

index versions

last:

LAST finds & aligns related regions of sequences.

Converts MAF alignments in another format.

012010101

axt_gz bam blast_gz blasttab_gz chain_gz cram gff_gz html_gz psl_gz sam_gz tab_gz versions

last:

LAST finds & aligns related regions of sequences.

Reorder alignments in a MAF file

01

maf versions

last:

LAST finds & aligns related regions of sequences.

Post-alignment masking

01

maf versions

last:

LAST finds & aligns related regions of sequences.

Find split or spliced alignments in a MAF file

01

maf multiqc versions

last:

LAST finds & aligns related regions of sequences.

Find suitable score parameters for sequence alignment

010

param_file multiqc versions

last:

LAST finds & aligns related regions of sequences.

Align sequences using learnMSA

01

alignment versions

learnmsa:

learnMSA: Learning and Aligning large Protein Families

Lofreq subcommand to for insert base and indel alignment qualities

010

bam versions

lofreq:

A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data

Lofreq subcommand to call low frequency variants from alignments

0120

vcf versions

lofreq:

A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data

Lofreq subcommand to call low frequency variants from alignments when tumor-normal paired samples are available

0123450101

vcf versions

lofreq:

A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data

Lofreq subcommand to call low frequency variants from alignments when tumor-normal paired samples are available

0101

bam versions

lofreq:

A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data

Peak calling of enriched genomic regions of ChIP-seq and ATAC-seq experiments

0120

peak xls versions gapped bed bdg

macs2:

Model Based Analysis for ChIP-Seq data

Peak calling of enriched genomic regions of ChIP-seq and ATAC-seq experiments

0120

peak xls versions gapped bed bdg

macs3:

Model Based Analysis for ChIP-Seq data

Multiple sequence alignment using MAFFT

0101010101010

fas versions

pigz:

Parallel implementation of the gzip algorithm.

Multiple sequence alignment using MAFFT

0101010101010

fas versions

mafft:

Multiple alignment program for amino acid or nucleotide sequences based on fast Fourier transform

pigz:

Parallel implementation of the gzip algorithm.

Guide tree rendering using MAFFT

01

tree versions

mafft:

Multiple alignment program for amino acid or nucleotide sequences based on fast Fourier transform

Multiple Sequence Alignment using Graph Clustering

01010

alignment versions

magus:

Multiple Sequence Alignment using Graph Clustering

Multiple Sequence Alignment using Graph Clustering

01

tree versions

magus:

Multiple Sequence Alignment using Graph Clustering

MALT, an acronym for MEGAN alignment tool, is a sequence alignment and analysis tool designed for processing high-throughput sequencing data, especially in the context of metagenomics.

0000

index versions log

malt:

A tool for mapping metagenomic data

MALT, an acronym for MEGAN alignment tool, is a sequence alignment and analysis tool designed for processing high-throughput sequencing data, especially in the context of metagenomics.

010

rma6 alignments log versions

malt:

A tool for mapping metagenomic data

Tool for evaluation of MALT results for true positives of ancient metagenomic taxonomic screening

0100

results versions

Map short-reads to an indexed reference genome

01010000000

bam versions

mapad:

An aDNA aware short-read mapper

Extracts per-base methylation metrics from alignments

01200

bedgraph methylkit versions

methyldackel:

Methylation caller from MethylDackel, a (mostly) universal methylation extractor for methyl-seq experiments.

Generates methylation bias plots from alignments

01200

txt versions

methyldackel:

Read position methylation bias tools from MethylDackel, a (mostly) universal extractor for methyl-seq experiments.

Provides fasta index required by minimap2 alignment.

01

index versions

minimap2:

A versatile pairwise aligner for genomic and spliced nucleotide sequences.

Provides fasta index required by miniprot alignment.

01

index versions

miniprot:

A versatile pairwise aligner for genomic and protein sequences.

Aligns protein structures using mTM-align

010

alignment structure versions

mTM-align:

Algorithm for structural multiple sequence alignments

pigz:

Parallel implementation of the gzip algorithm.

SNP table generator from GATK UnifiedGenotyper with functionality geared for aDNA

010101010000001

full_alignment info_txt snp_alignment snp_genome_alignment snpstatistics snptable snptable_snpeff snptable_uncertainty structure_genotypes structure_genotypes_nomissing json versions

MUSCLE is a program for creating multiple alignments of amino acid or nucleotide sequences. A range of options are provided that give you the choice of optimizing accuracy, speed, or some compromise between the two

01

aligned_fasta phyi phys clustalw html msf tree log versions

Muscle is a program for creating multiple alignments of amino acid or nucleotide sequences. This particular module uses the super5 algorithm for very big alignments. It can permutate the guide tree according to a set of flags.

010

alignment versions

muscle -super5:

Muscle v5 is a major re-write of MUSCLE based on new algorithms.

pigz:

Parallel implementation of the gzip algorithm.

Compare multiple runs of long read sequencing data and alignments

01

report_html lengths_violin_html log_length_violin_html n50_html number_of_reads_html overlay_histogram_html overlay_histogram_normalized_html overlay_log_histogram_html overlay_log_histogram_normalized_html total_throughput_html quals_violin_html overlay_histogram_identity_html overlay_histogram_phredscore_html percent_identity_violin_html active_pores_over_time_html cumulative_yield_plot_gigabases_html sequencing_speed_over_time_html stats_txt versions

Performs fastq alignment to a reference using NARFMAP

0101010

bam log versions

narfmap:

narfmap is a fork of the Dragen mapper/aligner Open Source Software.

Performs fastq alignment to a fasta reference using NextGenMap

010

bam versions

bwa:

NextGenMap is a flexible highly sensitive short read mapping tool that handles much higher mismatch rates than comparable algorithms while still outperforming them in terms of runtime

NUCmer is a pipeline for the alignment of multiple closely related nucleotide sequences.

012

delta coords versions

A fast and scalable tool for bacterial pangenome analysis

01

results aln versions

panaroo:

panaroo - an updated pipeline for pangenome investigation

NVIDIA Clara Parabricks GPU-accelerated alignment, sorting, BQSR calculation, and duplicate marking. Note this nf-core module requires files to be copied into the working directory and not symlinked.

01010101010

bam bai cram crai bqsr_table qc_metrics duplicate_metrics versions

parabricks:

NVIDIA Clara Parabricks GPU-accelerated genomics tools

VIDIA Clara Parabricks GPU-accelerated fast, accurate algorithm for mapping methylated DNA sequence reads to a reference genome, performing local alignment, and producing alignment for different parts of the query sequence

0101010

bam bai qc_metrics bqsr_table duplicate_metrics versions

parabricks:

NVIDIA Clara Parabricks GPU-accelerated genomics tools

Determines the depth in a BAM/CRAM file

0120101

depth binned_depth versions

paragraph:

Graph realignment tools for structural variants

Genotype structural variants using paragraph and grmpy

0123450101

vcf json versions

paragraph:

Graph realignment tools for structural variants

Convert a VCF file to a JSON graph

0101

graph versions

paragraph:

Graph realignment tools for structural variants

Alignment with PacBio's minimap2 frontend

0101

bam versions

pbmm2:

A minimap2 frontend for PacBio native data formats

Cleans the provided BAM, soft-clipping beyond-end-of-reference alignments and setting MAPQ to 0 for unmapped reads

01

bam versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Collects hybrid-selection (HS) metrics for a SAM or BAM file.

01234010101

metrics versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Collect metrics about the insert size distribution of a paired-end library.

01

metrics histogram versions

picard:

Java tools for working with NGS data in the BAM format

Collect multiple metrics from a BAM file

0120101

metrics pdf versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Collect metrics from a RNAseq BAM file

01000

metrics pdf versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Collect metrics about coverage and performance of whole genome sequencing (WGS) experiments.

01201010

metrics versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Checks that all data in the set of input files appear to come from the same individual

01234501

crosscheck_metrics versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Merges multiple BAM files into a single file

01

bam versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Pangenome toolbox for bacterial genomes

01

results aln versions

Run all Portcullis steps in one go

010101

log pass_junctions_bed pass_junctions_tab intron_gff exon_gff spliced_bam spliced_bai versions

portcullis:

Portcullis is a tool that filters out invalid splice junctions from RNA-seq alignment data. It accepts BAM files from various RNA-seq mappers, analyzes splice junctions and removes likely false positives, outputting filtered results in multiple formats for downstream analysis.

Split fasta file by 'N's to aid in self alignment for duplicate purging

01

split_fasta versions

purgedups:

Purge_dups is a package used to purge haplotigs and overlaps in an assembly based on read depth

Evaluate alignment data

010

results versions

qualimap:

Qualimap 2 is a platform-independent application written in Java and R that provides both a Graphical User Interface and a command-line interface to facilitate the quality control of alignment sequencing data and its derivatives like feature counts.

Evaluate alignment data

012000

results versions

qualimap:

Qualimap 2 is a platform-independent application written in Java and R that provides both a Graphical User Interface and a command-line interface to facilitate the quality control of alignment sequencing data and its derivatives like feature counts.

Evaluate alignment data

0101

results versions

qualimap:

Qualimap 2 is a platform-independent application written in Java and R that provides both a Graphical User Interface and a command-line interface to facilitate the quality control of alignment sequencing data and its derivatives like feature counts.

Homology-based assembly patching: Make continuous joins and fill gaps in 'target.fa' using sequences from 'query.fa'

01010101

patch_fasta patch_agp patch_components_fasta assembly_alignments target_splits_agp target_splits_fasta qry_rename_agp qry_rename_fasta stderr versions

ragtag:

Fast reference-guided genome assembly scaffolding

Scaffolding is the process of ordering and orienting draft assembly (query) sequences into longer sequences. Gaps (stretches of "N" characters) are placed between adjacent query sequences to indicate the presence of unknown sequence. RagTag uses whole-genome alignments to a reference assembly to scaffold query sequences. RagTag does not alter input query sequence in any way and only orders and orients sequences, joining them with gaps.

010101012

corrected_assembly corrected_agp corrected_stats versions

ragtag:

Fast reference-guided genome assembly scaffolding

Produces a Newick format phylogeny from a multiple sequence alignment using a Neighbour-Joining algorithm. Capable of bacterial genome size alignments.

0

stockholm_alignment phylogeny versions

Calculate pan-genome from annotated bacterial assemblies in GFF3 format

01

results aln versions

Calling lowest common ancestors from multi-mapped reads in SAM/BAM/CRAM files

0120

csv json bam versions

sam2lca:

Lowest Common Ancestor on SAM/BAM/CRAM alignment files

Clips read alignments where they match BED file defined regions

01000

bam stats rejects_bam versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

calculates MD and NM tags

0101

bam versions

samtoolscalmd:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Concatenate BAM or CRAM file

01

bam cram versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Produces a consensus FASTA/FASTQ/PILEUP

01

fasta fastq pileup versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

convert and then index CRAM -> BAM or BAM -> CRAM file

0120101

bam cram bai crai versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

produces a histogram or table of coverage per chromosome

0120101

coverage versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

List CRAM Content-ID and Data-Series sizes

01

size versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Create a sequence dictionary file from a FASTA file

01

dict versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Index FASTA file, and optionally generate a file of chromosome sizes

01010

fa fai sizes gzi versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Converts a SAM/BAM/CRAM file to FASTQ

010

fastq interleaved singleton other versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Samtools fixmate is a tool that can fill in information (insert size, cigar, mapq) about paired end reads onto the corresponding other read. Also has options to remove secondary/unmapped alignments and recalculate whether reads are proper pairs.

01

bam cram sam versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Counts the number of alignments in a BAM/CRAM/SAM file for each FLAG type

012

flagstat versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

filter/convert SAM/BAM/CRAM file

01

readgroup versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Reports alignment summary statistics for a BAM/CRAM/SAM file

012

idxstats versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

converts FASTQ files to unmapped SAM/BAM/CRAM

01

sam bam cram versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Index SAM/BAM/CRAM file

01

bai csi crai versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

mark duplicate alignments in a coordinate sorted file

0101

bam cram sam versions

samtools:

Tools for dealing with SAM, BAM and CRAM files

Merge BAM or CRAM file

010101

bam cram csi crai versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

BAM

0120

mpileup versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Replace the header in the bam file with the header generated by the command. This command is much faster than replacing the header with a BAMโ†’SAMโ†’BAM conversion.

01

bam versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Collate/Fixmate/Sort/Markdup SAM/BAM/CRAM file

0101

bam cram csi crai metrics versions

samtools_cat:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

samtools_collate:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

samtools_fixmate:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

samtools_sort:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

samtools_markdup:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Sort SAM/BAM/CRAM file

0101

bam cram crai csi versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Produces comprehensive statistics from SAM/BAM/CRAM file

01201

stats versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

filter/convert SAM/BAM/CRAM file

0120100

bam cram sam bai csi crai unselected unselected_index versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

The Cluster Analysis tool of Scramble analyses and interprets the soft-clipped clusters found by cluster_identifier

0100

meis_tab dels_tab vcf versions

scramble:

Soft Clipped Read Alignment Mapper

The cluster_identifier tool of Scramble identifies soft clipped clusters

0120

clusters versions

scramble:

Soft Clipped Read Alignment Mapper

A multi-split mapping algorithm for circular RNA, splicing, trans-splicing and fusion detection

0100

alignment trans_alignments multi_bed single_bed versions

segemehl:

A multi-split mapping algorithm for circular RNA, splicing, trans-splicing and fusion detection

Performs fastq alignment to a fasta reference using Sentieon's BWA MEM

01010101

bam_and_bai versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Generate recalibration table and optionally perform base quality recalibration

01201010101010

table table_post recal_alignment csv pdf versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Induce a variation graph in GFA format from alignments in PAF format

012

gfa versions

seqwish:

seqwish implements a lossless conversion from pairwise alignments between sequences to a variation graph encoding the sequences and their alignments.

Severus is a somatic structural variation (SV) caller for long reads (both PacBio and ONT)

01234501

log read_qual breakpoints_double read_alignments read_ids collapsed_dup loh all_vcf all_breakpoints_clusters_list all_breakpoints_clusters all_plots somatic_vcf somatic_breakpoints_clusters_list somatic_breakpoints_clusters somatic_plots versions

Simple ANI calculation between reference and query genomes.

0101

dist versions

skani:

skani is a fast and robust tool for calculating ANI between metagenome assembled genomes and contigs.

Memory-efficient ANI database queries with skani.

0101

search versions

skani:

skani is a fast and robust tool for calculating ANI between metagenome assembled genomes and contigs.

Storing skani sketches/indices on disk.

01

sketch_dir sketch markers versions

skani:

skani is a fast and robust tool for calculating ANI between metagenome assembled genomes and contigs.

All-to-all ANI computation.

01

triangle versions

skani:

skani is a fast and robust tool for calculating ANI between metagenome assembled genomes and contigs.

Linearize and simplify variation graph in GFA format using blocked partial order alignment

01

gfa maf versions

smoove simplifies and speeds calling and genotyping SVs for short reads. It also improves specificity by removing many spurious alignment signals that are indicative of low-level noise and often contribute to spurious calls. Developed by Brent Pedersen.

01230101

vcf versions

smoove:

structural variant calling and genotyping with existing tools, but, smoothly

Performs fastq alignment to a fasta reference using SNAP

0101

bam bai versions

snapaligner:

Scalable Nucleotide Alignment Program -- a fast and accurate read aligner for high-throughput sequencing data

Create a SNAP index for reference genome

01234

index versions

snapaligner:

Scalable Nucleotide Alignment Program -- a fast and accurate read aligner for high-throughput sequencing data

Core-SNP alignment from Snippy outputs

0120

aln full_aln tab vcf txt versions

snippy:

Rapid bacterial SNP calling and core genome alignments

Rapid haploid variant calling

010

tab csv html vcf bed gff bam bai log aligned_fa consensus_fa consensus_subs_fa raw_vcf filt_vcf vcf_gz vcf_csi txt versions

snippy:

Rapid bacterial SNP calling and core genome alignments

Pairwise SNP distance matrix from a FASTA sequence alignment

01

tsv versions

Rapidly extracts SNPs from a multi-FASTA alignment.

0

fasta constant_sites versions constant_sites_string

Local sequence alignment tool for filtering, mapping and clustering.

010101

reads log index versions

SortMeRNA:

The core algorithm is based on approximate seeds and allows for sensitive analysis of NGS reads. The main application of SortMeRNA is filtering rRNA from metatranscriptomic data. SortMeRNA takes as input files of reads (fasta, fastq, fasta.gz, fastq.gz) and one or multiple rRNA database file(s), and sorts apart aligned and rejected reads into two files. Additional applications include clustering and taxonomy assignation available through QIIME v1.9.1. SortMeRNA works with Illumina, Ion Torrent and PacBio data, and can produce SAM and BLAST-like alignments.

Aligns sequences using T_COFFEE

01010120

alignment lib versions

tcoffee:

A collection of tools for Computing, Evaluating and Manipulating Multiple Alignments of DNA, RNA, Protein Sequences and Structures.

pigz:

Parallel implementation of the gzip algorithm.

Compares 2 alternative MSAs to evaluate them.

012

scores versions

tcoffee:

A collection of tools for Multiple Alignments of DNA, RNA, Protein Sequence

pigz:

Parallel implementation of the gzip algorithm.

Computes a consensus alignment using T_COFFEE

01010

alignment eval versions

tcoffee:

A collection of tools for Computing, Evaluating and Manipulating Multiple Alignments of DNA, RNA, Protein Sequences and Structures.

pigz:

Parallel implementation of the gzip algorithm.

Reformats the header of PDB files with t-coffee

01

formatted_pdb versions

tcoffee:

A collection of tools for Computing, Evaluating and Manipulating Multiple Alignments of DNA, RNA, Protein Sequences and Structures.

Computes the irmsd score for a given alignment and the structures.

01012

irmsd versions

tcoffee:

A collection of tools for Multiple Alignments of DNA, RNA, Protein Sequence

pigz:

Parallel implementation of the gzip algorithm.

Aligns sequences using the regressive algorithm as implemented in the T_COFFEE package

01010120

alignment versions

tcoffee:

A collection of tools for Computing, Evaluating and Manipulating Multiple Alignments of DNA, RNA, Protein Sequences and Structures.

pigz:

Parallel implementation of the gzip algorithm.

Reformats files with t-coffee

01

formatted_file versions

tcoffee:

A collection of tools for Computing, Evaluating and Manipulating Multiple Alignments of DNA, RNA, Protein Sequences and Structures.

Compute the TCS score for a MSA or for a MSA plus a library file. Outputs the tcs as it is and a csv with just the total TCS score.

0101

tcs scores versions

tcoffee:

A collection of tools for Multiple Alignments of DNA, RNA, Protein Sequence

pigz:

Parallel implementation of the gzip algorithm.

TransDecoder identifies candidate coding regions within transcript sequences. it is used to build gff file.

01

pep gff3 cds dat folder versions

transdecoder:

TransDecoder identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks.

TransDecoder identifies candidate coding regions within transcript sequences. It is used to build gff file. You can use this module after transdecoder_longorf

010

pep gff3 cds bed versions

transdecoder:

TransDecoder identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks.

Cluster contigs from multiple assemblies by similarity

012

cluster_dir versions

trycycler:

Trycycler is a tool for generating consensus long-read assemblies for bacterial genomes

Import transcript-level abundances and estimated counts for gene-level analysis packages

01010

tpm_gene counts_gene counts_gene_length_scaled counts_gene_scaled lengths_gene tpm_transcript counts_transcript lengths_transcript versions

tximeta:

Transcript Quantification Import with Automatic Metadata

uLTRA aligner - A wrapper around minimap2 to improve small exon detection - Index gtf file for reads alignment

00

index versions

ultra:

Splice aligner of long transcriptomic reads to genome.

Aligns protein structures using UPP

01010

alignment versions

upp:

SATe-enabled phylogenetic placement

Filtering, downsampling and profiling alignments in BAM/CRAM formats

01

bam versions

In order to judge about candidate indel and structural variants, Varlociraptor needs to know about certain properties of the underlying sequencing experiment in combination with the used read aligner.

010101

alignment_properties_json versions

varlociraptor:

Flexible, uncertainty-aware variant calling with parameter free filtration via FDR control.

Constructs a graph from a reference and variant calls or a multiple sequence alignment file

01230101

graph versions

vg:

Variation graph data structures, interchange formats, alignment, genotyping, and variant calling methods.

Deconstruct snarls present in a variation graph in GFA format to variants in VCF format

0100

vcf versions

vg:

Variation graph data structures, interchange formats, alignment, genotyping, and variant calling methods.

write your description here

01

xg vg_index versions

vg:

Variation graph data structures, interchange formats, alignment, genotyping, and variant calling methods.

Cluster sequences using a single-pass, greedy centroid-based clustering algorithm.

01

aln biom mothur otu bam out blast uc centroids clusters profile msa versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Performs quality filtering and / or conversion of a FASTQ file to FASTA format.

01

fasta log versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Taxonomic classification using the sintax algorithm.

010

tsv versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Sort fasta entries by decreasing abundance (--sortbysize) or sequence length (--sortbylength).

010

fasta versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Compare target sequences to fasta-formatted query sequences using global pairwise alignment.

010000

aln biom lca mothur otu sam tsv txt uc versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

a pangenome-scale aligner

0123400

paf versions

Click here to trigger an update.