Available Modules

Modules are the building stones of all DSL2 nf-core blocks. You can find more info from nf-core website, if you would like to write your own module.

  • alignment 70
  • bam 63
  • fasta 31
  • genomics 29
  • fastq 26
  • cram 25
  • sam 24
  • metagenomics 20
  • MSA 19
  • index 16
  • map 14
  • genome 13
  • proteomics 11
  • vcf 10
  • openms 10
  • bisulfite 9
  • bisulphite 9
  • methylseq 9
  • reference 8
  • methylation 8
  • 5mC 8
  • sort 7
  • variant calling 7
  • align 7
  • filter 7
  • statistics 7
  • graph 7
  • LAST 7
  • msa 7
  • merge 6
  • qc 6
  • ancient DNA 6
  • phylogeny 6
  • consensus 6
  • metrics 6
  • multiple sequence alignment 6
  • gff 5
  • split 5
  • structure 5
  • aDNA 5
  • archaeogenomics 5
  • palaeogenomics 5
  • newick 5
  • bismark 5
  • vsearch 5
  • 3-letter genome 5
  • microbiome 5
  • MAF 5
  • idXML 5
  • database 4
  • variants 4
  • gtf 4
  • gfa 4
  • quality 4
  • mapping 4
  • samtools 4
  • markduplicates 4
  • bwa 4
  • hmmer 4
  • feature 4
  • low frequency variant calling 4
  • mem 4
  • clipping 4
  • genmod 4
  • ranking 4
  • containment 4
  • skani 4
  • malt 4
  • assembly 3
  • bed 3
  • gatk4 3
  • annotation 3
  • structural variants 3
  • bacteria 3
  • quality control 3
  • classify 3
  • variant 3
  • taxonomy 3
  • clustering 3
  • long reads 3
  • variation graph 3
  • kmer 3
  • bqsr 3
  • depth 3
  • taxonomic classification 3
  • cluster 3
  • mappability 3
  • db 3
  • gene 3
  • damage 3
  • transcript 3
  • hmmsearch 3
  • evaluation 3
  • gff3 3
  • sketch 3
  • duplicates 3
  • sourmash 3
  • counts 3
  • view 3
  • distance 3
  • guide tree 3
  • lineage 3
  • pan-genome 3
  • covid 3
  • pangolin 3
  • pseudoalignment 3
  • insert 3
  • dist 3
  • atac-seq 3
  • bwameth 3
  • nucleotide 3
  • chip-seq 3
  • nanopore 2
  • taxonomic profiling 2
  • somatic 2
  • sentieon 2
  • rnaseq 2
  • trimming 2
  • protein 2
  • stats 2
  • demultiplex 2
  • aligner 2
  • genotyping 2
  • sequence 2
  • population genetics 2
  • bedGraph 2
  • json 2
  • splicing 2
  • kallisto 2
  • mutect2 2
  • HMM 2
  • peak-calling 2
  • chromosome 2
  • structural_variants 2
  • score 2
  • indel 2
  • mask 2
  • SNP 2
  • amplicon sequences 2
  • mzml 2
  • authentication 2
  • edit distance 2
  • vg 2
  • HOPS 2
  • reformatting 2
  • MaltExtract 2
  • metamaps 2
  • nextclade 2
  • fixmate 2
  • soft-clipped clusters 2
  • recombination 2
  • signature 2
  • FracMinHash sketch 2
  • realignment 2
  • coverage 1
  • classification 1
  • contamination 1
  • pacbio 1
  • convert 1
  • mags 1
  • isoseq 1
  • build 1
  • long-read 1
  • picard 1
  • wgs 1
  • visualisation 1
  • sequences 1
  • matrix 1
  • filtering 1
  • example 1
  • plot 1
  • DNA methylation 1
  • WGBS 1
  • pangenome graph 1
  • scWGBS 1
  • base quality score recalibration 1
  • neural network 1
  • bisulfite sequencing 1
  • machine learning 1
  • biscuit 1
  • annotate 1
  • germline 1
  • umi 1
  • blast 1
  • peaks 1
  • profile 1
  • report 1
  • short-read 1
  • reads 1
  • snp 1
  • kmers 1
  • pangenome 1
  • merging 1
  • csv 1
  • cat 1
  • ont 1
  • amps 1
  • single cell 1
  • compare 1
  • mpileup 1
  • deamination 1
  • structural 1
  • preprocessing 1
  • bgzip 1
  • bedgraph 1
  • circrna 1
  • paf 1
  • SV 1
  • phylogenetic placement 1
  • DNA sequencing 1
  • targeted sequencing 1
  • miscoding lesions 1
  • fgbio 1
  • palaeogenetics 1
  • archaeogenetics 1
  • hybrid capture sequencing 1
  • copy number alteration calling 1
  • rna 1
  • clean 1
  • abundance 1
  • mapper 1
  • hidden Markov model 1
  • uLTRA 1
  • minimap2 1
  • long_read 1
  • subsample 1
  • pileup 1
  • hi-c 1
  • wig 1
  • fingerprint 1
  • eukaryotes 1
  • variation 1
  • rrna 1
  • import 1
  • orthology 1
  • screen 1
  • kma 1
  • maximum likelihood 1
  • primer 1
  • registration 1
  • image_processing 1
  • pair 1
  • salmon 1
  • lofreq 1
  • artic 1
  • hlala 1
  • hla 1
  • hla_typing 1
  • demultiplexed reads 1
  • aggregate 1
  • hlala_typing 1
  • reformat 1
  • emboss 1
  • fusions 1
  • blastp 1
  • collate 1
  • dict 1
  • estimation 1
  • splice 1
  • ancient dna 1
  • switch 1
  • duplicate 1
  • bayesian 1
  • orthologs 1
  • reheader 1
  • filtermutectcalls 1
  • decoy 1
  • rRNA 1
  • ribosomal RNA 1
  • core 1
  • subsample bam 1
  • SNPs 1
  • invariant 1
  • constant 1
  • downsample 1
  • snippy 1
  • downsample bam 1
  • pangenome-scale 1
  • all versus all 1
  • long read alignment 1
  • usearch 1
  • c to t 1
  • wavefront 1
  • mapad 1
  • mashmap 1
  • adna 1
  • graph projection to vcf 1
  • fracminhash sketch 1
  • construct 1
  • transcroder 1
  • eucaryotes 1
  • vsearch/sort 1
  • sintax 1
  • coding 1
  • cds 1
  • junction 1
  • phylogenies 1
  • rank 1
  • hhsuite 1
  • 16S 1
  • mass_error 1
  • search engine 1
  • droplet based single cells 1
  • guidetree 1
  • bwamem2 1
  • bwameme 1
  • cram-size 1
  • size 1
  • POA 1
  • taxonomic composition 1
  • updatedata 1
  • run 1
  • pdb 1
  • mzML 1
  • vsearch/fastqfilter 1
  • fastqfilter 1
  • ATACseq 1
  • shift 1
  • ATACshift 1
  • md 1
  • nm 1
  • uq 1
  • peak picking 1
  • covariance model 1
  • paired reads merging 1
  • overlap-based merging 1
  • liftover 1
  • multi-tool 1
  • probabilistic realignment 1
  • refresh 1
  • targets 1
  • ANI 1
  • rust 1
  • zipperbams 1
  • ubam 1
  • unmapped 1
  • models 1
  • mergebamalignment 1
  • readorientationartifacts 1
  • learnreadorientationmodel 1
  • compound 1
  • post Post-processing 1
  • segment 1
  • track 1
  • sorted 1
  • duplicate removal 1
  • chromap 1
  • intervals coverage 1
  • neighbour-joining 1
  • purging 1
  • hybrid-selection 1
  • induce 1
  • multimapper 1
  • Ancestor 1
  • LCA 1
  • amplicon 1
  • scramble 1
  • clusteridentifier 1
  • cluster analysis 1
  • readgroup 1
  • ampliconclip 1
  • read pairs 1
  • paired 1
  • repair 1
  • insert size 1
  • faidx 1
  • calmd 1
  • train 1
  • pneumophila 1
  • clinical 1
  • legionella 1
  • collapsing 1
  • adapter removal 1
  • spliced 1
  • lofreq/call 1
  • reorder 1
  • taxonomic assignment 1
  • Hidden Markov Model 1
  • amino acid 1
  • HMMER 1
  • quant 1
  • ngm 1
  • NextGenMap 1
  • graphs 1
  • paragraph 1
  • SNP table 1
  • methylation bias 1
  • mbias 1
  • GATK UnifiedGenotyper 1
  • download 0
  • cnv 0
  • k-mer 0
  • conversion 0
  • count 0
  • binning 0
  • single-cell 0
  • copy number 0
  • VCF 0
  • imputation 0
  • bedtools 0
  • contigs 0
  • bcftools 0
  • gvcf 0
  • reporting 0
  • sv 0
  • QC 0
  • databases 0
  • illumina 0
  • compression 0
  • table 0
  • indexing 0
  • cna 0
  • phage 0
  • tsv 0
  • imaging 0
  • serotype 0
  • antimicrobial resistance 0
  • haplotype 0
  • amr 0
  • bins 0
  • expression 0
  • repeat 0
  • searching 0
  • protein sequence 0
  • pairs 0
  • histogram 0
  • virus 0
  • transcriptome 0
  • bcf 0
  • gzip 0
  • completeness 0
  • cooler 0
  • metagenome 0
  • checkm 0
  • validation 0
  • phasing 0
  • low-coverage 0
  • iCLIP 0
  • plink2 0
  • seqkit 0
  • genotype 0
  • mmseqs2 0
  • mkref 0
  • kraken2 0
  • ncbi 0
  • complexity 0
  • ucsc 0
  • spatial 0
  • segmentation 0
  • mag 0
  • dedup 0
  • decompression 0
  • glimpse 0
  • prediction 0
  • deduplication 0
  • mirna 0
  • antimicrobial resistance genes 0
  • differential 0
  • demultiplexing 0
  • mitochondria 0
  • prokaryote 0
  • plasmid 0
  • antimicrobial peptides 0
  • single 0
  • tumor-only 0
  • scRNA-seq 0
  • NCBI 0
  • cnvkit 0
  • call 0
  • adapters 0
  • extract 0
  • antibiotic resistance 0
  • FASTQ 0
  • fastx 0
  • fragment 0
  • tabular 0
  • isolates 0
  • coptr 0
  • concatenate 0
  • de novo 0
  • svtk 0
  • text 0
  • arg 0
  • ptr 0
  • diversity 0
  • profiling 0
  • de novo assembly 0
  • interval 0
  • summary 0
  • indels 0
  • visualization 0
  • wxs 0
  • reference-free 0
  • query 0
  • detection 0
  • riboseq 0
  • benchmark 0
  • gridss 0
  • interval_list 0
  • CLIP 0
  • happy 0
  • compress 0
  • transcriptomics 0
  • hic 0
  • haplotypecaller 0
  • ccs 0
  • HiFi 0
  • dna 0
  • cut 0
  • xeniumranger 0
  • read depth 0
  • retrotransposon 0
  • hmmcopy 0
  • logratio 0
  • telomere 0
  • mtDNA 0
  • bedpe 0
  • public datasets 0
  • pypgx 0
  • bin 0
  • snps 0
  • deep learning 0
  • diamond 0
  • microsatellite 0
  • enrichment 0
  • ngscheckmate 0
  • family 0
  • bigwig 0
  • matching 0
  • STR 0
  • gsea 0
  • genome assembler 0
  • umitools 0
  • propr 0
  • sequencing 0
  • ganon 0
  • redundancy 0
  • isomir 0
  • quantification 0
  • sample 0
  • bcl2fastq 0
  • ATAC-seq 0
  • resistance 0
  • parsing 0
  • chunk 0
  • image 0
  • microarray 0
  • normalization 0
  • fai 0
  • biosynthetic gene cluster 0
  • DNA sequence 0
  • BGC 0
  • ampir 0
  • ancestry 0
  • fungi 0
  • union 0
  • add 0
  • untar 0
  • remove 0
  • reports 0
  • entrez 0
  • notebook 0
  • amplicon sequencing 0
  • virulence 0
  • unzip 0
  • krona chart 0
  • population genomics 0
  • fastk 0
  • cfDNA 0
  • tabix 0
  • replace 0
  • bakta 0
  • complement 0
  • highly_multiplexed_imaging 0
  • mcmicro 0
  • transposons 0
  • bacterial 0
  • seqtk 0
  • combine 0
  • comparisons 0
  • variant_calling 0
  • html 0
  • krona 0
  • prokka 0
  • pairsam 0
  • typing 0
  • archiving 0
  • zip 0
  • khmer 0
  • survivor 0
  • uncompress 0
  • spark 0
  • vrhyme 0
  • macrel 0
  • duplication 0
  • ligate 0
  • popscle 0
  • adapter trimming 0
  • quality trimming 0
  • transcripts 0
  • rna_structure 0
  • genotype-based deconvoltion 0
  • dictionary 0
  • amplify 0
  • neubi 0
  • rsem 0
  • genome assembly 0
  • roh 0
  • benchmarking 0
  • intervals 0
  • UMI 0
  • miRNA 0
  • host 0
  • converter 0
  • PacBio 0
  • npz 0
  • spaceranger 0
  • windowmasker 0
  • RNA 0
  • chimeras 0
  • angsd 0
  • DRAMP 0
  • wastewater 0
  • bim 0
  • ambient RNA removal 0
  • fam 0
  • bamtools 0
  • image_analysis 0
  • cool 0
  • fcs-gx 0
  • relatedness 0
  • observations 0
  • checkv 0
  • somatic variants 0
  • identity 0
  • kinship 0
  • C to T 0
  • ataqv 0
  • informative sites 0
  • das tool 0
  • comparison 0
  • das_tool 0
  • aln 0
  • organelle 0
  • deeparg 0
  • cut up 0
  • cellranger 0
  • arriba 0
  • dump 0
  • gene expression 0
  • fusion 0
  • gatk4spark 0
  • shapeit 0
  • polishing 0
  • CRISPR 0
  • sylph 0
  • mkfastq 0
  • kraken 0
  • microbes 0
  • prefetch 0
  • mlst 0
  • proteome 0
  • genomes 0
  • repeat expansion 0
  • panel 0
  • prokaryotes 0
  • small indels 0
  • scores 0
  • PCA 0
  • regions 0
  • bracken 0
  • png 0
  • RNA-seq 0
  • lossless 0
  • genome mining 0
  • functional analysis 0
  • gstama 0
  • profiles 0
  • multiallelic 0
  • trim 0
  • tumor 0
  • vcflib 0
  • variant pruning 0
  • assembly evaluation 0
  • gene set analysis 0
  • gem 0
  • bfiles 0
  • small variants 0
  • gene set 0
  • concat 0
  • differential expression 0
  • subset 0
  • resolve_bioscience 0
  • msi 0
  • library 0
  • ChIP-seq 0
  • tnhaplotyper2 0
  • taxids 0
  • taxon name 0
  • antismash 0
  • regression 0
  • zlib 0
  • tama 0
  • antibiotics 0
  • preseq 0
  • interactions 0
  • phase 0
  • genomad 0
  • homoploymer 0
  • rgfa 0
  • MSI 0
  • instability 0
  • spatial_transcriptomics 0
  • adapter 0
  • concordance 0
  • microscopy 0
  • bloom filter 0
  • haplogroups 0
  • ampgram 0
  • graph layout 0
  • proportionality 0
  • long terminal retrotransposon 0
  • bustools 0
  • NRPS 0
  • retrotransposons 0
  • RiPP 0
  • GPU-accelerated 0
  • mash 0
  • minhash 0
  • tree 0
  • polyA_tail 0
  • refine 0
  • long terminal repeat 0
  • krakentools 0
  • secondary metabolites 0
  • mitochondrion 0
  • serogroup 0
  • barcode 0
  • k-mer index 0
  • COBS 0
  • archive 0
  • krakenuniq 0
  • pharokka 0
  • function 0
  • xz 0
  • interactive 0
  • iphop 0
  • trancriptome 0
  • removal 0
  • graft 0
  • nucleotides 0
  • megan 0
  • leviosam2 0
  • lift 0
  • simulate 0
  • k-mer frequency 0
  • GC content 0
  • RNA-Seq 0
  • genetics 0
  • parallelized 0
  • cnvnator 0
  • instrain 0
  • ped 0
  • SimpleAF 0
  • parse 0
  • ichorcna 0
  • read-group 0
  • orf 0
  • checksum 0
  • mudskipper 0
  • transcriptomic 0
  • xenograft 0
  • mapcounter 0
  • baf 0
  • tbi 0
  • amptransformer 0
  • frame-shift correction 0
  • sizes 0
  • Pharmacogenetics 0
  • bases 0
  • gwas 0
  • varcal 0
  • eigenstrat 0
  • validate 0
  • samplesheet 0
  • format 0
  • eido 0
  • long-read sequencing 0
  • sequence analysis 0
  • salmonella 0
  • de novo assembler 0
  • small genome 0
  • rename 0
  • deseq2 0
  • rna-seq 0
  • awk 0
  • region 0
  • svdb 0
  • BAM 0
  • smrnaseq 0
  • mirdeep2 0
  • RNA sequencing 0
  • unaligned 0
  • UMIs 0
  • duplex 0
  • pigz 0
  • find 0
  • fetch 0
  • tab 0
  • GEO 0
  • intersection 0
  • standardization 0
  • metagenomic 0
  • identifier 0
  • expansionhunterdenovo 0
  • repeat_expansions 0
  • windows 0
  • metadata 0
  • transformation 0
  • blastn 0
  • reads merging 0
  • CNV 0
  • snpsift 0
  • corrupted 0
  • split_kmers 0
  • snpeff 0
  • gene labels 0
  • single cells 0
  • genome bins 0
  • calling 0
  • effect prediction 0
  • cnv calling 0
  • hostile 0
  • cancer genomics 0
  • decontamination 0
  • cvnkit 0
  • human removal 0
  • screening 0
  • trgt 0
  • cleaning 0
  • eCLIP 0
  • vdj 0
  • shigella 0
  • heatmap 0
  • nanostring 0
  • pharmacogenetics 0
  • spatial_omics 0
  • random forest 0
  • metagenomes 0
  • sequenzautils 0
  • Streptococcus pneumoniae 0
  • structural-variant calling 0
  • mRNA 0
  • fasterq-dump 0
  • join 0
  • sra-tools 0
  • settings 0
  • doublets 0
  • nacho 0
  • version 0
  • correction 0
  • anndata 0
  • immunoprofiling 0
  • short reads 0
  • micro-satellite-scan 0
  • merge mate pairs 0
  • runs_of_homozygosity 0
  • intersect 0
  • repeats 0
  • contig 0
  • scaffold 0
  • joint genotyping 0
  • polish 0
  • deconvolution 0
  • haplotypes 0
  • normalize 0
  • microbial 0
  • norm 0
  • panelofnormals 0
  • Read depth 0
  • scaffolding 0
  • evidence 0
  • scatter 0
  • rtgtools 0
  • junctions 0
  • dereplicate 0
  • allele-specific 0
  • cgMLST 0
  • WGS 0
  • interval list 0
  • taxon tables 0
  • gatk 0
  • Duplication purging 0
  • standardise 0
  • ome-tif 0
  • allele 0
  • purge duplications 0
  • bam2fq 0
  • MCMICRO 0
  • taxonomic profile 0
  • standardisation 0
  • msisensor-pro 0
  • otu tables 0
  • Read coverage histogram 0
  • DNA contamination estimation 0
  • propd 0
  • ucsc/liftover 0
  • gtftogenepred 0
  • verifybamid 0
  • genotype dosages 0
  • assembly polishing 0
  • gemini 0
  • Escherichia coli 0
  • maf 0
  • SINE 0
  • vcf2db 0
  • refflat 0
  • boxcox 0
  • signatures 0
  • clr 0
  • hash sketch 0
  • vcf file 0
  • snv 0
  • umicollapse 0
  • hmmfetch 0
  • uniq 0
  • decompose 0
  • deduplicate 0
  • vcfbreakmulti 0
  • genome polishing 0
  • files 0
  • dbnsfp 0
  • predictions 0
  • upd 0
  • uniparental 0
  • blat 0
  • toml 0
  • simulation 0
  • reverse complement 0
  • plant 0
  • genome graph 0
  • lua 0
  • tnseq 0
  • scRNA-Seq 0
  • melon 0
  • VCFtools 0
  • disomy 0
  • transmembrane 0
  • alr 0
  • bigbed 0
  • confidence 0
  • readproteingroups 0
  • pca 0
  • eigenvectors 0
  • bedcov 0
  • hicPCA 0
  • sliding 0
  • rdtest 0
  • snakemake 0
  • rdtest2vcf 0
  • workflow 0
  • workflow_mode 0
  • countsvtypes 0
  • proteus 0
  • vcf2bed 0
  • linkage equilibrium 0
  • copy-number 0
  • copy number alterations 0
  • copy number variation 0
  • yahs 0
  • gender determination 0
  • polya tail 0
  • geo 0
  • copy number analysis 0
  • wham 0
  • plink2_pca 0
  • whamg 0
  • decompress 0
  • createreadcountpanelofnormals 0
  • baftest 0
  • chloroplast 0
  • groupby 0
  • denoisereadcounts 0
  • readwriter 0
  • dnamodelapply 0
  • dnascope 0
  • sequencing adapters 0
  • bedgraphtobigwig 0
  • extractunbinned 0
  • fast5 0
  • linkbins 0
  • tnscope 0
  • bgen 0
  • bedtobigbed 0
  • genepred 0
  • bgen file 0
  • svtk/baftest 0
  • comp 0
  • Mycobacterium tuberculosis 0
  • short-read sequencing 0
  • detecting svs 0
  • chromosomal rearrangements 0
  • copyratios 0
  • wget 0
  • variantcalling 0
  • sccmec 0
  • network 0
  • streptococcus 0
  • spa 0
  • spatype 0
  • pruning 0
  • minimum_evolution 0
  • htseq 0
  • java 0
  • Read filters 0
  • Read trimming 0
  • Read report 0
  • drug categorization 0
  • uniques 0
  • Illumina 0
  • functional 0
  • impute-info 0
  • tags 0
  • tag2tag 0
  • hashing-based deconvolution 0
  • script 0
  • redundant 0
  • hmmscan 0
  • xml 0
  • svg 0
  • standard 0
  • haplotag 0
  • hmmpress 0
  • staging 0
  • Staging 0
  • microRNA 0
  • CRISPRi 0
  • multiqc 0
  • nanoq 0
  • extraction 0
  • ribosomal 0
  • busco 0
  • lexogen 0
  • genotype-based demultiplexing 0
  • donor deconvolution 0
  • cellsnp 0
  • trimfq 0
  • vcflib/vcffixup 0
  • AC/NS/AF 0
  • Pacbio 0
  • grabix 0
  • 10x 0
  • featuretable 0
  • regulatory network 0
  • transcription factors 0
  • paraphase 0
  • selector 0
  • quality check 0
  • realign 0
  • circular 0
  • spot 0
  • orthogroup 0
  • sage 0
  • mass spectrometry 0
  • poolseq 0
  • MMseqs2 0
  • p-value 0
  • source tracking 0
  • emoji 0
  • block substitutions 0
  • decomposeblocksub 0
  • identity-by-descent 0
  • quality_control 0
  • doublet_detection 0
  • barcodes 0
  • subsetting 0
  • logFC 0
  • significance statistic 0
  • scvi 0
  • controlstatistics 0
  • solo 0
  • import segmentation 0
  • nuclear segmentation 0
  • mgi 0
  • cell segmentation 0
  • recovery 0
  • relabel 0
  • resegment 0
  • leafcutter 0
  • regtools 0
  • plotting 0
  • morphology 0
  • metagenome assembler 0
  • scanpy 0
  • chip 0
  • elprep 0
  • variant-calling 0
  • jvarkit 0
  • stardist 0
  • telseq 0
  • prepare 0
  • catpack 0
  • Computational Immunology 0
  • Bioinformatics Tools 0
  • vsearch/dereplicate 0
  • setgt 0
  • Immune Deconvolution 0
  • elfasta 0
  • regex 0
  • partitioning 0
  • malformed 0
  • fix 0
  • paired reads re-pairing 0
  • nucleotide content 0
  • AT content 0
  • nucBed 0
  • translate 0
  • bclconvert 0
  • targz 0
  • tarball 0
  • tar 0
  • patterns 0
  • doublet 0
  • InterProScan 0
  • retrieval 0
  • f coefficient 0
  • plastid 0
  • transform 0
  • gaps 0
  • introns 0
  • agat 0
  • install 0
  • joint-genotyping 0
  • genotypegvcf 0
  • longest 0
  • isoform 0
  • variancepartition 0
  • dream 0
  • parallel 0
  • resfinder 0
  • microbial genomics 0
  • resistance genes 0
  • raw 0
  • mgf 0
  • parquet 0
  • parser 0
  • dbsnp 0
  • standardize 0
  • quarto 0
  • python 0
  • short 0
  • r 0
  • coexpression 0
  • correlation 0
  • drep 0
  • idx 0
  • assay 0
  • structural variant 0
  • homozygous genotypes 0
  • sompy 0
  • heterozygous genotypes 0
  • site frequency spectrum 0
  • ancestral alleles 0
  • derived alleles 0
  • tnfilter 0
  • array_cgh 0
  • cytosure 0
  • vector 0
  • gprofiler2 0
  • gost 0
  • rad 0
  • bam2fastx 0
  • mutect 0
  • bam2fastq 0
  • immcantation 0
  • airrseq 0
  • inbreeding 0
  • immunoinformatics 0
  • co-orthology 0
  • homology 0
  • sequence similarity 0
  • spectral clustering 0
  • comparative genomics 0
  • deep variant 0
  • dereplication 0
  • corpcor 0
  • phylogenetics 0
  • transposable element 0
  • rna velocity 0
  • SNV 0
  • Indel 0
  • omics 0
  • biological activity 0
  • prior knowledge 0
  • tag 0
  • cell_barcodes 0
  • mygene 0
  • host removal 0
  • go 0
  • pile up 0
  • haploype 0
  • impute 0
  • nanopore sequencing 0
  • reference compression 0
  • structural-variants 0
  • cobra 0
  • extension 0
  • grea 0
  • reference panel 0
  • functional enrichment 0
  • translation 0
  • check 0
  • hamming-distance 0
  • hashing-based deconvoltion 0
  • gnu 0
  • coreutils 0
  • generic 0
  • shuffleBed 0
  • long read 0
  • distance-based 0
  • nucleotide sequence 0
  • homologs 0
  • intron 0
  • masking 0
  • predict 0
  • hardy-weinberg 0
  • hwe statistics 0
  • hwe equilibrium 0
  • reference-independent 0
  • low-complexity 0
  • genotype likelihood 0
  • collapse 0
  • GFF/GTF 0
  • Bayesian 0
  • scimap 0
  • spatial_neighborhoods 0
  • associations 0
  • case/control 0
  • GWAS 0
  • association 0
  • tandem repeats 0
  • seqfu 0
  • trio binning 0
  • clahe 0
  • machine_learning 0
  • cell_phenotyping 0
  • cell_type_identification 0
  • n50 0
  • sniffles 0
  • 3D heat map 0
  • SMN2 0
  • combinegvcfs 0
  • heattree 0
  • annotateintervals 0
  • variant quality score recalibration 0
  • vqsr 0
  • asereadcounter 0
  • bedtointervallist 0
  • calculatecontamination 0
  • cross-samplecontamination 0
  • getpileupsummaries 0
  • calibratedragstrmodel 0
  • cnnscorevariants 0
  • collectreadcounts 0
  • collectsvevidence 0
  • short variant discovery 0
  • gene-calling 0
  • filterintervals 0
  • jointgenotyping 0
  • genomicsdbimport 0
  • genomicsdb 0
  • gatherbqsrreports 0
  • tranche filtering 0
  • filtervarianttranches 0
  • estimatelibrarycomplexity 0
  • composestrtablefile 0
  • duplication metrics 0
  • determinegermlinecontigploidy 0
  • createsomaticpanelofnormals 0
  • createsequencedictionary 0
  • condensedepthevidence 0
  • dragstr 0
  • gangstr 0
  • gamma 0
  • germline contig ploidy 0
  • str 0
  • ENA 0
  • SRA 0
  • ARGs 0
  • antibiotic resistance genes 0
  • faqcs 0
  • cache 0
  • consensus sequence 0
  • percent on target 0
  • endogenous DNA 0
  • Streptococcus pyogenes 0
  • swissprot 0
  • genbank 0
  • embl 0
  • public 0
  • duplexumi 0
  • UShER 0
  • fq 0
  • bootstrapping 0
  • bacterial variant calling 0
  • germline variant calling 0
  • somatic variant calling 0
  • variant caller 0
  • lint 0
  • groupreads 0
  • random 0
  • generate 0
  • single molecule 0
  • panelofnormalscreation 0
  • germlinecnvcaller 0
  • deletion 0
  • gget 0
  • GNU 0
  • joint-variant-calling 0
  • Imputation 0
  • Haplotypes 0
  • Sample 0
  • low coverage 0
  • genome statistics 0
  • genomes on a tree 0
  • genome manipulation 0
  • genome summary 0
  • gfastats 0
  • Mykrobe 0
  • Salmonella Typhi 0
  • repeat content 0
  • merge compare 0
  • tama_collapse.py 0
  • genome size 0
  • gunzip 0
  • fARGene 0
  • amrfinderplus 0
  • abricate 0
  • extractvariants 0
  • extract_variants 0
  • gvcftools 0
  • gunc 0
  • gene model 0
  • archaea 0
  • genome taxonomy database 0
  • GTDB taxonomy 0
  • gstama/polyacleanup 0
  • gstama/merge 0
  • TAMA 0
  • genome heterozygosity 0
  • germlinevariantsites 0
  • mutectstats 0
  • reblockgvcf 0
  • printsvevidence 0
  • printreads 0
  • preprocessintervals 0
  • postprocessgermlinecnvcalls 0
  • snvs 0
  • selectvariants 0
  • leftalignandtrimvariants 0
  • indexfeaturefile 0
  • readcountssummary 0
  • getpileupsumaries 0
  • revert 0
  • shiftchain 0
  • recalibration model 0
  • genome profile 0
  • bgc 0
  • file parsing 0
  • txt 0
  • gawk 0
  • variantrecalibrator 0
  • variantfiltration 0
  • shiftfasta 0
  • svcluster 0
  • svannotate 0
  • splitintervals 0
  • splitcram 0
  • site depth 0
  • shiftintervals 0
  • split by chromosome 0
  • circos 0
  • ibd 0
  • update header 0
  • mouse 0
  • bamtools/convert 0
  • yaml 0
  • bamtools/split 0
  • bamUtil 0
  • trimBam 0
  • element 0
  • illumiation_correction 0
  • background_correction 0
  • clumping fastqs 0
  • smaller fastqs 0
  • deduping 0
  • csi 0
  • BCF 0
  • biallelic 0
  • virulent 0
  • jaccard 0
  • slopBed 0
  • shiftBed 0
  • multinterval 0
  • overlapped bed 0
  • maskfasta 0
  • chunking 0
  • overlap 0
  • homozygosity 0
  • getfasta 0
  • genomecov 0
  • closest 0
  • bamtobed 0
  • sorting 0
  • autozygosity 0
  • bacphlip 0
  • temperate 0
  • unionBedGraphs 0
  • amp 0
  • allele counts 0
  • nuclear contamination estimate 0
  • model 0
  • AMPs 0
  • antimicrobial peptide prediction 0
  • Staphylococcus aureus 0
  • installation 0
  • affy 0
  • reference panels 0
  • admixture 0
  • adapterremoval 0
  • antimicrobial reistance 0
  • contiguate 0
  • doCounts 0
  • HLA 0
  • lifestyle 0
  • read group 0
  • autofluorescence 0
  • cycif 0
  • background 0
  • single-stranded 0
  • ancientDNA 0
  • authentict 0
  • bias 0
  • utility 0
  • ATLAS 0
  • sequencing_bias 0
  • post mortem damage 0
  • atlas 0
  • mkarv 0
  • http(s) 0
  • subtract 0
  • bioawk 0
  • eklipse 0
  • UNet 0
  • cls 0
  • na 0
  • custom 0
  • Cores 0
  • Segmentation 0
  • TMA dearray 0
  • mcool 0
  • cutesv 0
  • genomic bins 0
  • makebins 0
  • enzyme 0
  • digest 0
  • cload 0
  • cooler/balance 0
  • gct 0
  • pcr duplicates 0
  • nucleotide composition 0
  • structural variation 0
  • eigenstratdatabasetools 0
  • pep 0
  • schema 0
  • PEP 0
  • escherichia coli 0
  • depth information 0
  • duphold 0
  • paired-end 0
  • blastx 0
  • cumulative coverage 0
  • scatterplot 0
  • corrrelation 0
  • subcontigs 0
  • concoct 0
  • file manipulation 0
  • topology 0
  • mkvdjref 0
  • cellpose 0
  • hifi 0
  • Assembly 0
  • domains 0
  • compartments 0
  • calder2 0
  • antigen capture 0
  • cadd 0
  • postprocessing 0
  • tblastn 0
  • subtyping 0
  • Salmonella enterica 0
  • multiomics 0
  • antibody capture 0
  • partition histograms 0
  • polymorphic sites 0
  • target 0
  • export 0
  • antitarget 0
  • access 0
  • cmseq 0
  • protein coding genes 0
  • polymorphic 0
  • crispr 0
  • polymut 0
  • chromosome_visualization 0
  • quality assurnce 0
  • qa 0
  • rgi 0
  • hbd 0
  • SMN1 0
  • cutoff 0
  • scoring 0
  • variant genetic 0
  • pmdtools 0
  • porechop_abi 0
  • contact 0
  • pretext 0
  • jpg 0
  • bmp 0
  • contact maps 0
  • gene finding 0
  • genomic intervals 0
  • normal database 0
  • panel of normals 0
  • haplotype purging 0
  • whole genome association 0
  • strandedness 0
  • bamstat 0
  • R 0
  • rhocall 0
  • long uncorrected reads 0
  • subsampling 0
  • quast 0
  • duplicate purging 0
  • Assembly curation 0
  • False duplications 0
  • Haplotype purging 0
  • assembly curation 0
  • false duplications 0
  • identifiers 0
  • recode 0
  • read_pairs 0
  • illumina datasets 0
  • picard/renamesampleinvcf 0
  • pcr 0
  • liftovervcf 0
  • mate-pair 0
  • phylogenetic composition 0
  • identification 0
  • deletions 0
  • prophage 0
  • phantom peaks 0
  • ChIP-Seq 0
  • motif 0
  • pedigrees 0
  • read 0
  • sortvcf 0
  • insertions 0
  • indep pairwise 0
  • STRIPE-seq 0
  • indep 0
  • variant identifiers 0
  • exclude 0
  • genetic 0
  • GRO-seq 0
  • PRO-seq 0
  • csRNA-seq 0
  • tandem duplications 0
  • RAMPAGE 0
  • NETCAGE 0
  • CAGE 0
  • PRO-cap 0
  • GRO-cap 0
  • CoPRO 0
  • experiment 0
  • fragment_size 0
  • pbp 0
  • seq 0
  • bam2seqz 0
  • freqsum 0
  • pseudodiploid 0
  • pseudohaploid 0
  • random draw 0
  • selection 0
  • header 0
  • interleave 0
  • sertotype 0
  • sequence headers 0
  • grep 0
  • subseq 0
  • variant recalibration 0
  • gc_wiggle 0
  • sex determination 0
  • applyvarcal 0
  • shinyngs 0
  • CRAM 0
  • sliding window 0
  • features 0
  • density 0
  • boxplot 0
  • exploratory 0
  • 256 bit 0
  • genetic sex 0
  • sha256 0
  • longread 0
  • de-novo 0
  • error 0
  • rare variants 0
  • relative coverage 0
  • VQSR 0
  • assembly-binning 0
  • inner_distance 0
  • rtg-tools 0
  • flagstat 0
  • salsa2 0
  • salsa 0
  • rocplot 0
  • duplicate marking 0
  • pedfilter 0
  • rtg 0
  • integrity 0
  • mapping-based 0
  • sequence-based 0
  • read distribution 0
  • sambamba 0
  • seacr 0
  • chromatin 0
  • cut&run 0
  • cut&tag 0
  • peak-caller 0
  • pair-end 0
  • subreads 0
  • beagle 0
  • limma 0
  • combining 0
  • kofamscan 0
  • kegg 0
  • pneumoniae 0
  • Klebsiella 0
  • Listeria monocytogenes 0
  • lofreq/filter 0
  • k-mer counting 0
  • DNA damage 0
  • reduced 0
  • mash/sketch 0
  • estimate 0
  • damage patterns 0
  • NGS 0
  • rra 0
  • qualities 0
  • maximum-likelihood 0
  • CRISPR-Cas9 0
  • sgRNA 0
  • functional genomics 0
  • peptide prediction 0
  • AMP 0
  • effective genome size 0
  • digital normalization 0
  • maxbin2 0
  • IDR 0
  • panel_of_normals 0
  • haemophilus 0
  • pos 0
  • annotations 0
  • hmtnote 0
  • igv.js 0
  • readcounter 0
  • gccounter 0
  • haplotype resolution 0
  • Haemophilus influenzae 0
  • mitochondrial 0
  • igv 0
  • js 0
  • jasminesv 0
  • kallisto/index 0
  • papermill 0
  • jupytext 0
  • Jupyter 0
  • Python 0
  • jasmine 0
  • insertion 0
  • genome browser 0
  • genomic islands 0
  • interproscan 0
  • probability_maps 0
  • pixel_classification 0
  • pixel classification 0
  • multicut 0
  • representations 0
  • metagenome-assembled genomes 0
  • pbmerge 0
  • graph construction 0
  • graph unchopping 0
  • graph stats 0
  • combine graphs 0
  • odgi 0
  • squeeze 0
  • graph drawing 0
  • gender 0
  • graph viz 0
  • Neisseria gonorrhoeae 0
  • sequencing summary 0
  • mobile element insertions 0
  • somatic structural variations 0
  • graph formats 0
  • tumor/normal 0
  • contaminant 0
  • pairtools 0
  • pbbam 0
  • select 0
  • restriction fragments 0
  • pairstools 0
  • ligation junctions 0
  • hla-typing 0
  • upper-triangular matrix 0
  • flip 0
  • PCR/optical duplicates 0
  • block-compressed 0
  • HLA-I 0
  • ILP 0
  • cancer genome 0
  • mass-spectroscopy 0
  • rma6 0
  • unionsum 0
  • ploidy 0
  • smudgeplot 0
  • Merqury 0
  • contour map 0
  • Neisseria meningitidis 0
  • daa 0
  • debruijn 0
  • denovo 0
  • megahit 0
  • 128 bit 0
  • MD5 0
  • mcr-1 0
  • metaphlan 0
  • microsatellite instability 0
  • Beautiful stand-alone HTML report 0
  • bioinformatics tools 0
  • mitochondrial to nuclear ratio 0
  • ratio 0
  • mtnucratio 0
  • scan 0
  • otu table 0
  • assembler 0
  • mosdepth 0
  • reference genome 0
  • mitochondrial genome 0
  • target prediction 0
  • microrna 0
  • de Bruijn 0
  • remove samples 0

Post-processing script of the MaltExtract component of the HOPS package

000

json summary_pdf tsv candidate_pdfs versions

Run the alignment/variant-call/consensus logic of the artic pipeline

01012012

results bam bai bam_trimmed bai_trimmed bam_primertrimmed bai_primertrimmed fasta vcf tbi json versions

artic:

ARTIC pipeline - a bioinformatics pipeline for working with virus sequencing data sequenced with nanopore

Alignment by Simultaneous Harmonization of Layer/Adjacency Registration

0100

tif versions

Generate tables of feature metadata from GTF files

0101

feature_annotation filtered_cdna versions

atlasgeneannotationmanipulation:

Scripts for manipulating gene annotation

This module is used to clip primer sequences from your alignments.

0123

bam bai versions

Merging overlapping paired reads into a single read.

010

merged unmerged ihist versions log

bbmap:

BBMap is a short read aligner, as well as various other bioinformatic tools.

Locate and tag duplicate reads in a BAM file

01

bam metrics versions

biobambam:

biobambam is a set of tools for early stage alignment file processing.

Merge a list of sorted bam files

01

bam bam_index checksum versions

biobambam:

biobambam is a set of tools for early stage alignment file processing.

Parallel sorting and duplicate marking

0101

bam bam_index cram metrics versions

biobambam:

biobambam is a set of tools for early stage alignment file processing.

A fast, compact one-liner to produce duplicate-marked, sorted, and indexed BAM files using Biscuit

010101

bam bai versions

biscuit:

A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data

samblaster:

samblaster is a fast and flexible program for marking duplicates in read-id grouped paired-end SAM files. It can also optionally output discordant read pairs and/or split read mappings to separate SAM files, and/or unmapped/clipped reads to a separate FASTQ file. By default, samblaster reads SAM input from stdin and writes SAM to stdout.

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Performs alignment of BS-Seq reads using bismark

010101

bam report unmapped versions

bismark:

Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.

Removes alignments to the same position in the genome from the Bismark mapping output.

01

bam report versions

bismark:

Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.

Converts a specified reference genome into two different bisulfite converted versions and indexes them for alignments.

01

index versions

bismark:

Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.

Extracts methylation information for individual cytosines from alignments.

0101

bedgraph methylation_calls coverage report mbias versions

bismark:

Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.

Collects bismark alignment reports

01234

report versions

bismark:

Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.

BLASTP (Basic Local Alignment Search Tool- Protein) compares an amino acid (protein) query sequence against a protein database

01010

xml tsv csv versions

blast:

BLAST+ is a new suite of BLAST tools that utilizes the NCBI C++ Toolkit.

Construct species phylogenies using BUSCO proteins

01

gene_trees supermatrix versions

busco:

Construct species phylogenies using BUSCO proteins

Performs fastq alignment to a fasta reference using BWA

0101010

bam cram csi crai versions

bwa:

BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Performs fastq alignment to a fasta reference using BWA

0101010

sam bam cram crai csi versions

bwa:

BWA-mem2 is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Performs fastq alignment to a fasta reference using BWA-MEME

010101000

sam bam cram crai csi versions

bwameme:

Faster BWA-MEM2 using learned-index

Performs alignment of BS-Seq reads using bwameth

010101

bam versions

bwameth:

Fast and accurate alignment of BS-Seq reads using bwa-mem and a 3-letter genome.

Performs indexing of c2t converted reference genome

01

index versions

bwameth:

Fast and accurate alignment of BS-Seq reads using bwa-mem and a 3-letter genome.

Taxonomic classification plus read-based abundance estimation from long DNA sequences and metagenome assembled genomes (e.g. contigs, MAGs / bins).

0101010101001010101010101

rat_log complete_abundance contig_abundance read2classification alignment_diamond contig2classification cat_log orf2lca faa gff unmapped_diamond unmapped_fasta unmapped2classification versions

catpack:

CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)

Cluster protein sequences using sequence similarity

01

fasta clusters versions

cdhit:

Clusters and compares protein or nucleotide sequences

Cluster nucleotide sequences using sequence similarity

01

fasta clusters versions

cdhit:

Clusters and compares protein or nucleotide sequences

Cellsnp-lite is a C/C++ tool for efficient genotyping bi-allelic SNPs on single cells. You can use the mode A of cellsnp-lite after read alignment to obtain the snp x cell pileup UMI or read count matrices for each alleles of given or detected SNPs for droplet based single cell data.

01234

base cell sample allele_depth depth_coverage depth_other versions

cellsnp:

Efficient genotyping bi-allelic SNPs on single cells

Classifies metagenomic sequence data

01000

report results sam fastq_mapped fastq_unmapped versions

centrifuge:

Centrifuge is a classifier for metagenomic sequences.

Performs preprocessing and alignment of chromatin fastq files to fasta reference files using chromap.

0101010000

bed bam tagAlign pairs versions

chromap:

Fast alignment and preprocessing of chromatin profiles

Indexes a fasta reference genome ready for chromatin profiling.

01

index versions

chromap:

Fast alignment and preprocessing of chromatin profiles

ClipKIT is a fast and flexible alignment trimming tool that keeps phylogenetically informative sites and removes those that display characteristics poor phylogenetic signal.

010

clipkit log versions

Predict recomination events in bacterial genomes

012

emsim em status newick fasta pos_ref versions

Align sequences using Clustal Omega

010100000

alignment versions

clustalo:

Latest version of Clustal: a multiple sequence alignment program for DNA or proteins

pigz:

Parallel implementation of the gzip algorithm.

Renders a guidetree in clustalo

01

tree versions

clustalo:

Latest version of Clustal: a multiple sequence alignment program for DNA or proteins

Make a transcript/gene mapping from a GTF and cross-reference with transcript quantifications.

0101000

tx2gene versions

custom:

"Custom module to create a transcript to gene mapping from a GTF and check it against transcript quantifications"

This tool filters alignments in a BAM/CRAM file according the the specified parameters.

012

bam logs versions

deeptools:

A set of user-friendly tools for normalization and visualization of deep-sequencing data

This tool takes an alignment of reads or fragments as input (BAM file) and generates a coverage track (bigWig or bedGraph) as output.

01200

bigwig bedgraph versions

deeptools:

A set of user-friendly tools for normalization and visualization of deep-sequencing data

Transforms the input alignments to a format suitable for the deep neural network variant caller

012301010101

examples gvcf small_model_calls versions

deepvariant:

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

calculate clusters of highly similar sequences

01

tsv versions

diamond:

Accelerated BLAST compatible local sequence aligner

Performs fastq alignment to a reference using DRAGMAP

0101010

sam bam cram crai csi log versions

dragmap:

Dragmap is the Dragen mapper/aligner Open Source Software.

Export assembly segment sequences in GFA 1.0 format to FASTA format

01

fasta versions

dshbio:

Reads, features, variants, assemblies, alignments, genomic range trees, pangenome graphs, and a bunch of random command line tools for bioinformatics. LGPL version 3 or later.

Filter features in gzipped BED format

01

bed versions

dshbio:

Reads, features, variants, assemblies, alignments, genomic range trees, pangenome graphs, and a bunch of random command line tools for bioinformatics. LGPL version 3 or later.

Filter features in gzipped GFF3 format

01

gff3 versions

dshbio:

Reads, features, variants, assemblies, alignments, genomic range trees, pangenome graphs, and a bunch of random command line tools for bioinformatics. LGPL version 3 or later.

Split features in gzipped BED format

01

bed versions

dshbio:

Reads, features, variants, assemblies, alignments, genomic range trees, pangenome graphs, and a bunch of random command line tools for bioinformatics. LGPL version 3 or later.

Split features in gzipped GFF3 format

01

gff3 versions

dshbio:

Reads, features, variants, assemblies, alignments, genomic range trees, pangenome graphs, and a bunch of random command line tools for bioinformatics. LGPL version 3 or later.

Fast genome-wide functional annotation through orthology assignment.

010001

annotations orthologs hits versions

cons calculates a consensus sequence from a multiple sequence alignment. To obtain the consensus, the sequence weights and a scoring matrix are used to calculate a score for each amino acid residue or nucleotide at each position in the alignment.

01

consensus versions

emboss:

The European Molecular Biology Open Software Suite

A taxonomic profiler for metagenomic 16S data optimized for error prone long reads.

010

report assignment_report samfile unclassified_fa versions

emu:

Emu is a relative abundance estimator for 16s genomic data.

splits an alignment into reference and query parts

012

query reference versions

epang:

Massively parallel phylogenetic placement of genetic sequences

Aligns sequences using FAMSA

01010

alignment versions

famsa:

Algorithm for large-scale multiple sequence alignments

Renders a guidetree in famsa

01

tree versions

famsa:

Algorithm for large-scale multiple sequence alignments

Alignment-free computation of average nucleotide Identity (ANI)

010

ani versions

Produces a Newick format phylogeny from a multiple sequence alignment. Capable of bacterial genome size alignments.

0

phylogeny versions

FGBIO tool to zip together an unmapped and mapped BAM to transfer metadata into the output BAM

01010101

bam versions

fgbio:

A set of tools for working with genomic and high throughput sequencing data, including UMIs

Creates a database for Foldmason.

01

db versions

foldmason:

Multiple Protein Structure Alignment at Scale with FoldMason

Aligns protein structures using foldmason

01010

msa_3di msa_aa versions

foldmason:

Multiple Protein Structure Alignment at Scale with FoldMason

Renders a visualization report using foldmason

01010101

html versions

foldmason:

Multiple Protein Structure Alignment at Scale with FoldMason

Demultiplex fastq files

0123

sample_fastq metrics most_frequent_unmatched versions

Performs local realignment around indels to correct for mapping errors

012301010101

bam versions

gatk:

The full Genome Analysis Toolkit (GATK) framework, license restricted.

Generates a list of locations that should be considered for local realignment prior genotyping.

01201010101

intervals versions

gatk:

The full Genome Analysis Toolkit (GATK) framework, license restricted.

Filters the raw output of mutect2, can optionally use outputs of calculatecontamination and learnreadorientationmodel to improve filtering.

01234567010101

vcf tbi stats versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Uses f1r2 counts collected during mutect2 to Learn the prior probability of read orientation artifacts

01

artifactprior versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Merge unmapped with mapped BAM files

0120101

bam versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Performs fastq alignment to a fasta reference using using gem3-mapper

01010

bam versions

gem3:

The GEM indexer (v3).

create index file for genmap

01

index versions

genmap:

Ultra-fast computation of genome mappability.

create mappability files for a genome

0101

wig bedgraph txt csv versions

genmap:

Ultra-fast computation of genome mappability.

for annotating regions, frequencies, cadd scores

01

vcf versions

genmod:

Annotate genetic inheritance models in variant files

Score compounds

01

vcf versions

genmod:

Annotate genetic inheritance models in variant files

annotate models of inheritance

0120

vcf versions

genmod:

Annotate genetic inheritance models in variant files

Score the variants of a vcf based on their annotation

0120

vcf versions

genmod:

Annotate genetic inheritance models in variant files

Gubbins (Genealogies Unbiased By recomBinations In Nucleotide Sequences) is an algorithm that iteratively identifies loci containing elevated densities of base substitutions while concurrently constructing a phylogeny based on the putative point mutations outside of these regions.

0

fasta gff vcf stats phylip embl_predicted embl_branch tree tree_labelled versions

Reformat a Multiple Sequence Alignment (MSA) file

0100

msa versions

hhsuite:

HH-suite3 for fast remote homology detection and deep protein annotation

Align RNA-Seq reads to a reference with HISAT2

010101

bam summary fastq versions

hisat2:

HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes as well as to a single reference genome.

Builds HISAT2 index for reference genome

010101

index versions

hisat2:

HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes as well as to a single reference genome.

Extracts splicing sites from a gtf files

01

txt versions

hisat2:

HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes as well as to a single reference genome.

Performs HLA typing based on a population reference graph and employs a new linear projection method to align reads to the graph.

0123

results extraction extraction_mapped extraction_unmpapped hla fastq reads_per_level remapped versions

hlala:

HLA typing from short and long reads

Mask multiple sequence alignments

012345670

maskedaln fmask_rf fmask_all gmask_rf gmask_all pmask_rf pmask_all versions

hmmer:

Biosequence analysis using profile hidden Markov models

hmmalign from the HMMER suite aligns a number of sequences to an HMM profile

010

sto versions

hmmer:

Biosequence analysis using profile hidden Markov models

create an hmm profile from a multiple sequence alignment

010

hmm hmmbuildout versions

hmmer:

Biosequence analysis using profile hidden Markov models

R script that scores output from multiple runs of hmmer/hmmsearch

01

hmmrank versions

hmmer:

Biosequence analysis using profile hidden Markov models

R:

A Language and Environment for Statistical Computing

Tidyverse:

Tidyverse: R packages for data science

search profile(s) against a sequence database

012345

output alignments target_summary domain_summary versions

hmmer:

Biosequence analysis using profile hidden Markov models

Create a tag directory with the HOMER suite

010

tagdir taginfo versions

homer:

HOMER (Hypergeometric Optimization of Motif EnRichment) is a suite of tools for Motif Discovery and next-gen sequencing analysis.

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

DESeq2:

Differential gene expression analysis based on the negative binomial distribution

edgeR:

Empirical Analysis of Digital Gene Expression Data in R

Search covariance models against a sequence database

01200

output alignments target_summary versions

infernal:

Infernal is for searching DNA sequence databases for RNA structure and sequence similarities.

Produces a Newick format phylogeny from a multiple sequence alignment using the maximum likelihood algorithm. Capable of bacterial genome size alignments.

012000000000000

phylogeny report mldist lmap_svg lmap_eps lmap_quartetlh sitefreq_out bootstrap state contree nex splits suptree alninfo partlh siteprob sitelh treels rate mlrate exch_matrix log versions

Aligns sequences using kalign

010

alignment versions

kalign:

Kalign is a fast and accurate multiple sequence alignment algorithm.

Computes equivalence classes for reads and quantifies abundances

01010000

results json_info log versions

kallisto:

Quantifying abundances of transcripts from RNA-Seq data, or more generally of target sequences using high-throughput sequencing reads.

This module wraps the index module of the KMA alignment tool.

01

index versions

kma:

Rapid and precise alignment of raw reads against redundant databases with KMA

Classifies metagenomic sequence data

01000

classified_reads_fastq unclassified_reads_fastq classified_reads_assignment report versions

kraken2:

Kraken2 is a taxonomic sequence classifier that assigns taxonomic labels to sequence reads

Classifies metagenomic sequence data using unique k-mer counts

012000000

classified_reads unclassified_reads classified_assignment report versions

krakenuniq:

Metagenomics classifier with unique k-mer counting for more specific results

Makes a dotplot (Oxford Grid) of pair-wise sequence alignments

0120100

gif png versions

last:

LAST finds & aligns related regions of sequences.

Prepare sequences for subsequent alignment with lastal.

01

index versions

last:

LAST finds & aligns related regions of sequences.

Converts MAF alignments in another format.

012010101

axt_gz bam blast_gz blasttab_gz chain_gz cram gff_gz html_gz psl_gz sam_gz tab_gz versions

last:

LAST finds & aligns related regions of sequences.

Reorder alignments in a MAF file

01

maf versions

last:

LAST finds & aligns related regions of sequences.

Post-alignment masking

01

maf versions

last:

LAST finds & aligns related regions of sequences.

Find split or spliced alignments in a MAF file

01

maf multiqc versions

last:

LAST finds & aligns related regions of sequences.

Find suitable score parameters for sequence alignment

010

param_file multiqc versions

last:

LAST finds & aligns related regions of sequences.

Align sequences using learnMSA

01

alignment versions

learnmsa:

learnMSA: Learning and Aligning large Protein Families

Bayesian reconstruction of ancient DNA fragments

01

bam fq_pass fq_fail unmerged_r1_fq_pass unmerged_r1_fq_fail unmerged_r2_fq_pass unmerged_r2_fq_fail log versions

Typing of clinical and environmental isolates of Legionella pneumophila

01

tsv versions

Uses Liftoff to accurately map annotations in GFF or GTF between assemblies of the same, or closely-related species

01000

gff3 polished_gff3 unmapped_txt versions

Lofreq subcommand to for insert base and indel alignment qualities

010

bam versions

lofreq:

A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data

Lofreq subcommand to call low frequency variants from alignments

0120

vcf versions

lofreq:

A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data

Lofreq subcommand to call low frequency variants from alignments when tumor-normal paired samples are available

0123450101

vcf versions

lofreq:

A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data

Lofreq subcommand to call low frequency variants from alignments when tumor-normal paired samples are available

0101

bam versions

lofreq:

A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data

Peak calling of enriched genomic regions of ChIP-seq and ATAC-seq experiments

0120

peak xls versions gapped bed bdg

macs2:

Model Based Analysis for ChIP-Seq data

Peak calling of enriched genomic regions of ChIP-seq and ATAC-seq experiments

0120

peak xls versions gapped bed bdg

macs3:

Model Based Analysis for ChIP-Seq data

Multiple sequence alignment using MAFFT

0101010101010

fas versions

pigz:

Parallel implementation of the gzip algorithm.

Multiple sequence alignment using MAFFT

0101010101010

fas versions

mafft:

Multiple alignment program for amino acid or nucleotide sequences based on fast Fourier transform

pigz:

Parallel implementation of the gzip algorithm.

Guide tree rendering using MAFFT

01

tree versions

mafft:

Multiple alignment program for amino acid or nucleotide sequences based on fast Fourier transform

Multiple Sequence Alignment using Graph Clustering

01010

alignment versions

magus:

Multiple Sequence Alignment using Graph Clustering

Multiple Sequence Alignment using Graph Clustering

01

tree versions

magus:

Multiple Sequence Alignment using Graph Clustering

MALT, an acronym for MEGAN alignment tool, is a sequence alignment and analysis tool designed for processing high-throughput sequencing data, especially in the context of metagenomics.

0000

index versions log

malt:

A tool for mapping metagenomic data

MALT, an acronym for MEGAN alignment tool, is a sequence alignment and analysis tool designed for processing high-throughput sequencing data, especially in the context of metagenomics.

010

rma6 alignments log versions

malt:

A tool for mapping metagenomic data

Tool for evaluation of MALT results for true positives of ancient metagenomic taxonomic screening

0100

results versions

Map short-reads to an indexed reference genome

01010000000

bam versions

mapad:

An aDNA aware short-read mapper

Screens query sequences against large sequence databases

0101

screen versions

mash:

Fast sequence distance estimator that uses MinHash

Strain-level metagenomic assignment

012340

wimp evidence_unknown_species reads2taxon em contig_coverage length_and_id krona versions

metamaps:

MetaMaps is a tool for long-read metagenomic analysis

Maps long reads to a metamaps database

010

classification_res meta_file meta_unmappedreadsLengths para_file versions

metamaps:

MetaMaps is a tool for long-read metagenomic analysis

Extracts per-base methylation metrics from alignments

01200

bedgraph methylkit versions

methyldackel:

Methylation caller from MethylDackel, a (mostly) universal methylation extractor for methyl-seq experiments.

Generates methylation bias plots from alignments

01200

txt versions

methyldackel:

Read position methylation bias tools from MethylDackel, a (mostly) universal extractor for methyl-seq experiments.

Provides fasta index required by minimap2 alignment.

01

index versions

minimap2:

A versatile pairwise aligner for genomic and spliced nucleotide sequences.

Provides fasta index required by miniprot alignment.

01

index versions

miniprot:

A versatile pairwise aligner for genomic and protein sequences.

Aligns protein structures using mTM-align

010

alignment structure versions

mTM-align:

Algorithm for structural multiple sequence alignments

pigz:

Parallel implementation of the gzip algorithm.

SNP table generator from GATK UnifiedGenotyper with functionality geared for aDNA

010101010000001

full_alignment info_txt snp_alignment snp_genome_alignment snpstatistics snptable snptable_snpeff snptable_uncertainty structure_genotypes structure_genotypes_nomissing json versions

MUSCLE is a program for creating multiple alignments of amino acid or nucleotide sequences. A range of options are provided that give you the choice of optimizing accuracy, speed, or some compromise between the two

01

aligned_fasta phyi phys clustalw html msf tree log versions

Muscle is a program for creating multiple alignments of amino acid or nucleotide sequences. This particular module uses the super5 algorithm for very big alignments. It can permutate the guide tree according to a set of flags.

010

alignment versions

muscle -super5:

Muscle v5 is a major re-write of MUSCLE based on new algorithms.

pigz:

Parallel implementation of the gzip algorithm.

Compare multiple runs of long read sequencing data and alignments

01

report_html lengths_violin_html log_length_violin_html n50_html number_of_reads_html overlay_histogram_html overlay_histogram_normalized_html overlay_log_histogram_html overlay_log_histogram_normalized_html total_throughput_html quals_violin_html overlay_histogram_identity_html overlay_histogram_phredscore_html percent_identity_violin_html active_pores_over_time_html cumulative_yield_plot_gigabases_html sequencing_speed_over_time_html stats_txt versions

Performs fastq alignment to a reference using NARFMAP

0101010

bam log versions

narfmap:

narfmap is a fork of the Dragen mapper/aligner Open Source Software.

Get dataset for SARS-CoV-2 genome clade assignment, mutation calling, and sequence quality checks (C++ implementation)

00

dataset versions

nextclade:

SARS-CoV-2 genome clade assignment, mutation calling, and sequence quality checks

SARS-CoV-2 genome clade assignment, mutation calling, and sequence quality checks (C++ implementation)

010

csv csv_errors csv_insertions tsv json json_auspice ndjson fasta_aligned fasta_translation nwk versions

nextclade:

SARS-CoV-2 genome clade assignment, mutation calling, and sequence quality checks

Performs fastq alignment to a fasta reference using NextGenMap

010

bam versions

bwa:

NextGenMap is a flexible highly sensitive short read mapping tool that handles much higher mismatch rates than comparable algorithms while still outperforming them in terms of runtime

NUCmer is a pipeline for the alignment of multiple closely related nucleotide sequences.

012

delta coords versions

Create a decoy peptide database from a standard FASTA database.

01

decoy_fasta versions

openms:

OpenMS is an open-source software C++ library for LC-MS data management and analyses

Filters peptide/protein identification results by different criteria.

01

mzml featurexml consensusxml versions

openms:

OpenMS is an open-source software C++ library for LC-MS data management and analyses

Filters peptide/protein identification results by different criteria.

012

filtered versions

openms:

OpenMS is an open-source software C++ library for LC-MS data management and analyses

Calculates a distribution of the mass error from given mass spectra and IDs.

012

frag_err prec_err versions

openms:

OpenMS is an open-source software C++ library for LC-MS data management and analyses

Merges several idXML files into one idXML file.

01

idxml versions

openms:

OpenMS is an open-source software C++ library for LC-MS data management and analyses

Split a merged identification file into their originating identification files

01

idxmls versions

openms:

OpenMS is an open-source software C++ library for LC-MS data management and analyses

Switches between different scores of peptide or protein hits in identification data

01

idxml versions

openms:

OpenMS is an open-source software C++ library for LC-MS data management and analyses

A tool for peak detection in high-resolution profile data (Orbitrap or FTICR)

01

mzml versions

openms:

OpenMS is an open-source software C++ library for LC-MS data management and analyses

Refreshes the protein references for all peptide hits.

012

indexed_idxml versions

openms:

OpenMS is an open-source software C++ library for LC-MS data management and analyses

Annotates MS/MS spectra using Comet.

012

idxml pin versions

openms:

OpenMS is an open-source software C++ library for LC-MS data management and analyses

A fast and scalable tool for bacterial pangenome analysis

01

results aln versions

panaroo:

panaroo - an updated pipeline for pangenome investigation

Phylogenetic Assignment of Named Global Outbreak LINeages

01

report versions

Phylogenetic Assignment of Named Global Outbreak LINeages

010

report versions

pangolin:

Phylogenetic Assignment of Named Global Outbreak LINeages

Phylogenetic Assignment of Named Global Outbreak LINeages

0

db versions

pangolin:

Phylogenetic Assignment of Named Global Outbreak LINeages

NVIDIA Clara Parabricks GPU-accelerated alignment, sorting, BQSR calculation, and duplicate marking. Note this nf-core module requires files to be copied into the working directory and not symlinked.

01010101010

bam bai cram crai bqsr_table qc_metrics duplicate_metrics versions

parabricks:

NVIDIA Clara Parabricks GPU-accelerated genomics tools

VIDIA Clara Parabricks GPU-accelerated fast, accurate algorithm for mapping methylated DNA sequence reads to a reference genome, performing local alignment, and producing alignment for different parts of the query sequence

0101010

bam bai qc_metrics bqsr_table duplicate_metrics versions

parabricks:

NVIDIA Clara Parabricks GPU-accelerated genomics tools

Determines the depth in a BAM/CRAM file

0120101

depth binned_depth versions

paragraph:

Graph realignment tools for structural variants

Genotype structural variants using paragraph and grmpy

0123450101

vcf json versions

paragraph:

Graph realignment tools for structural variants

Convert a VCF file to a JSON graph

0101

graph versions

paragraph:

Graph realignment tools for structural variants

Alignment with PacBio's minimap2 frontend

0101

bam versions

pbmm2:

A minimap2 frontend for PacBio native data formats

Cleans the provided BAM, soft-clipping beyond-end-of-reference alignments and setting MAPQ to 0 for unmapped reads

01

bam versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Collects hybrid-selection (HS) metrics for a SAM or BAM file.

01234010101

metrics versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Collect metrics about the insert size distribution of a paired-end library.

01

metrics histogram versions

picard:

Java tools for working with NGS data in the BAM format

Collect multiple metrics from a BAM file

0120101

metrics pdf versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Collect metrics from a RNAseq BAM file

01000

metrics pdf versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Collect metrics about coverage and performance of whole genome sequencing (WGS) experiments.

01201010

metrics versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Checks that all data in the set of input files appear to come from the same individual

01234501

crosscheck_metrics versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Merges multiple BAM files into a single file

01

bam versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

This tool takes in a coordinate-sorted SAM or BAM and calculatesthe NM, MD, and UQ tags by comparing with the reference.

0101

bam bai versions

picard:

Java tools for working with NGS data in the BAM format

Pangenome toolbox for bacterial genomes

01

results aln versions

Run all Portcullis steps in one go

010101

log bed tab versions

portcullis:

Portcullis is a tool that filters out invalid splice junctions from RNA-seq alignment data. It accepts BAM files from various RNA-seq mappers, analyzes splice junctions and removes likely false positives, outputting filtered results in multiple formats for downstream analysis.

Calculate intervals coverage for each sample. N.B. the tool can not handle staging files with symlinks, stageInMode should be set to 'link'.

0120

txt png loess_qc_txt loess_txt versions

purecn:

Copy number calling and SNV classification using targeted short read sequencing

Split fasta file by 'N's to aid in self alignment for duplicate purging

01

split_fasta versions

purgedups:

Purge_dups is a package used to purge haplotigs and overlaps in an assembly based on read depth

Evaluate alignment data

010

results versions

qualimap:

Qualimap 2 is a platform-independent application written in Java and R that provides both a Graphical User Interface and a command-line interface to facilitate the quality control of alignment sequencing data and its derivatives like feature counts.

Evaluate alignment data

012000

results versions

qualimap:

Qualimap 2 is a platform-independent application written in Java and R that provides both a Graphical User Interface and a command-line interface to facilitate the quality control of alignment sequencing data and its derivatives like feature counts.

Evaluate alignment data

0101

results versions

qualimap:

Qualimap 2 is a platform-independent application written in Java and R that provides both a Graphical User Interface and a command-line interface to facilitate the quality control of alignment sequencing data and its derivatives like feature counts.

Produces a Newick format phylogeny from a multiple sequence alignment using a Neighbour-Joining algorithm. Capable of bacterial genome size alignments.

0

stockholm_alignment phylogeny versions

Calculate pan-genome from annotated bacterial assemblies in GFF3 format

01

results aln versions

Calling lowest common ancestors from multi-mapped reads in SAM/BAM/CRAM files

0120

csv json bam versions

sam2lca:

Lowest Common Ancestor on SAM/BAM/CRAM alignment files

Clips read alignments where they match BED file defined regions

01000

bam stats rejects_bam versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

calculates MD and NM tags

0101

bam versions

samtoolscalmd:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Concatenate BAM or CRAM file

01

bam cram versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Produces a consensus FASTA/FASTQ/PILEUP

01

fasta fastq pileup versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

convert and then index CRAM -> BAM or BAM -> CRAM file

0120101

bam cram bai crai versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

produces a histogram or table of coverage per chromosome

0120101

coverage versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

List CRAM Content-ID and Data-Series sizes

01

size versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Create a sequence dictionary file from a FASTA file

01

dict versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Index FASTA file, and optionally generate a file of chromosome sizes

01010

fa fai sizes gzi versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Converts a SAM/BAM/CRAM file to FASTQ

010

fastq interleaved singleton other versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Samtools fixmate is a tool that can fill in information (insert size, cigar, mapq) about paired end reads onto the corresponding other read. Also has options to remove secondary/unmapped alignments and recalculate whether reads are proper pairs.

01

bam cram sam versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Counts the number of alignments in a BAM/CRAM/SAM file for each FLAG type

012

flagstat versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

filter/convert SAM/BAM/CRAM file

01

readgroup versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Reports alignment summary statistics for a BAM/CRAM/SAM file

012

idxstats versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

converts FASTQ files to unmapped SAM/BAM/CRAM

01

sam bam cram versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Index SAM/BAM/CRAM file

01

bai csi crai versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

mark duplicate alignments in a coordinate sorted file

0101

bam cram sam versions

samtools:

Tools for dealing with SAM, BAM and CRAM files

Merge BAM or CRAM file

010101

bam cram csi crai versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

BAM

0120

mpileup versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Replace the header in the bam file with the header generated by the command. This command is much faster than replacing the header with a BAMโ†’SAMโ†’BAM conversion.

01

bam versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Collate/Fixmate/Sort/Markdup SAM/BAM/CRAM file

0101

bam cram csi crai metrics versions

samtools_cat:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

samtools_collate:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

samtools_fixmate:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

samtools_sort:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

samtools_markdup:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Sort SAM/BAM/CRAM file

0101

bam cram crai csi versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Produces comprehensive statistics from SAM/BAM/CRAM file

01201

stats versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

filter/convert SAM/BAM/CRAM file

0120100

bam cram sam bai csi crai unselected unselected_index versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

The Cluster Analysis tool of Scramble analyses and interprets the soft-clipped clusters found by cluster_identifier

0100

meis_tab dels_tab vcf versions

scramble:

Soft Clipped Read Alignment Mapper

The cluster_identifier tool of Scramble identifies soft clipped clusters

0120

clusters versions

scramble:

Soft Clipped Read Alignment Mapper

A multi-split mapping algorithm for circular RNA, splicing, trans-splicing and fusion detection

0100

alignment trans_alignments multi_bed single_bed versions

segemehl:

A multi-split mapping algorithm for circular RNA, splicing, trans-splicing and fusion detection

Performs fastq alignment to a fasta reference using Sentieon's BWA MEM

01010101

bam_and_bai versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Generate recalibration table and optionally perform base quality recalibration

01201010101010

table table_post recal_alignment csv pdf versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Induce a variation graph in GFA format from alignments in PAF format

012

gfa versions

seqwish:

seqwish implements a lossless conversion from pairwise alignments between sequences to a variation graph encoding the sequences and their alignments.

Severus is a somatic structural variation (SV) caller for long reads (both PacBio and ONT)

01234501

log read_qual breakpoints_double read_alignments read_ids collapsed_dup loh all_vcf all_breakpoints_clusters_list all_breakpoints_clusters all_plots somatic_vcf somatic_breakpoints_clusters_list somatic_breakpoints_clusters somatic_plots versions

Demultiplex bgzip'd fastq files

012

sample_fastq metrics most_frequent_unmatched per_project_metrics per_sample_metrics sample_barcode_hop_metrics versions

Simple ANI calculation between reference and query genomes.

0101

dist versions

skani:

skani is a fast and robust tool for calculating ANI between metagenome assembled genomes and contigs.

Memory-efficient ANI database queries with skani.

0101

search versions

skani:

skani is a fast and robust tool for calculating ANI between metagenome assembled genomes and contigs.

Storing skani sketches/indices on disk.

01

sketch_dir sketch markers versions

skani:

skani is a fast and robust tool for calculating ANI between metagenome assembled genomes and contigs.

All-to-all ANI computation.

01

triangle versions

skani:

skani is a fast and robust tool for calculating ANI between metagenome assembled genomes and contigs.

Linearize and simplify variation graph in GFA format using blocked partial order alignment

01

gfa maf versions

smoove simplifies and speeds calling and genotyping SVs for short reads. It also improves specificity by removing many spurious alignment signals that are indicative of low-level noise and often contribute to spurious calls. Developed by Brent Pedersen.

01230101

vcf versions

smoove:

structural variant calling and genotyping with existing tools, but, smoothly

Performs fastq alignment to a fasta reference using SNAP

0101

bam bai versions

snapaligner:

Scalable Nucleotide Alignment Program -- a fast and accurate read aligner for high-throughput sequencing data

Create a SNAP index for reference genome

01234

index versions

snapaligner:

Scalable Nucleotide Alignment Program -- a fast and accurate read aligner for high-throughput sequencing data

Core-SNP alignment from Snippy outputs

0120

aln full_aln tab vcf txt versions

snippy:

Rapid bacterial SNP calling and core genome alignments

Rapid haploid variant calling

010

tab csv html vcf bed gff bam bai log aligned_fa consensus_fa consensus_subs_fa raw_vcf filt_vcf vcf_gz vcf_csi txt versions

snippy:

Rapid bacterial SNP calling and core genome alignments

Pairwise SNP distance matrix from a FASTA sequence alignment

01

tsv versions

Rapidly extracts SNPs from a multi-FASTA alignment.

0

fasta constant_sites versions constant_sites_string

Local sequence alignment tool for filtering, mapping and clustering.

010101

reads log index versions

SortMeRNA:

The core algorithm is based on approximate seeds and allows for sensitive analysis of NGS reads. The main application of SortMeRNA is filtering rRNA from metatranscriptomic data. SortMeRNA takes as input files of reads (fasta, fastq, fasta.gz, fastq.gz) and one or multiple rRNA database file(s), and sorts apart aligned and rejected reads into two files. Additional applications include clustering and taxonomy assignation available through QIIME v1.9.1. SortMeRNA works with Illumina, Ion Torrent and PacBio data, and can produce SAM and BLAST-like alignments.

Compare many FracMinHash signatures generated by sourmash sketch.

01000

matrix labels csv versions

sourmash:

Compute and compare FracMinHash signatures for DNA and protein data sets.

Search a metagenome sourmash signature against one or many reference databases and return the minimum set of genomes that contain the k-mers in the metagenome.

0100000

result unassigned matches prefetch prefetchcsv versions

sourmash:

Compute and compare FracMinHash signatures for DNA data sets.

Annotate list of metagenome members (based on sourmash signature matches) with taxonomic information.

010

result versions

sourmash:

Compute and compare FracMinHash signatures for DNA data sets.

Aligns sequences using T_COFFEE

01010120

alignment lib versions

tcoffee:

A collection of tools for Computing, Evaluating and Manipulating Multiple Alignments of DNA, RNA, Protein Sequences and Structures.

pigz:

Parallel implementation of the gzip algorithm.

Compares 2 alternative MSAs to evaluate them.

012

scores versions

tcoffee:

A collection of tools for Multiple Alignments of DNA, RNA, Protein Sequence

pigz:

Parallel implementation of the gzip algorithm.

Computes a consensus alignment using T_COFFEE

01010

alignment eval versions

tcoffee:

A collection of tools for Computing, Evaluating and Manipulating Multiple Alignments of DNA, RNA, Protein Sequences and Structures.

pigz:

Parallel implementation of the gzip algorithm.

Reformats the header of PDB files with t-coffee

01

formatted_pdb versions

tcoffee:

A collection of tools for Computing, Evaluating and Manipulating Multiple Alignments of DNA, RNA, Protein Sequences and Structures.

Computes the irmsd score for a given alignment and the structures.

01012

irmsd versions

tcoffee:

A collection of tools for Multiple Alignments of DNA, RNA, Protein Sequence

pigz:

Parallel implementation of the gzip algorithm.

Aligns sequences using the regressive algorithm as implemented in the T_COFFEE package

01010120

alignment versions

tcoffee:

A collection of tools for Computing, Evaluating and Manipulating Multiple Alignments of DNA, RNA, Protein Sequences and Structures.

pigz:

Parallel implementation of the gzip algorithm.

Reformats files with t-coffee

01

formatted_file versions

tcoffee:

A collection of tools for Computing, Evaluating and Manipulating Multiple Alignments of DNA, RNA, Protein Sequences and Structures.

Compute the TCS score for a MSA or for a MSA plus a library file. Outputs the tcs as it is and a csv with just the total TCS score.

0101

tcs scores versions

tcoffee:

A collection of tools for Multiple Alignments of DNA, RNA, Protein Sequence

pigz:

Parallel implementation of the gzip algorithm.

TransDecoder identifies candidate coding regions within transcript sequences. it is used to build gff file.

01

pep gff3 cds dat folder versions

transdecoder:

TransDecoder identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks.

TransDecoder identifies candidate coding regions within transcript sequences. It is used to build gff file. You can use this module after transdecoder_longorf

010

pep gff3 cds bed versions

transdecoder:

TransDecoder identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks.

Cluster contigs from multiple assemblies by similarity

012

cluster_dir versions

trycycler:

Trycycler is a tool for generating consensus long-read assemblies for bacterial genomes

Import transcript-level abundances and estimated counts for gene-level analysis packages

01010

tpm_gene counts_gene counts_gene_length_scaled counts_gene_scaled lengths_gene tpm_transcript counts_transcript lengths_transcript versions

tximeta:

Transcript Quantification Import with Automatic Metadata

uLTRA aligner - A wrapper around minimap2 to improve small exon detection - Index gtf file for reads alignment

00

index versions

ultra:

Splice aligner of long transcriptomic reads to genome.

Aligns protein structures using UPP

01010

alignment versions

upp:

SATe-enabled phylogenetic placement

Filtering, downsampling and profiling alignments in BAM/CRAM formats

01

bam versions

In order to judge about candidate indel and structural variants, Varlociraptor needs to know about certain properties of the underlying sequencing experiment in combination with the used read aligner.

010101

alignment_properties_json versions

varlociraptor:

Flexible, uncertainty-aware variant calling with parameter free filtration via FDR control.

Constructs a graph from a reference and variant calls or a multiple sequence alignment file

01230101

graph versions

vg:

Variation graph data structures, interchange formats, alignment, genotyping, and variant calling methods.

Deconstruct snarls present in a variation graph in GFA format to variants in VCF format

0100

vcf versions

vg:

Variation graph data structures, interchange formats, alignment, genotyping, and variant calling methods.

write your description here

01

xg vg_index versions

vg:

Variation graph data structures, interchange formats, alignment, genotyping, and variant calling methods.

Cluster sequences using a single-pass, greedy centroid-based clustering algorithm.

01

aln biom mothur otu bam out blast uc centroids clusters profile msa versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Performs quality filtering and / or conversion of a FASTQ file to FASTA format.

01

fasta log versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Taxonomic classification using the sintax algorithm.

010

tsv versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Sort fasta entries by decreasing abundance (--sortbysize) or sequence length (--sortbylength).

010

fasta versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Compare target sequences to fasta-formatted query sequences using global pairwise alignment.

010000

aln biom lca mothur otu sam tsv txt uc versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

a pangenome-scale aligner

0123400

paf versions

Click here to trigger an update.