Available Modules

Modules are the building stones of all DSL2 nf-core blocks. You can find more info from nf-core website, if you would like to write your own module.

  • reference 65
  • fasta 44
  • index 37
  • genome 35
  • fastq 25
  • align 22
  • bam 19
  • map 19
  • sam 12
  • alignment 11
  • genomics 10
  • metagenomics 8
  • k-mer 8
  • bwa 8
  • mkref 7
  • vcf 6
  • taxonomy 5
  • count 5
  • short-read 5
  • reference-free 5
  • assembly 4
  • gatk4 4
  • sort 4
  • classification 4
  • build 4
  • illumina 4
  • picard 4
  • mapping 4
  • mem 4
  • profiling 4
  • ganon 4
  • cram 3
  • gtf 3
  • imputation 3
  • phylogeny 3
  • bisulfite 3
  • DNA methylation 3
  • WGBS 3
  • scWGBS 3
  • mappability 3
  • splicing 3
  • bcl2fastq 3
  • cellranger 3
  • aln 3
  • mkfastq 3
  • bed 2
  • structural variants 2
  • variant calling 2
  • database 2
  • download 2
  • classify 2
  • cnv 2
  • taxonomic profiling 2
  • sentieon 2
  • single-cell 2
  • copy number 2
  • rnaseq 2
  • consensus 2
  • kmer 2
  • bisulphite 2
  • methylseq 2
  • sequences 2
  • phasing 2
  • aligner 2
  • bisulfite sequencing 2
  • biscuit 2
  • genotype 2
  • population genetics 2
  • sketch 2
  • report 2
  • rna 2
  • NCBI 2
  • ptr 2
  • sourmash 2
  • coptr 2
  • 3-letter genome 2
  • distance 2
  • query 2
  • phylogenetic placement 2
  • clean 2
  • spaceranger 2
  • mapper 2
  • rsem 2
  • dictionary 2
  • bwameth 2
  • assembly evaluation 2
  • haplotypes 2
  • immunoprofiling 2
  • vdj 2
  • ragtag 2
  • junctions 2
  • interval list 2
  • merge 1
  • qc 1
  • quality control 1
  • split 1
  • contamination 1
  • convert 1
  • proteomics 1
  • VCF 1
  • trimming 1
  • graph 1
  • sv 1
  • methylation 1
  • databases 1
  • table 1
  • QC 1
  • bqsr 1
  • openms 1
  • serotype 1
  • taxonomic classification 1
  • 5mC 1
  • histogram 1
  • example 1
  • expression 1
  • low-coverage 1
  • sequence 1
  • damage 1
  • palaeogenomics 1
  • gene 1
  • transcript 1
  • archaeogenomics 1
  • evaluation 1
  • bismark 1
  • cnvkit 1
  • reads 1
  • duplicates 1
  • clipping 1
  • deamination 1
  • mpileup 1
  • idXML 1
  • counts 1
  • interval_list 1
  • microsatellite 1
  • fusion 1
  • quantification 1
  • containment 1
  • archaeogenetics 1
  • ancestry 1
  • miscoding lesions 1
  • skani 1
  • palaeogenetics 1
  • paf 1
  • hmmcopy 1
  • dist 1
  • typing 1
  • mlst 1
  • pseudoalignment 1
  • npz 1
  • fusions 1
  • scaffolding 1
  • angsd 1
  • pileup 1
  • chip-seq 1
  • arriba 1
  • gene expression 1
  • regions 1
  • hi-c 1
  • atac-seq 1
  • scatter 1
  • k-mer frequency 1
  • hlala_typing 1
  • hla_typing 1
  • hlala 1
  • hla 1
  • leviosam2 1
  • lift 1
  • homoploymer 1
  • MSI 1
  • vg 1
  • FracMinHash sketch 1
  • signature 1
  • gwas 1
  • ancient dna 1
  • Streptococcus pneumoniae 1
  • nucleotides 1
  • graft 1
  • simulate 1
  • xenograft 1
  • version 1
  • gem 1
  • RNA sequencing 1
  • mirdeep2 1
  • cvnkit 1
  • human removal 1
  • decontamination 1
  • hostile 1
  • c to t 1
  • mapad 1
  • signatures 1
  • adna 1
  • copy number alterations 1
  • bgen 1
  • references 1
  • patch 1
  • construct 1
  • impute 1
  • reference-independent 1
  • bwameme 1
  • haploype 1
  • bwamem2 1
  • circular 1
  • realign 1
  • trio binning 1
  • reference compression 1
  • refresh 1
  • reference panel 1
  • variant-calling 1
  • poolseq 1
  • simulation 1
  • uq 1
  • md 1
  • nm 1
  • dragstr 1
  • bedtointervallist 1
  • composestrtablefile 1
  • createsequencedictionary 1
  • getpileupsumaries 1
  • germlinevariantsites 1
  • readcountssummary 1
  • gget 1
  • Mykrobe 1
  • Salmonella Typhi 1
  • allele counts 1
  • reference panels 1
  • admixture 1
  • doCounts 1
  • multiomics 1
  • mkvdjref 1
  • antibody capture 1
  • antigen capture 1
  • crispr 1
  • access 1
  • duplicate removal 1
  • chromap 1
  • liftovervcf 1
  • gccounter 1
  • estimate 1
  • ngm 1
  • unionsum 1
  • mitochondrial genome 1
  • Merqury 1
  • reference genome 1
  • NextGenMap 1
  • annotation 0
  • filter 0
  • gff 0
  • bacteria 0
  • coverage 0
  • statistics 0
  • variants 0
  • nanopore 0
  • MSA 0
  • variant 0
  • gfa 0
  • pacbio 0
  • somatic 0
  • conversion 0
  • clustering 0
  • quality 0
  • binning 0
  • ancient DNA 0
  • long reads 0
  • contigs 0
  • bedtools 0
  • mags 0
  • isoseq 0
  • reporting 0
  • variation graph 0
  • gvcf 0
  • bcftools 0
  • protein 0
  • imaging 0
  • indexing 0
  • wgs 0
  • visualisation 0
  • cna 0
  • compression 0
  • long-read 0
  • demultiplex 0
  • stats 0
  • antimicrobial resistance 0
  • depth 0
  • metrics 0
  • phage 0
  • plink2 0
  • tsv 0
  • haplotype 0
  • searching 0
  • structure 0
  • protein sequence 0
  • plot 0
  • bins 0
  • cluster 0
  • samtools 0
  • base quality score recalibration 0
  • aDNA 0
  • filtering 0
  • pangenome graph 0
  • repeat 0
  • neural network 0
  • pairs 0
  • markduplicates 0
  • matrix 0
  • amr 0
  • machine learning 0
  • cooler 0
  • gzip 0
  • transcriptome 0
  • mmseqs2 0
  • annotate 0
  • iCLIP 0
  • virus 0
  • validation 0
  • db 0
  • bcf 0
  • completeness 0
  • metagenome 0
  • checkm 0
  • LAST 0
  • germline 0
  • seqkit 0
  • peaks 0
  • msa 0
  • kraken2 0
  • ucsc 0
  • prediction 0
  • blast 0
  • hmmsearch 0
  • hmmer 0
  • decompression 0
  • genotyping 0
  • spatial 0
  • glimpse 0
  • mag 0
  • umi 0
  • newick 0
  • ncbi 0
  • segmentation 0
  • dedup 0
  • complexity 0
  • gff3 0
  • feature 0
  • vsearch 0
  • json 0
  • prokaryote 0
  • scRNA-seq 0
  • bedGraph 0
  • kmers 0
  • pangenome 0
  • plasmid 0
  • multiple sequence alignment 0
  • single 0
  • tumor-only 0
  • antimicrobial peptides 0
  • csv 0
  • deduplication 0
  • antimicrobial resistance genes 0
  • mitochondria 0
  • snp 0
  • profile 0
  • low frequency variant calling 0
  • differential 0
  • demultiplexing 0
  • extract 0
  • mirna 0
  • wxs 0
  • arg 0
  • HMM 0
  • benchmark 0
  • indels 0
  • detection 0
  • merging 0
  • diversity 0
  • concatenate 0
  • cat 0
  • compare 0
  • microbiome 0
  • FASTQ 0
  • de novo 0
  • single cell 0
  • text 0
  • antibiotic resistance 0
  • gridss 0
  • isolates 0
  • tabular 0
  • interval 0
  • mutect2 0
  • de novo assembly 0
  • structural 0
  • MAF 0
  • amps 0
  • visualization 0
  • riboseq 0
  • svtk 0
  • kallisto 0
  • adapters 0
  • fragment 0
  • fastx 0
  • ont 0
  • call 0
  • summary 0
  • view 0
  • fgbio 0
  • add 0
  • propr 0
  • haplotypecaller 0
  • malt 0
  • gsea 0
  • STR 0
  • compress 0
  • parsing 0
  • microarray 0
  • hic 0
  • redundancy 0
  • family 0
  • bedpe 0
  • cut 0
  • bedgraph 0
  • ranking 0
  • logratio 0
  • genome assembler 0
  • transcriptomics 0
  • CLIP 0
  • read depth 0
  • genmod 0
  • circrna 0
  • pypgx 0
  • peak-calling 0
  • ampir 0
  • enrichment 0
  • bgzip 0
  • union 0
  • isomir 0
  • normalization 0
  • umitools 0
  • DNA sequencing 0
  • abundance 0
  • dna 0
  • DNA sequence 0
  • ccs 0
  • sample 0
  • sequencing 0
  • mtDNA 0
  • snps 0
  • ATAC-seq 0
  • targeted sequencing 0
  • resistance 0
  • hybrid capture sequencing 0
  • bin 0
  • chunk 0
  • copy number alteration calling 0
  • xeniumranger 0
  • retrotransposon 0
  • chromosome 0
  • bigwig 0
  • diamond 0
  • preprocessing 0
  • fai 0
  • telomere 0
  • SV 0
  • sylph 0
  • ngscheckmate 0
  • happy 0
  • deep learning 0
  • image 0
  • nucleotide 0
  • fungi 0
  • public datasets 0
  • HiFi 0
  • BGC 0
  • matching 0
  • biosynthetic gene cluster 0
  • gatk4spark 0
  • somatic variants 0
  • SNP 0
  • comparison 0
  • lossless 0
  • bacterial 0
  • mzml 0
  • identity 0
  • pairsam 0
  • relatedness 0
  • subsample 0
  • entrez 0
  • fastk 0
  • structural_variants 0
  • pan-genome 0
  • pangolin 0
  • lineage 0
  • anndata 0
  • covid 0
  • UMI 0
  • observations 0
  • survivor 0
  • panel 0
  • wastewater 0
  • benchmarking 0
  • bim 0
  • duplication 0
  • PacBio 0
  • fam 0
  • mask 0
  • hidden Markov model 0
  • cfDNA 0
  • polishing 0
  • population genomics 0
  • vrhyme 0
  • scaffold 0
  • amplicon sequencing 0
  • amplicon sequences 0
  • notebook 0
  • reports 0
  • prokka 0
  • krona chart 0
  • transposons 0
  • khmer 0
  • windowmasker 0
  • krona 0
  • html 0
  • small indels 0
  • popscle 0
  • genotype-based deconvoltion 0
  • indel 0
  • kinship 0
  • shapeit 0
  • spark 0
  • miRNA 0
  • tabix 0
  • seqtk 0
  • ambient RNA removal 0
  • informative sites 0
  • rna_structure 0
  • RNA 0
  • replace 0
  • score 0
  • genome assembly 0
  • transcripts 0
  • uLTRA 0
  • insert 0
  • variant_calling 0
  • ligate 0
  • minimap2 0
  • long_read 0
  • guide tree 0
  • untar 0
  • uncompress 0
  • chimeras 0
  • unzip 0
  • zip 0
  • archiving 0
  • organelle 0
  • kraken 0
  • genome mining 0
  • bamtools 0
  • cool 0
  • png 0
  • proteome 0
  • repeat expansion 0
  • bracken 0
  • cut up 0
  • das tool 0
  • das_tool 0
  • wig 0
  • prefetch 0
  • prokaryotes 0
  • comparisons 0
  • ataqv 0
  • image_analysis 0
  • mcmicro 0
  • highly_multiplexed_imaging 0
  • CRISPR 0
  • dump 0
  • eukaryotes 0
  • combine 0
  • bakta 0
  • intervals 0
  • host 0
  • RNA-seq 0
  • converter 0
  • deeparg 0
  • C to T 0
  • roh 0
  • adapter trimming 0
  • remove 0
  • virulence 0
  • fingerprint 0
  • macrel 0
  • amplify 0
  • neubi 0
  • fcs-gx 0
  • scores 0
  • quality trimming 0
  • checkv 0
  • complement 0
  • genomes 0
  • PCA 0
  • DRAMP 0
  • microbes 0
  • minhash 0
  • windows 0
  • immunoinformatics 0
  • intersect 0
  • norm 0
  • long terminal repeat 0
  • normalize 0
  • intersection 0
  • mash 0
  • long terminal retrotransposon 0
  • kma 0
  • retrotransposons 0
  • checksum 0
  • megan 0
  • GC content 0
  • k-mer index 0
  • archive 0
  • lofreq 0
  • bloom filter 0
  • pharokka 0
  • reheader 0
  • xz 0
  • function 0
  • profiles 0
  • COBS 0
  • resolve_bioscience 0
  • spatial_transcriptomics 0
  • tree 0
  • salmon 0
  • BAM 0
  • rna-seq 0
  • regression 0
  • reformat 0
  • functional analysis 0
  • mapcounter 0
  • haplogroups 0
  • interactions 0
  • taxids 0
  • ichorcna 0
  • taxon name 0
  • zlib 0
  • pigz 0
  • find 0
  • differential expression 0
  • trancriptome 0
  • tama 0
  • translation 0
  • amino acid 0
  • genetics 0
  • barcode 0
  • orf 0
  • primer 0
  • pair 0
  • region 0
  • interactive 0
  • krakenuniq 0
  • sizes 0
  • bases 0
  • homologs 0
  • krakentools 0
  • screen 0
  • bustools 0
  • metamaps 0
  • awk 0
  • tbi 0
  • polyA_tail 0
  • blastn 0
  • refine 0
  • maximum likelihood 0
  • iphop 0
  • instrain 0
  • deseq2 0
  • dict 0
  • varcal 0
  • MaltExtract 0
  • HOPS 0
  • authentication 0
  • soft-clipped clusters 0
  • edit distance 0
  • qualty 0
  • samples 0
  • fixmate 0
  • collate 0
  • taxon tables 0
  • secondary metabolites 0
  • bam2fq 0
  • NRPS 0
  • RiPP 0
  • antibiotics 0
  • antismash 0
  • rtgtools 0
  • vcflib 0
  • salmonella 0
  • rename 0
  • allele 0
  • join 0
  • cancer genomics 0
  • snpsift 0
  • snpeff 0
  • effect prediction 0
  • small genome 0
  • de novo assembler 0
  • shigella 0
  • otu tables 0
  • svdb 0
  • switch 0
  • standardization 0
  • sequenzautils 0
  • taxonomic profile 0
  • standardise 0
  • transformation 0
  • standardisation 0
  • runs_of_homozygosity 0
  • polish 0
  • instability 0
  • microscopy 0
  • GPU-accelerated 0
  • trim 0
  • multiallelic 0
  • small variants 0
  • rgfa 0
  • tnhaplotyper2 0
  • gstama 0
  • reformatting 0
  • graph layout 0
  • nextclade 0
  • orthology 0
  • parallelized 0
  • removal 0
  • transcriptomic 0
  • mudskipper 0
  • concat 0
  • msisensor-pro 0
  • micro-satellite-scan 0
  • tumor 0
  • msi 0
  • cnvnator 0
  • proportionality 0
  • RNA-Seq 0
  • preseq 0
  • contig 0
  • artic 0
  • duplicate 0
  • Read depth 0
  • aggregate 0
  • Duplication purging 0
  • demultiplexed reads 0
  • purge duplications 0
  • library 0
  • adapter 0
  • ped 0
  • import 0
  • variant pruning 0
  • bfiles 0
  • subset 0
  • SimpleAF 0
  • copyratios 0
  • image_processing 0
  • registration 0
  • mitochondrion 0
  • read-group 0
  • rrna 0
  • serogroup 0
  • nacho 0
  • metagenomic 0
  • cgMLST 0
  • unaligned 0
  • mass spectrometry 0
  • UMIs 0
  • orthologs 0
  • duplex 0
  • trgt 0
  • nanostring 0
  • fetch 0
  • GEO 0
  • sra-tools 0
  • fasterq-dump 0
  • identifier 0
  • sequence analysis 0
  • baf 0
  • pharmacogenetics 0
  • estimation 0
  • expansionhunterdenovo 0
  • repeat_expansions 0
  • cleaning 0
  • structural-variant calling 0
  • metadata 0
  • screening 0
  • tab 0
  • recombination 0
  • metagenomes 0
  • eCLIP 0
  • WGS 0
  • long-read sequencing 0
  • doublets 0
  • corrupted 0
  • mRNA 0
  • realignment 0
  • microbial 0
  • deconvolution 0
  • allele-specific 0
  • smrnaseq 0
  • bayesian 0
  • filtermutectcalls 0
  • MCMICRO 0
  • calling 0
  • ome-tif 0
  • Pharmacogenetics 0
  • split_kmers 0
  • evidence 0
  • repeats 0
  • panelofnormals 0
  • cnv calling 0
  • CNV 0
  • dereplicate 0
  • joint genotyping 0
  • gatk 0
  • merge mate pairs 0
  • reads merging 0
  • short reads 0
  • correction 0
  • frame-shift correction 0
  • splice 0
  • settings 0
  • random forest 0
  • amptransformer 0
  • gene set 0
  • gene set analysis 0
  • eigenstrat 0
  • variation 0
  • samplesheet 0
  • validate 0
  • format 0
  • genome bins 0
  • blastp 0
  • phase 0
  • ChIP-seq 0
  • gene labels 0
  • genomad 0
  • single cells 0
  • emboss 0
  • parse 0
  • heatmap 0
  • ampgram 0
  • eido 0
  • spatial_omics 0
  • concordance 0
  • spatialdata 0
  • melon 0
  • proteus 0
  • plant 0
  • hash sketch 0
  • setgt 0
  • readproteingroups 0
  • metabolomics 0
  • cell segmentation 0
  • SINE 0
  • copy-number 0
  • jvarkit 0
  • remove samples 0
  • gender determination 0
  • scanner 0
  • helitron 0
  • tar 0
  • unmarkduplicates 0
  • covariance models 0
  • translate 0
  • leafcutter 0
  • copy number analysis 0
  • trna 0
  • wham 0
  • fracminhash sketch 0
  • genome annotation 0
  • mobile genetic elements 0
  • tarball 0
  • copy number variation 0
  • yahs 0
  • geo 0
  • recovery 0
  • relabel 0
  • vsearch/fastqfilter 0
  • bedcov 0
  • genome polishing 0
  • fastqfilter 0
  • assembly polishing 0
  • chloroplast 0
  • confidence 0
  • blat 0
  • alr 0
  • clr 0
  • boxcox 0
  • tnscope 0
  • vsearch/dereplicate 0
  • telseq 0
  • rRNA 0
  • Escherichia coli 0
  • stardist 0
  • propd 0
  • Read coverage histogram 0
  • immunology 0
  • BCR 0
  • groupby 0
  • eigenvectors 0
  • secondary structure 0
  • network 0
  • resegment 0
  • wget 0
  • wavefront 0
  • hicPCA 0
  • sliding 0
  • mgi 0
  • snakemake 0
  • workflow 0
  • morphology 0
  • ATACseq 0
  • workflow_mode 0
  • ATACshift 0
  • createreadcountpanelofnormals 0
  • shift 0
  • denoisereadcounts 0
  • readwriter 0
  • ribosomal RNA 0
  • dnamodelapply 0
  • dnascope 0
  • comp 0
  • whamg 0
  • vsearch/sort 0
  • mashmap 0
  • source tracking 0
  • decompress 0
  • vcf2bed 0
  • significance statistic 0
  • scanpy 0
  • rdtest 0
  • hwe 0
  • emoji 0
  • umicollapse 0
  • data-download 0
  • scRNA-Seq 0
  • gtftogenepred 0
  • controlstatistics 0
  • rdtest2vcf 0
  • countsvtypes 0
  • p-value 0
  • scvi 0
  • elprep 0
  • files 0
  • baftest 0
  • elfasta 0
  • ucsc/liftover 0
  • refflat 0
  • upd 0
  • eucaryotes 0
  • doublet_detection 0
  • subsetting 0
  • fast5 0
  • modelsegments 0
  • polya tail 0
  • Mycobacterium tuberculosis 0
  • metagenome assembler 0
  • chromosomal rearrangements 0
  • coding 0
  • genepred 0
  • missingness 0
  • cds 0
  • transcroder 0
  • quality_control 0
  • sequencing adapters 0
  • logFC 0
  • bedgraphtobigwig 0
  • bigbed 0
  • bedtobigbed 0
  • nucleotide content 0
  • uniparental 0
  • all versus all 0
  • spa 0
  • graph projection to vcf 0
  • nucBed 0
  • long-reads 0
  • bclconvert 0
  • plotting 0
  • variantcalling 0
  • sccmec 0
  • streptococcus 0
  • extractunbinned 0
  • linkbins 0
  • integron 0
  • targz 0
  • iterative model refinement 0
  • nuclear segmentation 0
  • sintax 0
  • spatype 0
  • barcodes 0
  • usearch 0
  • long read alignment 0
  • pangenome-scale 0
  • regtools 0
  • DNA contamination estimation 0
  • disomy 0
  • metabolite annotation 0
  • snv 0
  • downsample 0
  • svtk/baftest 0
  • downsample bam 0
  • subsample bam 0
  • vcf2db 0
  • AT content 0
  • gemini 0
  • maf 0
  • lua 0
  • detecting svs 0
  • toml 0
  • solo 0
  • import segmentation 0
  • short-read sequencing 0
  • vcfbreakmulti 0
  • uniq 0
  • deduplicate 0
  • VCFtools 0
  • verifybamid 0
  • metaspace 0
  • check 0
  • decoy 0
  • genotype dosages 0
  • 10x 0
  • hwe statistics 0
  • ribosomal 0
  • grabix 0
  • SNV 0
  • hwe equilibrium 0
  • Indel 0
  • host removal 0
  • genotype likelihood 0
  • patterns 0
  • collapse 0
  • liftover 0
  • probabilistic realignment 0
  • seqfu 0
  • n50 0
  • guidetree 0
  • Pacbio 0
  • doublet 0
  • cell_type_identification 0
  • cell_phenotyping 0
  • machine_learning 0
  • hardy-weinberg 0
  • regex 0
  • AC/NS/AF 0
  • distance-based 0
  • python 0
  • r 0
  • quality check 0
  • coexpression 0
  • correlation 0
  • corpcor 0
  • low-complexity 0
  • assay 0
  • phylogenetics 0
  • minimum_evolution 0
  • nucleotide sequence 0
  • shuffleBed 0
  • GFF/GTF 0
  • size 0
  • cram-size 0
  • selector 0
  • paraphase 0
  • transcription factors 0
  • regulatory network 0
  • tandem repeats 0
  • multi-tool 0
  • long read 0
  • predict 0
  • vcflib/vcffixup 0
  • spot 0
  • nanopore sequencing 0
  • cell_barcodes 0
  • hhsuite 0
  • 16S 0
  • mygene 0
  • go 0
  • retrieval 0
  • CRISPRi 0
  • pile up 0
  • catpack 0
  • prepare 0
  • transposable element 0
  • generic 0
  • hmmpress 0
  • coreutils 0
  • rna velocity 0
  • cobra 0
  • gnu 0
  • extension 0
  • grea 0
  • hashing-based deconvoltion 0
  • hamming-distance 0
  • functional enrichment 0
  • paired reads merging 0
  • overlap-based merging 0
  • taxonomic composition 0
  • tag 0
  • Computational Immunology 0
  • trimfq 0
  • omics 0
  • clahe 0
  • association 0
  • GWAS 0
  • case/control 0
  • associations 0
  • spatial_neighborhoods 0
  • scimap 0
  • cellsnp 0
  • Bayesian 0
  • structural-variants 0
  • donor deconvolution 0
  • hmmscan 0
  • genotype-based demultiplexing 0
  • lexogen 0
  • biological activity 0
  • droplet based single cells 0
  • junction 0
  • Immune Deconvolution 0
  • Bioinformatics Tools 0
  • prior knowledge 0
  • phylogenies 0
  • busco 0
  • InterProScan 0
  • MMseqs2 0
  • masking 0
  • quarto 0
  • staging 0
  • derived alleles 0
  • tnfilter 0
  • heterozygous genotypes 0
  • inbreeding 0
  • array_cgh 0
  • cytosure 0
  • Staging 0
  • vector 0
  • gprofiler2 0
  • gost 0
  • rad 0
  • block substitutions 0
  • covariance model 0
  • haplotag 0
  • standard 0
  • svg 0
  • structural variant 0
  • xml 0
  • run 0
  • script 0
  • bam2fastx 0
  • bam2fastq 0
  • dereplication 0
  • java 0
  • pdb 0
  • ancestral alleles 0
  • immcantation 0
  • mass_error 0
  • reverse complement 0
  • vcf file 0
  • bgen file 0
  • plink2_pca 0
  • search engine 0
  • hmmfetch 0
  • decompose 0
  • identity-by-descent 0
  • decomposeblocksub 0
  • transmembrane 0
  • genome graph 0
  • site frequency spectrum 0
  • pca 0
  • tnseq 0
  • multiqc 0
  • mzML 0
  • pruning 0
  • htseq 0
  • linkage equilibrium 0
  • sompy 0
  • f coefficient 0
  • peak picking 0
  • homozygous genotypes 0
  • microRNA 0
  • rank 0
  • airrseq 0
  • orthogroup 0
  • isoform 0
  • joint-genotyping 0
  • variancepartition 0
  • genotypegvcf 0
  • dream 0
  • redundant 0
  • fix 0
  • extraction 0
  • featuretable 0
  • parallel 0
  • plastid 0
  • malformed 0
  • paired reads re-pairing 0
  • short 0
  • sage 0
  • resfinder 0
  • resistance genes 0
  • raw 0
  • mgf 0
  • parquet 0
  • intron 0
  • parser 0
  • dbsnp 0
  • standardize 0
  • install 0
  • nanoq 0
  • hashing-based deconvolution 0
  • deep variant 0
  • co-orthology 0
  • updatedata 0
  • homology 0
  • microbial genomics 0
  • chip 0
  • tag2tag 0
  • sequence similarity 0
  • spectral clustering 0
  • tags 0
  • comparative genomics 0
  • partitioning 0
  • functional 0
  • Illumina 0
  • Read filters 0
  • uniques 0
  • mutect 0
  • idx 0
  • drep 0
  • drug categorization 0
  • Read report 0
  • agat 0
  • Read trimming 0
  • transform 0
  • gaps 0
  • introns 0
  • longest 0
  • impute-info 0
  • assembler 0
  • constant 0
  • getpileupsummaries 0
  • short variant discovery 0
  • combinegvcfs 0
  • collectsvevidence 0
  • collectreadcounts 0
  • cnnscorevariants 0
  • calibratedragstrmodel 0
  • cross-samplecontamination 0
  • calculatecontamination 0
  • asereadcounter 0
  • vqsr 0
  • variant quality score recalibration 0
  • annotateintervals 0
  • condensedepthevidence 0
  • heattree 0
  • gatherbqsrreports 0
  • germlinecnvcaller 0
  • germline contig ploidy 0
  • panelofnormalscreation 0
  • jointgenotyping 0
  • genomicsdbimport 0
  • genomicsdb 0
  • tranche filtering 0
  • filtervarianttranches 0
  • filterintervals 0
  • estimatelibrarycomplexity 0
  • duplication metrics 0
  • determinegermlinecontigploidy 0
  • createsomaticpanelofnormals 0
  • targets 0
  • gangstr 0
  • antibiotic resistance genes 0
  • consensus sequence 0
  • public 0
  • ENA 0
  • SRA 0
  • ANI 0
  • ARGs 0
  • faqcs 0
  • groupreads 0
  • str 0
  • cache 0
  • percent on target 0
  • endogenous DNA 0
  • Streptococcus pyogenes 0
  • swissprot 0
  • duplexumi 0
  • unmapped 0
  • gene-calling 0
  • variant caller 0
  • gamma 0
  • UShER 0
  • bootstrapping 0
  • bacterial variant calling 0
  • germline variant calling 0
  • somatic variant calling 0
  • rust 0
  • ubam 0
  • fq 0
  • lint 0
  • random 0
  • generate 0
  • single molecule 0
  • zipperbams 0
  • embl 0
  • Imputation 0
  • gene model 0
  • tama_collapse.py 0
  • genomes on a tree 0
  • merge compare 0
  • GNU 0
  • joint-variant-calling 0
  • Haplotypes 0
  • gstama/merge 0
  • Sample 0
  • low coverage 0
  • genome statistics 0
  • genome manipulation 0
  • genome summary 0
  • TAMA 0
  • gstama/polyacleanup 0
  • abricate 0
  • beagle 0
  • hbd 0
  • ibd 0
  • rgi 0
  • fARGene 0
  • amrfinderplus 0
  • extractvariants 0
  • GTDB taxonomy 0
  • extract_variants 0
  • gvcftools 0
  • gunzip 0
  • gunc 0
  • archaea 0
  • genome taxonomy database 0
  • gfastats 0
  • indexfeaturefile 0
  • preprocessintervals 0
  • shiftchain 0
  • selectvariants 0
  • revert 0
  • reblockgvcf 0
  • printsvevidence 0
  • printreads 0
  • postprocessgermlinecnvcalls 0
  • shiftintervals 0
  • snvs 0
  • mutectstats 0
  • mergebamalignment 0
  • leftalignandtrimvariants 0
  • readorientationartifacts 0
  • learnreadorientationmodel 0
  • shiftfasta 0
  • site depth 0
  • repeat content 0
  • file parsing 0
  • genome heterozygosity 0
  • genome size 0
  • models 0
  • compound 0
  • genome profile 0
  • bgc 0
  • txt 0
  • splitcram 0
  • gawk 0
  • variantrecalibrator 0
  • recalibration model 0
  • variantfiltration 0
  • svcluster 0
  • svannotate 0
  • splitintervals 0
  • genbank 0
  • split by chromosome 0
  • Haemophilus influenzae 0
  • illumiation_correction 0
  • BCF 0
  • csi 0
  • deduping 0
  • smaller fastqs 0
  • clumping fastqs 0
  • background_correction 0
  • element 0
  • biallelic 0
  • trimBam 0
  • bamUtil 0
  • bamtools/split 0
  • yaml 0
  • bamtools/convert 0
  • mouse 0
  • update header 0
  • homozygosity 0
  • virulent 0
  • chunking 0
  • subtract 0
  • slopBed 0
  • shiftBed 0
  • multinterval 0
  • overlapped bed 0
  • maskfasta 0
  • jaccard 0
  • autozygosity 0
  • overlap 0
  • getfasta 0
  • genomecov 0
  • closest 0
  • bamtobed 0
  • sorting 0
  • bacphlip 0
  • temperate 0
  • bioawk 0
  • amp 0
  • nuclear contamination estimate 0
  • post Post-processing 0
  • model 0
  • AMPs 0
  • antimicrobial peptide prediction 0
  • Staphylococcus aureus 0
  • installation 0
  • affy 0
  • adapterremoval 0
  • antimicrobial reistance 0
  • contiguate 0
  • HLA 0
  • lifestyle 0
  • read group 0
  • autofluorescence 0
  • cycif 0
  • background 0
  • single-stranded 0
  • ancientDNA 0
  • authentict 0
  • bias 0
  • utility 0
  • ATLAS 0
  • sequencing_bias 0
  • post mortem damage 0
  • atlas 0
  • mkarv 0
  • http(s) 0
  • unionBedGraphs 0
  • file manipulation 0
  • deletion 0
  • Segmentation 0
  • cutesv 0
  • gct 0
  • cls 0
  • na 0
  • custom 0
  • Cores 0
  • TMA dearray 0
  • paired-end 0
  • UNet 0
  • mcool 0
  • genomic bins 0
  • makebins 0
  • enzyme 0
  • digest 0
  • pcr duplicates 0
  • track 0
  • cooler/balance 0
  • escherichia coli 0
  • circos 0
  • eklipse 0
  • eigenstratdatabasetools 0
  • pep 0
  • schema 0
  • PEP 0
  • depth information 0
  • corrrelation 0
  • structural variation 0
  • duphold 0
  • segment 0
  • blastx 0
  • cumulative coverage 0
  • scatterplot 0
  • cload 0
  • subcontigs 0
  • sorted 0
  • compartments 0
  • cellpose 0
  • hifi 0
  • Assembly 0
  • domains 0
  • topology 0
  • calder2 0
  • cadd 0
  • postprocessing 0
  • tblastn 0
  • subtyping 0
  • Salmonella enterica 0
  • nucleotide composition 0
  • cmseq 0
  • concoct 0
  • partition histograms 0
  • target 0
  • export 0
  • antitarget 0
  • protein coding genes 0
  • qa 0
  • polymorphic sites 0
  • polymorphic 0
  • polymut 0
  • chromosome_visualization 0
  • quality assurnce 0
  • mitochondrial 0
  • haplotype resolution 0
  • invariant 0
  • cutoff 0
  • False duplications 0
  • Haplotype purging 0
  • assembly curation 0
  • false duplications 0
  • duplicate purging 0
  • haplotype purging 0
  • panel of normals 0
  • purging 0
  • normal database 0
  • genomic intervals 0
  • intervals coverage 0
  • gene finding 0
  • contact maps 0
  • bmp 0
  • Assembly curation 0
  • quast 0
  • pretext 0
  • read_pairs 0
  • integrity 0
  • mapping-based 0
  • sequence-based 0
  • read distribution 0
  • inner_distance 0
  • fragment_size 0
  • experiment 0
  • neighbour-joining 0
  • strandedness 0
  • bamstat 0
  • R 0
  • rhocall 0
  • long uncorrected reads 0
  • subsampling 0
  • jpg 0
  • contact 0
  • pedfilter 0
  • sortvcf 0
  • PRO-cap 0
  • GRO-cap 0
  • CoPRO 0
  • tandem duplications 0
  • insertions 0
  • deletions 0
  • picard/renamesampleinvcf 0
  • NETCAGE 0
  • pcr 0
  • mate-pair 0
  • hybrid-selection 0
  • phylogenetic composition 0
  • illumina datasets 0
  • CAGE 0
  • RAMPAGE 0
  • porechop_abi 0
  • indep pairwise 0
  • pmdtools 0
  • variant genetic 0
  • scoring 0
  • identifiers 0
  • whole genome association 0
  • recode 0
  • indep 0
  • csRNA-seq 0
  • variant identifiers 0
  • exclude 0
  • genetic 0
  • GRO-seq 0
  • PRO-seq 0
  • STRIPE-seq 0
  • rtg 0
  • rocplot 0
  • prophage 0
  • sex determination 0
  • longread 0
  • de-novo 0
  • error 0
  • rare variants 0
  • relative coverage 0
  • genetic sex 0
  • induce 0
  • 256 bit 0
  • gc_wiggle 0
  • bam2seqz 0
  • freqsum 0
  • pseudodiploid 0
  • pseudohaploid 0
  • random draw 0
  • sha256 0
  • shinyngs 0
  • seq 0
  • POA 0
  • SNPs 0
  • predictions 0
  • dbnsfp 0
  • snippy 0
  • core 0
  • sniffles 0
  • SMN2 0
  • exploratory 0
  • SMN1 0
  • CRAM 0
  • sliding window 0
  • features 0
  • density 0
  • boxplot 0
  • selection 0
  • header 0
  • rtg-tools 0
  • duplicate marking 0
  • repair 0
  • insert size 0
  • faidx 0
  • calmd 0
  • ampliconclip 0
  • amplicon 0
  • sambamba 0
  • read pairs 0
  • flagstat 0
  • multimapper 0
  • Ancestor 0
  • LCA 0
  • salsa2 0
  • salsa 0
  • paired 0
  • readgroup 0
  • interleave 0
  • applyvarcal 0
  • sertotype 0
  • sequence headers 0
  • grep 0
  • subseq 0
  • variant recalibration 0
  • VQSR 0
  • assembly-binning 0
  • scramble 0
  • seacr 0
  • chromatin 0
  • cut&run 0
  • cut&tag 0
  • peak-caller 0
  • clusteridentifier 0
  • cluster analysis 0
  • identification 0
  • phantom peaks 0
  • limma 0
  • peptide prediction 0
  • AMP 0
  • qualities 0
  • lofreq/filter 0
  • lofreq/call 0
  • Listeria monocytogenes 0
  • pneumophila 0
  • sgRNA 0
  • clinical 0
  • legionella 0
  • collapsing 0
  • adapter removal 0
  • train 0
  • spliced 0
  • functional genomics 0
  • CRISPR-Cas9 0
  • combining 0
  • reduced 0
  • MD5 0
  • mcr-1 0
  • mass-spectroscopy 0
  • metagenome-assembled genomes 0
  • maxbin2 0
  • representations 0
  • mash/sketch 0
  • maximum-likelihood 0
  • taxonomic assignment 0
  • damage patterns 0
  • NGS 0
  • DNA damage 0
  • rra 0
  • reorder 0
  • kofamscan 0
  • megahit 0
  • panel_of_normals 0
  • multicut 0
  • genome browser 0
  • js 0
  • igv.js 0
  • igv 0
  • IDR 0
  • haemophilus 0
  • pixel_classification 0
  • pos 0
  • annotations 0
  • hmtnote 0
  • Hidden Markov Model 0
  • HMMER 0
  • readcounter 0
  • pixel classification 0
  • probability_maps 0
  • kegg 0
  • kallisto/index 0
  • pneumoniae 0
  • Klebsiella 0
  • effective genome size 0
  • k-mer counting 0
  • digital normalization 0
  • quant 0
  • papermill 0
  • interproscan 0
  • jupytext 0
  • Jupyter 0
  • Python 0
  • jasmine 0
  • jasminesv 0
  • insertion 0
  • genomic islands 0
  • 128 bit 0
  • denovo 0
  • ChIP-Seq 0
  • graph formats 0
  • block-compressed 0
  • HLA-I 0
  • ILP 0
  • hla-typing 0
  • tumor/normal 0
  • graph viz 0
  • graph unchopping 0
  • flip 0
  • graph stats 0
  • combine graphs 0
  • odgi 0
  • squeeze 0
  • graph drawing 0
  • graph construction 0
  • PCR/optical duplicates 0
  • upper-triangular matrix 0
  • Neisseria gonorrhoeae 0
  • pbmerge 0
  • motif 0
  • pedigrees 0
  • read 0
  • pair-end 0
  • pbp 0
  • subreads 0
  • pbbam 0
  • ligation junctions 0
  • graphs 0
  • paragraph 0
  • select 0
  • restriction fragments 0
  • pairstools 0
  • pairtools 0
  • gender 0
  • debruijn 0
  • ploidy 0
  • microrna 0
  • de Bruijn 0
  • mbias 0
  • methylation bias 0
  • metaphlan 0
  • smudgeplot 0
  • contour map 0
  • 3D heat map 0
  • Neisseria meningitidis 0
  • rma6 0
  • daa 0
  • target prediction 0
  • GATK UnifiedGenotyper 0
  • sequencing summary 0
  • mobile element insertions 0
  • somatic structural variations 0
  • cancer genome 0
  • contaminant 0
  • SNP table 0
  • Beautiful stand-alone HTML report 0
  • mosdepth 0
  • bioinformatics tools 0
  • mitochondrial to nuclear ratio 0
  • ratio 0
  • mtnucratio 0
  • scan 0
  • microsatellite instability 0
  • otu table 0
  • TCR 0

ADMIXTURE is a program for estimating ancestry in a model-based manner from large autosomal SNP genotype datasets, where the individuals are unrelated (for example, the individuals in a case-control association study).

01230

ancestry_fractions allele_frequencies versions

ALE: assembly likelihood estimator.

012

ale versions

Calculates base frequency statistics across reference positions from BAM.

0123

depth_sample depth_global qs pos counts icounts versions

angsd:

ANGSD: Analysis of next generation Sequencing Data

Arriba is a command-line tool for the detection of gene fusions from RNA-Seq data.

0

blacklist cytobands protein_domains known_fusions versions

arriba:

Fast and accurate gene fusion detection from RNA-Seq data

removes unused references from header of sorted BAM/CRAM files.

01

bam versions

Align short or PacBio reads to a reference genome using BBMap

010

bam log versions

bbmap:

BBMap is a short read aligner, as well as various other bioinformatic tools.

Split sequencing reads by mapping them to multiple references simultaneously

0100010

index primary_fastq all_fastq stats log versions

bbmap:

BBMap is a short read aligner, as well as various other bioinformatic tools.

Compares query sketches to reference sketches hosted on a remote server via the Internet.

01

hits versions

bbmap:

BBMap is a short read aligner, as well as various other bioinformatic tools.

Compresses VCF files

01234

fasta versions

consensus:

Create consensus sequence by applying VCF variants to a reference fasta file.

Aligns single- or paired-end reads from bisulfite-converted libraries to a reference genome using Biscuit.

010101

bam bai versions

biscuit:

A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data

Indexes a reference genome for use with Biscuit

01

index versions

biscuit:

A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data

Computes cytosine methylation and callable SNV mutations, optionally in reference to a germline BAM to call somatic variants

012340101

vcf versions

biscuit:

A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data

Converts a specified reference genome into two different bisulfite converted versions and indexes them for alignments.

01

index versions

bismark:

Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.

Align reads to a reference genome using bowtie

01010

bam log fastq versions

bowtie:

bowtie is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Create bowtie index for reference genome

01

index versions

bowtie:

bowtie is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Align reads to a reference genome using bowtie2

01010100

sam bam cram csi crai log fastq versions

bowtie2:

Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences.

Builds bowtie index for reference genome

01

index versions

bowtie2:

Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences.

Find SA coordinates of the input reads for bwa short-read mapping

0101

sai versions

bwa:

BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Create BWA index for reference genome

01

index versions

bwa:

BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Performs fastq alignment to a fasta reference using BWA

0101010

bam cram csi crai versions

bwa:

BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Convert paired-end bwa SA coordinate files to SAM format

01201

bam versions

bwa:

BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Convert bwa SA coordinate file to SAM format

01201

bam versions

bwa:

BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Create BWA-mem2 index for reference genome

01

index versions

bwamem2:

BWA-mem2 is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Performs fastq alignment to a fasta reference using BWA

0101010

sam bam cram crai csi versions

bwa:

BWA-mem2 is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Create BWA-MEME index for reference genome

01

index versions

bwameme:

Faster BWA-MEM2 using learned-index

Performs fastq alignment to a fasta reference using BWA-MEME

010101000

sam bam cram crai csi versions

bwameme:

Faster BWA-MEM2 using learned-index

Performs indexing of c2t converted reference genome

01

index versions

bwameth:

Fast and accurate alignment of BS-Seq reads using bwa-mem and a 3-letter genome.

Module to use Cell Ranger's pipelines analyze sequencing data produced from Chromium Single Cell Gene Expression.

010

outs versions

cellranger:

Cell Ranger by 10x Genomics is a set of analysis pipelines that process Chromium single-cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more.

Module to create FASTQs needed by the 10x Genomics Cell Ranger tool. Uses the cellranger mkfastq command.

012

fastq undetermined_fastq reports stats interop versions

cellranger:

Cell Ranger by 10x Genomics is a set of analysis pipelines that process Chromium single-cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more.

Module to build a filtered GTF needed by the 10x Genomics Cell Ranger tool. Uses the cellranger mkgtf command.

0

gtf versions

cellranger:

Cell Ranger by 10x Genomics is a set of analysis pipelines that process Chromium single-cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more.

Module to build the reference needed by the 10x Genomics Cell Ranger tool. Uses the cellranger mkref command.

000

reference versions

cellranger:

Cell Ranger by 10x Genomics is a set of analysis pipelines that process Chromium single-cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more.

Module to build the VDJ reference needed by the 10x Genomics Cell Ranger tool. Uses the cellranger mkvdjref command.

0000

reference versions

cellranger:

Cell Ranger processes data from 10X Genomics Chromium kits. cellranger vdj takes FASTQ files from cellranger mkfastq or bcl2fastq for V(D)J libraries and performs sequence assembly and paired clonotype calling. It uses the Chromium cellular barcodes and UMIs to assemble V(D)J transcripts per cell. Clonotypes and CDR3 sequences are output as a .vloupe file which can be loaded into Loupe V(D)J Browser.

Module to use Cell Ranger's pipelines to analyze sequencing data produced from various Chromium technologies, including Single Cell Gene Expression, Single Cell Immune Profiling, Feature Barcoding, and Cell Multiplexing.

00101010101010000000000000

config outs versions

cellranger:

Cell Ranger by 10x Genomics is a set of analysis pipelines that process Chromium single-cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more.

Module to use Cell Ranger's pipelines analyze sequencing data produced from Chromium Single Cell Immune Profiling.

010

outs versions

cellranger:

Cell Ranger processes data from 10X Genomics Chromium kits. cellranger vdj takes FASTQ files from cellranger mkfastq or bcl2fastq for V(D)J libraries and performs sequence assembly and paired clonotype calling. It uses the Chromium cellular barcodes and UMIs to assemble V(D)J transcripts per cell. Clonotypes and CDR3 sequences are output as a .vloupe file which can be loaded into Loupe V(D)J Browser.

Module to use Cell Ranger's ARC pipelines analyze sequencing data produced from Chromium Single Cell ARC. Uses the cellranger-arc count command.

01230

outs lib versions

cellrangerarc:

Cell Ranger ARC is a set of analysis pipelines that process Chromium Single Cell ARC data.

Module to create fastqs needed by the 10x Genomics Cell Ranger Arc tool. Uses the cellranger-arc mkfastq command.

00

versions fastq

cellrangerarc:

Cell Ranger Arc by 10x Genomics is a set of analysis pipelines that process Chromium single-cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more.

Module to build a filtered gtf needed by the 10x Genomics Cell Ranger Arc tool. Uses the cellranger-arc mkgtf command.

0

gtf versions

cellrangerarc:

Cell Ranger Arc by 10x Genomics is a set of analysis pipelines that process Chromium single-cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more.

Module to build the reference needed by the 10x Genomics Cell Ranger Arc tool. Uses the cellranger-arc mkref command.

00000

reference config versions

cellrangerarc:

Cell Ranger Arc is a set of analysis pipelines that process Chromium Single Cell Arc data.

Module to use Cell Ranger's ATAC pipelines analyze sequencing data produced from Chromium Single Cell ATAC.

010

outs versions

cellranger-atac:

Cell Ranger ATAC is a set of analysis pipelines that process Chromium Single Cell ATAC data.

Module to create fastqs needed by the 10x Genomics Cell Ranger ATAC tool. Uses the cellranger-atac mkfastq command.

00

versions fastq

cellranger-atac:

Cell Ranger ATAC by 10x Genomics is a set of analysis pipelines that process Chromium single-cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more.

Module to build the reference needed by the 10x Genomics Cell Ranger ATAC tool. Uses the cellranger-atac mkref command.

00000

reference versions

cellranger-atac:

Cell Ranger ATAC is a set of analysis pipelines that process Chromium Single Cell ATAC data.

Performs preprocessing and alignment of chromatin fastq files to fasta reference files using chromap.

0101010000

bed bam tagAlign pairs versions

chromap:

Fast alignment and preprocessing of chromatin profiles

Indexes a fasta reference genome ready for chromatin profiling.

01

index versions

chromap:

Fast alignment and preprocessing of chromatin profiles

A method to improve mappings on circular genomes, using the BWA mapper.

010101

fasta elongated versions

circulargenerator:

Creating a modified reference genome, with an elongation of the an specified amount of bases

Realign reads mapped with BWA to elongated reference genome

01010101

bam versions

circularmapper:

A method to improve mappings on circular genomes such as Mitochondria.

Calculate the sequence-accessible coordinates in chromosomes from the given reference genome, output as a BED file.

0101

bed versions

cnvkit:

CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent.

Compile a coverage reference from the given files (normal samples).

000

cnn versions

cnvkit:

CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent.

Computes the coverage map along the reference genome

01

coverage versions

coptr:

Accurate and robust inference of microbial growth dynamics from metagenomic sequencing reads.

Maps the reads to the reference database

0101

bam versions

coptr:

Accurate and robust inference of microbial growth dynamics from metagenomic sequencing reads.

Make a transcript/gene mapping from a GTF and cross-reference with transcript quantifications.

0101000

tx2gene versions

custom:

"Custom module to create a transcript to gene mapping from a GTF and check it against transcript quantifications"

Performs fastq alignment to a reference using DRAGMAP

0101010

sam bam cram crai csi log versions

dragmap:

Dragmap is the Dragen mapper/aligner Open Source Software.

Create DRAGEN hashtable for reference genome

01

hashmap versions

dragmap:

Dragmap is the Dragen mapper/aligner Open Source Software.

Perform phasing of genotyped data with or without a reference panel

012345

phased_variants versions

phylogenetic placement of query sequences in a reference tree

012300

epang jplace log versions

epang:

Massively parallel phylogenetic placement of genetic sequences

splits an alignment into reference and query parts

012

query reference versions

epang:

Massively parallel phylogenetic placement of genetic sequences

Build fastq screen config file from bowtie index files

00

database versions

fastqscreen:

FastQ Screen allows you to screen a library of sequences in FastQ format against a set of sequence databases so you can see if the composition of the library matches with what you expect.

Align reads to multiple reference genomes using fastq-screen

010

txt png html fastq versions

fastqscreen:

FastQ Screen allows you to screen a library of sequences in FastQ format against a set of sequence databases so you can see if the composition of the library matches with what you expect.

Run FCS-GX on assembled genomes. The contigs of the assembly are searched against a reference database excluding the given taxid.

010

fcs_gx_report taxonomy_report versions

fcs:

"The Foreign Contamination Screening (FCS) tool rapidly detects contaminants from foreign organisms in genome assemblies to prepare your data for submission. Therefore, the submission process to NCBI is faster and fewer contaminated genomes are submitted. This reduces errors in analyses and conclusions, not just for the original data submitter but for all subsequent users of the assembly."

Build references for fusioncatcher

0

reference versions

fusioncatcher:

Build genome for fusioncatcher

Build ganon database using custom reference sequences.

01000

db info versions

ganon:

ganon classifies short DNA sequences against large sets of genomic reference sequences efficiently

Classify FASTQ files against ganon database

010

tre report one all unc log versions

ganon:

ganon classifies short DNA sequences against large sets of genomic reference sequences efficiently

Generate a ganon report file from the output of ganon classify

010

tre versions

ganon:

ganon classifies short DNA sequences against large sets of genomic reference sequences efficiently

Generate a multi-sample report file from the output of ganon report runs

01

txt versions

ganon:

ganon classifies short DNA sequences against large sets of genomic reference sequences efficiently

Grafts query sequences from phylogenetic placement on the reference tree

01

newick versions

gappa:

Genesis Applications for Phylogenetic Placement Analysis

Creates an interval list from a bed file and a reference dict

0101

interval_list versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

This tool looks for low-complexity STR sequences along the reference that are later used to estimate the Dragstr model during single sample auto calibration CalibrateDragstrModel.

000

str_table versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Creates a sequence dictionary for a reference sequence

01

dict versions

gatk:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Summarizes counts of reads that support reference, alternate and other alleles for given sites. Results can be used with CalculateContamination. Requires a common germline variant sites file, such as from gnomAD.

012301010100

table versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Create a GEM index from a FASTA file

01

index log versions

gem2:

GEM2 is a high-performance mapping tool. It also provide a unique tool to evaluate mappability.

Define the mappability of a reference

010

map versions

gem2:

GEM2 is a high-performance mapping tool. It also provide a unique tool to evaluate mappability.

Performs fastq alignment to a fasta reference using using gem3-mapper

01010

bam versions

gem3:

The GEM indexer (v3).

Genotype Salmonella Typhi from Mykrobe results

01

tsv versions

genotyphi:

Assign genotypes to Salmonella Typhi genomes based on VCF files (mapped to Typhi CT18 reference genome)

gget is a free, open-source command-line tool and Python package that enables efficient querying of genomic databases. gget consists of a collection of separate but interoperable modules, each designed to facilitate one type of database querying in a single line of code.

01

files output versions

gget:

gget enables efficient querying of genomic databases

Tool to create a binary reference panel for quick reading time.

0123401

bin_ref versions

glimpse2:

GLIMPSE2 is a phasing and imputation method for large-scale low-coverage sequencing studies.

A versatile pairwise aligner for genomic and spliced nucleotide sequences

0100

sam versions

graphmap2:

A versatile pairwise aligner for genomic and spliced nucleotide sequences.

A versatile pairwise aligner for genomic and spliced nucleotide sequences

0

index versions

graphmap2:

A versatile pairwise aligner for genomic and spliced nucleotide sequences.

Align RNA-Seq reads to a reference with HISAT2

010101

bam summary fastq versions

hisat2:

HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes as well as to a single reference genome.

Builds HISAT2 index for reference genome

010101

index versions

hisat2:

HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes as well as to a single reference genome.

Extracts splicing sites from a gtf files

01

txt versions

hisat2:

HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes as well as to a single reference genome.

Performs HLA typing based on a population reference graph and employs a new linear projection method to align reads to the graph.

0123

results extraction extraction_mapped extraction_unmpapped hla fastq reads_per_level remapped versions

hlala:

HLA typing from short and long reads

gcCounter function from HMMcopy utilities, used to generate GC content in non-overlapping windows from a fasta reference

01

wig versions

hmmcopy:

C++ based programs for analyzing BAM files and preparing read counts -- used with bioconductor-hmmcopy

Downloads required reference genomes for Hostile

NO input

reference versions

hostile:

Hostile: accurate host decontamination

Taxonomic classification of metagenomic sequence data using a protein reference database

010

results versions

kaiju:

Fast and sensitive taxonomic classification for metagenomics

Converting aligned short and long reads records from one reference to another

0101

bam versions

leviosam2:

Fast and accurate coordinate conversion between assemblies

Create mapAD index for reference genome

01

index versions

mapad:

An aDNA aware short-read mapper

Map short-reads to an indexed reference genome

01010000000

bam versions

mapad:

An aDNA aware short-read mapper

Calculate Mash distances between reference and query sequences

010

dist versions

mash:

Fast sequence distance estimator that uses MinHash

Produces maternal and paternal FastK kmer tables from maternal, paternal and child FastK tables

010101

mat_hap_ktab pat_hap_ktab versions

merquryfk:

FastK based version of Merqury

FastK based version of Merqury

012340101

stats bed assembly_qv spectra_cn_fl spectra_cn_ln spectra_cn_st qv spectra_asm_fl spectra_asm_ln spectra_asm_st phased_block_bed phased_block_stats continuity_N block_N block_blob hapmers_blob versions

merquryfk:

FastK based version of Merqury

A genomic k-mer counter (and sequence utility) with nice features.

010

meryl_db versions

meryl:

A genomic k-mer counter (and sequence utility) with nice features.

A genomic k-mer counter (and sequence utility) with nice features.

010

hist versions

meryl:

A genomic k-mer counter (and sequence utility) with nice features.

A genomic k-mer counter (and sequence utility) with nice features.

010

meryl_db versions

meryl:

A genomic k-mer counter (and sequence utility) with nice features.

Compression of a reference panel for genotype imputation to .msav format

012

msav versions

minimac4:

Computationally efficient genotype imputation

Imputation of genotypes using a reference panel

0123456

vcf versions

minimac4:

Computationally efficient genotype imputation

A versatile pairwise aligner for genomic and spliced nucleotide sequences

01010000

paf bam index versions

minimap2:

A versatile pairwise aligner for genomic and spliced nucleotide sequences.

Provides fasta index required by minimap2 alignment.

01

index versions

minimap2:

A versatile pairwise aligner for genomic and spliced nucleotide sequences.

Provides fasta index required by miniprot alignment.

01

index versions

miniprot:

A versatile pairwise aligner for genomic and protein sequences.

miRDeep2 Mapper is a tool that prepares deep sequencing reads for downstream miRNA detection by collapsing reads, mapping them to a genome, and outputting the required files for miRNA discovery.

0101

outputs versions

mirdeep2:

miRDeep2 Mapper (mapper.pl) is part of the miRDeep2 suite. It collapses identical reads, maps them to a reference genome, and outputs both collapsed FASTA and ARF files for downstream miRNA detection and analysis.

Download a mitochondrial genome to be used as reference for MitoHiFi

01

fasta gb versions

findMitoReference.py:

Fetch mitochondrial genome in Fasta and Genbank format from NCBI

Scan a reference genome to get microsatellite & homopolymer information

01

txt versions

msisensor:

MSIsensor is a C++ program to detect replication slippage variants at microsatellite regions, and differentiate them as somatic or germline.

Performs fastq alignment to a reference using NARFMAP

0101010

bam log versions

narfmap:

narfmap is a fork of the Dragen mapper/aligner Open Source Software.

Create DRAGEN hashtable for reference genome

01

hashmap versions

narfmap:

narfmap is a fork of the Dragen mapper/aligner Open Source Software.

Performs fastq alignment to a fasta reference using NextGenMap

010

bam versions

bwa:

NextGenMap is a flexible highly sensitive short read mapping tool that handles much higher mismatch rates than comparable algorithms while still outperforming them in terms of runtime

Refreshes the protein references for all peptide hits.

012

indexed_idxml versions

openms:

OpenMS is an open-source software C++ library for LC-MS data management and analyses

VIDIA Clara Parabricks GPU-accelerated fast, accurate algorithm for mapping methylated DNA sequence reads to a reference genome, performing local alignment, and producing alignment for different parts of the query sequence

0101010

bam bai qc_metrics bqsr_table duplicate_metrics versions

parabricks:

NVIDIA Clara Parabricks GPU-accelerated genomics tools

Creates an interval list from a bed file and a reference dict

0101

interval_list versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Cleans the provided BAM, soft-clipping beyond-end-of-reference alignments and setting MAPQ to 0 for unmapped reads

01

bam versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Creates a sequence dictionary for a reference sequence.

01

reference_dict versions

picard:

Creates a sequence dictionary file (with ".dict" extension) from a reference sequence provided in FASTA format, which is required by many processing and analysis tools. The output file contains a header but no SAMRecords, and the header contains only sequence records.

Lifts over a VCF file from one reference build to another.

01010101

vcf_lifted vcf_unlifted versions

picard:

Move annotations from one assembly to another

Writes an interval list created by splitting a reference at Ns.A Program for breaking up a reference into intervals of alternating regions of N and ACGT bases

010101

intervals versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

This tool takes in a coordinate-sorted SAM or BAM and calculatesthe NM, MD, and UQ tags by comparing with the reference.

0101

bam bai versions

picard:

Java tools for working with NGS data in the BAM format

PoolSNP is a heuristic SNP caller, which uses an MPILEUP file and a reference genome in FASTA format as inputs.

0101012

vcf max_cov bad_sites versions

Calculate pairwise nucleotide identity with respect to a reference sequence

01010

valid_fasta invalid_fasta report log versions

QUILT is an R and C++ program for rapid genotype imputation from low-coverage sequence using a large reference panel.

012345678910111213141501

vcf tbi rdata plots versions

quilt:

Read aware low coverage whole genome sequence imputation from a reference panel

Homology-based assembly patching: Make continuous joins and fill gaps in 'target.fa' using sequences from 'query.fa'

01010101

patch_fasta patch_agp patch_components_fasta assembly_alignments target_splits_agp target_splits_fasta qry_rename_agp qry_rename_fasta stderr versions

ragtag:

Fast reference-guided genome assembly scaffolding

Scaffolding is the process of ordering and orienting draft assembly (query) sequences into longer sequences. Gaps (stretches of "N" characters) are placed between adjacent query sequences to indicate the presence of unknown sequence. RagTag uses whole-genome alignments to a reference assembly to scaffold query sequences. RagTag does not alter input query sequence in any way and only orders and orients sequences, joining them with gaps.

010101012

corrected_assembly corrected_agp corrected_stats versions

ragtag:

Fast reference-guided genome assembly scaffolding

Calculate expression with RSEM

010

counts_gene counts_transcript stat logs versions bam_star bam_genome bam_transcript

rseqc:

RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome

Prepare a reference genome for RSEM

00

index transcript_fasta versions

rseqc:

RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome

compare detected splice junctions to reference gene model

010

xls rscript log bed interact_bed pdf events_pdf versions

rseqc:

RSeQC package provides a number of useful modules that can comprehensively evaluate high throughput sequence data especially RNA-seq data.

compare detected splice junctions to reference gene model

010

pdf rscript versions

rseqc:

RSeQC package provides a number of useful modules that can comprehensively evaluate high throughput sequence data especially RNA-seq data.

Create index for salmon

00

index versions

salmon:

Salmon is a tool for wicked-fast transcript quantification from RNA-seq data

gene/transcript quantification with Salmon

0100000

results json_info lib_format_counts versions

salmon:

Salmon is a tool for wicked-fast transcript quantification from RNA-seq data

Create BWA index for reference genome

01

index versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Performs fastq alignment to a fasta reference using Sentieon's BWA MEM

01010101

bam_and_bai versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Determine Streptococcus pneumoniae serotype from Illumina paired-end reads

01

tsv txt versions

seroba:

SeroBA is a k-mer based pipeline to identify the Serotype from Illumina NGS reads for given references.

Simple ANI calculation between reference and query genomes.

0101

dist versions

skani:

skani is a fast and robust tool for calculating ANI between metagenome assembled genomes and contigs.

Performs fastq alignment to a fasta reference using SNAP

0101

bam bai versions

snapaligner:

Scalable Nucleotide Alignment Program -- a fast and accurate read aligner for high-throughput sequencing data

Create a SNAP index for reference genome

01234

index versions

snapaligner:

Scalable Nucleotide Alignment Program -- a fast and accurate read aligner for high-throughput sequencing data

Search a metagenome sourmash signature against one or many reference databases and return the minimum set of genomes that contain the k-mers in the metagenome.

0100000

result unassigned matches prefetch prefetchcsv versions

sourmash:

Compute and compare FracMinHash signatures for DNA data sets.

Create a database of sourmash signatures (a group of FracMinHash sketches) to be used as references.

010

signature_index versions

sourmash:

Compute and compare FracMinHash signatures for DNA data sets.

Module to build a filtered GTF needed by the 10x Genomics Space Ranger tool. Uses the spaceranger mkgtf command.

0

gtf versions

spaceranger:

Visium Spatial Gene Expression is a next-generation molecular profiling solution for classifying tissue based on total mRNA. Space Ranger is a set of analysis pipelines that process Visium Spatial Gene Expression data with brightfield and fluorescence microscope images. Space Ranger allows users to map the whole transcriptome in formalin fixed paraffin embedded (FFPE) and fresh frozen tissues to discover novel insights into normal development, disease pathology, and clinical translational research. Space Ranger provides pipelines for end to end analysis of Visium Spatial Gene Expression experiments.

Module to build the reference needed by the 10x Genomics Space Ranger tool. Uses the spaceranger mkref command.

000

reference versions

spaceranger:

Visium Spatial Gene Expression is a next-generation molecular profiling solution for classifying tissue based on total mRNA. Space Ranger is a set of analysis pipelines that process Visium Spatial Gene Expression data with brightfield and fluorescence microscope images. Space Ranger allows users to map the whole transcriptome in formalin fixed paraffin embedded (FFPE) and fresh frozen tissues to discover novel insights into normal development, disease pathology, and clinical translational research. Space Ranger provides pipelines for end to end analysis of Visium Spatial Gene Expression experiments.

Short Read Sequence Typing for Bacterial Pathogens is a program designed to take Illumina sequence data, a MLST database and/or a database of gene sequences (e.g. resistance genes, virulence genes, etc) and report the presence of STs and/or reference genes.

012

gene_results fullgene_results mlst_results pileup sorted_bam versions

srst2:

Short Read Sequence Typing for Bacterial Pathogens

Align reads to a reference genome using STAR

010101000

log_final log_out log_progress versions bam bam_sorted bam_sorted_aligned bam_transcript bam_unsorted fastq tab spl_junc_tab read_per_gene_tab junction sam wig bedgraph

star:

STAR is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Create index for STAR

0101

index versions

star:

STAR is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Get the minimal allowed index version from STAR

NO input

index_version versions

star:

STAR is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Create a counts matrix for single-cell data using STARSolo, handling cell barcodes and UMI information.

012001

counts log_final log_out log_progress summary versions

STITCH is an R program for reference panel free, read aware, low coverage sequencing genotype imputation. STITCH runs on a set of samples with sequencing reads in BAM format, as well as a list of positions to genotype, and outputs imputed genotypes in VCF format.

0123456789100120

input rdata plots vcf bgen versions

Merges the annotation gtf file and the stringtie output gtf files

00

gtf versions

stringtie2:

Transcript assembly and quantification for RNA-Seq

Count reads that map to genomic features

012

counts summary versions

featurecounts:

featureCounts is a highly efficient general-purpose read summarization program that counts mapped reads for genomic features such as genes, exons, promoter, gene bodies, genomic bins and chromosomal locations. It can be used to count both RNA-seq and genomic DNA-seq reads.

Simulate an SV VCF file based on a reference genome

01010100

parameters vcf bed fasta insertions versions

survivor:

Toolset for SV simulation, comparison and filtering

Constructs a graph from a reference and variant calls or a multiple sequence alignment file

01230101

graph versions

vg:

Variation graph data structures, interchange formats, alignment, genotyping, and variant calling methods.

simulating sequence reads from a reference genome

01

fastq versions

Create a new reference using healthy reference samples

01

npz versions

wisecondorx:

WIthin-SamplE COpy Number aberration DetectOR, including sex chromosomes

Fast lightweight accurate xenograft sorting

00000

hash info versions

xengsort:

A fast xenograft read sorter based on space-efficient k-mer hashing

Builds a YARA index for a reference genome

01

index versions

yara:

Yara is an exact tool for aligning DNA sequencing reads to reference genomes.

Align reads to a reference genome using YARA

0101

bam bai versions

yara:

Yara is an exact tool for aligning DNA sequencing reads to reference genomes.

Click here to trigger an update.