Available Modules

Modules are the building stones of all DSL2 nf-core blocks. You can find more info from nf-core website, if you would like to write your own module.

  • vcf 57
  • structural variants 40
  • variants 27
  • variant calling 18
  • gatk4 11
  • sv 10
  • bed 9
  • filter 9
  • bam 8
  • annotation 8
  • wgs 7
  • neural network 7
  • machine learning 7
  • fasta 6
  • low frequency variant calling 6
  • variant 5
  • somatic 5
  • svtk 5
  • gridss 5
  • conversion 4
  • imputation 4
  • haplotype 4
  • low-coverage 4
  • glimpse 4
  • genotyping 4
  • feature 4
  • wxs 4
  • benchmark 4
  • indels 4
  • bedpe 4
  • cram 3
  • nanopore 3
  • graph 3
  • plink2 3
  • phasing 3
  • validation 3
  • json 3
  • structural 3
  • snps 3
  • wastewater 3
  • ligate 3
  • panel 3
  • benchmarking 3
  • structural_variants 3
  • small indels 3
  • somatic variants 3
  • survivor 3
  • merge 2
  • download 2
  • split 2
  • gfa 2
  • sentieon 2
  • annotate 2
  • germline 2
  • gff3 2
  • call 2
  • SV 2
  • pypgx 2
  • happy 2
  • microsatellite 2
  • normalization 2
  • comparison 2
  • observations 2
  • shapeit 2
  • interactions 2
  • regression 2
  • qualty 2
  • snpeff 2
  • lofreq 2
  • effect prediction 2
  • standardization 2
  • svdb 2
  • small variants 2
  • vg 2
  • multiallelic 2
  • samples 2
  • MSI 2
  • homoploymer 2
  • tab 2
  • deconvolution 2
  • genomics 1
  • genome 1
  • index 1
  • alignment 1
  • assembly 1
  • sort 1
  • database 1
  • statistics 1
  • pacbio 1
  • VCF 1
  • long reads 1
  • consensus 1
  • isoseq 1
  • build 1
  • gvcf 1
  • variation graph 1
  • bisulfite 1
  • long-read 1
  • table 1
  • tsv 1
  • metrics 1
  • depth 1
  • pangenome graph 1
  • DNA methylation 1
  • scWGBS 1
  • filtering 1
  • WGBS 1
  • bcf 1
  • transcriptome 1
  • pangenome 1
  • mitochondria 1
  • concatenate 1
  • query 1
  • detection 1
  • dna 1
  • hybrid capture sequencing 1
  • targeted sequencing 1
  • fusion 1
  • ranking 1
  • genmod 1
  • copy number alteration calling 1
  • DNA sequencing 1
  • preprocessing 1
  • indel 1
  • fam 1
  • amplicon sequencing 1
  • score 1
  • bim 1
  • combine 1
  • arriba 1
  • pileup 1
  • virulence 1
  • genetics 1
  • variation 1
  • Pharmacogenetics 1
  • varcal 1
  • pharmacogenetics 1
  • haplotypes 1
  • variant pruning 1
  • instability 1
  • RNA-Seq 1
  • intersection 1
  • gatk 1
  • norm 1
  • normalize 1
  • estimation 1
  • calling 1
  • phase 1
  • baf 1
  • standardize 1
  • construct 1
  • associations 1
  • Bayesian 1
  • structural-variants 1
  • probabilistic realignment 1
  • simulation 1
  • decompose 1
  • Assembly 1
  • hifi 1
  • dnamodelapply 1
  • cadd 1
  • graph projection to vcf 1
  • cytosure 1
  • structural variant 1
  • array_cgh 1
  • sompy 1
  • reference panel 1
  • installation 1
  • Staphylococcus aureus 1
  • decomposeblocksub 1
  • block substitutions 1
  • hwe 1
  • missingness 1
  • linkage equilibrium 1
  • pruning 1
  • check 1
  • cnnscorevariants 1
  • collectsvevidence 1
  • vqsr 1
  • variant quality score recalibration 1
  • lofreq/call 1
  • lofreq/filter 1
  • qualities 1
  • UShER 1
  • bootstrapping 1
  • cancer genome 1
  • somatic structural variations 1
  • mobile element insertions 1
  • variantfiltration 1
  • svcluster 1
  • svannotate 1
  • hmtnote 1
  • tama_collapse.py 1
  • gene model 1
  • TAMA 1
  • gvcftools 1
  • extract_variants 1
  • extractvariants 1
  • readcountssummary 1
  • getpileupsumaries 1
  • germlinevariantsites 1
  • germlinecnvcaller 1
  • germline contig ploidy 1
  • leftalignandtrimvariants 1
  • selectvariants 1
  • printsvevidence 1
  • jasminesv 1
  • jasmine 1
  • rare variants 1
  • applyvarcal 1
  • VQSR 1
  • svtk/baftest 1
  • baftest 1
  • countsvtypes 1
  • rdtest2vcf 1
  • rdtest 1
  • vcf2bed 1
  • short-read sequencing 1
  • detecting svs 1
  • variantcalling 1
  • genetic 1
  • indep pairwise 1
  • paragraph 1
  • graphs 1
  • deletions 1
  • insertions 1
  • tandem duplications 1
  • segment 1
  • rtg-tools 1
  • depth information 1
  • structural variation 1
  • duphold 1
  • cache 1
  • fastq 0
  • metagenomics 0
  • reference 0
  • sam 0
  • align 0
  • gff 0
  • bacteria 0
  • map 0
  • coverage 0
  • qc 0
  • quality control 0
  • classification 0
  • gtf 0
  • classify 0
  • cnv 0
  • MSA 0
  • k-mer 0
  • contamination 0
  • taxonomic profiling 0
  • taxonomy 0
  • convert 0
  • proteomics 0
  • count 0
  • clustering 0
  • binning 0
  • quality 0
  • single-cell 0
  • copy number 0
  • ancient DNA 0
  • rnaseq 0
  • bedtools 0
  • trimming 0
  • contigs 0
  • phylogeny 0
  • bcftools 0
  • kmer 0
  • protein 0
  • mags 0
  • reporting 0
  • databases 0
  • methylseq 0
  • bqsr 0
  • compression 0
  • illumina 0
  • QC 0
  • bisulphite 0
  • indexing 0
  • picard 0
  • cna 0
  • imaging 0
  • methylation 0
  • visualisation 0
  • demultiplex 0
  • stats 0
  • mapping 0
  • serotype 0
  • antimicrobial resistance 0
  • sequences 0
  • phage 0
  • 5mC 0
  • taxonomic classification 0
  • openms 0
  • repeat 0
  • samtools 0
  • markduplicates 0
  • bins 0
  • searching 0
  • cluster 0
  • example 0
  • structure 0
  • aDNA 0
  • protein sequence 0
  • histogram 0
  • base quality score recalibration 0
  • pairs 0
  • expression 0
  • plot 0
  • matrix 0
  • amr 0
  • mmseqs2 0
  • cooler 0
  • damage 0
  • metagenome 0
  • checkm 0
  • db 0
  • archaeogenomics 0
  • palaeogenomics 0
  • gzip 0
  • gene 0
  • bisulfite sequencing 0
  • bwa 0
  • aligner 0
  • seqkit 0
  • genotype 0
  • LAST 0
  • biscuit 0
  • iCLIP 0
  • mappability 0
  • completeness 0
  • transcript 0
  • virus 0
  • sequence 0
  • decompression 0
  • population genetics 0
  • mkref 0
  • hmmer 0
  • newick 0
  • segmentation 0
  • evaluation 0
  • hmmsearch 0
  • ucsc 0
  • prediction 0
  • umi 0
  • msa 0
  • peaks 0
  • complexity 0
  • mag 0
  • ncbi 0
  • kraken2 0
  • spatial 0
  • dedup 0
  • blast 0
  • bismark 0
  • sketch 0
  • report 0
  • deduplication 0
  • cnvkit 0
  • NCBI 0
  • short-read 0
  • reads 0
  • snp 0
  • duplicates 0
  • csv 0
  • mirna 0
  • kmers 0
  • profile 0
  • antimicrobial peptides 0
  • prokaryote 0
  • splicing 0
  • plasmid 0
  • single 0
  • rna 0
  • scRNA-seq 0
  • vsearch 0
  • tumor-only 0
  • demultiplexing 0
  • extract 0
  • multiple sequence alignment 0
  • antimicrobial resistance genes 0
  • differential 0
  • bedGraph 0
  • de novo 0
  • fastx 0
  • tabular 0
  • 3-letter genome 0
  • arg 0
  • FASTQ 0
  • text 0
  • mem 0
  • single cell 0
  • diversity 0
  • cat 0
  • amps 0
  • sourmash 0
  • ont 0
  • fragment 0
  • isolates 0
  • distance 0
  • reference-free 0
  • HMM 0
  • view 0
  • counts 0
  • summary 0
  • riboseq 0
  • merging 0
  • microbiome 0
  • antibiotic resistance 0
  • deamination 0
  • clipping 0
  • adapters 0
  • de novo assembly 0
  • mpileup 0
  • kallisto 0
  • MAF 0
  • visualization 0
  • interval 0
  • profiling 0
  • mutect2 0
  • compare 0
  • coptr 0
  • ptr 0
  • idXML 0
  • clean 0
  • ccs 0
  • umitools 0
  • matching 0
  • mtDNA 0
  • skani 0
  • circrna 0
  • CLIP 0
  • sequencing 0
  • sample 0
  • chunk 0
  • genome assembler 0
  • diamond 0
  • read depth 0
  • isomir 0
  • palaeogenetics 0
  • miscoding lesions 0
  • microarray 0
  • enrichment 0
  • fungi 0
  • gsea 0
  • bgzip 0
  • deep learning 0
  • resistance 0
  • compress 0
  • bin 0
  • biosynthetic gene cluster 0
  • hmmcopy 0
  • BGC 0
  • bigwig 0
  • hic 0
  • xeniumranger 0
  • ATAC-seq 0
  • peak-calling 0
  • cut 0
  • HiFi 0
  • bedgraph 0
  • transcriptomics 0
  • archaeogenetics 0
  • propr 0
  • ganon 0
  • logratio 0
  • fai 0
  • redundancy 0
  • ancestry 0
  • DNA sequence 0
  • union 0
  • containment 0
  • retrotransposon 0
  • quantification 0
  • fgbio 0
  • public datasets 0
  • family 0
  • ampir 0
  • abundance 0
  • telomere 0
  • paf 0
  • image 0
  • phylogenetic placement 0
  • interval_list 0
  • sylph 0
  • STR 0
  • chromosome 0
  • add 0
  • parsing 0
  • malt 0
  • bcl2fastq 0
  • haplotypecaller 0
  • ngscheckmate 0
  • fingerprint 0
  • subsample 0
  • das_tool 0
  • PCA 0
  • fusions 0
  • SNP 0
  • covid 0
  • scores 0
  • regions 0
  • chimeras 0
  • genomes 0
  • genome assembly 0
  • transcripts 0
  • seqtk 0
  • pairsam 0
  • deeparg 0
  • pan-genome 0
  • mlst 0
  • prokka 0
  • spark 0
  • krona chart 0
  • rsem 0
  • scaffold 0
  • notebook 0
  • reports 0
  • fastk 0
  • pseudoalignment 0
  • html 0
  • krona 0
  • entrez 0
  • typing 0
  • khmer 0
  • duplication 0
  • pangolin 0
  • bacterial 0
  • lineage 0
  • UMI 0
  • PacBio 0
  • polishing 0
  • insert 0
  • dictionary 0
  • mask 0
  • hidden Markov model 0
  • cfDNA 0
  • population genomics 0
  • scaffolding 0
  • replace 0
  • anndata 0
  • das tool 0
  • macrel 0
  • comparisons 0
  • DRAMP 0
  • bracken 0
  • proteome 0
  • aln 0
  • bwameth 0
  • neubi 0
  • variant_calling 0
  • hi-c 0
  • guide tree 0
  • amplify 0
  • nucleotide 0
  • mzml 0
  • mkfastq 0
  • mapper 0
  • npz 0
  • cellranger 0
  • windowmasker 0
  • gene expression 0
  • amplicon sequences 0
  • vrhyme 0
  • kraken 0
  • microbes 0
  • archiving 0
  • zip 0
  • checkv 0
  • unzip 0
  • C to T 0
  • gatk4spark 0
  • chip-seq 0
  • bakta 0
  • prokaryotes 0
  • eukaryotes 0
  • genome mining 0
  • RNA-seq 0
  • ataqv 0
  • miRNA 0
  • fcs-gx 0
  • ambient RNA removal 0
  • image_analysis 0
  • mcmicro 0
  • highly_multiplexed_imaging 0
  • rna_structure 0
  • RNA 0
  • angsd 0
  • organelle 0
  • host 0
  • bamtools 0
  • genotype-based deconvoltion 0
  • popscle 0
  • adapter trimming 0
  • quality trimming 0
  • remove 0
  • repeat expansion 0
  • complement 0
  • transposons 0
  • roh 0
  • converter 0
  • intervals 0
  • atac-seq 0
  • dump 0
  • uncompress 0
  • prefetch 0
  • kinship 0
  • relatedness 0
  • informative sites 0
  • cool 0
  • cut up 0
  • spaceranger 0
  • dist 0
  • CRISPR 0
  • tabix 0
  • lossless 0
  • identity 0
  • long_read 0
  • wig 0
  • untar 0
  • png 0
  • uLTRA 0
  • minimap2 0
  • soft-clipped clusters 0
  • cancer genomics 0
  • bustools 0
  • taxids 0
  • find 0
  • sequenzautils 0
  • taxon name 0
  • screen 0
  • snpsift 0
  • rename 0
  • zlib 0
  • differential expression 0
  • human removal 0
  • functional analysis 0
  • screening 0
  • primer 0
  • gene set analysis 0
  • doublets 0
  • orf 0
  • serogroup 0
  • leviosam2 0
  • barcode 0
  • transformation 0
  • gene set 0
  • krakentools 0
  • join 0
  • cleaning 0
  • pair 0
  • lift 0
  • metamaps 0
  • interactive 0
  • krakenuniq 0
  • polyA_tail 0
  • gstama 0
  • refine 0
  • mirdeep2 0
  • shigella 0
  • hostile 0
  • switch 0
  • repeats 0
  • Streptococcus pneumoniae 0
  • ome-tif 0
  • haplogroups 0
  • MCMICRO 0
  • RNA sequencing 0
  • WGS 0
  • gene labels 0
  • smrnaseq 0
  • frame-shift correction 0
  • trancriptome 0
  • tama 0
  • ancient dna 0
  • long-read sequencing 0
  • dereplicate 0
  • cgMLST 0
  • maximum likelihood 0
  • reformat 0
  • sequence analysis 0
  • instrain 0
  • ampgram 0
  • ichorcna 0
  • amptransformer 0
  • trgt 0
  • salmonella 0
  • decontamination 0
  • mapcounter 0
  • hlala_typing 0
  • hla_typing 0
  • orthologs 0
  • ragtag 0
  • mass spectrometry 0
  • hlala 0
  • hla 0
  • iphop 0
  • tree 0
  • homologs 0
  • nanostring 0
  • taxonomic profile 0
  • copyratios 0
  • read-group 0
  • rtgtools 0
  • image_processing 0
  • registration 0
  • ped 0
  • mitochondrion 0
  • GPU-accelerated 0
  • pigz 0
  • bam2fq 0
  • proportionality 0
  • de novo assembler 0
  • small genome 0
  • nacho 0
  • cnvnator 0
  • nucleotides 0
  • graph layout 0
  • junctions 0
  • standardise 0
  • contig 0
  • Duplication purging 0
  • Read depth 0
  • duplicate 0
  • vcflib 0
  • library 0
  • preseq 0
  • adapter 0
  • SimpleAF 0
  • import 0
  • taxon tables 0
  • bfiles 0
  • subset 0
  • otu tables 0
  • standardisation 0
  • polish 0
  • runs_of_homozygosity 0
  • rgfa 0
  • salmon 0
  • function 0
  • k-mer index 0
  • corrupted 0
  • spatial_transcriptomics 0
  • bloom filter 0
  • profiles 0
  • assembly evaluation 0
  • pharokka 0
  • GC content 0
  • k-mer frequency 0
  • megan 0
  • COBS 0
  • checksum 0
  • split_kmers 0
  • retrotransposons 0
  • long terminal repeat 0
  • purge duplications 0
  • minhash 0
  • mash 0
  • long terminal retrotransposon 0
  • kma 0
  • FracMinHash sketch 0
  • tnhaplotyper2 0
  • msi 0
  • collate 0
  • nextclade 0
  • removal 0
  • mRNA 0
  • rrna 0
  • dict 0
  • msisensor-pro 0
  • micro-satellite-scan 0
  • tumor 0
  • fixmate 0
  • archive 0
  • orthology 0
  • parallelized 0
  • transcriptomic 0
  • mudskipper 0
  • signature 0
  • xz 0
  • reformatting 0
  • resolve_bioscience 0
  • recombination 0
  • evidence 0
  • bases 0
  • NRPS 0
  • sizes 0
  • RiPP 0
  • antibiotics 0
  • antismash 0
  • region 0
  • rna-seq 0
  • secondary metabolites 0
  • deseq2 0
  • blastp 0
  • windows 0
  • simulate 0
  • artic 0
  • heatmap 0
  • spatial_omics 0
  • aggregate 0
  • structural-variant calling 0
  • gwas 0
  • settings 0
  • sra-tools 0
  • fasterq-dump 0
  • blastn 0
  • allele 0
  • BAM 0
  • awk 0
  • joint genotyping 0
  • MaltExtract 0
  • HOPS 0
  • authentication 0
  • edit distance 0
  • metagenomes 0
  • random forest 0
  • filtermutectcalls 0
  • interval list 0
  • correction 0
  • microscopy 0
  • expansionhunterdenovo 0
  • merge mate pairs 0
  • reads merging 0
  • short reads 0
  • identifier 0
  • metagenomic 0
  • xenograft 0
  • GEO 0
  • metadata 0
  • eCLIP 0
  • graft 0
  • unaligned 0
  • trim 0
  • UMIs 0
  • duplex 0
  • fetch 0
  • repeat_expansions 0
  • demultiplexed reads 0
  • eigenstrat 0
  • eido 0
  • allele-specific 0
  • format 0
  • samplesheet 0
  • validate 0
  • realignment 0
  • microbial 0
  • concat 0
  • reheader 0
  • bayesian 0
  • scatter 0
  • emboss 0
  • intersect 0
  • tbi 0
  • version 0
  • panelofnormals 0
  • gem 0
  • genome bins 0
  • ChIP-seq 0
  • vdj 0
  • cvnkit 0
  • genomad 0
  • concordance 0
  • cnv calling 0
  • immunoprofiling 0
  • CNV 0
  • splice 0
  • single cells 0
  • parse 0
  • r 0
  • maf 0
  • parallel 0
  • BCF 0
  • python 0
  • plastid 0
  • quarto 0
  • gemini 0
  • dbsnp 0
  • resfinder 0
  • resistance genes 0
  • raw 0
  • lua 0
  • vcf2db 0
  • mgf 0
  • parquet 0
  • parser 0
  • snakemake 0
  • update header 0
  • toml 0
  • verifybamid 0
  • deep variant 0
  • mutect 0
  • idx 0
  • DNA contamination estimation 0
  • closest 0
  • bamtobed 0
  • sorting 0
  • autozygosity 0
  • transform 0
  • gaps 0
  • vcfbreakmulti 0
  • VCFtools 0
  • introns 0
  • homozygosity 0
  • deduplicate 0
  • biallelic 0
  • install 0
  • joint-genotyping 0
  • genotypegvcf 0
  • correlation 0
  • uniq 0
  • coexpression 0
  • minimum_evolution 0
  • corpcor 0
  • chromosome_visualization 0
  • quality assurnce 0
  • cell_phenotyping 0
  • machine_learning 0
  • chromap 0
  • clumping fastqs 0
  • files 0
  • background_correction 0
  • duplicate removal 0
  • clahe 0
  • refresh 0
  • association 0
  • GWAS 0
  • upd 0
  • case/control 0
  • illumiation_correction 0
  • spatial_neighborhoods 0
  • scimap 0
  • scRNA-Seq 0
  • umicollapse 0
  • omics 0
  • biological activity 0
  • qa 0
  • uniparental 0
  • assay 0
  • deduping 0
  • phylogenetics 0
  • getfasta 0
  • distance-based 0
  • nucleotide sequence 0
  • csi 0
  • multi-tool 0
  • predict 0
  • subsample bam 0
  • downsample bam 0
  • downsample 0
  • snv 0
  • smaller fastqs 0
  • disomy 0
  • hardy-weinberg 0
  • hwe statistics 0
  • hwe equilibrium 0
  • reference-independent 0
  • genotype likelihood 0
  • collapse 0
  • liftover 0
  • seqfu 0
  • n50 0
  • cell_type_identification 0
  • genomecov 0
  • antibody capture 0
  • comparative genomics 0
  • adna 0
  • Escherichia coli 0
  • readproteingroups 0
  • proteus 0
  • propd 0
  • Read coverage histogram 0
  • mkvdjref 0
  • subtyping 0
  • c to t 0
  • Salmonella enterica 0
  • reverse complement 0
  • mapad 0
  • boxcox 0
  • hmmfetch 0
  • geo 0
  • yahs 0
  • copy number variation 0
  • copy number alterations 0
  • sorted 0
  • gender determination 0
  • copy number analysis 0
  • transmembrane 0
  • tblastn 0
  • postprocessing 0
  • copy-number 0
  • topology 0
  • workflow_mode 0
  • createreadcountpanelofnormals 0
  • domains 0
  • denoisereadcounts 0
  • sliding 0
  • readwriter 0
  • dnascope 0
  • compartments 0
  • calder2 0
  • clr 0
  • groupby 0
  • tnscope 0
  • hicPCA 0
  • bgen 0
  • eigenvectors 0
  • cellpose 0
  • chloroplast 0
  • confidence 0
  • blat 0
  • alr 0
  • file manipulation 0
  • genome graph 0
  • overlap 0
  • sintax 0
  • chunking 0
  • vector 0
  • gprofiler2 0
  • gost 0
  • antigen capture 0
  • linkbins 0
  • extractunbinned 0
  • rad 0
  • jaccard 0
  • maskfasta 0
  • bam2fastx 0
  • bam2fastq 0
  • immcantation 0
  • airrseq 0
  • immunoinformatics 0
  • co-orthology 0
  • homology 0
  • sequence similarity 0
  • workflow 0
  • spectral clustering 0
  • crispr 0
  • overlapped bed 0
  • tnseq 0
  • mashmap 0
  • multiomics 0
  • bioawk 0
  • decoy 0
  • unionBedGraphs 0
  • htseq 0
  • wham 0
  • subtract 0
  • slopBed 0
  • whamg 0
  • wavefront 0
  • peak picking 0
  • multinterval 0
  • all versus all 0
  • pangenome-scale 0
  • prior knowledge 0
  • long read alignment 0
  • usearch 0
  • site frequency spectrum 0
  • ancestral alleles 0
  • derived alleles 0
  • tnfilter 0
  • vsearch/sort 0
  • shiftBed 0
  • element 0
  • GNU 0
  • tag 0
  • affy 0
  • malformed 0
  • fix 0
  • paired reads re-pairing 0
  • regex 0
  • patterns 0
  • doublet 0
  • Immune Deconvolution 0
  • Bioinformatics Tools 0
  • Computational Immunology 0
  • catpack 0
  • prepare 0
  • mzML 0
  • taxonomic composition 0
  • CRISPRi 0
  • chip 0
  • 16S 0
  • hhsuite 0
  • hmmpress 0
  • hmmscan 0
  • phylogenies 0
  • reference panels 0
  • junction 0
  • admixture 0
  • reference compression 0
  • impute 0
  • haploype 0
  • host removal 0
  • Indel 0
  • partitioning 0
  • updatedata 0
  • shuffleBed 0
  • metagenome assembler 0
  • p-value 0
  • scvi 0
  • solo 0
  • import segmentation 0
  • nuclear segmentation 0
  • cell segmentation 0
  • relabel 0
  • resegment 0
  • morphology 0
  • doCounts 0
  • allele counts 0
  • nuclear contamination estimate 0
  • post Post-processing 0
  • scanpy 0
  • run 0
  • plotting 0
  • regtools 0
  • leafcutter 0
  • model 0
  • AMPs 0
  • recovery 0
  • mgi 0
  • antimicrobial peptide prediction 0
  • amp 0
  • identity-by-descent 0
  • pdb 0
  • SNV 0
  • long read 0
  • logFC 0
  • integron 0
  • wget 0
  • network 0
  • SINE 0
  • plant 0
  • melon 0
  • remove samples 0
  • helitron 0
  • scanner 0
  • unmarkduplicates 0
  • covariance models 0
  • trna 0
  • genome annotation 0
  • mobile genetic elements 0
  • metaspace 0
  • bedcov 0
  • metabolite annotation 0
  • data-download 0
  • adapterremoval 0
  • antimicrobial reistance 0
  • contiguate 0
  • patch 0
  • modelsegments 0
  • references 0
  • long-reads 0
  • iterative model refinement 0
  • spatialdata 0
  • metabolomics 0
  • comp 0
  • genome polishing 0
  • tandem repeats 0
  • drep 0
  • trio binning 0
  • GFF/GTF 0
  • low-complexity 0
  • masking 0
  • intron 0
  • short 0
  • uq 0
  • nm 0
  • md 0
  • dream 0
  • variancepartition 0
  • isoform 0
  • longest 0
  • agat 0
  • microbial genomics 0
  • assembly polishing 0
  • dereplication 0
  • covariance model 0
  • inbreeding 0
  • heterozygous genotypes 0
  • homozygous genotypes 0
  • f coefficient 0
  • pca 0
  • plink2_pca 0
  • bgen file 0
  • vcf file 0
  • genotype dosages 0
  • significance statistic 0
  • subsetting 0
  • cell_barcodes 0
  • regulatory network 0
  • cellsnp 0
  • trimfq 0
  • vcflib/vcffixup 0
  • AC/NS/AF 0
  • Pacbio 0
  • guidetree 0
  • bwamem2 0
  • bwameme 0
  • grabix 0
  • ribosomal 0
  • 10x 0
  • temperate 0
  • lifestyle 0
  • autofluorescence 0
  • genotype-based demultiplexing 0
  • transcription factors 0
  • paraphase 0
  • selector 0
  • cram-size 0
  • size 0
  • quality check 0
  • realign 0
  • circular 0
  • spot 0
  • orthogroup 0
  • cycif 0
  • sage 0
  • background 0
  • featuretable 0
  • donor deconvolution 0
  • lexogen 0
  • single-stranded 0
  • translation 0
  • mygene 0
  • go 0
  • trimBam 0
  • bamUtil 0
  • pile up 0
  • bamtools/split 0
  • yaml 0
  • bamtools/convert 0
  • mouse 0
  • nanopore sequencing 0
  • rna velocity 0
  • cobra 0
  • extension 0
  • grea 0
  • paired reads merging 0
  • droplet based single cells 0
  • overlap-based merging 0
  • hamming-distance 0
  • hashing-based deconvoltion 0
  • gnu 0
  • coreutils 0
  • generic 0
  • transposable element 0
  • retrieval 0
  • bacphlip 0
  • virulent 0
  • MMseqs2 0
  • InterProScan 0
  • busco 0
  • extraction 0
  • ancientDNA 0
  • barcodes 0
  • tarball 0
  • variant-calling 0
  • stardist 0
  • telseq 0
  • vsearch/dereplicate 0
  • vsearch/fastqfilter 0
  • fastqfilter 0
  • ATACseq 0
  • shift 0
  • ATACshift 0
  • setgt 0
  • jvarkit 0
  • translate 0
  • tar 0
  • targz 0
  • search engine 0
  • http(s) 0
  • utility 0
  • bclconvert 0
  • nucBed 0
  • AT content 0
  • nucleotide content 0
  • elfasta 0
  • elprep 0
  • HLA 0
  • controlstatistics 0
  • source tracking 0
  • emoji 0
  • quality_control 0
  • doublet_detection 0
  • poolseq 0
  • mass_error 0
  • redundant 0
  • rank 0
  • nanoq 0
  • Read filters 0
  • Read trimming 0
  • Read report 0
  • drug categorization 0
  • authentict 0
  • uniques 0
  • Illumina 0
  • functional 0
  • impute-info 0
  • tags 0
  • tag2tag 0
  • read group 0
  • hashing-based deconvolution 0
  • java 0
  • multiqc 0
  • script 0
  • bias 0
  • xml 0
  • svg 0
  • standard 0
  • haplotag 0
  • ATLAS 0
  • staging 0
  • sequencing_bias 0
  • Staging 0
  • post mortem damage 0
  • atlas 0
  • mkarv 0
  • microRNA 0
  • functional enrichment 0
  • antitarget 0
  • polymut 0
  • cross-samplecontamination 0
  • collectreadcounts 0
  • mcr-1 0
  • MD5 0
  • 128 bit 0
  • calibratedragstrmodel 0
  • megahit 0
  • denovo 0
  • debruijn 0
  • daa 0
  • rma6 0
  • Neisseria meningitidis 0
  • getpileupsummaries 0
  • calculatecontamination 0
  • 3D heat map 0
  • contour map 0
  • Merqury 0
  • bedtointervallist 0
  • smudgeplot 0
  • ploidy 0
  • unionsum 0
  • metaphlan 0
  • asereadcounter 0
  • methylation bias 0
  • mbias 0
  • assembler 0
  • mass-spectroscopy 0
  • metagenome-assembled genomes 0
  • rra 0
  • filterintervals 0
  • estimatelibrarycomplexity 0
  • duplication metrics 0
  • AMP 0
  • peptide prediction 0
  • determinegermlinecontigploidy 0
  • functional genomics 0
  • sgRNA 0
  • CRISPR-Cas9 0
  • maximum-likelihood 0
  • createsomaticpanelofnormals 0
  • combinegvcfs 0
  • createsequencedictionary 0
  • DNA damage 0
  • NGS 0
  • damage patterns 0
  • estimate 0
  • condensedepthevidence 0
  • taxonomic assignment 0
  • dragstr 0
  • mash/sketch 0
  • composestrtablefile 0
  • reduced 0
  • representations 0
  • short variant discovery 0
  • maxbin2 0
  • de Bruijn 0
  • microrna 0
  • limma 0
  • graph formats 0
  • Neisseria gonorrhoeae 0
  • gender 0
  • fq 0
  • lint 0
  • random 0
  • graph construction 0
  • graph drawing 0
  • generate 0
  • squeeze 0
  • odgi 0
  • combine graphs 0
  • graph stats 0
  • graph unchopping 0
  • graph viz 0
  • NextGenMap 0
  • tumor/normal 0
  • hla-typing 0
  • ILP 0
  • HLA-I 0
  • block-compressed 0
  • single molecule 0
  • PCR/optical duplicates 0
  • flip 0
  • upper-triangular matrix 0
  • ligation junctions 0
  • pairtools 0
  • pairstools 0
  • restriction fragments 0
  • ngm 0
  • rust 0
  • annotateintervals 0
  • bacterial variant calling 0
  • target prediction 0
  • mitochondrial genome 0
  • reference genome 0
  • targets 0
  • heattree 0
  • gangstr 0
  • mosdepth 0
  • otu table 0
  • gene-calling 0
  • gamma 0
  • microsatellite instability 0
  • germline variant calling 0
  • sequencing summary 0
  • scan 0
  • mtnucratio 0
  • ratio 0
  • somatic variant calling 0
  • mitochondrial to nuclear ratio 0
  • bioinformatics tools 0
  • Beautiful stand-alone HTML report 0
  • GATK UnifiedGenotyper 0
  • SNP table 0
  • contaminant 0
  • variant caller 0
  • Listeria monocytogenes 0
  • filtervarianttranches 0
  • zipperbams 0
  • gawk 0
  • mitochondrial 0
  • repeat content 0
  • genome heterozygosity 0
  • genome size 0
  • Haemophilus influenzae 0
  • haplotype resolution 0
  • models 0
  • compound 0
  • genome profile 0
  • bgc 0
  • file parsing 0
  • txt 0
  • gccounter 0
  • readcounter 0
  • hbd 0
  • variantrecalibrator 0
  • recalibration model 0
  • HMMER 0
  • amino acid 0
  • Hidden Markov Model 0
  • annotations 0
  • splitintervals 0
  • pos 0
  • haemophilus 0
  • beagle 0
  • ibd 0
  • site depth 0
  • gstama/polyacleanup 0
  • genomes on a tree 0
  • joint-variant-calling 0
  • Imputation 0
  • Haplotypes 0
  • Sample 0
  • low coverage 0
  • gget 0
  • gstama/merge 0
  • genome statistics 0
  • genome manipulation 0
  • GTDB taxonomy 0
  • rgi 0
  • genome summary 0
  • genome taxonomy database 0
  • archaea 0
  • gfastats 0
  • gunc 0
  • gunzip 0
  • Mykrobe 0
  • Salmonella Typhi 0
  • abricate 0
  • amrfinderplus 0
  • fARGene 0
  • splitcram 0
  • panel_of_normals 0
  • tranche filtering 0
  • combining 0
  • readorientationartifacts 0
  • learnreadorientationmodel 0
  • digital normalization 0
  • indexfeaturefile 0
  • k-mer counting 0
  • effective genome size 0
  • Klebsiella 0
  • pneumoniae 0
  • kegg 0
  • kofamscan 0
  • quant 0
  • panelofnormalscreation 0
  • jointgenotyping 0
  • genomicsdbimport 0
  • genomicsdb 0
  • reorder 0
  • spliced 0
  • train 0
  • adapter removal 0
  • collapsing 0
  • gatherbqsrreports 0
  • legionella 0
  • clinical 0
  • pneumophila 0
  • kallisto/index 0
  • IDR 0
  • genomic islands 0
  • igv 0
  • igv.js 0
  • js 0
  • genome browser 0
  • multicut 0
  • pixel classification 0
  • pixel_classification 0
  • probability_maps 0
  • shiftintervals 0
  • shiftfasta 0
  • interproscan 0
  • shiftchain 0
  • insertion 0
  • mergebamalignment 0
  • revert 0
  • reblockgvcf 0
  • printreads 0
  • Python 0
  • Jupyter 0
  • jupytext 0
  • papermill 0
  • preprocessintervals 0
  • postprocessgermlinecnvcalls 0
  • snvs 0
  • mutectstats 0
  • select 0
  • ubam 0
  • ucsc/liftover 0
  • longread 0
  • freqsum 0
  • cutesv 0
  • bam2seqz 0
  • gc_wiggle 0
  • induce 0
  • gct 0
  • sex determination 0
  • genetic sex 0
  • relative coverage 0
  • cls 0
  • error 0
  • de-novo 0
  • sha256 0
  • pseudohaploid 0
  • 256 bit 0
  • na 0
  • shinyngs 0
  • exploratory 0
  • boxplot 0
  • density 0
  • features 0
  • sliding window 0
  • custom 0
  • CRAM 0
  • SMN1 0
  • SMN2 0
  • POA 0
  • pseudodiploid 0
  • random draw 0
  • core 0
  • seacr 0
  • repair 0
  • paired 0
  • read pairs 0
  • readgroup 0
  • cumulative coverage 0
  • scatterplot 0
  • scramble 0
  • cluster analysis 0
  • clusteridentifier 0
  • peak-caller 0
  • cut&tag 0
  • cut&run 0
  • chromatin 0
  • assembly-binning 0
  • selection 0
  • corrrelation 0
  • track 0
  • variant recalibration 0
  • subseq 0
  • grep 0
  • sequence headers 0
  • sertotype 0
  • interleave 0
  • paired-end 0
  • pcr duplicates 0
  • header 0
  • seq 0
  • sniffles 0
  • snippy 0
  • faidx 0
  • polymorphic sites 0
  • merge compare 0
  • decompress 0
  • access 0
  • polya tail 0
  • fast5 0
  • cmseq 0
  • protein coding genes 0
  • Mycobacterium tuberculosis 0
  • target 0
  • chromosomal rearrangements 0
  • eucaryotes 0
  • coding 0
  • cds 0
  • transcroder 0
  • sequencing adapters 0
  • polymorphic 0
  • bedgraphtobigwig 0
  • bigbed 0
  • bedtobigbed 0
  • genepred 0
  • refflat 0
  • gtftogenepred 0
  • export 0
  • Cores 0
  • cload 0
  • Segmentation 0
  • TMA dearray 0
  • UNet 0
  • mcool 0
  • dbnsfp 0
  • predictions 0
  • genomic bins 0
  • SNPs 0
  • invariant 0
  • constant 0
  • makebins 0
  • enzyme 0
  • digest 0
  • rRNA 0
  • ribosomal RNA 0
  • cooler/balance 0
  • subcontigs 0
  • nucleotide composition 0
  • signatures 0
  • hash sketch 0
  • fracminhash sketch 0
  • concoct 0
  • partition histograms 0
  • spatype 0
  • spa 0
  • streptococcus 0
  • sccmec 0
  • insert size 0
  • blastx 0
  • unmapped 0
  • ARGs 0
  • NETCAGE 0
  • RAMPAGE 0
  • csRNA-seq 0
  • STRIPE-seq 0
  • PRO-seq 0
  • GRO-seq 0
  • ENA 0
  • SRA 0
  • exclude 0
  • variant identifiers 0
  • ANI 0
  • indep 0
  • PRO-cap 0
  • recode 0
  • whole genome association 0
  • antibiotic resistance genes 0
  • identifiers 0
  • scoring 0
  • faqcs 0
  • variant genetic 0
  • pmdtools 0
  • porechop_abi 0
  • str 0
  • contact 0
  • pretext 0
  • jpg 0
  • CAGE 0
  • GRO-cap 0
  • contact maps 0
  • ChIP-Seq 0
  • groupreads 0
  • duplexumi 0
  • pbbam 0
  • pbmerge 0
  • subreads 0
  • pbp 0
  • pair-end 0
  • read 0
  • pedigrees 0
  • consensus sequence 0
  • motif 0
  • phantom peaks 0
  • CoPRO 0
  • prophage 0
  • identification 0
  • illumina datasets 0
  • phylogenetic composition 0
  • public 0
  • hybrid-selection 0
  • mate-pair 0
  • liftovervcf 0
  • pcr 0
  • picard/renamesampleinvcf 0
  • sortvcf 0
  • bmp 0
  • gene finding 0
  • escherichia coli 0
  • strandedness 0
  • experiment 0
  • read_pairs 0
  • fragment_size 0
  • inner_distance 0
  • read distribution 0
  • sequence-based 0
  • mapping-based 0
  • PEP 0
  • integrity 0
  • rtg 0
  • pedfilter 0
  • rocplot 0
  • schema 0
  • salsa 0
  • salsa2 0
  • LCA 0
  • Ancestor 0
  • multimapper 0
  • flagstat 0
  • sambamba 0
  • duplicate marking 0
  • amplicon 0
  • ampliconclip 0
  • calmd 0
  • bamstat 0
  • R 0
  • assembly curation 0
  • percent on target 0
  • intervals coverage 0
  • endogenous DNA 0
  • Streptococcus pyogenes 0
  • swissprot 0
  • genomic intervals 0
  • normal database 0
  • panel of normals 0
  • cutoff 0
  • genbank 0
  • haplotype purging 0
  • duplicate purging 0
  • false duplications 0
  • Haplotype purging 0
  • pep 0
  • embl 0
  • False duplications 0
  • Assembly curation 0
  • split by chromosome 0
  • purging 0
  • deletion 0
  • circos 0
  • quast 0
  • eklipse 0
  • eigenstratdatabasetools 0
  • neighbour-joining 0
  • subsampling 0
  • long uncorrected reads 0
  • rhocall 0
  • secondary structure 0

Rapid identification of Staphylococcus aureus agr locus type and agr operon variants

01

summary results_dir versions

Annotation and Ranking of Structural Variation

012301010101

tsv unannotated_tsv vcf versions

annotsv:

Annotation and Ranking of Structural Variation

Install the AnnotSV annotations

NO input

annotations versions

annotsv:

Annotation and Ranking of Structural Variation

Arriba is a command-line tool for the detection of gene fusions from RNA-Seq data.

metabammeta2fastameta3gtfmeta4blacklistmeta5known_fusionsmeta6structural_variantsmeta7tagsmeta8protein_domains

meta versions fusions fusions_fail

Compresses VCF files

01234

fasta versions

consensus:

Create consensus sequence by applying VCF variants to a reference fasta file.

Convert a BED file to a VCF file according to a YAML config

01201

vcf versions

Computes cytosine methylation and callable SNV mutations, optionally in reference to a germline BAM to call somatic variants

012340101

vcf versions

biscuit:

A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data

CADD is a tool for scoring the deleteriousness of single nucleotide variants as well as insertion/deletions variants in the human genome.

010

tsv versions

Accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads.

0100

report assembly contigs corrected_reads corrected_trimmed_reads metadata contig_position contig_info versions

DeepSomatic is an extension of deep learning-based variant caller DeepVariant that takes aligned reads (in BAM or CRAM format) from tumor and normal data, produces pileup image tensors from them, classifies each tensor using a convolutional neural network, and finally reports somatic variants in a standard VCF or gVCF file.

0123401010101

vcf vcf_tbi gvcf gvcf_tbi versions

(DEPRECATED - see main.nf) DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

012301010101

vcf vcf_tbi gvcf gvcf_tbi versions

Call variants from the examples produced by make_examples

01

call_variants_tfrecords versions

deepvariant:

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

Transforms the input alignments to a format suitable for the deep neural network variant caller

012301010101

examples gvcf small_model_calls versions

deepvariant:

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

01234010101

vcf vcf_index gvcf gvcf_index versions

deepvariant:

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

012301010101

vcf vcf_tbi gvcf gvcf_tbi versions

deepvariant:

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

01

report versions

deepvariant:

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

Call structural variants

0123450101

bcf csi versions

delly:

Structural variant discovery by integrated paired-end and split-read analysis

Export assembly segment sequences in GFA 1.0 format to FASTA format

01

fasta versions

dshbio:

Reads, features, variants, assemblies, alignments, genomic range trees, pangenome graphs, and a bunch of random command line tools for bioinformatics. LGPL version 3 or later.

Filter features in gzipped BED format

01

bed versions

dshbio:

Reads, features, variants, assemblies, alignments, genomic range trees, pangenome graphs, and a bunch of random command line tools for bioinformatics. LGPL version 3 or later.

Filter features in gzipped GFF3 format

01

gff3 versions

dshbio:

Reads, features, variants, assemblies, alignments, genomic range trees, pangenome graphs, and a bunch of random command line tools for bioinformatics. LGPL version 3 or later.

Split features in gzipped BED format

01

bed versions

dshbio:

Reads, features, variants, assemblies, alignments, genomic range trees, pangenome graphs, and a bunch of random command line tools for bioinformatics. LGPL version 3 or later.

Split features in gzipped GFF3 format

01

gff3 versions

dshbio:

Reads, features, variants, assemblies, alignments, genomic range trees, pangenome graphs, and a bunch of random command line tools for bioinformatics. LGPL version 3 or later.

SV callers like lumpy look at split-reads and pair distances to find structural variants. This tool is a fast way to add depth information to those calls. This can be used as additional information for filtering variants; for example we will be skeptical of deletion calls that do not have lower than average coverage compared to regions with similar gc-content.

01234500

vcf versions

Dysgu calls structural variants (SVs) from mapped sequencing reads. It is designed for accurate and efficient detection of structural variations.

012012

vcf tbi versions

Perform phasing of genotyped data with or without a reference panel

012345

phased_variants versions

Ensembl Variant Effect Predictor (VEP). The cache downloading options are controlled through task.ext.args.

0123

cache versions

ensemblvep:

VEP determines the effect of your variants (SNPs, insertions, deletions, CNVs or structural variants) on genes, transcripts, and protein sequence, as well as regulatory regions.

Filter variants based on Ensembl Variant Effect Predictor (VEP) annotations.

010

output versions

ensemblvep:

VEP determines the effect of your variants (SNPs, insertions, deletions, CNVs or structural variants) on genes, transcripts, and protein sequence, as well as regulatory regions.

Ensembl Variant Effect Predictor (VEP). The output-file-format is controlled through task.ext.args.

0120000010

vcf tbi tab json report versions

ensemblvep:

VEP determines the effect of your variants (SNPs, insertions, deletions, CNVs or structural variants) on genes, transcripts, and protein sequence, as well as regulatory regions.

Bootstrap sample demixing by resampling each site based on a multinomial distribution of read depth across all sites, where the event probabilities were determined by the fraction of the total sample reads found at each site, followed by a secondary resampling at each site according to a multinomial distribution (that is, binomial when there was only one SNV at a site), where event probabilities were determined by the frequencies of each base at the site, and the number of trials is given by the sequencing depth.

012000

lineages summarized versions

freyja:

Freyja recovers relative lineage abundances from mixed SARS-CoV-2 samples and provides functionality to analyze lineage dynamics.

specify the relative abundance of each known haplotype

01200

demix versions

freyja:

Freyja recovers relative lineage abundances from mixed SARS-CoV-2 samples and provides functionality to analyze lineage dynamics.

downloads new versions of the curated SARS-CoV-2 lineage file and barcodes

0

barcodes lineages_topology lineages_meta versions

freyja:

Freyja recovers relative lineage abundances from mixed SARS-CoV-2 samples and provides functionality to analyze lineage dynamics.

call variant and sequencing depth information of the variant

010

variants versions

freyja:

Freyja recovers relative lineage abundances from mixed SARS-CoV-2 samples and provides functionality to analyze lineage dynamics.

Apply a score cutoff to filter variants based on a recalibration table. AplyVQSR performs the second pass in a two-stage process called Variant Quality Score Recalibration (VQSR). Specifically, it applies filtering to the input variants based on the recalibration table produced in the first step by VariantRecalibrator and a target sensitivity value.

012345000

vcf tbi versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Apply a Convolutional Neural Net to filter annotated variants

0123400000

vcf tbi versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Gathers paired-end and split read evidence files for use in the GATK-SV pipeline. Output files are a file containing the location of and orientation of read pairs marked as discordant, and a file containing the clipping location of all soft clipped reads and the orientation of the clipping.

01234000

split_read_evidence split_read_evidence_index paired_end_evidence paired_end_evidence_index site_depths site_depths_index versions

gatk4:

Genome Analysis Toolkit (GATK4)

Calls copy-number variants in germline samples given their counts and the output of DetermineGermlineContigPloidy.

01234

cohortcalls cohortmodel casecalls versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Summarizes counts of reads that support reference, alternate and other alleles for given sites. Results can be used with CalculateContamination. Requires a common germline variant sites file, such as from gnomAD.

012301010100

table versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Left align and trim variants using GATK4 LeftAlignAndTrimVariants.

0123000

vcf tbi versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

WARNING - this tool is still experimental and shouldn't be used in a production setting. Gathers paired-end and split read evidence files for use in the GATK-SV pipeline. Output files are a file containing the location of and orientation of read pairs marked as discordant, and a file containing the clipping location of all soft clipped reads and the orientation of the clipping.

0120000

printed_evidence printed_evidence_index versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Select a subset of variants from a VCF file

0123

vcf tbi versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Adds predicted functional consequence, gene overlap, and noncoding element overlap annotations to SV VCF from GATK-SV pipeline. Input files are an SV VCF, a GTF file containing primary or canonical transcripts, and a BED file containing noncoding elements. Output file is an annotated SV VCF.

0123000

annotated_vcf index versions

gatk4:

Genome Analysis Toolkit (GATK4)

Clusters structural variants based on coordinates, event type, and supporting algorithms

0120000

clustered_vcf clustered_vcf_index versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Filter variants

01201010101

vcf tbi versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Extract fields from a VCF file to a tab-delimited table

012345010101

table versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Score the variants of a vcf based on their annotation

0120

vcf versions

genmod:

Annotate genetic inheritance models in variant files

Concatenates imputation chunks in a single VCF/BCF file ligating phased information.

012

merged_variants versions

glimpse:

GLIMPSE is a phasing and imputation method for large-scale low-coverage sequencing studies.

main GLIMPSE algorithm, performs phasing and imputation refining genotype likelihoods

012345678

phased_variants versions

glimpse:

GLIMPSE is a phasing and imputation method for large-scale low-coverage sequencing studies.

Ligatation of multiple phased BCF/VCF files into a single whole chromosome file. GLIMPSE2 is run in chunks that are ligated into chromosome-wide files maintaining the phasing.

012

merged_variants versions

glimpse2:

GLIMPSE2 is a phasing and imputation method for large-scale low-coverage sequencing studies.

Tool for imputation and phasing from vcf file or directly from bam files.

0123456789012

phased_variants stats_coverage versions

glimpse2:

GLIMPSE2 is a phasing and imputation method for large-scale low-coverage sequencing studies.

Tools for population-scale genotyping using pangenome graphs.

01201010

vcf tbi versions

graphtyper:

A graph-based variant caller capable of genotyping population-scale short read data sets while incorporating previously discovered variants.

Tools for population-scale genotyping using pangenome graphs.

01

vcf tbi versions

graphtyper:

A graph-based variant caller capable of genotyping population-scale short read data sets while incorporating previously discovered variants.

GRIDSS is a module software suite containing tools useful for the detection of genomic rearrangements.

0123010101

bedpe bed versions

gridss:

GRIDSS: the Genomic Rearrangement IDentification Software Suite

GRIDSS is a module software suite containing tools useful for the detection of genomic rearrangements.

01010101

vcf versions

gridss:

GRIDSS: the Genomic Rearrangement IDentification Software Suite

GRIDSS is a module software suite containing tools useful for the detection of genomic rearrangements.

0123010101

bedpe bed versions

gridss:

GRIDSS: the Genomic Rearrangement IDentification Software Suite

GRIDSS is a module software suite containing tools useful for the detection of genomic rearrangements.

0101

high_conf_sv all_sv versions

gridss:

GRIDSS: the Genomic Rearrangement IDentification Software Suite

GRIDSS is a module software suite containing tools useful for the detection of genomic rearrangements.

0101

high_conf_sv all_sv versions

gridss:

GRIDSS: the Genomic Rearrangement IDentification Software Suite

Collapse redundant transcript models in Iso-Seq data.

010

bed bed_trans_reads local_density_error polya read strand_check trans_report versions varcov variants

tama_collapse.py:

Collapse similar gene model

Removes all non-variant blocks from a gVCF file to produce a smaller variant-only VCF file.

01

vcf versions

gvcftools:

gvcftools is a package of small utilities for creating and analyzing gVCF files

Hap.py is a tool to compare diploid genotypes at haplotype level. Rather than comparing VCF records row by row, hap.py will generate and match alternate sequences in a superlocus. A superlocus is a small region of the genome (sized between 1 and around 1000 bp) that contains one or more variants.

012340101010101

summary_csv roc_all_csv roc_indel_locations_csv roc_indel_locations_pass_csv roc_snp_locations_csv roc_snp_locations_pass_csv extended_csv runinfo metrics_json vcf tbi versions

happy:

Haplotype VCF comparison tools

Hap.py is a tool to compare diploid genotypes at haplotype level. som.py is a part of hap.py compares somatic variations.

012340101010101

features metrics stats versions

sompy:

Haplotype VCF comparison tools somatic variant comparison

Human mitochondrial variants annotation using HmtVar. Contains .plk file with annotation, so can be run offline

01

vcf versions

hmtnote:

Human mitochondrial variants annotation using HmtVar.

This tools takes a background VCF, such as gnomad, that has full genome (though in some cases, users will instead want whole exome) coverage and uses that as an expectation of variants.

012012

tsv versions

htsnimtools:

useful command-line tools written to show-case hts-nim

Call variants from a BAM file using iVar

010000

tsv mpileup versions

ivar:

iVar - a computational package that contains functions broadly useful for viral amplicon-based sequencing.

Jointly Accurate Sv Merging with Intersample Network Edges

012301010

vcf versions

Lofreq subcommand to for insert base and indel alignment qualities

010

bam versions

lofreq:

A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data

Lofreq subcommand to call low frequency variants from alignments

0120

vcf versions

lofreq:

A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data

It predicts variants using multiple processors

01230101

vcf tbi versions

lofreq:

Lofreq is a fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data. It's call-parallel programme predicts variants using multiple processors

Lofreq subcommand to remove variants with low coverage or strand bias potential

01

vcf versions

lofreq:

A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data

Inserts indel qualities in a BAM file

0101

bam versions

lofreq:

Lofreq is a fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data. It's indelqual programme inserts indel qualities in a BAM file

Lofreq subcommand to call low frequency variants from alignments when tumor-normal paired samples are available

0123450101

vcf versions

lofreq:

A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data

Lofreq subcommand to call low frequency variants from alignments when tumor-normal paired samples are available

0101

bam versions

lofreq:

A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data

Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads. This script reformats inversions into single inverted sequence junctions which was the format used in Manta versions <= 1.4.0.

0101

vcf tbi versions

manta:

Structural variant and indel caller for mapped sequencing data

Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads. It is optimized for analysis of germline variation in small sets of individuals and somatic variation in tumor/normal sample pairs.

0123401010

candidate_small_indels_vcf candidate_small_indels_vcf_tbi candidate_sv_vcf candidate_sv_vcf_tbi diploid_sv_vcf diploid_sv_vcf_tbi versions

manta:

Structural variant and indel caller for mapped sequencing data

Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads. It is optimized for analysis of germline variation in small sets of individuals and somatic variation in tumor/normal sample pairs.

012345601010

candidate_small_indels_vcf candidate_small_indels_vcf_tbi candidate_sv_vcf candidate_sv_vcf_tbi diploid_sv_vcf diploid_sv_vcf_tbi somatic_sv_vcf somatic_sv_vcf_tbi versions

manta:

Structural variant and indel caller for mapped sequencing data

Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads. It is optimized for analysis of germline variation in small sets of individuals and somatic variation in tumor/normal sample pairs.

0123401010

candidate_small_indels_vcf candidate_small_indels_vcf_tbi candidate_sv_vcf candidate_sv_vcf_tbi tumor_sv_vcf tumor_sv_vcf_tbi versions

manta:

Structural variant and indel caller for mapped sequencing data

Evaluate microsattelite instability (MSI) using paired tumor-normal sequencing data

0123456

output output_dis output_germline output_somatic versions

msisensor:

MSIsensor is a C++ program to detect replication slippage variants at microsatellite regions, and differentiate them as somatic or germline.

Scan a reference genome to get microsatellite & homopolymer information

01

txt versions

msisensor:

MSIsensor is a C++ program to detect replication slippage variants at microsatellite regions, and differentiate them as somatic or germline.

Computes tier-based cutoffs from a sample-specific error model which is generated by muse/call and reports the finalized variants

01012

vcf tbi versions

MuSE:

Somatic point mutation caller based on Markov substitution model for molecular evolution

Parse all the supporting reads of putative somatic SVs using nanomonsv. After successful completion, you will find supporting reads stratified by deletions, insertions, and rearrangements. A precursor to "nanomonsv get"

012

insertions insertions_index deletions deletions_index rearrangements rearrangements_index bp_info bp_info_index versions

nanomonsv:

nanomonsv is a software for detecting somatic structural variations from paired (tumor and matched control) cancer genome sequence data.

Determines the depth in a BAM/CRAM file

0120101

depth binned_depth versions

paragraph:

Graph realignment tools for structural variants

Genotype structural variants using paragraph and grmpy

0123450101

vcf json versions

paragraph:

Graph realignment tools for structural variants

Convert a VCF file to a JSON graph

0101

graph versions

paragraph:

Graph realignment tools for structural variants

Pindel can detect breakpoints of large deletions, medium sized insertions, inversions, tandem duplications and other structural variants at single-based resolution from next-gen sequence data

012000

bp cem del dd int_final inv li rp si td versions

pindel:

Pindel can detect breakpoints of large deletions, medium sized insertions, inversions, tandem duplications and other structural variants at single-based resolution from next-gen sequence data

Platypus is a tool that efficiently and accurately calling genetic variants from next-generation DNA sequencing data

01234000

vcf tbi log version

Epistasis in PLINK, analyzing how the effects of one gene depend on the presence of others.

0123010101

epi episummary log nosex versions

plink:

Whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.

Fast Epistasis in PLINK, analyzing how the effects of one gene depend on the presence of others.

0123010101

fepi fepisummary flog fnosex versions

plink:

Whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.

Produce a pruned subset of markers that are in approximate linkage equilibrium with each other. Pairs of variants in the current window with squared correlation greater than the threshold are noted and variants are greedily pruned from the window until no such pairs remain.

0123000

prunein pruneout versions

plink:

Whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.

LD analysis in PLINK examines genetic variant associations within populations

0123010101

ld log nosex versions

plink:

Whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.

Filters plink bfiles or pfiles with filters such as maf or var

0123

bed bim fam pgen pvar psam versions

plink2:

Whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner

Filters plink bfiles or pfiles with maf filters

01230

bed bim fam pgen pvar psam versions

plink2:

Whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner

Produce pruned set of variants in approximatelinkage equilibrium

0123000

prune_in prune_out versions

plink2:

Whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner

Run PureCN workflow to normalize, segment and determine purity and ploidy

01200

pdf local_optima_pdf seg genes_csv amplification_pvalues_csv vcf_gz variants_csv loh_csv chr_pdf segmentation_pdf multisample_seg versions

purecn:

Copy number calling and SNV classification using targeted short read sequencing

Call SNVs/indels from BAM files for all target genes.

0120100

vcf tbi versions

pypgx:

A Python package for pharmacogenomics research

PyPGx pharmacogenomics genotyping pipeline for NGS data.

012345010

results cnv_calls consolidated_variants versions

pypgx:

A Python package for pharmacogenomics research

The VCFeval tool of RTG tools. It is used to evaluate called variants for agreement with a baseline variant set

012345601

tp_vcf tp_tbi fn_vcf fn_tbi fp_vcf fp_tbi baseline_vcf baseline_tbi snp_roc non_snp_roc weighted_roc summary phasing versions

rtgtools:

RealTimeGenomics Tools -- Utilities for accurate VCF comparison and manipulation

Apply a score cutoff to filter variants based on a recalibration table. Sentieon's Aplyvarcal performs the second pass in a two-stage process called Variant Quality Score Recalibration (VQSR). Specifically, it applies filtering to the input variants based on the recalibration table produced in the previous step VarCal and a target sensitivity value. https://support.sentieon.com/manual/usages/general/#applyvarcal-algorithm

0123450101

vcf tbi versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

modifies the input VCF file by adding the MLrejected FILTER to the variants

012010101

vcf index versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Ligate multiple phased BCF/VCF files into a single whole chromosome file. Typically run to ligate multiple chunks of phased common variants.

012

merged_variants versions

shapeit5:

Fast and accurate method for estimation of haplotypes (phasing)

Tool to phase rare variants onto a scaffold of common variants (output of phase_common / ligate). Require feature AVX2.

01234012301

phased_variant versions

shapeit5:

Fast and accurate method for estimation of haplotypes (phasing)

smoove simplifies and speeds calling and genotyping SVs for short reads. It also improves specificity by removing many spurious alignment signals that are indicative of low-level noise and often contribute to spurious calls. Developed by Brent Pedersen.

01230101

vcf versions

smoove:

structural variant calling and genotyping with existing tools, but, smoothly

Genetic variant annotation and functional effect prediction toolbox

012

cache versions

snpeff:

SnpEff is a variant annotation and effect prediction tool. It annotates and predicts the effects of genetic variants on genes and proteins (such as amino acid changes).

Genetic variant annotation and functional effect prediction toolbox

01001

vcf report summary_html genes_txt versions

snpeff:

SnpEff is a variant annotation and effect prediction tool. It annotates and predicts the effects of genetic variants on genes and proteins (such as amino acid changes).

Strelka2 is a fast and accurate small variant caller optimized for analysis of germline variation

0123400

vcf vcf_tbi genome_vcf genome_vcf_tbi versions

strelka:

Strelka calls somatic and germline small variants from mapped sequencing reads

Strelka2 is a fast and accurate small variant caller optimized for analysis of germline variation in small cohorts and somatic variation in tumor/normal sample pairs

01234567800

vcf_indels vcf_indels_tbi vcf_snvs vcf_snvs_tbi versions

strelka:

Strelka calls somatic and germline small variants from mapped sequencing reads

Converts a bedpe file to a VCF file (beta version)

01

vcf versions

survivor:

Toolset for SV simulation, comparison and filtering

Filter a vcf file based on size and/or regions to ignore

0120000

vcf versions

survivor:

Toolset for SV simulation, comparison and filtering

Compare or merge VCF files to generate a consensus or multi sample VCF files.

01000000

vcf versions

survivor:

Toolset for SV simulation, comparison and filtering

Simulate an SV VCF file based on a reference genome

01010100

parameters vcf bed fasta insertions versions

survivor:

Toolset for SV simulation, comparison and filtering

Report multiple stats over a VCF file

01000

stats versions

survivor:

Toolset for SV simulation, comparison and filtering

SvABA is an efficient and accurate method for detecting SVs from short-read sequencing data using genome-wide local assembly with low memory and computing requirements

01234010101010101

sv indel germ_indel germ_sv som_indel som_sv unfiltered_sv unfiltered_indel unfiltered_germ_indel unfiltered_germ_sv unfiltered_som_indel unfiltered_som_sv raw_calls discordants log versions

SVbenchmark compares a set of โ€œtestโ€ structural variants in VCF format to a known truth set (also in VCF format) and outputs estimates of sensitivity and specificity.

0123450101

fns fps distances log report versions

svanalyzer:

SVanalyzer: tools for the analysis of structural variation in genomes

Build a structural variant database

010

db versions

svdb:

structural variant database software

The merge module merges structural variants within one or more vcf files.

0100

vcf tbi csi versions

svdb:

structural variant database software

Query a structural variant database, using a vcf file as query

01000000

vcf versions

svdb:

structural variant database software

Performs tests on BAF files

01234

metrics versions

svtk:

Utilities for consolidating, filtering, resolving, and annotating structural variants.

Count the instances of each SVTYPE observed in each sample in a VCF.

01

counts versions

svtk:

Utilities for consolidating, filtering, resolving, and annotating structural variants.

Convert an RdTest-formatted bed to the standard VCF format.

0120

vcf tbi versions

svtk:

Utilities for consolidating, filtering, resolving, and annotating structural variants.

Convert SV calls to a standardized format.

0101

vcf versions

svtk:

Utilities for consolidating, filtering, resolving, and annotating structural variants.

Converts VCFs containing structural variants to BED format

012

bed versions

svtk:

Utilities for consolidating, filtering, resolving, and annotating structural variants.

Convert a VCF file to a BEDPE file.

01

bedpe versions

svtools:

Tools for processing and analyzing structural variants

SVTyper performs breakpoint genotyping of structural variants (SVs) using whole genome sequencing data

01230101

json gt_vcf bam versions

svtyper:

Compute genotype of structural variants based on breakpoint depth

SVTyper-sso computes structural variant (SV) genotypes based on breakpoint depth on a SINGLE sample

012301

gt_vcf json versions

svtyper:

Bayesian genotyper for structural variants

A tool to standardize VCF files from structural variant callers

0123

vcf tbi versions

Identify chromosomal rearrangements.

0120101

vcf ploidy versions

sv:

Search for structural variants.

Given baseline and comparison sets of variants, calculate the recall/precision/f-measure

0123450101

fn_vcf fn_tbi fp_vcf fp_tbi tp_base_vcf tp_base_tbi tp_comp_vcf tp_comp_tbi summary versions

truvari:

Structural variant comparison tool for VCFs

Over multiple vcfs, calculate their intersection/consistency.

01

consistency versions

truvari:

Structural variant comparison tool for VCFs

Normalization of SVs into disjointed genomic regions

01

vcf versions

truvari:

Structural variant comparison tool for VCFs

Call variants for a given scenario specified with the varlociraptor calling grammar, preprocessed by varlociraptor preprocessing

01200

bcf_gz vcf_gz bcf vcf versions

varlociraptor:

Flexible, uncertainty-aware variant calling with parameter free filtration via FDR control.

In order to judge about candidate indel and structural variants, Varlociraptor needs to know about certain properties of the underlying sequencing experiment in combination with the used read aligner.

010101

alignment_properties_json versions

varlociraptor:

Flexible, uncertainty-aware variant calling with parameter free filtration via FDR control.

Obtains per-sample observations for the actual calling process with varlociraptor calls

012340101

bcf_gz vcf_gz bcf vcf versions

varlociraptor:

Flexible, uncertainty-aware variant calling with parameter free filtration via FDR control.

Convert VCF with structural variations to CytoSure format

010101010

cgh versions

Constructs a graph from a reference and variant calls or a multiple sequence alignment file

01230101

graph versions

vg:

Variation graph data structures, interchange formats, alignment, genotyping, and variant calling methods.

Deconstruct snarls present in a variation graph in GFA format to variants in VCF format

0100

vcf versions

vg:

Variation graph data structures, interchange formats, alignment, genotyping, and variant calling methods.

write your description here

01

xg vg_index versions

vg:

Variation graph data structures, interchange formats, alignment, genotyping, and variant calling methods.

decomposes multiallelic variants into biallelic in a VCF file.

012

vcf versions

vt:

A tool set for short variant discovery in genetic sequence data

Decomposes biallelic block substitutions into its constituent SNPs.

0123

vcf versions

vt:

A tool set for short variant discovery in genetic sequence data

normalizes variants in a VCF file

01230101

vcf fai versions

vt:

A tool set for short variant discovery in genetic sequence data

A large variant benchmarking tool analogous to hap.py for small variants.

01234

report bench_vcf bench_vcf_tbi versions

Click here to trigger an update.