Available Modules

Modules are the building stones of all DSL2 nf-core blocks. You can find more info from nf-core website, if you would like to write your own module.

  • vcf 52
  • variant calling 40
  • bam 32
  • bed 21
  • sentieon 17
  • variants 14
  • variant 13
  • VCF 12
  • structural variants 11
  • bedtools 11
  • fasta 10
  • bcftools 10
  • gatk4 9
  • somatic 9
  • bisulfite 9
  • wgs 9
  • gff 8
  • gvcf 8
  • bisulphite 8
  • methylseq 8
  • methylation 8
  • 5mC 8
  • map 7
  • consensus 7
  • neural network 7
  • machine learning 7
  • bismark 7
  • genomics 6
  • alignment 6
  • cram 6
  • filter 6
  • sv 6
  • germline 6
  • low frequency variant calling 6
  • index 5
  • sort 5
  • merge 5
  • cnv 5
  • haplotype 5
  • call 5
  • wxs 5
  • copy number 4
  • QC 4
  • bcf 4
  • bedGraph 4
  • haplotypecaller 4
  • peak-calling 4
  • DNA sequencing 4
  • targeted sequencing 4
  • hybrid capture sequencing 4
  • copy number alteration calling 4
  • genome 3
  • reference 3
  • single-cell 3
  • graph 3
  • cna 3
  • metrics 3
  • long-read 3
  • genotype 3
  • genotyping 3
  • 3-letter genome 3
  • mpileup 3
  • mutect2 3
  • read depth 3
  • structural 3
  • ancestry 3
  • family 3
  • variant_calling 3
  • indel 3
  • panel 3
  • informative sites 3
  • kinship 3
  • identity 3
  • relatedness 3
  • small indels 3
  • fastq 2
  • assembly 2
  • sam 2
  • annotation 2
  • bacteria 2
  • statistics 2
  • coverage 2
  • qc 2
  • nanopore 2
  • pacbio 2
  • convert 2
  • imputation 2
  • histogram 2
  • filtering 2
  • example 2
  • umi 2
  • report 2
  • mem 2
  • indels 2
  • view 2
  • SV 2
  • fgbio 2
  • union 2
  • pypgx 2
  • polishing 2
  • observations 2
  • shapeit 2
  • cellranger 2
  • complement 2
  • roh 2
  • atac-seq 2
  • chip-seq 2
  • combine 2
  • fai 2
  • intervals 2
  • nextclade 2
  • intersect 2
  • immunoprofiling 2
  • lofreq 2
  • tnhaplotyper2 2
  • cnvnator 2
  • vg 2
  • varcal 2
  • runs_of_homozygosity 2
  • standardization 2
  • cancer genomics 2
  • snpsift 2
  • calling 2
  • cnv calling 2
  • realignment 2
  • filtermutectcalls 2
  • joint genotyping 2
  • structural-variant calling 2
  • metagenomics 1
  • align 1
  • gfa 1
  • count 1
  • conversion 1
  • variation graph 1
  • bqsr 1
  • protein 1
  • stats 1
  • base quality score recalibration 1
  • scWGBS 1
  • WGBS 1
  • DNA methylation 1
  • pangenome graph 1
  • matrix 1
  • annotate 1
  • transcript 1
  • bwa 1
  • low-coverage 1
  • iCLIP 1
  • gff3 1
  • dedup 1
  • peaks 1
  • population genetics 1
  • phasing 1
  • glimpse 1
  • gene 1
  • cnvkit 1
  • scRNA-seq 1
  • pangenome 1
  • feature 1
  • summary 1
  • kallisto 1
  • benchmark 1
  • svtk 1
  • query 1
  • counts 1
  • CLIP 1
  • bedgraph 1
  • chunk 1
  • ATAC-seq 1
  • concatenate 1
  • sample 1
  • HiFi 1
  • preprocessing 1
  • ligate 1
  • amplicon sequencing 1
  • rna_structure 1
  • RNA 1
  • dna 1
  • wastewater 1
  • fam 1
  • bim 1
  • structural_variants 1
  • SNP 1
  • benchmarking 1
  • pileup 1
  • remove 1
  • comparisons 1
  • chromosome 1
  • converter 1
  • Pharmacogenetics 1
  • GC content 1
  • intersection 1
  • smrnaseq 1
  • concat 1
  • tbi 1
  • normalize 1
  • norm 1
  • windows 1
  • pharmacogenetics 1
  • khmer 1
  • bustools 1
  • BAM 1
  • comparison 1
  • allele 1
  • artic 1
  • aggregate 1
  • demultiplexed reads 1
  • variation 1
  • CNV 1
  • bayesian 1
  • estimation 1
  • UMIs 1
  • eCLIP 1
  • gatk 1
  • ChIP-seq 1
  • duplex 1
  • eigenstrat 1
  • microbial 1
  • concordance 1
  • vdj 1
  • tnfilter 1
  • tnseq 1
  • tnscope 1
  • dbsnp 1
  • standardize 1
  • assay 1
  • deep variant 1
  • mutect 1
  • Sample 1
  • groupby 1
  • consensus sequence 1
  • partition histograms 1
  • upd 1
  • uniparental 1
  • disomy 1
  • snv 1
  • polya tail 1
  • short-read sequencing 1
  • germline variant calling 1
  • somatic variant calling 1
  • fast5 1
  • variant caller 1
  • chromosomal rearrangements 1
  • atlas 1
  • duplexumi 1
  • readwriter 1
  • dnamodelapply 1
  • dnascope 1
  • construct 1
  • graph projection to vcf 1
  • whamg 1
  • wham 1
  • setgt 1
  • cutesv 1
  • nucBed 1
  • AT content 1
  • nucleotide content 1
  • elfasta 1
  • elprep 1
  • structural variation 1
  • java 1
  • script 1
  • depth information 1
  • duphold 1
  • joint-variant-calling 1
  • microRNA 1
  • poolseq 1
  • variant-calling 1
  • tags 1
  • mkvdjref 1
  • impute-info 1
  • probabilistic realignment 1
  • split by chromosome 1
  • functional 1
  • paraphase 1
  • Imputation 1
  • Haplotypes 1
  • detecting svs 1
  • variantcalling 1
  • genomecov 1
  • closest 1
  • bamtobed 1
  • sorting 1
  • postprocessgermlinecnvcalls 1
  • snvs 1
  • jaccard 1
  • getfasta 1
  • overlap 1
  • germlinecnvcaller 1
  • germline contig ploidy 1
  • BCF 1
  • tumor/normal 1
  • digital normalization 1
  • k-mer counting 1
  • sizes 1
  • region 1
  • shiftBed 1
  • multinterval 1
  • overlapped bed 1
  • maskfasta 1
  • chunking 1
  • unionBedGraphs 1
  • subtract 1
  • slopBed 1
  • lofreq/call 1
  • lofreq/filter 1
  • qualities 1
  • bases 1
  • VQSR 1
  • variant recalibration 1
  • applyvarcal 1
  • random draw 1
  • pseudohaploid 1
  • pseudodiploid 1
  • freqsum 1
  • targets 1
  • variant quality score recalibration 1
  • getpileupsummaries 1
  • cross-samplecontamination 1
  • calculatecontamination 1
  • peak-caller 1
  • cut&tag 1
  • cut&run 1
  • chromatin 1
  • seacr 1
  • vqsr 1
  • dbnsfp 1
  • predictions 1
  • gene-calling 1
  • bacterial variant calling 1
  • gamma 1
  • CRAM 1
  • SMN1 1
  • SMN2 1
  • sniffles 1
  • core 1
  • snippy 1
  • determinegermlinecontigploidy 1
  • csi 1
  • CoPRO 1
  • GRO-cap 1
  • PRO-cap 1
  • CAGE 1
  • NETCAGE 1
  • RAMPAGE 1
  • csRNA-seq 1
  • STRIPE-seq 1
  • PRO-seq 1
  • GRO-seq 1
  • genetic 1
  • intervals coverage 1
  • multimapper 1
  • short variant discovery 1
  • combinegvcfs 1
  • rtg-tools 1
  • LCA 1
  • Ancestor 1
  • rhocall 1
  • genomic intervals 1
  • normal database 1
  • panel of normals 1
  • database 0
  • quality control 0
  • gtf 0
  • download 0
  • classification 0
  • classify 0
  • split 0
  • contamination 0
  • MSA 0
  • taxonomic profiling 0
  • k-mer 0
  • taxonomy 0
  • binning 0
  • quality 0
  • proteomics 0
  • clustering 0
  • ancient DNA 0
  • phylogeny 0
  • contigs 0
  • reporting 0
  • trimming 0
  • isoseq 0
  • build 0
  • long reads 0
  • rnaseq 0
  • illumina 0
  • databases 0
  • mags 0
  • indexing 0
  • compression 0
  • picard 0
  • kmer 0
  • table 0
  • imaging 0
  • antimicrobial resistance 0
  • phage 0
  • mapping 0
  • visualisation 0
  • demultiplex 0
  • openms 0
  • tsv 0
  • serotype 0
  • sequences 0
  • amr 0
  • taxonomic classification 0
  • searching 0
  • protein sequence 0
  • pairs 0
  • markduplicates 0
  • bins 0
  • depth 0
  • cluster 0
  • aDNA 0
  • structure 0
  • plot 0
  • expression 0
  • db 0
  • LAST 0
  • checkm 0
  • metagenome 0
  • completeness 0
  • archaeogenomics 0
  • repeat 0
  • biscuit 0
  • mmseqs2 0
  • palaeogenomics 0
  • virus 0
  • damage 0
  • mappability 0
  • validation 0
  • cooler 0
  • samtools 0
  • bisulfite sequencing 0
  • aligner 0
  • mag 0
  • hmmsearch 0
  • decompression 0
  • segmentation 0
  • mkref 0
  • msa 0
  • transcriptome 0
  • kraken2 0
  • blast 0
  • evaluation 0
  • complexity 0
  • seqkit 0
  • ncbi 0
  • spatial 0
  • sequence 0
  • newick 0
  • ucsc 0
  • mirna 0
  • demultiplexing 0
  • tumor-only 0
  • differential 0
  • antimicrobial resistance genes 0
  • deduplication 0
  • prediction 0
  • vsearch 0
  • json 0
  • antimicrobial peptides 0
  • prokaryote 0
  • kmers 0
  • single 0
  • hmmer 0
  • snp 0
  • mitochondria 0
  • duplicates 0
  • NCBI 0
  • short-read 0
  • splicing 0
  • multiple sequence alignment 0
  • gzip 0
  • plasmid 0
  • fragment 0
  • merging 0
  • amps 0
  • adapters 0
  • visualization 0
  • de novo 0
  • single cell 0
  • arg 0
  • ptr 0
  • gridss 0
  • coptr 0
  • detection 0
  • deamination 0
  • de novo assembly 0
  • microbiome 0
  • extract 0
  • interval 0
  • csv 0
  • sourmash 0
  • diversity 0
  • tabular 0
  • riboseq 0
  • isolates 0
  • clipping 0
  • profiling 0
  • MAF 0
  • text 0
  • idXML 0
  • antibiotic resistance 0
  • interval_list 0
  • cut 0
  • bin 0
  • genome assembler 0
  • circrna 0
  • bcl2fastq 0
  • gsea 0
  • compare 0
  • miscoding lesions 0
  • ganon 0
  • archaeogenetics 0
  • hic 0
  • palaeogenetics 0
  • STR 0
  • phylogenetic placement 0
  • profile 0
  • genmod 0
  • diamond 0
  • public datasets 0
  • bigwig 0
  • compress 0
  • snps 0
  • enrichment 0
  • ranking 0
  • deep learning 0
  • paf 0
  • cat 0
  • FASTQ 0
  • fastx 0
  • sequencing 0
  • umitools 0
  • normalization 0
  • sketch 0
  • add 0
  • ont 0
  • resistance 0
  • ampir 0
  • xeniumranger 0
  • isomir 0
  • microarray 0
  • parsing 0
  • retrotransposon 0
  • fungi 0
  • malt 0
  • telomere 0
  • redundancy 0
  • BGC 0
  • biosynthetic gene cluster 0
  • propr 0
  • logratio 0
  • bgzip 0
  • ccs 0
  • hmmcopy 0
  • DNA sequence 0
  • reference-free 0
  • microsatellite 0
  • reads 0
  • quantification 0
  • ngscheckmate 0
  • containment 0
  • matching 0
  • happy 0
  • reports 0
  • notebook 0
  • bedpe 0
  • gatk4spark 0
  • mzml 0
  • somatic variants 0
  • amplicon sequences 0
  • mtDNA 0
  • windowmasker 0
  • pseudoalignment 0
  • krona chart 0
  • npz 0
  • mapper 0
  • typing 0
  • entrez 0
  • guide tree 0
  • covid 0
  • organelle 0
  • transcriptomics 0
  • repeat expansion 0
  • fcs-gx 0
  • chimeras 0
  • PacBio 0
  • fingerprint 0
  • PCA 0
  • miRNA 0
  • ambient RNA removal 0
  • HMM 0
  • genotype-based deconvoltion 0
  • cfDNA 0
  • popscle 0
  • transposons 0
  • krona 0
  • bacterial 0
  • vrhyme 0
  • untar 0
  • archiving 0
  • plink2 0
  • rsem 0
  • transcripts 0
  • genome assembly 0
  • mlst 0
  • spark 0
  • prokka 0
  • dictionary 0
  • duplication 0
  • insert 0
  • score 0
  • replace 0
  • pairsam 0
  • pan-genome 0
  • lineage 0
  • unzip 0
  • survivor 0
  • uncompress 0
  • fastk 0
  • pangolin 0
  • html 0
  • long_read 0
  • minimap2 0
  • uLTRA 0
  • tabix 0
  • spaceranger 0
  • subsample 0
  • UMI 0
  • lossless 0
  • scores 0
  • zip 0
  • gene expression 0
  • image_analysis 0
  • quality trimming 0
  • adapter trimming 0
  • bamtools 0
  • host 0
  • clean 0
  • bakta 0
  • mcmicro 0
  • ataqv 0
  • arriba 0
  • fusion 0
  • RNA-seq 0
  • eukaryotes 0
  • prokaryotes 0
  • cut up 0
  • cool 0
  • genome mining 0
  • angsd 0
  • CRISPR 0
  • bracken 0
  • mkfastq 0
  • image 0
  • nucleotide 0
  • hi-c 0
  • kraken 0
  • microbes 0
  • bwameth 0
  • aln 0
  • abundance 0
  • checkv 0
  • rna 0
  • wig 0
  • png 0
  • dump 0
  • highly_multiplexed_imaging 0
  • amplify 0
  • C to T 0
  • virulence 0
  • das tool 0
  • das_tool 0
  • deeparg 0
  • genomes 0
  • neubi 0
  • DRAMP 0
  • prefetch 0
  • macrel 0
  • msisensor-pro 0
  • micro-satellite-scan 0
  • removal 0
  • gwas 0
  • tumor 0
  • msi 0
  • instability 0
  • MSI 0
  • homoploymer 0
  • frame-shift correction 0
  • long-read sequencing 0
  • spatial_transcriptomics 0
  • resolve_bioscience 0
  • profiles 0
  • lift 0
  • graph layout 0
  • MCMICRO 0
  • Duplication purging 0
  • purge duplications 0
  • trim 0
  • library 0
  • preseq 0
  • adapter 0
  • import 0
  • ome-tif 0
  • variant pruning 0
  • bfiles 0
  • mirdeep2 0
  • reheader 0
  • read-group 0
  • ped 0
  • RNA sequencing 0
  • GPU-accelerated 0
  • scatter 0
  • megan 0
  • minhash 0
  • checksum 0
  • hlala_typing 0
  • instrain 0
  • proteome 0
  • ichorcna 0
  • cleaning 0
  • hidden Markov model 0
  • trgt 0
  • corrupted 0
  • mask 0
  • mapcounter 0
  • hla_typing 0
  • maximum likelihood 0
  • hlala 0
  • hla 0
  • nacho 0
  • nanostring 0
  • mRNA 0
  • trancriptome 0
  • tama 0
  • gstama 0
  • gene set 0
  • iphop 0
  • refine 0
  • sequence analysis 0
  • interactive 0
  • tree 0
  • contig 0
  • mash 0
  • serogroup 0
  • barcode 0
  • primer 0
  • pair 0
  • doublets 0
  • krakenuniq 0
  • screening 0
  • krakentools 0
  • screen 0
  • awk 0
  • anndata 0
  • blastn 0
  • gene labels 0
  • polyA_tail 0
  • Read depth 0
  • repeats 0
  • scaffold 0
  • interactions 0
  • rgfa 0
  • small variants 0
  • multiallelic 0
  • nucleotides 0
  • proportionality 0
  • NRPS 0
  • mitochondrion 0
  • registration 0
  • reformatting 0
  • image_processing 0
  • RiPP 0
  • antibiotics 0
  • regression 0
  • SimpleAF 0
  • taxids 0
  • antismash 0
  • taxon name 0
  • secondary metabolites 0
  • functional analysis 0
  • zlib 0
  • HOPS 0
  • leviosam2 0
  • orf 0
  • salmon 0
  • kma 0
  • long terminal retrotransposon 0
  • long terminal repeat 0
  • retrotransposons 0
  • MaltExtract 0
  • function 0
  • pharokka 0
  • bloom filter 0
  • orthology 0
  • k-mer index 0
  • COBS 0
  • archive 0
  • xz 0
  • authentication 0
  • genetics 0
  • edit distance 0
  • mudskipper 0
  • transcriptomic 0
  • parallelized 0
  • vcflib 0
  • distance 0
  • polish 0
  • xenograft 0
  • Streptococcus pneumoniae 0
  • sequenzautils 0
  • transformation 0
  • rename 0
  • seqtk 0
  • salmonella 0
  • fusions 0
  • soft-clipped clusters 0
  • fixmate 0
  • switch 0
  • dict 0
  • collate 0
  • bam2fq 0
  • dereplicate 0
  • scaffolding 0
  • graft 0
  • rtgtools 0
  • junctions 0
  • metamaps 0
  • ancient dna 0
  • shigella 0
  • differential expression 0
  • taxonomic profile 0
  • RNA-Seq 0
  • simulate 0
  • taxon tables 0
  • otu tables 0
  • standardisation 0
  • standardise 0
  • amptransformer 0
  • svdb 0
  • de novo assembler 0
  • small genome 0
  • signature 0
  • FracMinHash sketch 0
  • join 0
  • snpeff 0
  • effect prediction 0
  • ampgram 0
  • spatial_omics 0
  • duplicate 0
  • gene set analysis 0
  • allele-specific 0
  • short reads 0
  • reads merging 0
  • merge mate pairs 0
  • deconvolution 0
  • cvnkit 0
  • interval list 0
  • evidence 0
  • panelofnormals 0
  • recombination 0
  • parse 0
  • baf 0
  • gem 0
  • genomad 0
  • unaligned 0
  • single cells 0
  • samplesheet 0
  • regions 0
  • random forest 0
  • heatmap 0
  • metagenomes 0
  • rna-seq 0
  • deseq2 0
  • blastp 0
  • fasterq-dump 0
  • eido 0
  • format 0
  • validate 0
  • fetch 0
  • sra-tools 0
  • settings 0
  • emboss 0
  • correction 0
  • tab 0
  • metadata 0
  • repeat_expansions 0
  • expansionhunterdenovo 0
  • identifier 0
  • metagenomic 0
  • GEO 0
  • genome bins 0
  • phase 0
  • hmtnote 0
  • ancestral alleles 0
  • hifi 0
  • htseq 0
  • rrna 0
  • installation 0
  • sompy 0
  • doCounts 0
  • extractvariants 0
  • peak picking 0
  • allele counts 0
  • site frequency spectrum 0
  • derived alleles 0
  • ANI 0
  • nuclear contamination estimate 0
  • ARGs 0
  • antibiotic resistance genes 0
  • faqcs 0
  • array_cgh 0
  • cytosure 0
  • post Post-processing 0
  • vector 0
  • gprofiler2 0
  • gost 0
  • str 0
  • decoy 0
  • genome graph 0
  • amrfinderplus 0
  • bgen 0
  • fARGene 0
  • chloroplast 0
  • confidence 0
  • blat 0
  • alr 0
  • clr 0
  • cooler/balance 0
  • boxcox 0
  • public 0
  • Escherichia coli 0
  • propd 0
  • rad 0
  • Read coverage histogram 0
  • abricate 0
  • ENA 0
  • SRA 0
  • reverse complement 0
  • simulation 0
  • hmmfetch 0
  • decompose 0
  • cload 0
  • digest 0
  • transmembrane 0
  • enzyme 0
  • extract_variants 0
  • makebins 0
  • gvcftools 0
  • parser 0
  • endogenous DNA 0
  • AMPs 0
  • parallel 0
  • gunzip 0
  • plastid 0
  • Streptococcus pyogenes 0
  • resfinder 0
  • resistance genes 0
  • raw 0
  • mgf 0
  • parquet 0
  • model 0
  • swissprot 0
  • quarto 0
  • python 0
  • r 0
  • coexpression 0
  • correlation 0
  • corpcor 0
  • phylogenetics 0
  • minimum_evolution 0
  • distance-based 0
  • percent on target 0
  • cache 0
  • rgi 0
  • mcool 0
  • structural variant 0
  • bam2fastx 0
  • bam2fastq 0
  • immcantation 0
  • airrseq 0
  • immunoinformatics 0
  • co-orthology 0
  • homology 0
  • sequence similarity 0
  • spectral clustering 0
  • comparative genomics 0
  • genomic bins 0
  • genotypegvcf 0
  • idx 0
  • UNet 0
  • TMA dearray 0
  • Segmentation 0
  • transform 0
  • gaps 0
  • introns 0
  • Cores 0
  • install 0
  • joint-genotyping 0
  • ibd 0
  • single molecule 0
  • refflat 0
  • gtftogenepred 0
  • ucsc/liftover 0
  • mkarv 0
  • fq 0
  • lint 0
  • umicollapse 0
  • random 0
  • scRNA-Seq 0
  • generate 0
  • files 0
  • bedtobigbed 0
  • haplogroups 0
  • mitochondrial 0
  • beagle 0
  • downsample 0
  • downsample bam 0
  • subsample bam 0
  • vcf2db 0
  • gemini 0
  • genepred 0
  • bigbed 0
  • lua 0
  • sequencing_bias 0
  • svtk/baftest 0
  • baftest 0
  • countsvtypes 0
  • rdtest2vcf 0
  • rdtest 0
  • vcf2bed 0
  • decompress 0
  • post mortem damage 0
  • bedgraphtobigwig 0
  • target 0
  • rust 0
  • Assembly 0
  • Mycobacterium tuberculosis 0
  • eucaryotes 0
  • coding 0
  • cds 0
  • transcroder 0
  • sequencing adapters 0
  • maf 0
  • toml 0
  • homologs 0
  • eigenvectors 0
  • copy number alterations 0
  • copy number variation 0
  • yahs 0
  • geo 0
  • mapad 0
  • adna 0
  • c to t 0
  • unmapped 0
  • proteus 0
  • readproteingroups 0
  • groupreads 0
  • hicPCA 0
  • gender determination 0
  • sliding 0
  • subcontigs 0
  • snakemake 0
  • workflow 0
  • workflow_mode 0
  • createreadcountpanelofnormals 0
  • copyratios 0
  • denoisereadcounts 0
  • ubam 0
  • copy number analysis 0
  • hbd 0
  • linkbins 0
  • vcfbreakmulti 0
  • uniq 0
  • deduplicate 0
  • VCFtools 0
  • verifybamid 0
  • DNA contamination estimation 0
  • concoct 0
  • nucleotide composition 0
  • http(s) 0
  • extractunbinned 0
  • utility 0
  • copy-number 0
  • sintax 0
  • vsearch/sort 0
  • zipperbams 0
  • usearch 0
  • long read alignment 0
  • pangenome-scale 0
  • all versus all 0
  • mashmap 0
  • wavefront 0
  • HLA 0
  • nucleotide sequence 0
  • antimicrobial peptide prediction 0
  • ATLAS 0
  • gstama/merge 0
  • ATACseq 0
  • shift 0
  • ATACshift 0
  • gct 0
  • blastx 0
  • jvarkit 0
  • translate 0
  • tar 0
  • tarball 0
  • targz 0
  • vsearch/fastqfilter 0
  • GNU 0
  • bclconvert 0
  • TAMA 0
  • controlstatistics 0
  • source tracking 0
  • emoji 0
  • quality_control 0
  • fastqfilter 0
  • vsearch/dereplicate 0
  • admixture 0
  • Staphylococcus aureus 0
  • hashing-based deconvolution 0
  • rank 0
  • escherichia coli 0
  • xml 0
  • svg 0
  • standard 0
  • haplotag 0
  • staging 0
  • Staging 0
  • telseq 0
  • cls 0
  • affy 0
  • segment 0
  • multiqc 0
  • mass_error 0
  • search engine 0
  • stardist 0
  • reference panels 0
  • gene model 0
  • updatedata 0
  • merge compare 0
  • recovery 0
  • mgi 0
  • sylph 0
  • corrrelation 0
  • tama_collapse.py 0
  • genomes on a tree 0
  • identity-by-descent 0
  • decomposeblocksub 0
  • block substitutions 0
  • pdb 0
  • run 0
  • chip 0
  • leafcutter 0
  • partitioning 0
  • malformed 0
  • fix 0
  • paired reads re-pairing 0
  • regex 0
  • patterns 0
  • doublet 0
  • Immune Deconvolution 0
  • Bioinformatics Tools 0
  • Computational Immunology 0
  • catpack 0
  • prepare 0
  • split_kmers 0
  • regtools 0
  • doublet_detection 0
  • relabel 0
  • barcodes 0
  • pcr duplicates 0
  • subsetting 0
  • logFC 0
  • significance statistic 0
  • p-value 0
  • scvi 0
  • solo 0
  • import segmentation 0
  • nuclear segmentation 0
  • cell segmentation 0
  • adapterremoval 0
  • paired-end 0
  • plotting 0
  • resegment 0
  • morphology 0
  • antimicrobial reistance 0
  • hostile 0
  • decontamination 0
  • human removal 0
  • metagenome assembler 0
  • cumulative coverage 0
  • scanpy 0
  • contiguate 0
  • scatterplot 0
  • tag2tag 0
  • multi-tool 0
  • na 0
  • omics 0
  • biological activity 0
  • deletion 0
  • prior knowledge 0
  • tag 0
  • cell_barcodes 0
  • mygene 0
  • go 0
  • circos 0
  • version 0
  • pile up 0
  • eklipse 0
  • Bayesian 0
  • cellpose 0
  • archaea 0
  • eigenstratdatabasetools 0
  • nanopore sequencing 0
  • rna velocity 0
  • cobra 0
  • extension 0
  • grea 0
  • genome taxonomy database 0
  • functional enrichment 0
  • translation 0
  • paired reads merging 0
  • structural-variants 0
  • scimap 0
  • check 0
  • n50 0
  • predict 0
  • amp 0
  • genbank 0
  • hardy-weinberg 0
  • hwe statistics 0
  • hwe equilibrium 0
  • reference-independent 0
  • genotype likelihood 0
  • collapse 0
  • liftover 0
  • seqfu 0
  • cell_type_identification 0
  • spatial_neighborhoods 0
  • cell_phenotyping 0
  • machine_learning 0
  • embl 0
  • gunc 0
  • clahe 0
  • refresh 0
  • association 0
  • GWAS 0
  • case/control 0
  • custom 0
  • associations 0
  • overlap-based merging 0
  • pep 0
  • mass spectrometry 0
  • transcription factors 0
  • selector 0
  • cram-size 0
  • size 0
  • quality check 0
  • realign 0
  • circular 0
  • spot 0
  • orthogroup 0
  • orthologs 0
  • sage 0
  • featuretable 0
  • regulatory network 0
  • extraction 0
  • cgMLST 0
  • WGS 0
  • redundant 0
  • nanoq 0
  • Read filters 0
  • Read trimming 0
  • Read report 0
  • drug categorization 0
  • uniques 0
  • Illumina 0
  • PEP 0
  • gstama/polyacleanup 0
  • hamming-distance 0
  • lexogen 0
  • hashing-based deconvoltion 0
  • gnu 0
  • coreutils 0
  • generic 0
  • transposable element 0
  • retrieval 0
  • schema 0
  • MMseqs2 0
  • InterProScan 0
  • busco 0
  • droplet based single cells 0
  • genotype-based demultiplexing 0
  • GTDB taxonomy 0
  • donor deconvolution 0
  • cellsnp 0
  • trimfq 0
  • vcflib/vcffixup 0
  • AC/NS/AF 0
  • Pacbio 0
  • guidetree 0
  • bwamem2 0
  • bwameme 0
  • grabix 0
  • ribosomal 0
  • 10x 0
  • bias 0
  • mutectstats 0
  • printsvevidence 0
  • printreads 0
  • mosdepth 0
  • otu table 0
  • preprocessintervals 0
  • microsatellite instability 0
  • mitochondrial genome 0
  • mergebamalignment 0
  • leftalignandtrimvariants 0
  • readorientationartifacts 0
  • scan 0
  • mtnucratio 0
  • ratio 0
  • autozygosity 0
  • mitochondrial to nuclear ratio 0
  • bioinformatics tools 0
  • Beautiful stand-alone HTML report 0
  • GATK UnifiedGenotyper 0
  • SNP table 0
  • reference genome 0
  • target prediction 0
  • learnreadorientationmodel 0
  • assembly evaluation 0
  • debruijn 0
  • shiftintervals 0
  • daa 0
  • rma6 0
  • Neisseria meningitidis 0
  • shiftfasta 0
  • k-mer frequency 0
  • shiftchain 0
  • 3D heat map 0
  • contour map 0
  • Merqury 0
  • smudgeplot 0
  • ploidy 0
  • unionsum 0
  • metaphlan 0
  • selectvariants 0
  • methylation bias 0
  • mbias 0
  • revert 0
  • reblockgvcf 0
  • assembler 0
  • de Bruijn 0
  • microrna 0
  • contaminant 0
  • cancer genome 0
  • megahit 0
  • graphs 0
  • pairtools 0
  • pairstools 0
  • restriction fragments 0
  • select 0
  • getpileupsumaries 0
  • germlinevariantsites 0
  • cadd 0
  • readcounter 0
  • paragraph 0
  • panelofnormalscreation 0
  • upper-triangular matrix 0
  • pbbam 0
  • pbmerge 0
  • subreads 0
  • pbp 0
  • pair-end 0
  • read 0
  • pedigrees 0
  • jointgenotyping 0
  • genomicsdbimport 0
  • motif 0
  • ChIP-Seq 0
  • phantom peaks 0
  • ligation junctions 0
  • flip 0
  • somatic structural variations 0
  • readcountssummary 0
  • mobile element insertions 0
  • sequencing summary 0
  • indexfeaturefile 0
  • NextGenMap 0
  • ngm 0
  • Neisseria gonorrhoeae 0
  • gender 0
  • homozygosity 0
  • biallelic 0
  • crispr 0
  • graph construction 0
  • graph drawing 0
  • squeeze 0
  • PCR/optical duplicates 0
  • odgi 0
  • combine graphs 0
  • graph stats 0
  • graph unchopping 0
  • graph formats 0
  • graph viz 0
  • hla-typing 0
  • ILP 0
  • HLA-I 0
  • block-compressed 0
  • update header 0
  • denovo 0
  • reformat 0
  • identification 0
  • amino acid 0
  • Jupyter 0
  • jupytext 0
  • papermill 0
  • tblastn 0
  • genome heterozygosity 0
  • genome size 0
  • subtyping 0
  • Salmonella enterica 0
  • kallisto/index 0
  • quant 0
  • sorted 0
  • chromap 0
  • jasmine 0
  • models 0
  • effective genome size 0
  • compound 0
  • Klebsiella 0
  • pneumoniae 0
  • file manipulation 0
  • kegg 0
  • kofamscan 0
  • genome profile 0
  • combining 0
  • Python 0
  • jasminesv 0
  • HMMER 0
  • pixel classification 0
  • genome summary 0
  • pos 0
  • haemophilus 0
  • Hidden Markov Model 0
  • gfastats 0
  • panel_of_normals 0
  • IDR 0
  • igv 0
  • igv.js 0
  • js 0
  • genome browser 0
  • multicut 0
  • pixel_classification 0
  • repeat content 0
  • probability_maps 0
  • genome manipulation 0
  • population genomics 0
  • postprocessing 0
  • interproscan 0
  • qa 0
  • genome statistics 0
  • genomic islands 0
  • insertion 0
  • quality assurnce 0
  • Mykrobe 0
  • Salmonella Typhi 0
  • bioawk 0
  • duplicate removal 0
  • 128 bit 0
  • mash/sketch 0
  • rra 0
  • DNA damage 0
  • NGS 0
  • damage patterns 0
  • estimate 0
  • taxonomic assignment 0
  • svcluster 0
  • svannotate 0
  • CRISPR-Cas9 0
  • reduced 0
  • representations 0
  • splitintervals 0
  • maxbin2 0
  • metagenome-assembled genomes 0
  • mass-spectroscopy 0
  • splitcram 0
  • mcr-1 0
  • site depth 0
  • MD5 0
  • maximum-likelihood 0
  • sgRNA 0
  • chromosome_visualization 0
  • pneumophila 0
  • bgc 0
  • file parsing 0
  • reorder 0
  • spliced 0
  • train 0
  • adapter removal 0
  • collapsing 0
  • txt 0
  • legionella 0
  • clinical 0
  • gawk 0
  • functional genomics 0
  • variantrecalibrator 0
  • limma 0
  • Listeria monocytogenes 0
  • recalibration model 0
  • variantfiltration 0
  • AMP 0
  • peptide prediction 0
  • prophage 0
  • illumina datasets 0
  • annotations 0
  • protein coding genes 0
  • virulent 0
  • subseq 0
  • grep 0
  • sequence headers 0
  • polymorphic sites 0
  • sertotype 0
  • topology 0
  • interleave 0
  • temperate 0
  • header 0
  • seq 0
  • cmseq 0
  • selection 0
  • annotateintervals 0
  • bam2seqz 0
  • gc_wiggle 0
  • induce 0
  • bacphlip 0
  • genetic sex 0
  • antibody capture 0
  • amplicon 0
  • ampliconclip 0
  • calibratedragstrmodel 0
  • calmd 0
  • faidx 0
  • insert size 0
  • repair 0
  • paired 0
  • read pairs 0
  • readgroup 0
  • assembly-binning 0
  • bedtointervallist 0
  • asereadcounter 0
  • antigen capture 0
  • scramble 0
  • cluster analysis 0
  • clusteridentifier 0
  • sex determination 0
  • relative coverage 0
  • sambamba 0
  • ribosomal RNA 0
  • UShER 0
  • bootstrapping 0
  • SNPs 0
  • invariant 0
  • constant 0
  • cycif 0
  • background 0
  • single-stranded 0
  • ancientDNA 0
  • rRNA 0
  • export 0
  • authentict 0
  • haplotype resolution 0
  • signatures 0
  • hash sketch 0
  • fracminhash sketch 0
  • read group 0
  • Haemophilus influenzae 0
  • spatype 0
  • spa 0
  • streptococcus 0
  • sccmec 0
  • multiomics 0
  • compartments 0
  • boxplot 0
  • lifestyle 0
  • domains 0
  • rare variants 0
  • error 0
  • heattree 0
  • de-novo 0
  • longread 0
  • sha256 0
  • 256 bit 0
  • access 0
  • shinyngs 0
  • exploratory 0
  • density 0
  • dist 0
  • features 0
  • antitarget 0
  • sliding window 0
  • gangstr 0
  • autofluorescence 0
  • POA 0
  • duplicate marking 0
  • flagstat 0
  • phylogenetic composition 0
  • variant genetic 0
  • subset 0
  • splice 0
  • indep 0
  • gget 0
  • indep pairwise 0
  • recode 0
  • whole genome association 0
  • filterintervals 0
  • identifiers 0
  • scoring 0
  • estimatelibrarycomplexity 0
  • pmdtools 0
  • variant identifiers 0
  • porechop_abi 0
  • duplication metrics 0
  • microscopy 0
  • gccounter 0
  • background_correction 0
  • contact 0
  • pretext 0
  • jpg 0
  • bmp 0
  • contact maps 0
  • gene finding 0
  • filtervarianttranches 0
  • exclude 0
  • element 0
  • tandem duplications 0
  • genomicsdb 0
  • gatherbqsrreports 0
  • tranche filtering 0
  • hybrid-selection 0
  • mate-pair 0
  • liftovervcf 0
  • pcr 0
  • picard/renamesampleinvcf 0
  • sortvcf 0
  • deletions 0
  • insertions 0
  • clumping fastqs 0
  • deduping 0
  • smaller fastqs 0
  • illumiation_correction 0
  • mapping-based 0
  • bamstat 0
  • bamtools/convert 0
  • strandedness 0
  • experiment 0
  • read_pairs 0
  • fragment_size 0
  • inner_distance 0
  • read distribution 0
  • polymorphic 0
  • sequence-based 0
  • mouse 0
  • composestrtablefile 0
  • integrity 0
  • rtg 0
  • collectsvevidence 0
  • pedfilter 0
  • rocplot 0
  • collectreadcounts 0
  • cnnscorevariants 0
  • salsa 0
  • salsa2 0
  • R 0
  • trimBam 0
  • polymut 0
  • bamUtil 0
  • calder2 0
  • cutoff 0
  • createsomaticpanelofnormals 0
  • haplotype purging 0
  • duplicate purging 0
  • false duplications 0
  • assembly curation 0
  • Haplotype purging 0
  • False duplications 0
  • long uncorrected reads 0
  • Assembly curation 0
  • createsequencedictionary 0
  • track 0
  • purging 0
  • bamtools/split 0
  • yaml 0
  • quast 0
  • low coverage 0
  • condensedepthevidence 0
  • dragstr 0
  • neighbour-joining 0
  • subsampling 0
  • mzML 0

The script aims to remove features based on a kill list. The default behaviour is to look at the features's ID. If the feature has an ID (case insensitive) listed among the kill list it will be removed. /!\ Removing a level1 or level2 feature will automatically remove all linked subfeatures, and removing all children of a feature will automatically remove this feature too.

0100

gff versions

agat:

Another Gff Analysis Toolkit (AGAT). Suite of tools to handle gene annotations in any GTF/GFF format.

Run the alignment/variant-call/consensus logic of the artic pipeline

0100000000

results bam bai bam_trimmed bai_trimmed bam_primertrimmed bai_primertrimmed fasta vcf tbi json versions

artic:

ARTIC pipeline - a bioinformatics pipeline for working with virus sequencing data sequenced with nanopore

generate VCF file from a BAM file using various calling methods

012340000

vcf versions

atlas:

ATLAS, a suite of methods to accurately genotype and estimate genetic diversity

This command replaces the former bcftools view caller. Some of the original functionality has been temporarily lost in the process of transition under htslib, but will be added back on popular demand. The original calling model can be invoked with the -c option.

012000

vcf tbi csi versions

view:

View, subset and filter VCF or BCF files by position and filtering expression. Convert between VCF and BCF

Concatenate VCF files

012

vcf tbi csi versions

concat:

Concatenate VCF files.

Compresses VCF files

01234

fasta versions

consensus:

Create consensus sequence by applying VCF variants to a reference fasta file.

Converts certain output formats to VCF

012010

vcf_gz vcf bcf_gz bcf hap legend samples tbi csi versions

bcftools:

BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. All commands work transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed. Most commands accept VCF, bgzipped VCF and BCF with filetype detected automatically even when streaming from a pipe. Indexed VCF and BCF will work in all situations. Un-indexed VCF and BCF and streams will work in most, but not all situations.

bcftools Haplotype-aware consequence caller

01010101

vcf tbi csi versions

reheader:

Haplotype-aware consequence caller

Filters VCF files

012

vcf tbi csi versions

filter:

Apply fixed-threshold filters to VCF files.

Index VCF tools

01

csi tbi versions

bcftools:

BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. All commands work transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed. Most commands accept VCF, bgzipped VCF and BCF with filetype detected automatically even when streaming from a pipe. Indexed VCF and BCF will work in all situations. Un-indexed VCF and BCF and streams will work in most, but not all situations.

Apply set operations to VCF files

012

results versions

isec:

Computes intersections, unions and complements of VCF files.

Merge VCF files

012010101

vcf index versions

merge:

Merge VCF files.

Compresses VCF files

012010

vcf tbi stats mpileup versions

mpileup:

Generates genotype likelihoods at each genomic position with coverage.

Normalize VCF file

01201

vcf tbi csi versions

norm:

Normalize VCF files.

Adds imputation information metrics to the INFO field based on selected FORMAT tags. Only the IMPUTE2 INFO metric from FORMAT/GP tags is currently available.

01200

vcf tbi csi versions

bcftools:

BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. All commands work transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed. Most commands accept VCF, bgzipped VCF and BCF with filetype detected automatically even when streaming from a pipe. Indexed VCF and BCF will work in all situations. Un-indexed VCF and BCF and streams will work in most, but not all situations.

bcftools plugin impute-info:

Bcftools plugins are tools that can be used with bcftools to manipulate variant calls in Variant Call Format (VCF) and BCF. The impute-info plugin adds imputation information metrics to the INFO field based on selected FORMAT tags. Only the IMPUTE2 INFO metric from FORMAT/GP tags is currently available

Sets genotypes according to the specified criteria and filtering expressions. For example, missing genotypes can be set to ref, but much more than that.

0120000

vcf tbi csi versions

bcftools:

BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. All commands work transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed. Most commands accept VCF, bgzipped VCF and BCF with filetype detected automatically even when streaming from a pipe. Indexed VCF and BCF will work in all situations. Un-indexed VCF and BCF and streams will work in most, but not all situations.

bcftools plugin setGT:

Bcftools plugins are tools that can be used with bcftools to manipulate variant calls in Variant Call Format (VCF) and BCF. The setGT plugin sets genotypes according to the specified criteria and filtering expressions. For example, missing genotypes can be set to ref, but much more than that.

Extracts fields from VCF or BCF files and outputs them in user-defined format.

012000

output versions

query:

Extracts fields from VCF or BCF files and outputs them in user-defined format.

Sorts VCF files

01

vcf tbi csi versions

sort:

Sort VCF files by coordinates.

Generates stats from VCF files

0120101010101

stats versions

stats:

Parses VCF or BCF and produces text file stats which is suitable for machine processing and can be plotted using plot-vcfstats.

View, subset and filter VCF or BCF files by position and filtering expression. Convert between VCF and BCF

012000

vcf tbi csi versions

view:

View, subset and filter VCF or BCF files by position and filtering expression. Convert between VCF and BCF

Converts a bam file to a bed12 file.

01

bed versions

bedtools:

A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.

For each feature in A, finds the closest feature (upstream or downstream) in B.

0120

output versions

bedtools:

A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.

Returns all intervals in a genome that are not covered by at least one interval in the input BED/GFF/VCF file.

010

bed versions

bedtools:

A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.

Computes histograms (default), per-base reports (-d) and BEDGRAPH (-bg) summaries of feature coverage (e.g., aligned sequences) for a given genome.

012000

genomecov versions

bedtools:

A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.

extract sequences in a FASTA file based on intervals defined in a feature file.

010

fasta versions

bedtools:

A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.

Groups features in a BED file by given column(s) and computes summary statistics for each group to another column.

010

bed versions

bedtools:

A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.

Allows one to screen for overlaps between two sets of genomic features.

01201

intersect versions

bedtools:

A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.

Calculate Jaccard statistic b/w two feature files.

01201

tsv versions

bedtools:

A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.

Makes adjacent or sliding windows across a genome or BED file.

01

bed versions

bedtools:

A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.

Allows one to screen for overlaps between two sets of genomic features.

01201

mapped versions

bedtools:

A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.

masks sequences in a FASTA file based on intervals defined in a feature file.

010

fasta versions

bedtools:

A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.

combines overlapping or โ€œbook-endedโ€ features in an interval file into a single feature which spans all of the combined features.

01

bed versions

bedtools:

A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.

Identifies common intervals among multiple (and subsets thereof) sorted BED/GFF/VCF files.

010

bed versions

bedtools:

A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.

Profiles the nucleotide content of intervals in a fasta file.

012

bed versions

bedtools:

A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.

Shifts each feature by specific number of bases

0101

bed versions

bedtools:

A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.

Adds a specified number of bases in each direction (unique values may be specified for either -l or -r)

010

bed versions

bedtools:

A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.

Sorts a feature file by chromosome and other criteria.

010

sorted versions

bedtools:

A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.

Finds overlaps between two sets of regions (A and B), removes the overlaps from A and reports the remaining portion of A.

012

bed versions

bedtools:

A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.

Combines multiple BedGraph files into a single file

0101

bed versions

bedtools:

A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.

Computes cytosine methylation and callable SNV mutations, optionally in reference to a germline BAM to call somatic variants

012340101

vcf versions

biscuit:

A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data

Performs alignment of BS-Seq reads using bismark

010101

bam report unmapped versions

bismark:

Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.

Relates methylation calls back to genomic cytosine contexts.

010101

coverage report summary versions

bismark:

Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.

Removes alignments to the same position in the genome from the Bismark mapping output.

01

bam report versions

bismark:

Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.

Converts a specified reference genome into two different bisulfite converted versions and indexes them for alignments.

01

index versions

bismark:

Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.

Extracts methylation information for individual cytosines from alignments.

0101

bedgraph methylation_calls coverage report mbias versions

bismark:

Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.

Collects bismark alignment reports

01234

report versions

bismark:

Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.

Uses Bismark report files of several samples in a run folder to generate a graphical summary HTML report.

00000

summary versions

bismark:

Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.

Module to build the VDJ reference needed by the 10x Genomics Cell Ranger tool. Uses the cellranger mkvdjref command.

0000

reference versions

cellranger:

Cell Ranger processes data from 10X Genomics Chromium kits. cellranger vdj takes FASTQ files from cellranger mkfastq or bcl2fastq for V(D)J libraries and performs sequence assembly and paired clonotype calling. It uses the Chromium cellular barcodes and UMIs to assemble V(D)J transcripts per cell. Clonotypes and CDR3 sequences are output as a .vloupe file which can be loaded into Loupe V(D)J Browser.

Module to use Cell Ranger's pipelines analyze sequencing data produced from Chromium Single Cell Immune Profiling.

010

outs versions

cellranger:

Cell Ranger processes data from 10X Genomics Chromium kits. cellranger vdj takes FASTQ files from cellranger mkfastq or bcl2fastq for V(D)J libraries and performs sequence assembly and paired clonotype calling. It uses the Chromium cellular barcodes and UMIs to assemble V(D)J transcripts per cell. Clonotypes and CDR3 sequences are output as a .vloupe file which can be loaded into Loupe V(D)J Browser.

ClipKIT is a fast and flexible alignment trimming tool that keeps phylogenetically informative sites and removes those that display characteristics poor phylogenetic signal.

01

clipkit versions

Runs the Clippy CLIP peak caller

0100

peaks summits intergenic_gtf versions

Given segmented log2 ratio estimates (.cns), derive each segmentโ€™s absolute integer copy number

012

cns versions

cnvkit:

CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent.

CNVnator is a command line tool for CNV/CNA analysis from depth-of-coverage by mapped reads.

012010101

root tab versions

cnvnator:

Tool for calling copy number variations.

convert2vcf.pl is command line tool to convert CNVnator calls to vcf format.

01

vcf versions

cnvnator:

Tool for calling copy number variations.

Command line tool for calling CNVs in whole genome sequencing data

010

pytor versions

cnvpytor:

calling CNVs using read depth

calculates read depth histograms

010

pytor versions

cnvpytor:

calling CNVs using read depth

command line tool for CNV/CNA analysis. This step imports the read depth data into a root pytor file.

01200

pytor versions

cnvpytor -rd:

calling CNVs using read depth

partitioning read depth histograms

010

pytor versions

cnvpytor:

calling CNVs using read depth

view function to generate vcfs

0100

vcf tsv xls versions

cnvpytor:

calling CNVs using read depth

structural-variant calling with cutesv

01201

vcf versions

DeepSomatic is an extension of deep learning-based variant caller DeepVariant that takes aligned reads (in BAM or CRAM format) from tumor and normal data, produces pileup image tensors from them, classifies each tensor using a convolutional neural network, and finally reports somatic variants in a standard VCF or gVCF file.

0123401010101

vcf vcf_tbi gvcf gvcf_tbi versions

(DEPRECATED - see main.nf) DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

012301010101

vcf vcf_tbi gvcf gvcf_tbi versions

Call variants from the examples produced by make_examples

01

call_variants_tfrecords versions

deepvariant:

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

Transforms the input alignments to a format suitable for the deep neural network variant caller

012301010101

examples gvcf versions

deepvariant:

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

012010101

vcf vcf_tbi gvcf gvcf_tbi versions

deepvariant:

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

012301010101

vcf vcf_tbi gvcf gvcf_tbi versions

deepvariant:

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

01

report versions

deepvariant:

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

Call structural variants

0123450101

bcf csi versions

delly:

Structural variant discovery by integrated paired-end and split-read analysis

SV callers like lumpy look at split-reads and pair distances to find structural variants. This tool is a fast way to add depth information to those calls. This can be used as additional information for filtering variants; for example we will be skeptical of deletion calls that do not have lower than average coverage compared to regions with similar gc-content.

01234500

vcf versions

Dysgu calls structural variants (SVs) from mapped sequencing reads. It is designed for accurate and efficient detection of structural variations.

012012

vcf tbi versions

Convert a file in FASTA format to the ELFASTA format

01

elfasta log versions

elprep:

elPrep is a high-performance tool for preparing .sam/.bam files for variant calling in sequencing pipelines. It can be used as a drop-in replacement for SAMtools/Picard/GATK4.

Filter, sort and markdup sam/bam files, with optional BQSR and variant calling.

012345601010100000

bam logs metrics recall gvcf table activity_profile assembly_regions versions

elprep:

elPrep is a high-performance tool for preparing .sam/.bam files for variant calling in sequencing pipelines. It can be used as a drop-in replacement for SAMtools/Picard/GATK4.

Merge split bam/sam chunks in one file

01

bam versions

elprep:

elPrep is a high-performance tool for preparing .sam/.bam files for variant calling in sequencing pipelines. It can be used as a drop-in replacement for SAMtools/Picard/GATK4.

Split bam file into manageable chunks

01

bam versions

elprep:

elPrep is a high-performance tool for preparing .sam/.bam files for variant calling in sequencing pipelines. It can be used as a drop-in replacement for SAMtools/Picard/GATK4.

Uses FGBIO CallDuplexConsensusReads to call duplex consensus sequences from reads generated from the same double-stranded source molecule.

0100

bam versions

fgbio:

A set of tools for working with genomic and high throughput sequencing data, including UMIs

Calls consensus sequences from reads with the same unique molecular tag.

0100

bam versions

fgbio:

Tools for working with genomic and high throughput sequencing data.

Uses FGBIO FilterConsensusReads to filter consensus reads generated by CallMolecularConsensusReads or CallDuplexConsensusReads.

0101000

bam versions

fgbio:

A set of tools for working with genomic and high throughput sequencing data, including UMIs

A haplotype-based variant detector

0123450101010101

vcf versions

call variant and sequencing depth information of the variant

010

variants versions

freyja:

Freyja recovers relative lineage abundances from mixed SARS-CoV-2 samples and provides functionality to analyze lineage dynamics.

Gene Allele Mutation Microbial Assessment

010

gamma psl gff fasta versions

gamma:

Tool for Gene Allele Mutation Microbial Assessment

Performs local realignment around indels to correct for mapping errors

012301010101

bam versions

gatk:

The full Genome Analysis Toolkit (GATK) framework, license restricted.

Generates a list of locations that should be considered for local realignment prior genotyping.

01201010101

intervals versions

gatk:

The full Genome Analysis Toolkit (GATK) framework, license restricted.

SNP and Indel variant caller on a per-locus basis

01201010101010101

vcf versions

gatk:

The full Genome Analysis Toolkit (GATK) framework, license restricted.

Apply a score cutoff to filter variants based on a recalibration table. AplyVQSR performs the second pass in a two-stage process called Variant Quality Score Recalibration (VQSR). Specifically, it applies filtering to the input variants based on the recalibration table produced in the first step by VariantRecalibrator and a target sensitivity value.

012345000

vcf tbi versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Calculates the fraction of reads from cross-sample contamination based on summary tables from getpileupsummaries. Output to be used with filtermutectcalls.

012

contamination segmentation versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Combine per-sample gVCF files produced by HaplotypeCaller into a multi-sample gVCF file

012000

combined_gvcf versions

gatk4:

Genome Analysis Toolkit (GATK4). Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Determines the baseline contig ploidy for germline samples given counts data

0123010

calls model versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Filters the raw output of mutect2, can optionally use outputs of calculatecontamination and learnreadorientationmodel to improve filtering.

01234567010101

vcf tbi stats versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Perform joint genotyping on one or more samples pre-called with HaplotypeCaller.

012340101010101

vcf tbi versions

gatk4:

Genome Analysis Toolkit (GATK4)

Calls copy-number variants in germline samples given their counts and the output of DetermineGermlineContigPloidy.

01234

cohortcalls cohortmodel casecalls versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Call germline SNPs and indels via local re-assembly of haplotypes

012340101010101

vcf tbi bam versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Call somatic SNVs and indels via local assembly of haplotypes.

01230101010000

vcf tbi stats f1r2 versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Postprocesses the output of GermlineCNVCaller and generates VCFs and denoised copy ratios

0123

intervals segments denoised versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Peak-calling for ChIP-seq and ATAC-seq enrichment experiments

0120

peak versions bedgraph_pvalues bedgraph_pileup bed_intervals duplicates

Compute the r2 correlation between imputed dosages (in MAF bins) and highly-confident genotype calls from the high-coverage dataset.

01234567000

errors_cal errors_grp errors_spl rsquare_grp rsquare_spl versions

glimpse:

GLIMPSE is a phasing and imputation method for large-scale low-coverage sequencing studies.

Generates haplotype calls by sampling haplotype estimates

01

haplo_sampled versions

glimpse:

GLIMPSE is a phasing and imputation method for large-scale low-coverage sequencing studies.

merge gVCF files and perform joint variant calling

0101

bcf versions

Tools for population-scale genotyping using pangenome graphs.

01201010

vcf tbi versions

graphtyper:

A graph-based variant caller capable of genotyping population-scale short read data sets while incorporating previously discovered variants.

Tools for population-scale genotyping using pangenome graphs.

01

vcf tbi versions

graphtyper:

A graph-based variant caller capable of genotyping population-scale short read data sets while incorporating previously discovered variants.

pacbio structural variant calling tool

01201201

vcf csv versions

Call variants from a BAM file using iVar

010000

tsv mpileup versions

ivar:

iVar - a computational package that contains functions broadly useful for viral amplicon-based sequencing.

Filtering VCF with dynamically-compiled java expressions

01230101010101

vcf tbi csi versions

jvarkit:

Java utilities for Bioinformatics.

bcftools:

View, subset and filter VCF or BCF files by position and filtering expression. Convert between VCF and BCF

quantifies scRNA-seq data from fastq files using kb-python.

01000000

count versions matrix

kb:

kallisto and bustools are wrapped in an easy-to-use program called kb

Module that calls normalize-by-median.py from khmer. The module can take a mix of paired end (interleaved) and single end reads. If both types are provided, only a single file with single ends is possible.

000

reads versions

khmer:

khmer k-mer counting library

Lofreq subcommand to for insert base and indel alignment qualities

010

bam versions

lofreq:

A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data

Lofreq subcommand to call low frequency variants from alignments

0120

vcf versions

lofreq:

A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data

It predicts variants using multiple processors

01230101

vcf tbi versions

lofreq:

Lofreq is a fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data. It's call-parallel programme predicts variants using multiple processors

Lofreq subcommand to remove variants with low coverage or strand bias potential

01

vcf versions

lofreq:

A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data

Inserts indel qualities in a BAM file

0101

bam versions

lofreq:

Lofreq is a fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data. It's indelqual programme inserts indel qualities in a BAM file

Lofreq subcommand to call low frequency variants from alignments when tumor-normal paired samples are available

0123450101

vcf versions

lofreq:

A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data

Lofreq subcommand to call low frequency variants from alignments when tumor-normal paired samples are available

0101

bam versions

lofreq:

A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data

Peak calling of enriched genomic regions of ChIP-seq and ATAC-seq experiments

0120

peak xls versions gapped bed bdg

macs2:

Model Based Analysis for ChIP-Seq data

Peak calling of enriched genomic regions of ChIP-seq and ATAC-seq experiments

0120

peak xls versions gapped bed bdg

macs3:

Model Based Analysis for ChIP-Seq data

Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads. This script reformats inversions into single inverted sequence junctions which was the format used in Manta versions <= 1.4.0.

0101

vcf tbi versions

manta:

Structural variant and indel caller for mapped sequencing data

Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads. It is optimized for analysis of germline variation in small sets of individuals and somatic variation in tumor/normal sample pairs.

0123401010

candidate_small_indels_vcf candidate_small_indels_vcf_tbi candidate_sv_vcf candidate_sv_vcf_tbi diploid_sv_vcf diploid_sv_vcf_tbi versions

manta:

Structural variant and indel caller for mapped sequencing data

Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads. It is optimized for analysis of germline variation in small sets of individuals and somatic variation in tumor/normal sample pairs.

012345601010

candidate_small_indels_vcf candidate_small_indels_vcf_tbi candidate_sv_vcf candidate_sv_vcf_tbi diploid_sv_vcf diploid_sv_vcf_tbi somatic_sv_vcf somatic_sv_vcf_tbi versions

manta:

Structural variant and indel caller for mapped sequencing data

Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads. It is optimized for analysis of germline variation in small sets of individuals and somatic variation in tumor/normal sample pairs.

0123401010

candidate_small_indels_vcf candidate_small_indels_vcf_tbi candidate_sv_vcf candidate_sv_vcf_tbi tumor_sv_vcf tumor_sv_vcf_tbi versions

manta:

Structural variant and indel caller for mapped sequencing data

A tool to create consensus sequences and variant calls from nanopore sequencing data

012

assembly versions

Extracts per-base methylation metrics from alignments

01200

bedgraph methylkit versions

methyldackel:

Methylation caller from MethylDackel, a (mostly) universal methylation extractor for methyl-seq experiments.

A tool for quality control and tracing taxonomic origins of microRNA sequencing data

0120

html json tsv all_fa rnatype_unknown_fa versions

mirtrace:

miRTrace is a new quality control and taxonomic tracing tool developed specifically for small RNA sequencing data (sRNA-Seq). Each sample is characterized by profiling sequencing quality, read length, sequencing depth and miRNA complexity and also the amounts of miRNAs versus undesirable sequences (derived from tRNAs, rRNAs and sequencing artifacts). In addition to these routine quality control (QC) analyses, miRTrace can accurately and sensitively resolve taxonomic origins of small RNA-Seq data based on the composition of clade-specific miRNAs. This feature can be used to detect cross-clade contaminations in typical lab settings. It can also be applied for more specific applications in forensics, food quality control and clinical diagnosis, for instance tracing the origins of meat products or detecting parasitic microRNAs in host serum.

pre-filtering and calculating position-specific summary statistics using the Markov substitution model

0123401

txt versions

MuSE:

Somatic point mutation caller based on Markov substitution model for molecular evolution

Computes tier-based cutoffs from a sample-specific error model which is generated by muse/call and reports the finalized variants

01012

vcf versions

MuSE:

Somatic point mutation caller based on Markov substitution model for molecular evolution

Get dataset for SARS-CoV-2 genome clade assignment, mutation calling, and sequence quality checks (C++ implementation)

00

dataset versions

nextclade:

SARS-CoV-2 genome clade assignment, mutation calling, and sequence quality checks

SARS-CoV-2 genome clade assignment, mutation calling, and sequence quality checks (C++ implementation)

010

csv csv_errors csv_insertions tsv json json_auspice ndjson fasta_aligned fasta_translation nwk versions

nextclade:

SARS-CoV-2 genome clade assignment, mutation calling, and sequence quality checks

Calls CNVs in bam files from tumor patients

0123400

png profile summary versions

NVIDIA Clara Parabricks GPU-accelerated variant calls annotation based on dbSNP database

0123

vcf versions

parabricks:

NVIDIA Clara Parabricks GPU-accelerated genomics tools

NVIDIA Clara Parabricks GPU-accelerated germline variant calling, replicating deepvariant.

012301

vcf gvcf versions

parabricks:

NVIDIA Clara Parabricks GPU-accelerated genomics tools

NVIDIA Clara Parabricks GPU-accelerated germline variant calling, replicating GATK haplotypecaller.

012301

vcf versions

parabricks:

NVIDIA Clara Parabricks GPU-accelerated genomics tools

NVIDIA Clara Parabricks GPU-accelerated somatic variant calling, replicating GATK Mutect2.

0123450100

vcf stats versions

parabricks:

NVIDIA Clara Parabricks GPU-accelerated genomics tools

HiFi-based caller for highly homologous genes

0120101

json bam bai vcf vcf_index versions

pbsv/call - PacBio structural variant (SV) calling and analysis tools

0101

vcf versions

pbsv:

pbsv - PacBio structural variant (SV) calling and analysis tools

pbsv - PacBio structural variant (SV) signature discovery tool

0101

svsig versions

pbsv:

pbsv - PacBio structural variant (SV) calling and analysis tools

Automatically improve draft assemblies and find variation among strains, including large event detection

010120

improved_assembly vcf change_record tracks_bed tracks_wig versions

Main caller script for peak calling

0120

divergent_TREs bidirectional_TREs unidirectional_TREs peakcalling_log versions

pints:

Peak Identifier for Nascent Transcripts Starts (PINTS)

Platypus is a tool that efficiently and accurately calling genetic variants from next-generation DNA sequencing data

01234000

vcf tbi log version

Analyses binary variant call format (BCF) files using plink

01

bed bim fam versions

plink:

Whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.

Analyses variant calling files using plink

01

bed bim fam versions

plink:

Whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner

PoolSNP is a heuristic SNP caller, which uses an MPILEUP file and a reference genome in FASTA format as inputs.

0101012

vcf max_cov bad_sites versions

Calculate intervals coverage for each sample. N.B. the tool can not handle staging files with symlinks, stageInMode should be set to 'link'.

0120

txt png loess_qc_txt loess_txt versions

purecn:

Copy number calling and SNV classification using targeted short read sequencing

Generate on and off-target intervals for PureCN from a list of targets

01010

txt bed versions

purecn:

Copy number calling and SNV classification using targeted short read sequencing

Build a normal database for coverage normalization from all the (GC-normalized) normal coverage files. N.B. as reported in https://www.bioconductor.org/packages/devel/bioc/vignettes/PureCN/inst/doc/Quick.html, it is advised to provide a normal panel (VCF format) to precompute mapping bias for faster runtimes.

012300

rds png bias_rds bias_bed low_cov_bed versions

purecn:

Copy number calling and SNV classification using targeted short read sequencing

Run PureCN workflow to normalize, segment and determine purity and ploidy

01200

pdf local_optima_pdf seg genes_csv amplification_pvalues_csv vcf_gz variants_csv loh_csv chr_pdf segmentation_pdf multisample_seg versions

purecn:

Copy number calling and SNV classification using targeted short read sequencing

Call SNVs/indels from BAM files for all target genes.

0120100

vcf tbi versions

pypgx:

A Python package for pharmacogenomics research

PyPGx pharmacogenomics genotyping pipeline for NGS data.

012345010

results cnv_calls consolidated_variants versions

pypgx:

A Python package for pharmacogenomics research

Markup VCF file using rho-calls.

012010

vcf versions

rhocall:

Call regions of homozygosity and make tentative UPD calls.

Call regions of homozygosity and make tentative UPD calls

0101

bed wig versions

rhocall:

Call regions of homozygosity and make tentative UPD calls.

The VCFeval tool of RTG tools. It is used to evaluate called variants for agreement with a baseline variant set

012345601

tp_vcf tp_tbi fn_vcf fn_tbi fp_vcf fp_tbi baseline_vcf baseline_tbi snp_roc non_snp_roc weighted_roc summary phasing versions

rtgtools:

RealTimeGenomics Tools -- Utilities for accurate VCF comparison and manipulation

Calling lowest common ancestors from multi-mapped reads in SAM/BAM/CRAM files

0120

csv json bam versions

sam2lca:

Lowest Common Ancestor on SAM/BAM/CRAM alignment files

Call peaks using SEACR on sequenced reads in bedgraph format

0120

bed versions

seacr:

SEACR is intended to call peaks and enriched regions from sparse CUT&RUN or chromatin profiling data in which background is dominated by "zeroes" (i.e. regions with no read coverage).

Apply a score cutoff to filter variants based on a recalibration table. Sentieon's Aplyvarcal performs the second pass in a two-stage process called Variant Quality Score Recalibration (VQSR). Specifically, it applies filtering to the input variants based on the recalibration table produced in the previous step VarCal and a target sensitivity value. https://support.sentieon.com/manual/usages/general/#applyvarcal-algorithm

0123450101

vcf tbi versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Create BWA index for reference genome

01

index versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Performs fastq alignment to a fasta reference using Sentieon's BWA MEM

01010101

bam_and_bai versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Accelerated implementation of the Picard CollectVariantCallingMetrics tool.

012012010101

metrics summary versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Accelerated implementation of the GATK DepthOfCoverage tool.

01201010101

per_locus sample_summary statistics coverage_counts coverage_proportions interval_summary versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Collects multiple quality metrics from a bam file

01201010

mq_metrics qd_metrics gc_summary gc_metrics aln_metrics is_metrics mq_plot qd_plot is_plot gc_plot versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Runs the sentieon tool LocusCollector followed by Dedup. LocusCollector collects read information that is used by Dedup which in turn marks or removes duplicate reads.

0120101

cram crai bam bai score metrics metrics_multiqc_tsv versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

modifies the input VCF file by adding the MLrejected FILTER to the variants

012010101

vcf index versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

DNAscope algorithm performs an improved version of Haplotype variant calling.

01230101010101000

vcf vcf_tbi gvcf gvcf_tbi versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Perform joint genotyping on one or more samples pre-called with Sentieon's Haplotyper.

012301010101

vcf_gz vcf_gz_tbi versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Runs Sentieon's haplotyper for germline variant calling.

012340101010100

vcf vcf_tbi gvcf gvcf_tbi versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Generate recalibration table and optionally perform base quality recalibration

01201010101010

table table_post recal_alignment csv pdf versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Merges BAM files, and/or convert them into cram files. Also, outputs the result of applying the Base Quality Score Recalibration to a file.

0120101

output index output_index versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Filters the raw output of sentieon/tnhaplotyper2.

01234560101

vcf vcf_tbi stats versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Tnhaplotyper2 performs somatic variant calling on the tumor-normal matched pairs.

01230101010101010100

orientation_data contamination_data contamination_segments stats vcf index versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

TNscope algorithm performs somatic variant calling on the tumor-normal matched pair or the tumor only data, using a Haplotyper algorithm.

012010101201201201

vcf index versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Module for Sentieons VarCal. The VarCal algorithm calculates the Variant Quality Score Recalibration (VQSR). VarCal builds a recalibration model for scoring variant quality. https://support.sentieon.com/manual/usages/general/#varcal-algorithm

01200000

recal idx tranches plots versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Collects whole genome quality metrics from a bam file

012010101

wgs_metrics versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

PileupCaller is a tool to create genotype calls from bam files using read-sampling methods

0100

eigenstrat plink freqsum versions

sequencetools:

Tools for population genetics on sequencing data

Severus is a somatic structural variation (SV) caller for long reads (both PacBio and ONT)

01234501

log read_qual breakpoints_double read_alignments read_ids collapsed_dup loh all_vcf all_breakpoints_clusters_list all_breakpoints_clusters all_plots somatic_vcf somatic_breakpoints_clusters_list somatic_breakpoints_clusters somatic_plots versions

Ligate multiple phased BCF/VCF files into a single whole chromosome file. Typically run to ligate multiple chunks of phased common variants.

012

merged_variants versions

shapeit5:

Fast and accurate method for estimation of haplotypes (phasing)

Tool to phase common sites, typically SNP array data, or the first step of WES/WGS data.

0123401201201

phased_variant versions

shapeit5:

Fast and accurate method for estimation of haplotypes (phasing)

tool to call the copy number of full-length SMN1, full-length SMN2, as well as SMN2ฮ”7โ€“8 (SMN2 with a deletion of Exon7-8) from a whole-genome sequencing (WGS) BAM file.

012

smncopynumber run_metrics versions

smoove simplifies and speeds calling and genotyping SVs for short reads. It also improves specificity by removing many spurious alignment signals that are indicative of low-level noise and often contribute to spurious calls. Developed by Brent Pedersen.

01230101

vcf versions

smoove:

structural variant calling and genotyping with existing tools, but, smoothly

structural-variant calling with sniffles

012010100

vcf tbi snf versions

Core-SNP alignment from Snippy outputs

0120

aln full_aln tab vcf txt versions

snippy:

Rapid bacterial SNP calling and core genome alignments

Rapid haploid variant calling

010

tab csv html vcf bed gff bam bai log aligned_fa consensus_fa consensus_subs_fa raw_vcf filt_vcf vcf_gz vcf_csi txt versions

snippy:

Rapid bacterial SNP calling and core genome alignments

Annotate a VCF file with another VCF file

012012

vcf versions

snpsift:

SnpSift is a toolbox that allows you to filter and manipulate annotated files

The dbNSFP is an integrated database of functional predictions from multiple algorithms

012012

vcf versions

snpsift:

SnpSift is a toolbox that allows you to filter and manipulate annotated files

Somalier can extract informative sites, evaluate relatedness, and perform quality-control on BAM/CRAM/BCF/VCF/GVCF or from jointly-called VCFs

01012

tsv html versions

somalier:

Somalier can extract informative sites, evaluate relatedness, and perform quality-control on BAM/CRAM/BCF/VCF/GVCF or from jointly-called VCFs

Somalier can extract informative sites, evaluate relatedness, and perform quality-control on BAM/CRAM/BCF/VCF/GVCF or from jointly-called VCFs

012010101

extract versions

somalier:

Somalier can extract informative sites, evaluate relatedness, and perform quality-control on BAM/CRAM/BCF/VCF/GVCF or from jointly-called VCFs

Somalier can extract informative sites, evaluate relatedness, and perform quality-control on BAM/CRAM/BCF/VCF/GVCF or from jointly-called VCFs

0120

html pairs_tsv samples_tsv versions

somalier:

Somalier can extract informative sites, evaluate relatedness, and perform quality-control on BAM/CRAM/BCF/VCF/GVCF or from jointly-called VCFs

Strelka2 is a fast and accurate small variant caller optimized for analysis of germline variation

0123400

vcf vcf_tbi genome_vcf genome_vcf_tbi versions

strelka:

Strelka calls somatic and germline small variants from mapped sequencing reads

Strelka2 is a fast and accurate small variant caller optimized for analysis of germline variation in small cohorts and somatic variation in tumor/normal sample pairs

01234567800

vcf_indels vcf_indels_tbi vcf_snvs vcf_snvs_tbi versions

strelka:

Strelka calls somatic and germline small variants from mapped sequencing reads

SummarizedExperiment container

010101

rds log versions

summarizedexperiment:

The SummarizedExperiment container contains one or more assays, each represented by a matrix-like object of numeric or other mode. The rows typically represent genomic ranges of interest and the columns represent samples.

SvABA is an efficient and accurate method for detecting SVs from short-read sequencing data using genome-wide local assembly with low memory and computing requirements

01234010101010101

sv indel germ_indel germ_sv som_indel som_sv unfiltered_sv unfiltered_indel unfiltered_germ_indel unfiltered_germ_sv unfiltered_som_indel unfiltered_som_sv raw_calls discordants log versions

Convert SV calls to a standardized format.

010

standardized_vcf versions

svtk:

Utilities for consolidating, filtering, resolving, and annotating structural variants.

A tool to standardize VCF files from structural variant callers

0123

vcf versions

Estimating poly(A)-tail lengths from basecalled fast5 files produced by Nanopore sequencing of RNA and DNA

01

csv_gz versions

Computes the coverage of different regions from the bam file.

0101

cov wig versions

tiddit:

TIDDIT - structural variant calling.

Given baseline and comparison sets of variants, calculate the recall/precision/f-measure

0123450101

fn_vcf fn_tbi fp_vcf fp_tbi tp_base_vcf tp_base_tbi tp_comp_vcf tp_comp_tbi summary versions

truvari:

Structural variant comparison tool for VCFs

Simple software to call UPD regions from germline exome/wgs trios.

01

bed versions

The Java port of the VarDict variant caller

01230101

vcf versions

Call variants for a given scenario specified with the varlociraptor calling grammar, preprocessed by varlociraptor preprocessing

01200

bcf_gz vcf_gz bcf vcf versions

varlociraptor:

Flexible, uncertainty-aware variant calling with parameter free filtration via FDR control.

In order to judge about candidate indel and structural variants, Varlociraptor needs to know about certain properties of the underlying sequencing experiment in combination with the used read aligner.

010101

alignment_properties_json versions

varlociraptor:

Flexible, uncertainty-aware variant calling with parameter free filtration via FDR control.

Obtains per-sample observations for the actual calling process with varlociraptor calls

012340101

bcf_gz vcf_gz bcf vcf versions

varlociraptor:

Flexible, uncertainty-aware variant calling with parameter free filtration via FDR control.

Constructs a graph from a reference and variant calls or a multiple sequence alignment file

01230101

graph versions

vg:

Variation graph data structures, interchange formats, alignment, genotyping, and variant calling methods.

Deconstruct snarls present in a variation graph in GFA format to variants in VCF format

0100

vcf versions

vg:

Variation graph data structures, interchange formats, alignment, genotyping, and variant calling methods.

write your description here

01

xg vg_index versions

vg:

Variation graph data structures, interchange formats, alignment, genotyping, and variant calling methods.

calculate locally stable secondary structures of RNAs

0

rnalfold_txt versions

viennarna:

calculate locally stable secondary structures of RNAs

Compute locally stable RNA secondary structure with a maximal base pair span. For a sequence of length n and a base pair span of L the algorithm uses only O(n+LL) memory and O(nL*L) CPU time. Thus it is practical to โ€œscanโ€ very large genomes for short RNA structures. Output consists of a list of secondary structure components of size <= L, one entry per line. Each output line contains the predicted local structure its energy in kcal/mol and the starting position of the local structure.

The wham suite consists of two programs, wham and whamg. wham, the original tool, is a very sensitive method with a high false discovery rate. The second program, whamg, is more accurate and better suited for general structural variant (SV) discovery.

01200

vcf tbi graph versions

Click here to trigger an update.