Available Modules

Modules are the building stones of all DSL2 nf-core blocks. You can find more info from nf-core website, if you would like to write your own module.

  • bam 48
  • fasta 36
  • reference 20
  • cram 20
  • metagenomics 19
  • sam 19
  • sentieon 17
  • genomics 16
  • alignment 16
  • genome 15
  • index 15
  • ancient DNA 14
  • classification 12
  • fastq 10
  • align 9
  • map 9
  • aDNA 9
  • scWGBS 9
  • WGBS 9
  • DNA methylation 9
  • copy number 8
  • palaeogenomics 8
  • archaeogenomics 8
  • damage 8
  • biscuit 8
  • bisulfite sequencing 8
  • sort 7
  • variant calling 7
  • MSA 7
  • taxonomic classification 7
  • aligner 7
  • vcf 6
  • assembly 6
  • build 6
  • neural network 6
  • bwa 6
  • machine learning 6
  • cnvkit 6
  • cnv 5
  • kmer 5
  • mapping 5
  • kraken2 5
  • sourmash 5
  • deamination 5
  • filter 4
  • qc 4
  • nanopore 4
  • k-mer 4
  • taxonomy 4
  • contamination 4
  • long reads 4
  • mags 4
  • markduplicates 4
  • short-read 4
  • snp 4
  • counts 4
  • profiling 4
  • mem 4
  • microbiome 4
  • mtDNA 4
  • DNA sequencing 4
  • targeted sequencing 4
  • hybrid capture sequencing 4
  • copy number alteration calling 4
  • archaeogenetics 4
  • miscoding lesions 4
  • palaeogenetics 4
  • malt 4
  • dna 4
  • matching 4
  • ngscheckmate 4
  • ganon 4
  • DNA sequence 4
  • gatk4 3
  • annotation 3
  • database 3
  • merge 3
  • classify 3
  • taxonomic profiling 3
  • samtools 3
  • db 3
  • blast 3
  • evaluation 3
  • view 3
  • diamond 3
  • containment 3
  • cfDNA 3
  • aln 3
  • variant_calling 3
  • windowmasker 3
  • bracken 3
  • C to T 3
  • bed 2
  • bacteria 2
  • coverage 2
  • statistics 2
  • gtf 2
  • convert 2
  • conversion 2
  • trimming 2
  • consensus 2
  • cna 2
  • wgs 2
  • protein 2
  • bqsr 2
  • stats 2
  • metrics 2
  • filtering 2
  • dedup 2
  • report 2
  • splicing 2
  • interval 2
  • de novo assembly 2
  • clipping 2
  • microsatellite 2
  • chromosome 2
  • abundance 2
  • pileup 2
  • ichorcna 2
  • megan 2
  • tnhaplotyper2 2
  • haplogroups 2
  • reformatting 2
  • varcal 2
  • ancient dna 2
  • fixmate 2
  • signature 2
  • FracMinHash sketch 2
  • tumor 2
  • edit distance 2
  • authentication 2
  • HOPS 2
  • cvnkit 2
  • MaltExtract 2
  • quality control 1
  • download 1
  • variant 1
  • pacbio 1
  • count 1
  • single-cell 1
  • gvcf 1
  • bisulfite 1
  • reporting 1
  • compression 1
  • QC 1
  • visualisation 1
  • table 1
  • depth 1
  • repeat 1
  • base quality score recalibration 1
  • haplotype 1
  • sequence 1
  • genotype 1
  • gene 1
  • seqkit 1
  • newick 1
  • msa 1
  • peaks 1
  • duplicates 1
  • scRNA-seq 1
  • deduplication 1
  • mitochondria 1
  • kallisto 1
  • compare 1
  • call 1
  • cat 1
  • mpileup 1
  • antibiotic resistance 1
  • detection 1
  • amps 1
  • merging 1
  • SV 1
  • add 1
  • preprocessing 1
  • haplotypecaller 1
  • rna 1
  • bwameth 1
  • mapper 1
  • RNA-seq 1
  • SNP 1
  • guide tree 1
  • gatk4spark 1
  • spark 1
  • mask 1
  • rrna 1
  • bustools 1
  • dict 1
  • collate 1
  • removal 1
  • msi 1
  • instability 1
  • polish 1
  • dereplicate 1
  • import 1
  • bayesian 1
  • version 1
  • reheader 1
  • BAM 1
  • blastn 1
  • joint genotyping 1
  • blastp 1
  • multi-tool 1
  • mapad 1
  • adna 1
  • c to t 1
  • verifybamid 1
  • DNA contamination estimation 1
  • tnseq 1
  • tnfilter 1
  • readwriter 1
  • dnamodelapply 1
  • dnascope 1
  • postprocessing 1
  • tnscope 1
  • tblastn 1
  • pdb 1
  • taxonomic composition 1
  • post Post-processing 1
  • regtools 1
  • leafcutter 1
  • unmarkduplicates 1
  • mobile genetic elements 1
  • integron 1
  • low-complexity 1
  • masking 1
  • covariance model 1
  • 10x 1
  • cram-size 1
  • size 1
  • post mortem damage 1
  • translate 1
  • single-stranded 1
  • ancientDNA 1
  • authentict 1
  • uniques 1
  • standard 1
  • daa 1
  • rma6 1
  • DNA damage 1
  • NGS 1
  • damage patterns 1
  • microsatellite instability 1
  • GATK UnifiedGenotyper 1
  • SNP table 1
  • contaminant 1
  • mitochondrial 1
  • adapter removal 1
  • collapsing 1
  • interproscan 1
  • panel_of_normals 1
  • sex determination 1
  • genetic sex 1
  • relative coverage 1
  • de-novo 1
  • longread 1
  • faidx 1
  • insert size 1
  • repair 1
  • paired 1
  • read pairs 1
  • readgroup 1
  • applyvarcal 1
  • VQSR 1
  • variant recalibration 1
  • paired-end 1
  • pcr duplicates 1
  • blastx 1
  • polya tail 1
  • antitarget 1
  • access 1
  • fast5 1
  • export 1
  • signatures 1
  • hash sketch 1
  • fracminhash sketch 1
  • target 1
  • calmd 1
  • ampliconclip 1
  • genetic 1
  • pmdtools 1
  • percent on target 1
  • ChIP-Seq 1
  • phantom peaks 1
  • endogenous DNA 1
  • amplicon 1
  • intervals coverage 1
  • genomic intervals 1
  • normal database 1
  • panel of normals 1
  • deletion 1
  • circos 1
  • eklipse 1
  • structural variants 0
  • gff 0
  • variants 0
  • split 0
  • gfa 0
  • somatic 0
  • quality 0
  • clustering 0
  • binning 0
  • proteomics 0
  • VCF 0
  • bedtools 0
  • contigs 0
  • imputation 0
  • phylogeny 0
  • rnaseq 0
  • sv 0
  • isoseq 0
  • bcftools 0
  • graph 0
  • variation graph 0
  • picard 0
  • methylation 0
  • illumina 0
  • databases 0
  • bisulphite 0
  • long-read 0
  • methylseq 0
  • indexing 0
  • imaging 0
  • sequences 0
  • plink2 0
  • phage 0
  • demultiplex 0
  • serotype 0
  • openms 0
  • tsv 0
  • antimicrobial resistance 0
  • 5mC 0
  • pairs 0
  • amr 0
  • histogram 0
  • matrix 0
  • plot 0
  • expression 0
  • pangenome graph 0
  • bins 0
  • cluster 0
  • protein sequence 0
  • example 0
  • structure 0
  • searching 0
  • annotate 0
  • cooler 0
  • transcriptome 0
  • LAST 0
  • validation 0
  • gzip 0
  • germline 0
  • virus 0
  • checkm 0
  • metagenome 0
  • phasing 0
  • mmseqs2 0
  • transcript 0
  • completeness 0
  • iCLIP 0
  • low-coverage 0
  • bcf 0
  • mappability 0
  • glimpse 0
  • segmentation 0
  • sketch 0
  • mkref 0
  • ucsc 0
  • mag 0
  • feature 0
  • umi 0
  • population genetics 0
  • gff3 0
  • bismark 0
  • spatial 0
  • genotyping 0
  • decompression 0
  • ncbi 0
  • complexity 0
  • hmmer 0
  • hmmsearch 0
  • bedGraph 0
  • antimicrobial peptides 0
  • pangenome 0
  • extract 0
  • json 0
  • vsearch 0
  • low frequency variant calling 0
  • reads 0
  • prokaryote 0
  • prediction 0
  • mirna 0
  • kmers 0
  • plasmid 0
  • single 0
  • demultiplexing 0
  • multiple sequence alignment 0
  • tumor-only 0
  • csv 0
  • NCBI 0
  • antimicrobial resistance genes 0
  • profile 0
  • differential 0
  • single cell 0
  • FASTQ 0
  • MAF 0
  • visualization 0
  • tabular 0
  • ont 0
  • benchmark 0
  • indels 0
  • text 0
  • isolates 0
  • fastx 0
  • mutect2 0
  • de novo 0
  • fragment 0
  • structural 0
  • wxs 0
  • distance 0
  • diversity 0
  • svtk 0
  • concatenate 0
  • reference-free 0
  • arg 0
  • gridss 0
  • 3-letter genome 0
  • query 0
  • riboseq 0
  • summary 0
  • coptr 0
  • ptr 0
  • idXML 0
  • adapters 0
  • sequencing 0
  • bigwig 0
  • transcriptomics 0
  • skani 0
  • propr 0
  • snps 0
  • bin 0
  • clean 0
  • deep learning 0
  • hic 0
  • retrotransposon 0
  • read depth 0
  • enrichment 0
  • cut 0
  • normalization 0
  • compress 0
  • phylogenetic placement 0
  • bgzip 0
  • gsea 0
  • ancestry 0
  • chunk 0
  • public datasets 0
  • STR 0
  • parsing 0
  • bedgraph 0
  • quantification 0
  • genmod 0
  • HiFi 0
  • interval_list 0
  • BGC 0
  • sylph 0
  • ampir 0
  • ccs 0
  • HMM 0
  • hmmcopy 0
  • xeniumranger 0
  • biosynthetic gene cluster 0
  • ATAC-seq 0
  • resistance 0
  • peak-calling 0
  • isomir 0
  • pypgx 0
  • family 0
  • fgbio 0
  • microarray 0
  • genome assembler 0
  • paf 0
  • happy 0
  • fungi 0
  • ranking 0
  • union 0
  • sample 0
  • redundancy 0
  • CLIP 0
  • umitools 0
  • logratio 0
  • bcl2fastq 0
  • circrna 0
  • image 0
  • telomere 0
  • bedpe 0
  • fai 0
  • nucleotide 0
  • intervals 0
  • DRAMP 0
  • converter 0
  • pseudoalignment 0
  • kraken 0
  • genome mining 0
  • ambient RNA removal 0
  • organelle 0
  • mzml 0
  • archiving 0
  • prokka 0
  • npz 0
  • typing 0
  • khmer 0
  • entrez 0
  • krona 0
  • html 0
  • krona chart 0
  • notebook 0
  • reports 0
  • bacterial 0
  • amplify 0
  • RNA 0
  • highly_multiplexed_imaging 0
  • mcmicro 0
  • population genomics 0
  • image_analysis 0
  • scaffolding 0
  • ataqv 0
  • rna_structure 0
  • benchmarking 0
  • duplication 0
  • microbes 0
  • rsem 0
  • amplicon sequencing 0
  • neubi 0
  • miRNA 0
  • repeat expansion 0
  • eukaryotes 0
  • fusion 0
  • mlst 0
  • hi-c 0
  • prokaryotes 0
  • panel 0
  • mkfastq 0
  • small indels 0
  • pairsam 0
  • angsd 0
  • subsample 0
  • fcs-gx 0
  • somatic variants 0
  • wastewater 0
  • arriba 0
  • pan-genome 0
  • polishing 0
  • insert 0
  • gene expression 0
  • fam 0
  • bim 0
  • vrhyme 0
  • PCA 0
  • fastk 0
  • replace 0
  • covid 0
  • amplicon sequences 0
  • structural_variants 0
  • dictionary 0
  • lineage 0
  • indel 0
  • UMI 0
  • pangolin 0
  • cellranger 0
  • zip 0
  • fingerprint 0
  • wig 0
  • dump 0
  • chimeras 0
  • dist 0
  • lossless 0
  • virulence 0
  • observations 0
  • CRISPR 0
  • score 0
  • relatedness 0
  • shapeit 0
  • long_read 0
  • popscle 0
  • prefetch 0
  • PacBio 0
  • chip-seq 0
  • combine 0
  • png 0
  • cool 0
  • seqtk 0
  • transposons 0
  • ligate 0
  • tabix 0
  • complement 0
  • survivor 0
  • comparison 0
  • transcripts 0
  • spaceranger 0
  • genome assembly 0
  • identity 0
  • remove 0
  • uLTRA 0
  • minimap2 0
  • informative sites 0
  • kinship 0
  • cut up 0
  • quality trimming 0
  • adapter trimming 0
  • genotype-based deconvoltion 0
  • atac-seq 0
  • uncompress 0
  • host 0
  • untar 0
  • genomes 0
  • bamtools 0
  • macrel 0
  • regions 0
  • deeparg 0
  • proteome 0
  • das_tool 0
  • comparisons 0
  • scores 0
  • bakta 0
  • checkv 0
  • unzip 0
  • hidden Markov model 0
  • roh 0
  • das tool 0
  • instrain 0
  • k-mer frequency 0
  • GC content 0
  • proportionality 0
  • mapcounter 0
  • tama 0
  • mash 0
  • haplotypes 0
  • reformat 0
  • ragtag 0
  • gstama 0
  • qualty 0
  • decontamination 0
  • gene set 0
  • minhash 0
  • tree 0
  • hostile 0
  • nucleotides 0
  • checksum 0
  • samples 0
  • gene labels 0
  • gene set analysis 0
  • cnvnator 0
  • assembly evaluation 0
  • maximum likelihood 0
  • hlala_typing 0
  • krakenuniq 0
  • mitochondrion 0
  • registration 0
  • pair 0
  • trgt 0
  • nacho 0
  • nanostring 0
  • interactive 0
  • hla 0
  • small variants 0
  • krakentools 0
  • split_kmers 0
  • screen 0
  • image_processing 0
  • corrupted 0
  • rgfa 0
  • mRNA 0
  • iphop 0
  • lofreq 0
  • multiallelic 0
  • hla_typing 0
  • trancriptome 0
  • refine 0
  • human removal 0
  • pigz 0
  • find 0
  • screening 0
  • hlala 0
  • cleaning 0
  • serogroup 0
  • barcode 0
  • primer 0
  • polyA_tail 0
  • SimpleAF 0
  • standardisation 0
  • fusions 0
  • sequenzautils 0
  • pharokka 0
  • transformation 0
  • rename 0
  • interactions 0
  • regression 0
  • salmonella 0
  • taxids 0
  • taxon name 0
  • zlib 0
  • differential expression 0
  • soft-clipped clusters 0
  • variation 0
  • ampgram 0
  • amptransformer 0
  • bloom filter 0
  • bam2fq 0
  • rtgtools 0
  • orthologs 0
  • Streptococcus pneumoniae 0
  • functional analysis 0
  • cgMLST 0
  • standardise 0
  • taxonomic profile 0
  • standardization 0
  • svdb 0
  • taxon tables 0
  • retrotransposons 0
  • long terminal retrotransposon 0
  • de novo assembler 0
  • small genome 0
  • kma 0
  • salmon 0
  • function 0
  • switch 0
  • orf 0
  • leviosam2 0
  • join 0
  • cancer genomics 0
  • lift 0
  • snpsift 0
  • snpeff 0
  • effect prediction 0
  • metamaps 0
  • shigella 0
  • genetics 0
  • junctions 0
  • runs_of_homozygosity 0
  • anndata 0
  • read-group 0
  • mudskipper 0
  • Pharmacogenetics 0
  • frame-shift correction 0
  • ped 0
  • transcriptomic 0
  • GPU-accelerated 0
  • graph layout 0
  • long-read sequencing 0
  • nextclade 0
  • sequence analysis 0
  • msisensor-pro 0
  • smrnaseq 0
  • micro-satellite-scan 0
  • pharmacogenetics 0
  • MSI 0
  • parallelized 0
  • homoploymer 0
  • orthology 0
  • doublets 0
  • spatial_transcriptomics 0
  • resolve_bioscience 0
  • long terminal repeat 0
  • subset 0
  • purge duplications 0
  • scaffold 0
  • contig 0
  • k-mer index 0
  • COBS 0
  • WGS 0
  • archive 0
  • duplicate 0
  • Read depth 0
  • Duplication purging 0
  • xz 0
  • repeats 0
  • bfiles 0
  • library 0
  • preseq 0
  • ome-tif 0
  • vcflib 0
  • MCMICRO 0
  • mirdeep2 0
  • adapter 0
  • RNA sequencing 0
  • vg 0
  • variant pruning 0
  • otu tables 0
  • profiles 0
  • gatk 0
  • repeat_expansions 0
  • antibiotics 0
  • reads merging 0
  • antismash 0
  • baf 0
  • tab 0
  • metadata 0
  • deconvolution 0
  • eido 0
  • merge mate pairs 0
  • structural-variant calling 0
  • expansionhunterdenovo 0
  • bases 0
  • heatmap 0
  • short reads 0
  • sizes 0
  • evidence 0
  • immunoprofiling 0
  • calling 0
  • RiPP 0
  • fasterq-dump 0
  • scatter 0
  • correction 0
  • settings 0
  • normalize 0
  • intersect 0
  • awk 0
  • tbi 0
  • samplesheet 0
  • secondary metabolites 0
  • sra-tools 0
  • realignment 0
  • panelofnormals 0
  • metagenomes 0
  • emboss 0
  • format 0
  • random forest 0
  • gwas 0
  • spatial_omics 0
  • concat 0
  • NRPS 0
  • allele 0
  • eigenstrat 0
  • region 0
  • allele-specific 0
  • windows 0
  • duplex 0
  • trim 0
  • estimation 0
  • ChIP-seq 0
  • single cells 0
  • genome bins 0
  • recombination 0
  • demultiplexed reads 0
  • interval list 0
  • filtermutectcalls 0
  • aggregate 0
  • intersection 0
  • concordance 0
  • phase 0
  • fetch 0
  • eCLIP 0
  • GEO 0
  • parse 0
  • splice 0
  • gem 0
  • artic 0
  • xenograft 0
  • vdj 0
  • cnv calling 0
  • CNV 0
  • microbial 0
  • validate 0
  • genomad 0
  • RNA-Seq 0
  • identifier 0
  • metagenomic 0
  • microscopy 0
  • unaligned 0
  • rna-seq 0
  • deseq2 0
  • UMIs 0
  • graft 0
  • simulate 0
  • norm 0
  • homozygosity 0
  • biallelic 0
  • autozygosity 0
  • Haplotypes 0
  • sorting 0
  • airrseq 0
  • rad 0
  • getfasta 0
  • genomecov 0
  • structural variant 0
  • bam2fastx 0
  • bam2fastq 0
  • immcantation 0
  • immunoinformatics 0
  • idx 0
  • co-orthology 0
  • sequence similarity 0
  • spectral clustering 0
  • comparative genomics 0
  • closest 0
  • bamtobed 0
  • deep variant 0
  • mutect 0
  • homology 0
  • csi 0
  • transform 0
  • n50 0
  • predict 0
  • smaller fastqs 0
  • clumping fastqs 0
  • hardy-weinberg 0
  • hwe statistics 0
  • hwe equilibrium 0
  • reference-independent 0
  • genotype likelihood 0
  • collapse 0
  • liftover 0
  • probabilistic realignment 0
  • seqfu 0
  • cell_type_identification 0
  • deduping 0
  • cell_phenotyping 0
  • machine_learning 0
  • background_correction 0
  • illumiation_correction 0
  • clahe 0
  • refresh 0
  • association 0
  • GWAS 0
  • case/control 0
  • element 0
  • associations 0
  • spatial_neighborhoods 0
  • homologs 0
  • gaps 0
  • mgf 0
  • introns 0
  • update header 0
  • install 0
  • joint-genotyping 0
  • genotypegvcf 0
  • BCF 0
  • parallel 0
  • plastid 0
  • jaccard 0
  • resfinder 0
  • resistance genes 0
  • raw 0
  • parquet 0
  • nucleotide sequence 0
  • parser 0
  • dbsnp 0
  • standardize 0
  • quarto 0
  • python 0
  • r 0
  • coexpression 0
  • correlation 0
  • corpcor 0
  • assay 0
  • phylogenetics 0
  • minimum_evolution 0
  • distance-based 0
  • overlap 0
  • htseq 0
  • gost 0
  • wham 0
  • extractunbinned 0
  • linkbins 0
  • sintax 0
  • vsearch/sort 0
  • mkvdjref 0
  • usearch 0
  • long read alignment 0
  • pangenome-scale 0
  • all versus all 0
  • mashmap 0
  • wavefront 0
  • whamg 0
  • cellpose 0
  • graph projection to vcf 0
  • copy-number 0
  • copy number analysis 0
  • gender determination 0
  • hifi 0
  • copy number alterations 0
  • copy number variation 0
  • yahs 0
  • geo 0
  • Assembly 0
  • proteus 0
  • multiomics 0
  • construct 0
  • domains 0
  • downsample 0
  • chromosome_visualization 0
  • duplicate removal 0
  • umicollapse 0
  • chromap 0
  • scRNA-Seq 0
  • quality assurnce 0
  • qa 0
  • files 0
  • crispr 0
  • upd 0
  • uniparental 0
  • disomy 0
  • snv 0
  • downsample bam 0
  • antigen capture 0
  • subsample bam 0
  • vcf2db 0
  • gemini 0
  • maf 0
  • lua 0
  • toml 0
  • antibody capture 0
  • vcfbreakmulti 0
  • uniq 0
  • deduplicate 0
  • VCFtools 0
  • readproteingroups 0
  • eigenvectors 0
  • gprofiler2 0
  • Bayesian 0
  • simulation 0
  • hmmfetch 0
  • decompose 0
  • file manipulation 0
  • bioawk 0
  • transmembrane 0
  • unionBedGraphs 0
  • genome graph 0
  • subtract 0
  • decoy 0
  • slopBed 0
  • sompy 0
  • sorted 0
  • shiftBed 0
  • peak picking 0
  • multinterval 0
  • site frequency spectrum 0
  • ancestral alleles 0
  • derived alleles 0
  • overlapped bed 0
  • maskfasta 0
  • chunking 0
  • array_cgh 0
  • cytosure 0
  • vector 0
  • reverse complement 0
  • Salmonella enterica 0
  • hicPCA 0
  • cadd 0
  • sliding 0
  • compartments 0
  • snakemake 0
  • workflow 0
  • workflow_mode 0
  • topology 0
  • createreadcountpanelofnormals 0
  • copyratios 0
  • denoisereadcounts 0
  • calder2 0
  • Read coverage histogram 0
  • groupby 0
  • bgen 0
  • chloroplast 0
  • confidence 0
  • blat 0
  • alr 0
  • clr 0
  • boxcox 0
  • Escherichia coli 0
  • subtyping 0
  • propd 0
  • scimap 0
  • rna velocity 0
  • structural-variants 0
  • Immune Deconvolution 0
  • decomposeblocksub 0
  • block substitutions 0
  • run 0
  • updatedata 0
  • chip 0
  • partitioning 0
  • malformed 0
  • fix 0
  • paired reads re-pairing 0
  • regex 0
  • patterns 0
  • doublet 0
  • Bioinformatics Tools 0
  • Staphylococcus aureus 0
  • Computational Immunology 0
  • catpack 0
  • prepare 0
  • mzML 0
  • affy 0
  • CRISPRi 0
  • 16S 0
  • hhsuite 0
  • hmmpress 0
  • hmmscan 0
  • phylogenies 0
  • reference panels 0
  • junction 0
  • identity-by-descent 0
  • amp 0
  • reference panel 0
  • installation 0
  • quality_control 0
  • doublet_detection 0
  • barcodes 0
  • subsetting 0
  • logFC 0
  • significance statistic 0
  • p-value 0
  • scvi 0
  • solo 0
  • import segmentation 0
  • nuclear segmentation 0
  • cell segmentation 0
  • relabel 0
  • resegment 0
  • antimicrobial peptide prediction 0
  • morphology 0
  • doCounts 0
  • allele counts 0
  • nuclear contamination estimate 0
  • metagenome assembler 0
  • scanpy 0
  • plotting 0
  • model 0
  • AMPs 0
  • recovery 0
  • mgi 0
  • admixture 0
  • reference compression 0
  • source tracking 0
  • helitron 0
  • bgen file 0
  • vcf file 0
  • genotype dosages 0
  • assembly polishing 0
  • genome polishing 0
  • bedcov 0
  • comp 0
  • wget 0
  • network 0
  • SINE 0
  • plant 0
  • melon 0
  • remove samples 0
  • scanner 0
  • pca 0
  • covariance models 0
  • trna 0
  • genome annotation 0
  • metaspace 0
  • metabolite annotation 0
  • data-download 0
  • adapterremoval 0
  • hwe 0
  • antimicrobial reistance 0
  • contiguate 0
  • patch 0
  • plink2_pca 0
  • pruning 0
  • impute 0
  • nm 0
  • haploype 0
  • host removal 0
  • Indel 0
  • SNV 0
  • shuffleBed 0
  • long read 0
  • tandem repeats 0
  • trio binning 0
  • GFF/GTF 0
  • intron 0
  • short 0
  • uq 0
  • md 0
  • linkage equilibrium 0
  • dream 0
  • variancepartition 0
  • isoform 0
  • longest 0
  • agat 0
  • drep 0
  • microbial genomics 0
  • dereplication 0
  • inbreeding 0
  • heterozygous genotypes 0
  • homozygous genotypes 0
  • f coefficient 0
  • emoji 0
  • controlstatistics 0
  • omics 0
  • bwameme 0
  • InterProScan 0
  • busco 0
  • droplet based single cells 0
  • lexogen 0
  • genotype-based demultiplexing 0
  • donor deconvolution 0
  • cellsnp 0
  • trimfq 0
  • vcflib/vcffixup 0
  • AC/NS/AF 0
  • Pacbio 0
  • guidetree 0
  • bwamem2 0
  • grabix 0
  • temperate 0
  • ribosomal 0
  • lifestyle 0
  • autofluorescence 0
  • regulatory network 0
  • cycif 0
  • transcription factors 0
  • paraphase 0
  • selector 0
  • quality check 0
  • realign 0
  • circular 0
  • MMseqs2 0
  • virulent 0
  • orthogroup 0
  • gtftogenepred 0
  • biological activity 0
  • trimBam 0
  • prior knowledge 0
  • tag 0
  • cell_barcodes 0
  • mygene 0
  • go 0
  • bamUtil 0
  • pile up 0
  • bamtools/split 0
  • yaml 0
  • bamtools/convert 0
  • mouse 0
  • nanopore sequencing 0
  • cobra 0
  • retrieval 0
  • extension 0
  • grea 0
  • functional enrichment 0
  • translation 0
  • paired reads merging 0
  • overlap-based merging 0
  • check 0
  • bacphlip 0
  • hamming-distance 0
  • hashing-based deconvoltion 0
  • gnu 0
  • coreutils 0
  • generic 0
  • transposable element 0
  • spot 0
  • background 0
  • HLA 0
  • shift 0
  • mkarv 0
  • microRNA 0
  • multiqc 0
  • mass_error 0
  • search engine 0
  • poolseq 0
  • variant-calling 0
  • stardist 0
  • telseq 0
  • vsearch/dereplicate 0
  • vsearch/fastqfilter 0
  • fastqfilter 0
  • ATACseq 0
  • ATACshift 0
  • setgt 0
  • jvarkit 0
  • tar 0
  • tarball 0
  • targz 0
  • http(s) 0
  • utility 0
  • bclconvert 0
  • nucBed 0
  • AT content 0
  • nucleotide content 0
  • elfasta 0
  • elprep 0
  • atlas 0
  • Staging 0
  • sage 0
  • functional 0
  • mass spectrometry 0
  • featuretable 0
  • extraction 0
  • redundant 0
  • nanoq 0
  • Read filters 0
  • Read trimming 0
  • Read report 0
  • drug categorization 0
  • Illumina 0
  • impute-info 0
  • sequencing_bias 0
  • tags 0
  • tag2tag 0
  • read group 0
  • hashing-based deconvolution 0
  • rank 0
  • java 0
  • script 0
  • bias 0
  • xml 0
  • svg 0
  • haplotag 0
  • ATLAS 0
  • staging 0
  • ucsc/liftover 0
  • svtk/baftest 0
  • refflat 0
  • cnnscorevariants 0
  • metagenome-assembled genomes 0
  • calibratedragstrmodel 0
  • mass-spectroscopy 0
  • mcr-1 0
  • getpileupsummaries 0
  • MD5 0
  • 128 bit 0
  • cross-samplecontamination 0
  • megahit 0
  • denovo 0
  • debruijn 0
  • calculatecontamination 0
  • collectreadcounts 0
  • Neisseria meningitidis 0
  • bedtointervallist 0
  • asereadcounter 0
  • 3D heat map 0
  • contour map 0
  • Merqury 0
  • vqsr 0
  • smudgeplot 0
  • ploidy 0
  • unionsum 0
  • metaphlan 0
  • variant quality score recalibration 0
  • methylation bias 0
  • maxbin2 0
  • representations 0
  • annotateintervals 0
  • sgRNA 0
  • estimatelibrarycomplexity 0
  • duplication metrics 0
  • limma 0
  • Listeria monocytogenes 0
  • determinegermlinecontigploidy 0
  • createsomaticpanelofnormals 0
  • lofreq/call 0
  • lofreq/filter 0
  • qualities 0
  • AMP 0
  • peptide prediction 0
  • createsequencedictionary 0
  • functional genomics 0
  • CRISPR-Cas9 0
  • reduced 0
  • maximum-likelihood 0
  • rra 0
  • condensedepthevidence 0
  • dragstr 0
  • composestrtablefile 0
  • estimate 0
  • short variant discovery 0
  • taxonomic assignment 0
  • combinegvcfs 0
  • mash/sketch 0
  • collectsvevidence 0
  • mbias 0
  • targets 0
  • clinical 0
  • graph stats 0
  • sequencing summary 0
  • random 0
  • NextGenMap 0
  • ngm 0
  • Neisseria gonorrhoeae 0
  • gender 0
  • generate 0
  • single molecule 0
  • graph construction 0
  • graph drawing 0
  • squeeze 0
  • odgi 0
  • combine graphs 0
  • graph unchopping 0
  • somatic structural variations 0
  • graph formats 0
  • graph viz 0
  • tumor/normal 0
  • hla-typing 0
  • ILP 0
  • HLA-I 0
  • block-compressed 0
  • PCR/optical duplicates 0
  • flip 0
  • upper-triangular matrix 0
  • ligation junctions 0
  • pairtools 0
  • pairstools 0
  • mobile element insertions 0
  • cancer genome 0
  • assembler 0
  • bacterial variant calling 0
  • de Bruijn 0
  • microrna 0
  • heattree 0
  • target prediction 0
  • mitochondrial genome 0
  • reference genome 0
  • gangstr 0
  • gene-calling 0
  • gamma 0
  • UShER 0
  • mosdepth 0
  • otu table 0
  • bootstrapping 0
  • germline variant calling 0
  • lint 0
  • somatic variant calling 0
  • variant caller 0
  • rust 0
  • scan 0
  • mtnucratio 0
  • ratio 0
  • fq 0
  • mitochondrial to nuclear ratio 0
  • bioinformatics tools 0
  • Beautiful stand-alone HTML report 0
  • pneumophila 0
  • legionella 0
  • select 0
  • variantrecalibrator 0
  • ibd 0
  • hbd 0
  • beagle 0
  • models 0
  • compound 0
  • Haemophilus influenzae 0
  • haplotype resolution 0
  • genome profile 0
  • bgc 0
  • file parsing 0
  • txt 0
  • gawk 0
  • recalibration model 0
  • fARGene 0
  • gccounter 0
  • variantfiltration 0
  • readcounter 0
  • svcluster 0
  • svannotate 0
  • splitintervals 0
  • HMMER 0
  • amino acid 0
  • splitcram 0
  • site depth 0
  • Hidden Markov Model 0
  • hmtnote 0
  • annotations 0
  • rgi 0
  • amrfinderplus 0
  • haemophilus 0
  • gstama/merge 0
  • joint-variant-calling 0
  • GNU 0
  • merge compare 0
  • genomes on a tree 0
  • Sample 0
  • low coverage 0
  • gget 0
  • tama_collapse.py 0
  • genome statistics 0
  • genome manipulation 0
  • gene model 0
  • TAMA 0
  • genome summary 0
  • gfastats 0
  • abricate 0
  • gstama/polyacleanup 0
  • GTDB taxonomy 0
  • Mykrobe 0
  • genome taxonomy database 0
  • archaea 0
  • Salmonella Typhi 0
  • gunc 0
  • gunzip 0
  • repeat content 0
  • genome heterozygosity 0
  • gvcftools 0
  • extract_variants 0
  • extractvariants 0
  • genome size 0
  • pos 0
  • shiftintervals 0
  • filterintervals 0
  • kegg 0
  • kallisto/index 0
  • quant 0
  • indexfeaturefile 0
  • readcountssummary 0
  • getpileupsumaries 0
  • digital normalization 0
  • germlinevariantsites 0
  • k-mer counting 0
  • effective genome size 0
  • germlinecnvcaller 0
  • Klebsiella 0
  • pneumoniae 0
  • germline contig ploidy 0
  • kofamscan 0
  • readorientationartifacts 0
  • combining 0
  • panelofnormalscreation 0
  • jointgenotyping 0
  • genomicsdbimport 0
  • genomicsdb 0
  • gatherbqsrreports 0
  • tranche filtering 0
  • filtervarianttranches 0
  • reorder 0
  • spliced 0
  • train 0
  • learnreadorientationmodel 0
  • leftalignandtrimvariants 0
  • shiftfasta 0
  • IDR 0
  • igv 0
  • igv.js 0
  • js 0
  • genome browser 0
  • multicut 0
  • pixel classification 0
  • pixel_classification 0
  • probability_maps 0
  • shiftchain 0
  • selectvariants 0
  • revert 0
  • reblockgvcf 0
  • mergebamalignment 0
  • printsvevidence 0
  • genomic islands 0
  • insertion 0
  • printreads 0
  • preprocessintervals 0
  • postprocessgermlinecnvcalls 0
  • snvs 0
  • jasminesv 0
  • jasmine 0
  • Python 0
  • Jupyter 0
  • jupytext 0
  • papermill 0
  • mutectstats 0
  • restriction fragments 0
  • zipperbams 0
  • genepred 0
  • rare variants 0
  • cls 0
  • selection 0
  • random draw 0
  • pseudohaploid 0
  • pseudodiploid 0
  • freqsum 0
  • bam2seqz 0
  • gc_wiggle 0
  • induce 0
  • na 0
  • error 0
  • header 0
  • custom 0
  • sha256 0
  • 256 bit 0
  • shinyngs 0
  • exploratory 0
  • boxplot 0
  • density 0
  • features 0
  • Cores 0
  • sliding window 0
  • Segmentation 0
  • seq 0
  • gct 0
  • CRAM 0
  • peak-caller 0
  • cumulative coverage 0
  • scatterplot 0
  • corrrelation 0
  • track 0
  • scramble 0
  • cluster analysis 0
  • clusteridentifier 0
  • cut&tag 0
  • cutesv 0
  • cut&run 0
  • chromatin 0
  • seacr 0
  • assembly-binning 0
  • subseq 0
  • grep 0
  • sequence headers 0
  • sertotype 0
  • interleave 0
  • TMA dearray 0
  • SMN1 0
  • detecting svs 0
  • short-read sequencing 0
  • Imputation 0
  • baftest 0
  • countsvtypes 0
  • rdtest2vcf 0
  • rdtest 0
  • cmseq 0
  • vcf2bed 0
  • decompress 0
  • protein coding genes 0
  • variantcalling 0
  • polymorphic sites 0
  • polymorphic 0
  • polymut 0
  • Mycobacterium tuberculosis 0
  • chromosomal rearrangements 0
  • eucaryotes 0
  • coding 0
  • cds 0
  • transcroder 0
  • sequencing adapters 0
  • bedgraphtobigwig 0
  • bigbed 0
  • bedtobigbed 0
  • sccmec 0
  • SMN2 0
  • invariant 0
  • POA 0
  • sniffles 0
  • core 0
  • snippy 0
  • UNet 0
  • mcool 0
  • genomic bins 0
  • makebins 0
  • enzyme 0
  • dbnsfp 0
  • predictions 0
  • digest 0
  • SNPs 0
  • constant 0
  • streptococcus 0
  • cload 0
  • cooler/balance 0
  • subcontigs 0
  • nucleotide composition 0
  • rRNA 0
  • ribosomal RNA 0
  • concoct 0
  • partition histograms 0
  • spatype 0
  • spa 0
  • ubam 0
  • ARGs 0
  • GRO-cap 0
  • PRO-cap 0
  • CAGE 0
  • NETCAGE 0
  • RAMPAGE 0
  • csRNA-seq 0
  • STRIPE-seq 0
  • PRO-seq 0
  • GRO-seq 0
  • ANI 0
  • exclude 0
  • variant identifiers 0
  • antibiotic resistance genes 0
  • tandem duplications 0
  • faqcs 0
  • indep 0
  • str 0
  • indep pairwise 0
  • recode 0
  • whole genome association 0
  • identifiers 0
  • scoring 0
  • variant genetic 0
  • porechop_abi 0
  • cache 0
  • CoPRO 0
  • insertions 0
  • contact 0
  • public 0
  • unmapped 0
  • groupreads 0
  • duplexumi 0
  • consensus sequence 0
  • paragraph 0
  • graphs 0
  • pbbam 0
  • pbmerge 0
  • subreads 0
  • pbp 0
  • pair-end 0
  • read 0
  • pedigrees 0
  • motif 0
  • deletions 0
  • prophage 0
  • identification 0
  • illumina datasets 0
  • phylogenetic composition 0
  • ENA 0
  • SRA 0
  • hybrid-selection 0
  • mate-pair 0
  • liftovervcf 0
  • pcr 0
  • picard/renamesampleinvcf 0
  • sortvcf 0
  • pretext 0
  • rtg 0
  • bamstat 0
  • PEP 0
  • strandedness 0
  • experiment 0
  • read_pairs 0
  • fragment_size 0
  • inner_distance 0
  • read distribution 0
  • escherichia coli 0
  • sequence-based 0
  • mapping-based 0
  • depth information 0
  • integrity 0
  • structural variation 0
  • rhocall 0
  • pedfilter 0
  • rocplot 0
  • duphold 0
  • rtg-tools 0
  • segment 0
  • salsa 0
  • salsa2 0
  • LCA 0
  • Ancestor 0
  • multimapper 0
  • flagstat 0
  • sambamba 0
  • duplicate marking 0
  • R 0
  • long uncorrected reads 0
  • jpg 0
  • haplotype purging 0
  • bmp 0
  • contact maps 0
  • gene finding 0
  • Streptococcus pyogenes 0
  • swissprot 0
  • genbank 0
  • embl 0
  • cutoff 0
  • split by chromosome 0
  • duplicate purging 0
  • subsampling 0
  • false duplications 0
  • assembly curation 0
  • Haplotype purging 0
  • False duplications 0
  • Assembly curation 0
  • purging 0
  • eigenstratdatabasetools 0
  • quast 0
  • pep 0
  • schema 0
  • neighbour-joining 0
  • missingness 0

Post-processing script of the MaltExtract component of the HOPS package

000

json summary_pdf tsv candidate_pdfs versions

Estimate the post-mortem damage patterns of DNA

012300

empiric exponential counts table versions

atlas:

ATLAS, a suite of methods to accurately genotype and estimate genetic diversity

Generate tables of feature metadata from GTF files

0101

feature_annotation filtered_cdna versions

atlasgeneannotationmanipulation:

Scripts for manipulating gene annotation

Use deamination patterns to estimate contamination in single-stranded libraries

010101

txt versions

authentict:

Estimates present-day DNA contamination in ancient DNA single-stranded libraries.

Tool for converting 10x BAMs produced by Cell Ranger, Space Ranger, Cell Ranger ATAC, Cell Ranger DNA, and Long Ranger back to FASTQ files that can be used as inputs to re-run analysis

01

fastq versions

Aligns single- or paired-end reads from bisulfite-converted libraries to a reference genome using Biscuit.

010101

bam bai versions

biscuit:

A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data

A fast, compact one-liner to produce duplicate-marked, sorted, and indexed BAM files using Biscuit

010101

bam bai versions

biscuit:

A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data

samblaster:

samblaster is a fast and flexible program for marking duplicates in read-id grouped paired-end SAM files. It can also optionally output discordant read pairs and/or split read mappings to separate SAM files, and/or unmapped/clipped reads to a separate FASTQ file. By default, samblaster reads SAM input from stdin and writes SAM to stdout.

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Summarize and/or filter reads based on bisulfite conversion rate

01010101

bam versions

biscuit:

A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data

Summarizes read-level methylation (and optionally SNV) information from a Biscuit BAM file in a standard-compliant BED format.

0101010101

bed versions

biscuit:

A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data

Indexes a reference genome for use with Biscuit

01

index versions

biscuit:

A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data

Merges methylation information for opposite-strand C's in a CpG context

010101

bed versions

biscuit:

A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data

Computes cytosine methylation and callable SNV mutations, optionally in reference to a germline BAM to call somatic variants

012340101

vcf versions

biscuit:

A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data

Perform basic quality control on a BAM file generated with Biscuit

010101

reports versions

biscuit:

A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data

Summarizes methylation or SNV information from a Biscuit VCF in a standard-compliant BED file.

01

bed versions

biscuit:

A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data

Queries a BLAST DNA database

0101

txt versions

blast:

BLAST finds regions of similarity between biological sequences.

Queries a BLAST DNA database

0101

txt versions

blast:

Protein to Translated Nucleotide BLAST.

Align reads to a reference genome using bowtie

01010

bam log fastq versions

bowtie:

bowtie is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Create bowtie index for reference genome

01

index versions

bowtie:

bowtie is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Re-estimate taxonomic abundance of metagenomic samples analyzed by kraken.

010

reports txt versions

bracken:

Bracken (Bayesian Reestimation of Abundance with KrakEN) is a highly accurate statistical method that computes the abundance of species in DNA sequences from a metagenomics sample.

Extends a Kraken2 database to be compatible with Bracken

01

db bracken_files versions

bracken:

Bracken (Bayesian Reestimation of Abundance with KrakEN) is a highly accurate statistical method that computes the abundance of species in DNA sequences from a metagenomics sample.

Combine output of metagenomic samples analyzed by bracken.

01

txt versions

bracken:

Bracken (Bayesian Reestimation of Abundance with KrakEN) is a highly accurate statistical method that computes the abundance of species in DNA sequences from a metagenomics sample.

Find SA coordinates of the input reads for bwa short-read mapping

0101

sai versions

bwa:

BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Create BWA index for reference genome

01

index versions

bwa:

BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Performs fastq alignment to a fasta reference using BWA

0101010

bam cram csi crai versions

bwa:

BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Convert paired-end bwa SA coordinate files to SAM format

01201

bam versions

bwa:

BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Convert bwa SA coordinate file to SAM format

01201

bam versions

bwa:

BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Create BWA-mem2 index for reference genome

01

index versions

bwamem2:

BWA-mem2 is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Performs fastq alignment to a fasta reference using BWA

0101010

sam bam cram crai csi versions

bwa:

BWA-mem2 is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Taxonomic classification of long DNA sequences and metagenome assembled genomes (e.g. MAGs / bins).

0101

txt versions

catpack:

CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)

Taxonomic classification of long DNA sequences and metagenome assembled genomes (e.g. MAGs / bins).

0101010101

orf2lca bin2classification log diamond faa gff versions

catpack:

CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)

Taxonomic classification of long DNA sequences and metagenome assembled genomes (e.g. contigs, MAGs / bins).

0101010101

orf2lca contig2classification log diamond faa gff versions

catpack:

CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)

Taxonomic classification plus read-based abundance estimation from long DNA sequences and metagenome assembled genomes (e.g. contigs, MAGs / bins).

0101010101001010101010101

rat_log complete_abundance contig_abundance read2classification alignment_diamond contig2classification cat_log orf2lca faa gff unmapped_diamond unmapped_fasta unmapped2classification versions

catpack:

CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)

Align sequences using Clustal Omega

010100000

alignment versions

clustalo:

Latest version of Clustal: a multiple sequence alignment program for DNA or proteins

pigz:

Parallel implementation of the gzip algorithm.

Renders a guidetree in clustalo

01

tree versions

clustalo:

Latest version of Clustal: a multiple sequence alignment program for DNA or proteins

Calculate the sequence-accessible coordinates in chromosomes from the given reference genome, output as a BED file.

0101

bed versions

cnvkit:

CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent.

Derive off-target (โ€œantitargetโ€) bins from target regions.

01

bed versions

cnvkit:

CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent.

Copy number variant detection from high-throughput sequencing data

012010101010

bed cnn cnr cns pdf png versions

cnvkit:

CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent.

Given segmented log2 ratio estimates (.cns), derive each segmentโ€™s absolute integer copy number

012

cns versions

cnvkit:

CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent.

Convert copy number ratio tables (.cnr files) or segments (.cns) to another format.

01

output versions

cnvkit:

CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent.

Copy number variant detection from high-throughput sequencing data

012

tsv cnn versions

cnvkit:

CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent.

Compile a coverage reference from the given files (normal samples).

000

cnn versions

cnvkit:

CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent.

Transform bait intervals into targets more suitable for CNVkit.

0101

bed versions

cnvkit:

CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent.

Map reads to contigs and estimate coverage

010100

coverage versions

coverm:

CoverM aims to be a configurable, easy to use and fast DNA read coverage and relative abundance calculator focused on metagenomics applications

A Java based tool to determine damage patterns on ancient DNA as a replacement for mapDamage

01000

results versions

DeDup is a tool for read deduplication in paired-end read merging (e.g. for ancient DNA experiments).

01

bam json hist log versions

(DEPRECATED - see main.nf) DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

012301010101

vcf vcf_tbi gvcf gvcf_tbi versions

Call variants from the examples produced by make_examples

01

call_variants_tfrecords versions

deepvariant:

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

Transforms the input alignments to a format suitable for the deep neural network variant caller

012301010101

examples gvcf small_model_calls versions

deepvariant:

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

01234010101

vcf vcf_index gvcf gvcf_index versions

deepvariant:

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

012301010101

vcf vcf_tbi gvcf gvcf_tbi versions

deepvariant:

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

01

report versions

deepvariant:

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

Queries a DIAMOND database using blastp mode

010100

blast xml txt daa sam tsv paf versions

diamond:

Accelerated BLAST compatible local sequence aligner

Queries a DIAMOND database using blastx mode

010100

blast xml txt daa sam tsv paf log versions

diamond:

Accelerated BLAST compatible local sequence aligner

tool for detection and quantification of large mtDNA rearrangements.

0120

deletions genes circos versions

endorS.py calculates endogenous DNA from samtools flagstat files and print to screen

0123

json versions

Build ganon database using custom reference sequences.

01000

db info versions

ganon:

ganon classifies short DNA sequences against large sets of genomic reference sequences efficiently

Classify FASTQ files against ganon database

010

tre report one all unc log versions

ganon:

ganon classifies short DNA sequences against large sets of genomic reference sequences efficiently

Generate a ganon report file from the output of ganon classify

010

tre versions

ganon:

ganon classifies short DNA sequences against large sets of genomic reference sequences efficiently

Generate a multi-sample report file from the output of ganon report runs

01

txt versions

ganon:

ganon classifies short DNA sequences against large sets of genomic reference sequences efficiently

This tool locates and tags duplicate reads in a BAM or SAM file, where duplicate reads are defined as originating from a single fragment of DNA.

0100

cram bam crai bai metrics versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

This tool locates and tags duplicate reads in a BAM or SAM file, where duplicate reads are defined as originating from a single fragment of DNA.

metabamfastafaidict

meta versions output bam_index

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

This tool locates and unmark the marked duplicate reads in a BAM or SAM file, where duplicate reads are defined as originating from a single fragment of DNA.

01

bam bai versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

This tool locates and tags duplicate reads in a BAM or SAM file, where duplicate reads are defined as originating from a single fragment of DNA.

01000

output bam_index metrics versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Haplocheck detects contamination patterns in mtDNA AND WGS sequencing studies by analyzing the mitochondrial DNA. Haplocheck also works as a proxy tool for nDNA studies and provides users a graphical report to investigate the contamination further. Internally, it uses the Haplogrep tool, that supports rCRS and RSRS mitochondrial versions.

01

txt html versions

classification into haplogroups

010

txt versions

haplogrep2:

A tool for mtDNA haplogroup classification.

classification into haplogroups

01

txt versions

haplogrep3:

A tool for mtDNA haplogroup classification.

Align RNA-Seq reads to a reference with HISAT2

010101

bam summary fastq versions

hisat2:

HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes as well as to a single reference genome.

Builds HISAT2 index for reference genome

010101

index versions

hisat2:

HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes as well as to a single reference genome.

Extracts splicing sites from a gtf files

01

txt versions

hisat2:

HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes as well as to a single reference genome.

Create a tag directory with the HOMER suite

010

tagdir taginfo versions

homer:

HOMER (Hypergeometric Optimization of Motif EnRichment) is a suite of tools for Motif Discovery and next-gen sequencing analysis.

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

DESeq2:

Differential gene expression analysis based on the negative binomial distribution

edgeR:

Empirical Analysis of Digital Gene Expression Data in R

ichorCNA is an R package for calculating copy number alteration from (low-pass) whole genome sequencing, particularly for use in cell-free DNA. This module generates a panel of normals

000000

rds txt versions

ichorcna:

Estimating tumor fraction in cell-free DNA from ultra-low-pass whole genome sequencing.

ichorCNA is an R package for calculating copy number alteration from (low-pass) whole genome sequencing, particularly for use in cell-free DNA

010000000

rdata seg cna_seg seg_txt corrected_depth ichorcna_params plots genome_plot versions

ichorcna:

Estimating tumor fraction in cell-free DNA from ultra-low-pass whole genome sequencing.

Search covariance models against a sequence database

01200

output alignments target_summary versions

infernal:

Infernal is for searching DNA sequence databases for RNA structure and sequence similarities.

Detect integrons in DNA sequences

01

gbk integrons summary out versions

Produces protein annotations and predictions from an amino acids FASTA file

010

tsv xml gff3 json versions

index creation for kb count quantification of single-cell data.

000

versions index t2g cdna intron cdna_t2c intron_t2c

kb:

kallisto|bustools (kb) is a tool developed for fast and efficient processing of single-cell OMICS data.

Adds fasta files to a Kraken2 taxonomic database

010000

db versions

kraken2:

Kraken2 is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies.

Builds Kraken2 database

010

db versions

kraken2:

Kraken2 is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies.

Downloads and builds Kraken2 standard database

0

db versions

kraken2:

Kraken2 is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies.

Bayesian reconstruction of ancient DNA fragments

01

bam fq_pass fq_fail unmerged_r1_fq_pass unmerged_r1_fq_fail unmerged_r2_fq_pass unmerged_r2_fq_fail log versions

MALT, an acronym for MEGAN alignment tool, is a sequence alignment and analysis tool designed for processing high-throughput sequencing data, especially in the context of metagenomics.

0000

index versions log

malt:

A tool for mapping metagenomic data

MALT, an acronym for MEGAN alignment tool, is a sequence alignment and analysis tool designed for processing high-throughput sequencing data, especially in the context of metagenomics.

010

rma6 alignments log versions

malt:

A tool for mapping metagenomic data

Tool for evaluation of MALT results for true positives of ancient metagenomic taxonomic screening

0100

results versions

Create mapAD index for reference genome

01

index versions

mapad:

An aDNA aware short-read mapper

Map short-reads to an indexed reference genome

01010000000

bam versions

mapad:

An aDNA aware short-read mapper

Computational framework for tracking and quantifying DNA damage patterns among ancient DNA sequencing reads generated by Next-Generation Sequencing platforms.

010

runtime_log fragmisincorporation_plot length_plot misincorporation lgdistribution dnacomp stats_out_mcmc_hist stats_out_mcmc_iter stats_out_mcmc_trace stats_out_mcmc_iter_summ_stat stats_out_mcmc_post_pred stats_out_mcmc_correct_prob dnacomp_genome rescaled pctot_freq pgtoa_freq fasta folder versions

mdust from DFCI Gene Indices Software Tools for masking low-complexity DNA sequences

01

fasta versions

Analyses a DAA file and exports information in text format

010

txt_gz megan versions

megan:

A tool for studying the taxonomic content of a set of DNA reads

Analyses an RMA file and exports information in text format

010

txt megan_summary versions

megan:

A tool for studying the taxonomic content of a set of DNA reads

msisensor2 detection of MSI regions.

01234500

msi distribution somatic versions

msisensor2:

MSIsensor2 is a novel algorithm based machine learning, featuring a large upgrade in the microsatellite instability (MSI) detection for tumor only sequencing data, including Cell-Free DNA (cfDNA), Formalin-Fixed Paraffin-Embedded(FFPE) and other sample types. The original MSIsensor is specially designed for tumor/normal paired sequencing data.

msisensor2 detection of MSI regions.

00

scan versions

msisensor2:

MSIsensor2 is a novel algorithm based machine learning, featuring a large upgrade in the microsatellite instability (MSI) detection for tumor only sequencing data, including Cell-Free DNA (cfDNA), Formalin-Fixed Paraffin-Embedded(FFPE) and other sample types. The original MSIsensor is specially designed for tumor/normal paired sequencing data.

SNP table generator from GATK UnifiedGenotyper with functionality geared for aDNA

010101010000001

full_alignment info_txt snp_alignment snp_genome_alignment snpstatistics snptable snptable_snpeff snptable_uncertainty structure_genotypes structure_genotypes_nomissing json versions

DNA contaminant removal using NanoLyse

010

fastq log versions

Determining whether sequencing data comes from the same individual by using SNP matching. This module generates vaf files for individual fastq file(s), ready for the vafncm module.

0101

vaf versions

ngscheckmate:

NGSCheckMate is a software package for identifying next generation sequencing (NGS) data files from the same individual, including matching between DNA and RNA.

Determining whether sequencing data comes from the same individual by using SNP matching. Designed for humans on vcf or bam files.

010101

corr_matrix matched all pdf vcf versions

ngscheckmate:

NGSCheckMate is a software package for identifying next generation sequencing (NGS) data files from the same individual, including matching between DNA and RNA.

Determining whether sequencing data comes from the same individual by using SNP matching. This module generates PT files from a bed file containing individual positions.

010101

pt versions

ngscheckmate:

NGSCheckMate is a software package for identifying next generation sequencing (NGS) data files from the same individual, including matching between DNA and RNA.

Determining whether sequencing data comes from the same individual by using SNP matching. This module generates PT files from a bed file containing individual positions.

01

pdf corr_matrix all matched versions

ngscheckmate:

NGSCheckMate is a software package for identifying next generation sequencing (NGS) data files from the same individual, including matching between DNA and RNA.

VIDIA Clara Parabricks GPU-accelerated fast, accurate algorithm for mapping methylated DNA sequence reads to a reference genome, performing local alignment, and producing alignment for different parts of the query sequence

0101010

bam bai qc_metrics bqsr_table duplicate_metrics versions

parabricks:

NVIDIA Clara Parabricks GPU-accelerated genomics tools

"This package computes informative enrichment and quality measures for ChIP-seq/DNase-seq/FAIRE-seq/MNase-seq data. It can also be used to obtain robust estimates of the predominant fragment length or characteristic tag shift values in these assays."

01

spp pdf rdata versions

Platypus is a tool that efficiently and accurately calling genetic variants from next-generation DNA sequencing data

01234000

vcf tbi log version

pmdtools command to filter ancient DNA molecules from others

01200

bam versions

pmdtools:

Compute postmortem damage patterns and decontaminate ancient genomes

Calculate intervals coverage for each sample. N.B. the tool can not handle staging files with symlinks, stageInMode should be set to 'link'.

0120

txt png loess_qc_txt loess_txt versions

purecn:

Copy number calling and SNV classification using targeted short read sequencing

Generate on and off-target intervals for PureCN from a list of targets

01010

txt bed versions

purecn:

Copy number calling and SNV classification using targeted short read sequencing

Build a normal database for coverage normalization from all the (GC-normalized) normal coverage files. N.B. as reported in https://www.bioconductor.org/packages/devel/bioc/vignettes/PureCN/inst/doc/Quick.html, it is advised to provide a normal panel (VCF format) to precompute mapping bias for faster runtimes.

012300

rds png bias_rds bias_bed low_cov_bed versions

purecn:

Copy number calling and SNV classification using targeted short read sequencing

Run PureCN workflow to normalize, segment and determine purity and ploidy

01200

pdf local_optima_pdf seg genes_csv amplification_pvalues_csv vcf_gz variants_csv loh_csv chr_pdf segmentation_pdf multisample_seg versions

purecn:

Copy number calling and SNV classification using targeted short read sequencing

Identify, orient and trim nanopore cDNA reads

01

fastq versions

gzip:

Gzip reduces the size of the named files using Lempel-Ziv coding (LZ77).

Damage parameter estimation for ancient DNA

012

csv versions

pydamage:

Damage parameter estimation for ancient DNA

Damage parameter estimation for ancient DNA

01

csv versions

pydamage:

Damage parameter estimation for ancient DNA

Consensus module for raw de novo DNA assembly of long uncorrected reads

0123

improved_assembly versions

Extract exon-exon junctions from an RNAseq BAM file. The output is a BED file in the BED12 format.

012

junc versions

regtools:

RegTools is a set of tools that integrate DNA-seq and RNA-seq data to help interpret mutations in a regulatory and splicing context.

Screening DNA sequences for interspersed repeats and low complexity DNA sequences

010

masked out tbl gff versions

repeatmasker:

RepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences

Predict antibiotic resistance from protein or nucleotide data

0100

json tsv tmp tool_version db_version versions

rgi:

This tool provides a preliminary annotation of your DNA sequence(s) based upon the data available in The Comprehensive Antibiotic Resistance Database (CARD). Hits to genes tagged with Antibiotic Resistance ontology terms will be highlighted. As CARD expands to include more pathogens, genomes, plasmids, and ontology terms this tool will grow increasingly powerful in providing first-pass detection of antibiotic resistance associated genes. See license at CARD website

Clips read alignments where they match BED file defined regions

01000

bam stats rejects_bam versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

calculates MD and NM tags

0101

bam versions

samtoolscalmd:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Concatenate BAM or CRAM file

01

bam cram versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Produces a consensus FASTA/FASTQ/PILEUP

01

fasta fastq pileup versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

convert and then index CRAM -> BAM or BAM -> CRAM file

0120101

bam cram bai crai versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

produces a histogram or table of coverage per chromosome

0120101

coverage versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

List CRAM Content-ID and Data-Series sizes

01

size versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Create a sequence dictionary file from a FASTA file

01

dict versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Index FASTA file, and optionally generate a file of chromosome sizes

01010

fa fai sizes gzi versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Converts a SAM/BAM/CRAM file to FASTQ

010

fastq interleaved singleton other versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Samtools fixmate is a tool that can fill in information (insert size, cigar, mapq) about paired end reads onto the corresponding other read. Also has options to remove secondary/unmapped alignments and recalculate whether reads are proper pairs.

01

bam cram sam versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Counts the number of alignments in a BAM/CRAM/SAM file for each FLAG type

012

flagstat versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

filter/convert SAM/BAM/CRAM file

01

readgroup versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Reports alignment summary statistics for a BAM/CRAM/SAM file

012

idxstats versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

converts FASTQ files to unmapped SAM/BAM/CRAM

01

sam bam cram versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Index SAM/BAM/CRAM file

01

bai csi crai versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Merge BAM or CRAM file

010101

bam cram csi crai versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

BAM

0120

mpileup versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Replace the header in the bam file with the header generated by the command. This command is much faster than replacing the header with a BAMโ†’SAMโ†’BAM conversion.

01

bam versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Collate/Fixmate/Sort/Markdup SAM/BAM/CRAM file

0101

bam cram csi crai metrics versions

samtools_cat:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

samtools_collate:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

samtools_fixmate:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

samtools_sort:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

samtools_markdup:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Sort SAM/BAM/CRAM file

0101

bam cram crai csi versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Produces comprehensive statistics from SAM/BAM/CRAM file

01201

stats versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

filter/convert SAM/BAM/CRAM file

0120100

bam cram sam bai csi crai unselected unselected_index versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Apply a score cutoff to filter variants based on a recalibration table. Sentieon's Aplyvarcal performs the second pass in a two-stage process called Variant Quality Score Recalibration (VQSR). Specifically, it applies filtering to the input variants based on the recalibration table produced in the previous step VarCal and a target sensitivity value. https://support.sentieon.com/manual/usages/general/#applyvarcal-algorithm

0123450101

vcf tbi versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Create BWA index for reference genome

01

index versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Performs fastq alignment to a fasta reference using Sentieon's BWA MEM

01010101

bam_and_bai versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Accelerated implementation of the Picard CollectVariantCallingMetrics tool.

012012010101

metrics summary versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Accelerated implementation of the GATK DepthOfCoverage tool.

01201010101

per_locus sample_summary statistics coverage_counts coverage_proportions interval_summary versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Collects multiple quality metrics from a bam file

01201010

mq_metrics qd_metrics gc_summary gc_metrics aln_metrics is_metrics mq_plot qd_plot is_plot gc_plot versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Runs the sentieon tool LocusCollector followed by Dedup. LocusCollector collects read information that is used by Dedup which in turn marks or removes duplicate reads.

0120101

cram crai bam bai score metrics metrics_multiqc_tsv versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

modifies the input VCF file by adding the MLrejected FILTER to the variants

012010101

vcf index versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

DNAscope algorithm performs an improved version of Haplotype variant calling.

01230101010101000

vcf vcf_tbi gvcf gvcf_tbi versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Perform joint genotyping on one or more samples pre-called with Sentieon's Haplotyper.

012301010101

vcf_gz vcf_gz_tbi versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Runs Sentieon's haplotyper for germline variant calling.

012340101010100

vcf vcf_tbi gvcf gvcf_tbi versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Generate recalibration table and optionally perform base quality recalibration

01201010101010

table table_post recal_alignment csv pdf versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Merges BAM files, and/or convert them into cram files. Also, outputs the result of applying the Base Quality Score Recalibration to a file.

0120101

output index output_index versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Filters the raw output of sentieon/tnhaplotyper2.

01234560101

vcf vcf_tbi stats versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Tnhaplotyper2 performs somatic variant calling on the tumor-normal matched pairs.

01230101010101010100

orientation_data contamination_data contamination_segments stats vcf index versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

TNscope algorithm performs somatic variant calling on the tumor-normal matched pair or the tumor only data, using a Haplotyper algorithm.

012010101201201201

vcf index versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Module for Sentieons VarCal. The VarCal algorithm calculates the Variant Quality Score Recalibration (VQSR). VarCal builds a recalibration model for scoring variant quality. https://support.sentieon.com/manual/usages/general/#varcal-algorithm

01200000

recal idx tranches plots versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Collects whole genome quality metrics from a bam file

012010101

wgs_metrics versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Dereplicate FASTX sequences, removing duplicate sequences and printing the number of identical sequences in the sequence header. Can dereplicate already dereplicated FASTA files, summing the numbers found in the headers.

01

fasta versions

seqfu:

DNA sequence utilities for FASTX files

Translate DNA/RNA to protein sequence

01

fastx versions

seqkit:

A cross-platform and ultrafast toolkit for FASTA/Q file manipulation

Calculate the relative coverage on the Gonosomes vs Autosomes from the output of samtools depth, with error bars.

010

json tsv versions

The goal of the Shasta long read assembler is to rapidly produce accurate assembled sequence using DNA reads generated by Oxford Nanopore flow cells as input. Please note Assembler is design to focus on speed, so assembly may be considered somewhat non-deterministic as final assembly may vary across executions. See https://github.com/chanzuckerberg/shasta/issues/296.

01

assembly gfa results versions

Compare many FracMinHash signatures generated by sourmash sketch.

01000

matrix labels csv versions

sourmash:

Compute and compare FracMinHash signatures for DNA and protein data sets.

Search a metagenome sourmash signature against one or many reference databases and return the minimum set of genomes that contain the k-mers in the metagenome.

0100000

result unassigned matches prefetch prefetchcsv versions

sourmash:

Compute and compare FracMinHash signatures for DNA data sets.

Create a database of sourmash signatures (a group of FracMinHash sketches) to be used as references.

010

signature_index versions

sourmash:

Compute and compare FracMinHash signatures for DNA data sets.

Create a signature (a group of FracMinHash sketches) of a sequence using sourmash

01

signatures versions

sourmash:

Compute and compare FracMinHash signatures for DNA and protein data sets.

Annotate list of metagenome members (based on sourmash signature matches) with taxonomic information.

010

result versions

sourmash:

Compute and compare FracMinHash signatures for DNA data sets.

Advanced sequence file format conversions

01000

cram gzi versions

scramble:

Staden Package 'io_lib' (sometimes referred to as libstaden-read by distributions). This contains code for reading and writing a variety of Bioinformatics / DNA Sequence formats.

Align reads to a reference genome using STAR

010101000

log_final log_out log_progress versions bam bam_sorted bam_sorted_aligned bam_transcript bam_unsorted fastq tab spl_junc_tab read_per_gene_tab junction sam wig bedgraph

star:

STAR is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Create index for STAR

0101

index versions

star:

STAR is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Get the minimal allowed index version from STAR

NO input

index_version versions

star:

STAR is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Count reads that map to genomic features

012

counts summary versions

featurecounts:

featureCounts is a highly efficient general-purpose read summarization program that counts mapped reads for genomic features such as genes, exons, promoter, gene bodies, genomic bins and chromosomal locations. It can be used to count both RNA-seq and genomic DNA-seq reads.

Estimating poly(A)-tail lengths from basecalled fast5 files produced by Nanopore sequencing of RNA and DNA

01

csv_gz versions

Aligns sequences using T_COFFEE

01010120

alignment lib versions

tcoffee:

A collection of tools for Computing, Evaluating and Manipulating Multiple Alignments of DNA, RNA, Protein Sequences and Structures.

pigz:

Parallel implementation of the gzip algorithm.

Compares 2 alternative MSAs to evaluate them.

012

scores versions

tcoffee:

A collection of tools for Multiple Alignments of DNA, RNA, Protein Sequence

pigz:

Parallel implementation of the gzip algorithm.

Computes a consensus alignment using T_COFFEE

01010

alignment eval versions

tcoffee:

A collection of tools for Computing, Evaluating and Manipulating Multiple Alignments of DNA, RNA, Protein Sequences and Structures.

pigz:

Parallel implementation of the gzip algorithm.

Reformats the header of PDB files with t-coffee

01

formatted_pdb versions

tcoffee:

A collection of tools for Computing, Evaluating and Manipulating Multiple Alignments of DNA, RNA, Protein Sequences and Structures.

Computes the irmsd score for a given alignment and the structures.

01012

irmsd versions

tcoffee:

A collection of tools for Multiple Alignments of DNA, RNA, Protein Sequence

pigz:

Parallel implementation of the gzip algorithm.

Aligns sequences using the regressive algorithm as implemented in the T_COFFEE package

01010120

alignment versions

tcoffee:

A collection of tools for Computing, Evaluating and Manipulating Multiple Alignments of DNA, RNA, Protein Sequences and Structures.

pigz:

Parallel implementation of the gzip algorithm.

Reformats files with t-coffee

01

formatted_file versions

tcoffee:

A collection of tools for Computing, Evaluating and Manipulating Multiple Alignments of DNA, RNA, Protein Sequences and Structures.

Compute the TCS score for a MSA or for a MSA plus a library file. Outputs the tcs as it is and a csv with just the total TCS score.

0101

tcs scores versions

tcoffee:

A collection of tools for Multiple Alignments of DNA, RNA, Protein Sequence

pigz:

Parallel implementation of the gzip algorithm.

Create fasta consensus with TOPAS toolkit with options to penalize substitutions for typical DNA damage present in ancient DNA

010101010

fasta vcf ccf log versions

topas:

This toolkit allows the efficient manipulation of sequence data in various ways. It is organized into modules: The FASTA processing modules, the FASTQ processing modules, the GFF processing modules and the VCF processing modules.

Detecting and estimating inter-sample DNA contamination became a crucial quality assessment step to ensure high quality sequence reads and reliable downstream analysis.

0120

log selfsm depthsm selfrg depthrg bestsm bestrg versions

verifybamid:

verifyBamID is a software that verifies whether the reads in particular file match previously known genotypes for an individual (or group of individuals), and checks whether the reads are contaminated as a mixture of two samples.

Detecting and estimating inter-sample DNA contamination became a crucial quality assessment step to ensure high quality sequence reads and reliable downstream analysis.

01201200

log ud bed mu self_sm ancestry versions

verifybamid2:

A robust tool for DNA contamination estimation from sequence reads using ancestry-agnostic method.

Masks out highly repetitive DNA sequences with low complexity in a genome

01

converted versions

windowmasker:

A program to mask highly repetitive and low complexity DNA sequences within a genome.

A program to generate frequency counts of repetitive units.

01

counts versions

windowmasker:

A program to mask highly repetitive and low complexity DNA sequences within a genome.

A program to take a counts file and creates a file of genomic co-ordinates to be masked.

0101

intervals versions

windowmasker:

A program to mask highly repetitive and low complexity DNA sequences within a genome.

Builds a YARA index for a reference genome

01

index versions

yara:

Yara is an exact tool for aligning DNA sequencing reads to reference genomes.

Align reads to a reference genome using YARA

0101

bam bai versions

yara:

Yara is an exact tool for aligning DNA sequencing reads to reference genomes.

Click here to trigger an update.