Available Modules

Modules are the building stones of all DSL2 nf-core blocks. You can find more info from nf-core website, if you would like to write your own module.

  • bam 74
  • alignment 70
  • fasta 53
  • fastq 39
  • reference 38
  • align 37
  • genome 31
  • index 27
  • cram 27
  • sam 27
  • genomics 25
  • map 24
  • MSA 19
  • vcf 11
  • metagenomics 11
  • bisulfite 9
  • bisulphite 9
  • methylseq 9
  • variant calling 8
  • methylation 8
  • 5mC 8
  • aligner 8
  • LAST 8
  • bwa 8
  • assembly 7
  • sort 7
  • filter 7
  • statistics 7
  • graph 7
  • msa 7
  • gff 6
  • qc 6
  • count 6
  • phylogeny 6
  • metrics 6
  • scWGBS 6
  • DNA methylation 6
  • WGBS 6
  • bisulfite sequencing 6
  • biscuit 6
  • multiple sequence alignment 6
  • bed 5
  • merge 5
  • quality 5
  • ancient DNA 5
  • consensus 5
  • structure 5
  • aDNA 5
  • archaeogenomics 5
  • palaeogenomics 5
  • newick 5
  • bismark 5
  • hmmsearch 5
  • short-read 5
  • vsearch 5
  • MAF 5
  • 3-letter genome 5
  • amps 5
  • microbiome 5
  • database 4
  • coverage 4
  • variants 4
  • split 4
  • gfa 4
  • clustering 4
  • reporting 4
  • mapping 4
  • samtools 4
  • markduplicates 4
  • feature 4
  • sketch 4
  • low frequency variant calling 4
  • antimicrobial peptides 4
  • mem 4
  • malt 4
  • skani 4
  • parsing 4
  • structural variants 3
  • quality control 3
  • build 3
  • variation graph 3
  • isoseq 3
  • protein 3
  • bqsr 3
  • illumina 3
  • depth 3
  • cluster 3
  • transcript 3
  • damage 3
  • sequence 3
  • hmmer 3
  • evaluation 3
  • mkref 3
  • splicing 3
  • duplicates 3
  • ont 3
  • clipping 3
  • counts 3
  • view 3
  • distance 3
  • diamond 3
  • circrna 3
  • ampir 3
  • paf 3
  • bcl2fastq 3
  • DRAMP 3
  • insert 3
  • fcs-gx 3
  • macrel 3
  • amplify 3
  • neubi 3
  • pan-genome 3
  • mkfastq 3
  • aln 3
  • bwameth 3
  • nucleotide 3
  • guide tree 3
  • dist 3
  • pseudoalignment 3
  • long_read 3
  • minimap2 3
  • uLTRA 3
  • atac-seq 3
  • chip-seq 3
  • gatk4 2
  • bacteria 2
  • classification 2
  • gtf 2
  • variant 2
  • contamination 2
  • taxonomy 2
  • sentieon 2
  • somatic 2
  • single-cell 2
  • rnaseq 2
  • trimming 2
  • picard 2
  • stats 2
  • neural network 2
  • machine learning 2
  • gene 2
  • gff3 2
  • genotyping 2
  • population genetics 2
  • bedGraph 2
  • json 2
  • kallisto 2
  • HMM 2
  • chromosome 2
  • peak-calling 2
  • DNA sequence 2
  • clean 2
  • indel 2
  • SNP 2
  • structural_variants 2
  • mapper 2
  • amplicon sequences 2
  • mask 2
  • cellranger 2
  • vg 2
  • ampgram 2
  • amptransformer 2
  • reformatting 2
  • screening 2
  • cleaning 2
  • fusions 2
  • soft-clipped clusters 2
  • fixmate 2
  • unaligned 2
  • realignment 2
  • MaltExtract 2
  • HOPS 2
  • authentication 2
  • edit distance 2
  • vdj 2
  • recombination 2
  • splice 2
  • blastp 2
  • nanopore 1
  • pacbio 1
  • convert 1
  • proteomics 1
  • long reads 1
  • bedtools 1
  • kmer 1
  • mags 1
  • visualisation 1
  • wgs 1
  • long-read 1
  • taxonomic classification 1
  • sequences 1
  • imaging 1
  • demultiplex 1
  • base quality score recalibration 1
  • histogram 1
  • example 1
  • filtering 1
  • pangenome graph 1
  • matrix 1
  • plot 1
  • mappability 1
  • germline 1
  • mmseqs2 1
  • ncbi 1
  • spatial 1
  • umi 1
  • peaks 1
  • blast 1
  • reads 1
  • report 1
  • pangenome 1
  • snp 1
  • profile 1
  • diversity 1
  • cat 1
  • structural 1
  • mpileup 1
  • compare 1
  • query 1
  • deamination 1
  • single cell 1
  • miscoding lesions 1
  • palaeogenetics 1
  • archaeogenetics 1
  • rna 1
  • normalization 1
  • abundance 1
  • phylogenetic placement 1
  • SV 1
  • subsample 1
  • fingerprint 1
  • eukaryotes 1
  • hi-c 1
  • gene expression 1
  • quality trimming 1
  • adapter trimming 1
  • pileup 1
  • hidden Markov model 1
  • spaceranger 1
  • lofreq 1
  • hla_typing 1
  • maximum likelihood 1
  • hlala_typing 1
  • reformat 1
  • minhash 1
  • pair 1
  • instrain 1
  • hlala 1
  • primer 1
  • hla 1
  • variation 1
  • orthologs 1
  • image_processing 1
  • kma 1
  • rrna 1
  • salmon 1
  • leviosam2 1
  • lift 1
  • registration 1
  • import 1
  • nextclade 1
  • duplicate 1
  • ancient dna 1
  • contig 1
  • dict 1
  • collate 1
  • scaffold 1
  • fetch 1
  • demultiplexed reads 1
  • artic 1
  • aggregate 1
  • emboss 1
  • immunoprofiling 1
  • estimation 1
  • parse 1
  • normalize 1
  • norm 1
  • reheader 1
  • genomecov 1
  • smaller fastqs 1
  • clumping fastqs 1
  • deduping 1
  • covariance model 1
  • subsample bam 1
  • construct 1
  • graph projection to vcf 1
  • sintax 1
  • vsearch/sort 1
  • downsample bam 1
  • long read alignment 1
  • chromap 1
  • downsample 1
  • scRNA-Seq 1
  • crispr 1
  • antibody capture 1
  • antigen capture 1
  • multiomics 1
  • usearch 1
  • pangenome-scale 1
  • sorted 1
  • all versus all 1
  • mashmap 1
  • wavefront 1
  • copy-number 1
  • mapad 1
  • adna 1
  • c to t 1
  • multi-tool 1
  • fastqfilter 1
  • taxonomic composition 1
  • vsearch/fastqfilter 1
  • ATACseq 1
  • shift 1
  • ATACshift 1
  • phylogenies 1
  • hhsuite 1
  • post Post-processing 1
  • pdb 1
  • probabilistic realignment 1
  • realign 1
  • cram-size 1
  • size 1
  • circular 1
  • bwameme 1
  • junction 1
  • bwamem2 1
  • paired reads merging 1
  • MMseqs2 1
  • overlap-based merging 1
  • InterProScan 1
  • guidetree 1
  • droplet based single cells 1
  • targets 1
  • lofreq/call 1
  • NextGenMap 1
  • ngm 1
  • GATK UnifiedGenotyper 1
  • SNP table 1
  • methylation bias 1
  • mbias 1
  • random 1
  • generate 1
  • train 1
  • HMMER 1
  • amino acid 1
  • Hidden Markov Model 1
  • spliced 1
  • quant 1
  • reorder 1
  • igv 1
  • igv.js 1
  • js 1
  • genome browser 1
  • mergebamalignment 1
  • leftalignandtrimvariants 1
  • duplicate removal 1
  • induce 1
  • cluster analysis 1
  • calmd 1
  • faidx 1
  • track 1
  • insert size 1
  • repair 1
  • paired 1
  • read pairs 1
  • readgroup 1
  • scramble 1
  • clusteridentifier 1
  • amplicon 1
  • eucaryotes 1
  • coding 1
  • cds 1
  • transcroder 1
  • POA 1
  • core 1
  • snippy 1
  • SNPs 1
  • invariant 1
  • constant 1
  • rRNA 1
  • ribosomal RNA 1
  • ampliconclip 1
  • paragraph 1
  • graphs 1
  • ANI 1
  • hybrid-selection 1
  • segment 1
  • blastx 1
  • LCA 1
  • Ancestor 1
  • multimapper 1
  • neighbour-joining 1
  • purging 1
  • quast 1
  • annotation 0
  • download 0
  • classify 0
  • cnv 0
  • k-mer 0
  • taxonomic profiling 0
  • conversion 0
  • binning 0
  • VCF 0
  • copy number 0
  • imputation 0
  • contigs 0
  • bcftools 0
  • sv 0
  • gvcf 0
  • indexing 0
  • databases 0
  • QC 0
  • compression 0
  • cna 0
  • table 0
  • tsv 0
  • serotype 0
  • phage 0
  • openms 0
  • antimicrobial resistance 0
  • protein sequence 0
  • repeat 0
  • searching 0
  • pairs 0
  • bins 0
  • haplotype 0
  • expression 0
  • amr 0
  • transcriptome 0
  • completeness 0
  • plink2 0
  • low-coverage 0
  • genotype 0
  • bcf 0
  • seqkit 0
  • cooler 0
  • phasing 0
  • gzip 0
  • iCLIP 0
  • annotate 0
  • virus 0
  • validation 0
  • metagenome 0
  • checkm 0
  • db 0
  • decompression 0
  • ucsc 0
  • complexity 0
  • mag 0
  • segmentation 0
  • kraken2 0
  • glimpse 0
  • dedup 0
  • prokaryote 0
  • deduplication 0
  • antimicrobial resistance genes 0
  • cnvkit 0
  • tumor-only 0
  • single 0
  • demultiplexing 0
  • plasmid 0
  • mitochondria 0
  • scRNA-seq 0
  • kmers 0
  • differential 0
  • prediction 0
  • NCBI 0
  • csv 0
  • extract 0
  • mirna 0
  • ptr 0
  • riboseq 0
  • call 0
  • antibiotic resistance 0
  • sourmash 0
  • benchmark 0
  • wxs 0
  • concatenate 0
  • coptr 0
  • svtk 0
  • indels 0
  • isolates 0
  • de novo assembly 0
  • mutect2 0
  • visualization 0
  • detection 0
  • fastx 0
  • gridss 0
  • adapters 0
  • profiling 0
  • de novo 0
  • FASTQ 0
  • text 0
  • fragment 0
  • summary 0
  • interval 0
  • merging 0
  • tabular 0
  • reference-free 0
  • idXML 0
  • arg 0
  • containment 0
  • transcriptomics 0
  • snps 0
  • sample 0
  • sequencing 0
  • umitools 0
  • gsea 0
  • microarray 0
  • pypgx 0
  • isomir 0
  • compress 0
  • bgzip 0
  • hic 0
  • deep learning 0
  • haplotypecaller 0
  • cut 0
  • resistance 0
  • ATAC-seq 0
  • read depth 0
  • interval_list 0
  • bin 0
  • preprocessing 0
  • ccs 0
  • bigwig 0
  • dna 0
  • fungi 0
  • CLIP 0
  • DNA sequencing 0
  • biosynthetic gene cluster 0
  • mtDNA 0
  • family 0
  • bedgraph 0
  • chunk 0
  • happy 0
  • targeted sequencing 0
  • ranking 0
  • logratio 0
  • propr 0
  • fgbio 0
  • ancestry 0
  • matching 0
  • fai 0
  • bedpe 0
  • ngscheckmate 0
  • genome assembler 0
  • enrichment 0
  • ganon 0
  • redundancy 0
  • add 0
  • telomere 0
  • retrotransposon 0
  • microsatellite 0
  • union 0
  • genmod 0
  • public datasets 0
  • xeniumranger 0
  • quantification 0
  • BGC 0
  • image 0
  • STR 0
  • hmmcopy 0
  • HiFi 0
  • hybrid capture sequencing 0
  • copy number alteration calling 0
  • deeparg 0
  • genome mining 0
  • mlst 0
  • arriba 0
  • html 0
  • panel 0
  • das_tool 0
  • prokka 0
  • small indels 0
  • C to T 0
  • fusion 0
  • typing 0
  • das tool 0
  • polishing 0
  • entrez 0
  • rsem 0
  • regions 0
  • bim 0
  • replace 0
  • fam 0
  • fastk 0
  • PCA 0
  • spark 0
  • benchmarking 0
  • dictionary 0
  • UMI 0
  • lineage 0
  • RNA-seq 0
  • bacterial 0
  • duplication 0
  • pangolin 0
  • genomes 0
  • covid 0
  • pairsam 0
  • prokaryotes 0
  • angsd 0
  • scores 0
  • reports 0
  • krona 0
  • npz 0
  • windowmasker 0
  • bakta 0
  • vrhyme 0
  • highly_multiplexed_imaging 0
  • mcmicro 0
  • host 0
  • image_analysis 0
  • seqtk 0
  • archiving 0
  • zip 0
  • unzip 0
  • uncompress 0
  • untar 0
  • kraken 0
  • RNA 0
  • proteome 0
  • microbes 0
  • somatic variants 0
  • transposons 0
  • complement 0
  • roh 0
  • transcripts 0
  • organelle 0
  • remove 0
  • converter 0
  • intervals 0
  • genome assembly 0
  • gatk4spark 0
  • mzml 0
  • chimeras 0
  • PacBio 0
  • comparisons 0
  • combine 0
  • comparison 0
  • score 0
  • popscle 0
  • genotype-based deconvoltion 0
  • bamtools 0
  • variant_calling 0
  • bracken 0
  • rna_structure 0
  • sylph 0
  • amplicon sequencing 0
  • notebook 0
  • informative sites 0
  • kinship 0
  • identity 0
  • relatedness 0
  • wastewater 0
  • virulence 0
  • cut up 0
  • miRNA 0
  • tabix 0
  • cool 0
  • krona chart 0
  • dump 0
  • lossless 0
  • observations 0
  • shapeit 0
  • CRISPR 0
  • khmer 0
  • prefetch 0
  • survivor 0
  • ataqv 0
  • repeat expansion 0
  • ambient RNA removal 0
  • checkv 0
  • png 0
  • cfDNA 0
  • wig 0
  • population genomics 0
  • ligate 0
  • gene set analysis 0
  • megan 0
  • nacho 0
  • mash 0
  • pigz 0
  • profiles 0
  • gene set 0
  • bustools 0
  • gstama 0
  • resolve_bioscience 0
  • spatial_transcriptomics 0
  • checksum 0
  • screen 0
  • krakentools 0
  • phase 0
  • haplotypes 0
  • iphop 0
  • krakenuniq 0
  • assembly evaluation 0
  • trancriptome 0
  • tama 0
  • polyA_tail 0
  • k-mer frequency 0
  • corrupted 0
  • GC content 0
  • tree 0
  • nanostring 0
  • barcode 0
  • mapcounter 0
  • haplogroups 0
  • mRNA 0
  • find 0
  • refine 0
  • ichorcna 0
  • serogroup 0
  • interactive 0
  • long terminal repeat 0
  • split_kmers 0
  • WGS 0
  • regression 0
  • taxids 0
  • taxon name 0
  • zlib 0
  • differential expression 0
  • vcflib 0
  • cgMLST 0
  • dereplicate 0
  • taxon tables 0
  • otu tables 0
  • standardisation 0
  • standardise 0
  • standardization 0
  • repeats 0
  • svdb 0
  • ome-tif 0
  • de novo assembler 0
  • small genome 0
  • MCMICRO 0
  • signature 0
  • FracMinHash sketch 0
  • mirdeep2 0
  • interactions 0
  • functional analysis 0
  • cancer genomics 0
  • function 0
  • pharokka 0
  • bloom filter 0
  • k-mer index 0
  • COBS 0
  • archive 0
  • xz 0
  • mudskipper 0
  • long terminal retrotransposon 0
  • transcriptomic 0
  • parallelized 0
  • orthology 0
  • genetics 0
  • tnhaplotyper2 0
  • rgfa 0
  • small variants 0
  • multiallelic 0
  • nucleotides 0
  • cnvnator 0
  • proportionality 0
  • mitochondrion 0
  • orf 0
  • metamaps 0
  • join 0
  • RNA sequencing 0
  • trgt 0
  • GPU-accelerated 0
  • purge duplications 0
  • library 0
  • preseq 0
  • adapter 0
  • variant pruning 0
  • doublets 0
  • bfiles 0
  • anndata 0
  • subset 0
  • gene labels 0
  • read-group 0
  • ped 0
  • hostile 0
  • Read depth 0
  • decontamination 0
  • graph layout 0
  • human removal 0
  • removal 0
  • msisensor-pro 0
  • micro-satellite-scan 0
  • tumor 0
  • msi 0
  • instability 0
  • MSI 0
  • homoploymer 0
  • Duplication purging 0
  • snpsift 0
  • Pharmacogenetics 0
  • snpeff 0
  • effect prediction 0
  • shigella 0
  • switch 0
  • Streptococcus pneumoniae 0
  • sequenzautils 0
  • transformation 0
  • rename 0
  • salmonella 0
  • smrnaseq 0
  • varcal 0
  • retrotransposons 0
  • bam2fq 0
  • frame-shift correction 0
  • long-read sequencing 0
  • scaffolding 0
  • rtgtools 0
  • sequence analysis 0
  • junctions 0
  • pharmacogenetics 0
  • runs_of_homozygosity 0
  • polish 0
  • taxonomic profile 0
  • SimpleAF 0
  • concordance 0
  • duplex 0
  • deconvolution 0
  • bayesian 0
  • merge mate pairs 0
  • reads merging 0
  • short reads 0
  • xenograft 0
  • UMIs 0
  • GEO 0
  • trim 0
  • metagenomic 0
  • identifier 0
  • microscopy 0
  • expansionhunterdenovo 0
  • repeat_expansions 0
  • metadata 0
  • tab 0
  • microbial 0
  • allele-specific 0
  • concat 0
  • panelofnormals 0
  • gatk 0
  • joint genotyping 0
  • secondary metabolites 0
  • NRPS 0
  • RiPP 0
  • evidence 0
  • antibiotics 0
  • antismash 0
  • filtermutectcalls 0
  • RNA-Seq 0
  • simulate 0
  • interval list 0
  • tbi 0
  • gwas 0
  • CNV 0
  • sra-tools 0
  • settings 0
  • BAM 0
  • blastn 0
  • version 0
  • correction 0
  • calling 0
  • cnv calling 0
  • awk 0
  • cvnkit 0
  • single cells 0
  • eCLIP 0
  • genome bins 0
  • fasterq-dump 0
  • structural-variant calling 0
  • intersect 0
  • eigenstrat 0
  • scatter 0
  • validate 0
  • samplesheet 0
  • format 0
  • eido 0
  • deseq2 0
  • metagenomes 0
  • rna-seq 0
  • intersection 0
  • windows 0
  • heatmap 0
  • region 0
  • sizes 0
  • spatial_omics 0
  • bases 0
  • random forest 0
  • allele 0
  • graft 0
  • ChIP-seq 0
  • gem 0
  • genomad 0
  • baf 0
  • vector 0
  • f coefficient 0
  • homozygous genotypes 0
  • jaccard 0
  • heterozygous genotypes 0
  • overlap 0
  • inbreeding 0
  • array_cgh 0
  • cytosure 0
  • getfasta 0
  • run 0
  • tnfilter 0
  • gost 0
  • closest 0
  • rad 0
  • bamtobed 0
  • sorting 0
  • structural variant 0
  • bam2fastx 0
  • bam2fastq 0
  • immcantation 0
  • airrseq 0
  • immunoinformatics 0
  • gprofiler2 0
  • derived alleles 0
  • homology 0
  • genome graph 0
  • unionBedGraphs 0
  • reverse complement 0
  • simulation 0
  • hmmfetch 0
  • decompose 0
  • subtract 0
  • slopBed 0
  • transmembrane 0
  • vcf file 0
  • bgen file 0
  • plink2_pca 0
  • pca 0
  • tnseq 0
  • ancestral alleles 0
  • pruning 0
  • decoy 0
  • linkage equilibrium 0
  • htseq 0
  • shiftBed 0
  • multinterval 0
  • sompy 0
  • overlapped bed 0
  • maskfasta 0
  • peak picking 0
  • chunking 0
  • site frequency spectrum 0
  • co-orthology 0
  • spectral clustering 0
  • sequence similarity 0
  • python 0
  • plastid 0
  • resfinder 0
  • resistance genes 0
  • raw 0
  • mgf 0
  • parquet 0
  • parser 0
  • dbsnp 0
  • standardize 0
  • quarto 0
  • r 0
  • coexpression 0
  • correlation 0
  • corpcor 0
  • assay 0
  • phylogenetics 0
  • minimum_evolution 0
  • distance-based 0
  • short 0
  • nucleotide sequence 0
  • intron 0
  • masking 0
  • low-complexity 0
  • uq 0
  • parallel 0
  • file manipulation 0
  • agat 0
  • comparative genomics 0
  • autozygosity 0
  • homozygosity 0
  • deep variant 0
  • dereplication 0
  • mutect 0
  • microbial genomics 0
  • drep 0
  • idx 0
  • biallelic 0
  • update header 0
  • longest 0
  • nm 0
  • isoform 0
  • transform 0
  • gaps 0
  • introns 0
  • variancepartition 0
  • dream 0
  • install 0
  • joint-genotyping 0
  • genotypegvcf 0
  • BCF 0
  • md 0
  • csi 0
  • bioawk 0
  • Read coverage histogram 0
  • GFF/GTF 0
  • remove samples 0
  • gemini 0
  • maf 0
  • lua 0
  • toml 0
  • scanner 0
  • helitron 0
  • vcfbreakmulti 0
  • uniq 0
  • deduplicate 0
  • VCFtools 0
  • verifybamid 0
  • DNA contamination estimation 0
  • mkvdjref 0
  • melon 0
  • plant 0
  • cellpose 0
  • hifi 0
  • extractunbinned 0
  • linkbins 0
  • Assembly 0
  • domains 0
  • vcf2db 0
  • umicollapse 0
  • genepred 0
  • refflat 0
  • gtftogenepred 0
  • ucsc/liftover 0
  • quality assurnce 0
  • qa 0
  • metabolite annotation 0
  • metaspace 0
  • integron 0
  • mobile genetic elements 0
  • genome annotation 0
  • trna 0
  • files 0
  • covariance models 0
  • upd 0
  • uniparental 0
  • disomy 0
  • unmarkduplicates 0
  • snv 0
  • genotype dosages 0
  • comp 0
  • denoisereadcounts 0
  • readwriter 0
  • dnamodelapply 0
  • SINE 0
  • dnascope 0
  • tblastn 0
  • network 0
  • wget 0
  • groupby 0
  • tnscope 0
  • bgen 0
  • subtyping 0
  • bedcov 0
  • createreadcountpanelofnormals 0
  • chloroplast 0
  • genome polishing 0
  • confidence 0
  • blat 0
  • alr 0
  • clr 0
  • Salmonella enterica 0
  • boxcox 0
  • Escherichia coli 0
  • assembly polishing 0
  • propd 0
  • copyratios 0
  • postprocessing 0
  • geo 0
  • whamg 0
  • wham 0
  • compartments 0
  • copy number analysis 0
  • gender determination 0
  • topology 0
  • copy number alterations 0
  • copy number variation 0
  • yahs 0
  • workflow_mode 0
  • calder2 0
  • proteus 0
  • readproteingroups 0
  • ploidy 0
  • eigenvectors 0
  • hicPCA 0
  • sliding 0
  • cadd 0
  • snakemake 0
  • workflow 0
  • homologs 0
  • predict 0
  • microRNA 0
  • admixture 0
  • multiqc 0
  • mass_error 0
  • search engine 0
  • poolseq 0
  • variant-calling 0
  • stardist 0
  • telseq 0
  • vsearch/dereplicate 0
  • CRISPRi 0
  • http(s) 0
  • utility 0
  • setgt 0
  • jvarkit 0
  • translate 0
  • tar 0
  • tarball 0
  • targz 0
  • HLA 0
  • mzML 0
  • adapterremoval 0
  • 16S 0
  • bclconvert 0
  • rank 0
  • antimicrobial reistance 0
  • drug categorization 0
  • ATLAS 0
  • uniques 0
  • Illumina 0
  • functional 0
  • impute-info 0
  • tags 0
  • tag2tag 0
  • sequencing_bias 0
  • hashing-based deconvolution 0
  • java 0
  • script 0
  • post mortem damage 0
  • hmmscan 0
  • xml 0
  • svg 0
  • standard 0
  • haplotag 0
  • atlas 0
  • staging 0
  • mkarv 0
  • Staging 0
  • hmmpress 0
  • prepare 0
  • nucBed 0
  • Read trimming 0
  • plotting 0
  • patterns 0
  • regex 0
  • paired reads re-pairing 0
  • fix 0
  • metagenome assembler 0
  • malformed 0
  • partitioning 0
  • model 0
  • scanpy 0
  • AMPs 0
  • antimicrobial peptide prediction 0
  • regtools 0
  • resegment 0
  • leafcutter 0
  • amp 0
  • chip 0
  • recovery 0
  • mgi 0
  • Staphylococcus aureus 0
  • affy 0
  • updatedata 0
  • reference panels 0
  • identity-by-descent 0
  • decomposeblocksub 0
  • block substitutions 0
  • morphology 0
  • nuclear contamination estimate 0
  • AT content 0
  • installation 0
  • nucleotide content 0
  • elfasta 0
  • elprep 0
  • catpack 0
  • Computational Immunology 0
  • controlstatistics 0
  • source tracking 0
  • emoji 0
  • Bioinformatics Tools 0
  • quality_control 0
  • Immune Deconvolution 0
  • doublet 0
  • doublet_detection 0
  • relabel 0
  • barcodes 0
  • doCounts 0
  • subsetting 0
  • logFC 0
  • significance statistic 0
  • p-value 0
  • scvi 0
  • solo 0
  • import segmentation 0
  • nuclear segmentation 0
  • cell segmentation 0
  • allele counts 0
  • Read report 0
  • Read filters 0
  • pile up 0
  • structural-variants 0
  • omics 0
  • biological activity 0
  • bamtools/split 0
  • prior knowledge 0
  • tag 0
  • cell_barcodes 0
  • mygene 0
  • go 0
  • yaml 0
  • bamtools/convert 0
  • shuffleBed 0
  • mouse 0
  • scimap 0
  • SNV 0
  • bigbed 0
  • Indel 0
  • host removal 0
  • haploype 0
  • bacphlip 0
  • virulent 0
  • nanopore sequencing 0
  • rna velocity 0
  • cobra 0
  • extension 0
  • grea 0
  • Bayesian 0
  • spatial_neighborhoods 0
  • functional enrichment 0
  • cell_type_identification 0
  • background_correction 0
  • illumiation_correction 0
  • hardy-weinberg 0
  • hwe statistics 0
  • hwe equilibrium 0
  • reference-independent 0
  • genotype likelihood 0
  • collapse 0
  • liftover 0
  • seqfu 0
  • n50 0
  • cell_phenotyping 0
  • associations 0
  • machine_learning 0
  • element 0
  • trimBam 0
  • bamUtil 0
  • clahe 0
  • refresh 0
  • association 0
  • GWAS 0
  • trio binning 0
  • tandem repeats 0
  • case/control 0
  • long read 0
  • temperate 0
  • translation 0
  • nanoq 0
  • ribosomal 0
  • 10x 0
  • background 0
  • single-stranded 0
  • regulatory network 0
  • ancientDNA 0
  • transcription factors 0
  • paraphase 0
  • selector 0
  • quality check 0
  • spot 0
  • orthogroup 0
  • authentict 0
  • sage 0
  • contiguate 0
  • mass spectrometry 0
  • featuretable 0
  • extraction 0
  • read group 0
  • bias 0
  • redundant 0
  • grabix 0
  • check 0
  • lifestyle 0
  • hamming-distance 0
  • hashing-based deconvoltion 0
  • gnu 0
  • coreutils 0
  • generic 0
  • transposable element 0
  • retrieval 0
  • autofluorescence 0
  • cycif 0
  • busco 0
  • impute 0
  • lexogen 0
  • genotype-based demultiplexing 0
  • donor deconvolution 0
  • cellsnp 0
  • reference compression 0
  • trimfq 0
  • vcflib/vcffixup 0
  • AC/NS/AF 0
  • Pacbio 0
  • reference panel 0
  • bedtobigbed 0
  • detecting svs 0
  • bedgraphtobigwig 0
  • megahit 0
  • calibratedragstrmodel 0
  • reduced 0
  • representations 0
  • maxbin2 0
  • getpileupsummaries 0
  • metagenome-assembled genomes 0
  • cross-samplecontamination 0
  • mass-spectroscopy 0
  • calculatecontamination 0
  • mcr-1 0
  • MD5 0
  • 128 bit 0
  • bedtointervallist 0
  • denovo 0
  • cnnscorevariants 0
  • debruijn 0
  • asereadcounter 0
  • daa 0
  • rma6 0
  • Neisseria meningitidis 0
  • vqsr 0
  • variant quality score recalibration 0
  • 3D heat map 0
  • contour map 0
  • Merqury 0
  • annotateintervals 0
  • smudgeplot 0
  • mash/sketch 0
  • taxonomic assignment 0
  • metaphlan 0
  • peptide prediction 0
  • determinegermlinecontigploidy 0
  • legionella 0
  • clinical 0
  • pneumophila 0
  • createsomaticpanelofnormals 0
  • limma 0
  • Listeria monocytogenes 0
  • createsequencedictionary 0
  • condensedepthevidence 0
  • lofreq/filter 0
  • qualities 0
  • AMP 0
  • dragstr 0
  • collectreadcounts 0
  • functional genomics 0
  • sgRNA 0
  • CRISPR-Cas9 0
  • maximum-likelihood 0
  • rra 0
  • composestrtablefile 0
  • short variant discovery 0
  • combinegvcfs 0
  • DNA damage 0
  • NGS 0
  • damage patterns 0
  • collectsvevidence 0
  • estimate 0
  • unionsum 0
  • heattree 0
  • adapter removal 0
  • unmapped 0
  • contaminant 0
  • cancer genome 0
  • somatic structural variations 0
  • mobile element insertions 0
  • sequencing summary 0
  • Neisseria gonorrhoeae 0
  • gender 0
  • zipperbams 0
  • ubam 0
  • graph construction 0
  • graph drawing 0
  • squeeze 0
  • odgi 0
  • combine graphs 0
  • graph stats 0
  • graph unchopping 0
  • graph formats 0
  • graph viz 0
  • tumor/normal 0
  • hla-typing 0
  • ILP 0
  • HLA-I 0
  • block-compressed 0
  • groupreads 0
  • PCR/optical duplicates 0
  • Beautiful stand-alone HTML report 0
  • mosdepth 0
  • gangstr 0
  • assembler 0
  • de Bruijn 0
  • gene-calling 0
  • microrna 0
  • gamma 0
  • target prediction 0
  • mitochondrial genome 0
  • reference genome 0
  • UShER 0
  • bootstrapping 0
  • bacterial variant calling 0
  • otu table 0
  • bioinformatics tools 0
  • germline variant calling 0
  • somatic variant calling 0
  • variant caller 0
  • rust 0
  • fq 0
  • microsatellite instability 0
  • lint 0
  • scan 0
  • mtnucratio 0
  • ratio 0
  • single molecule 0
  • mitochondrial to nuclear ratio 0
  • collapsing 0
  • upper-triangular matrix 0
  • gawk 0
  • amrfinderplus 0
  • fARGene 0
  • rgi 0
  • ibd 0
  • hbd 0
  • beagle 0
  • mitochondrial 0
  • genome profile 0
  • bgc 0
  • Haemophilus influenzae 0
  • haplotype resolution 0
  • file parsing 0
  • txt 0
  • variantrecalibrator 0
  • compound 0
  • recalibration model 0
  • variantfiltration 0
  • svcluster 0
  • svannotate 0
  • gccounter 0
  • splitintervals 0
  • readcounter 0
  • splitcram 0
  • site depth 0
  • shiftintervals 0
  • shiftfasta 0
  • abricate 0
  • extractvariants 0
  • hmtnote 0
  • gene model 0
  • Haplotypes 0
  • Imputation 0
  • joint-variant-calling 0
  • GNU 0
  • merge compare 0
  • genomes on a tree 0
  • low coverage 0
  • gget 0
  • genome statistics 0
  • genome manipulation 0
  • genome summary 0
  • tama_collapse.py 0
  • gfastats 0
  • TAMA 0
  • extract_variants 0
  • Mykrobe 0
  • gstama/merge 0
  • Salmonella Typhi 0
  • repeat content 0
  • gstama/polyacleanup 0
  • GTDB taxonomy 0
  • genome heterozygosity 0
  • genome taxonomy database 0
  • archaea 0
  • genome size 0
  • gunc 0
  • gunzip 0
  • models 0
  • gvcftools 0
  • annotations 0
  • pneumoniae 0
  • learnreadorientationmodel 0
  • indexfeaturefile 0
  • readcountssummary 0
  • getpileupsumaries 0
  • kallisto/index 0
  • germlinevariantsites 0
  • germlinecnvcaller 0
  • germline contig ploidy 0
  • digital normalization 0
  • k-mer counting 0
  • effective genome size 0
  • Klebsiella 0
  • panelofnormalscreation 0
  • papermill 0
  • kegg 0
  • kofamscan 0
  • jointgenotyping 0
  • combining 0
  • genomicsdbimport 0
  • genomicsdb 0
  • gatherbqsrreports 0
  • tranche filtering 0
  • filtervarianttranches 0
  • filterintervals 0
  • estimatelibrarycomplexity 0
  • duplication metrics 0
  • readorientationartifacts 0
  • jupytext 0
  • shiftchain 0
  • probability_maps 0
  • pos 0
  • haemophilus 0
  • selectvariants 0
  • revert 0
  • panel_of_normals 0
  • IDR 0
  • multicut 0
  • pixel classification 0
  • pixel_classification 0
  • reblockgvcf 0
  • Jupyter 0
  • printsvevidence 0
  • printreads 0
  • interproscan 0
  • preprocessintervals 0
  • postprocessgermlinecnvcalls 0
  • genomic islands 0
  • insertion 0
  • snvs 0
  • mutectstats 0
  • jasminesv 0
  • jasmine 0
  • Python 0
  • flip 0
  • ligation junctions 0
  • sex determination 0
  • interleave 0
  • header 0
  • seq 0
  • na 0
  • selection 0
  • random draw 0
  • pseudohaploid 0
  • pseudodiploid 0
  • freqsum 0
  • bam2seqz 0
  • gc_wiggle 0
  • custom 0
  • genetic sex 0
  • cls 0
  • relative coverage 0
  • Cores 0
  • Segmentation 0
  • rare variants 0
  • error 0
  • TMA dearray 0
  • de-novo 0
  • longread 0
  • sha256 0
  • 256 bit 0
  • UNet 0
  • shinyngs 0
  • exploratory 0
  • sertotype 0
  • sequence headers 0
  • density 0
  • cumulative coverage 0
  • scatterplot 0
  • corrrelation 0
  • paired-end 0
  • grep 0
  • peak-caller 0
  • cut&tag 0
  • cut&run 0
  • chromatin 0
  • seacr 0
  • pcr duplicates 0
  • assembly-binning 0
  • applyvarcal 0
  • cutesv 0
  • VQSR 0
  • variant recalibration 0
  • gct 0
  • subseq 0
  • boxplot 0
  • features 0
  • rdtest 0
  • spa 0
  • streptococcus 0
  • sccmec 0
  • variantcalling 0
  • cmseq 0
  • protein coding genes 0
  • Sample 0
  • short-read sequencing 0
  • polymorphic sites 0
  • svtk/baftest 0
  • baftest 0
  • countsvtypes 0
  • rdtest2vcf 0
  • polymorphic 0
  • access 0
  • vcf2bed 0
  • decompress 0
  • polymut 0
  • polya tail 0
  • fast5 0
  • chromosome_visualization 0
  • Mycobacterium tuberculosis 0
  • chromosomal rearrangements 0
  • sequencing adapters 0
  • spatype 0
  • antitarget 0
  • mcool 0
  • cooler/balance 0
  • sliding window 0
  • genomic bins 0
  • makebins 0
  • CRAM 0
  • SMN1 0
  • SMN2 0
  • sniffles 0
  • enzyme 0
  • digest 0
  • cload 0
  • subcontigs 0
  • fracminhash sketch 0
  • dbnsfp 0
  • predictions 0
  • nucleotide composition 0
  • concoct 0
  • partition histograms 0
  • target 0
  • export 0
  • signatures 0
  • hash sketch 0
  • duplicate marking 0
  • pairtools 0
  • antibiotic resistance genes 0
  • insertions 0
  • tandem duplications 0
  • CoPRO 0
  • GRO-cap 0
  • PRO-cap 0
  • CAGE 0
  • NETCAGE 0
  • RAMPAGE 0
  • csRNA-seq 0
  • STRIPE-seq 0
  • PRO-seq 0
  • GRO-seq 0
  • genetic 0
  • faqcs 0
  • sortvcf 0
  • str 0
  • exclude 0
  • variant identifiers 0
  • indep 0
  • indep pairwise 0
  • recode 0
  • whole genome association 0
  • cache 0
  • identifiers 0
  • scoring 0
  • percent on target 0
  • variant genetic 0
  • pmdtools 0
  • deletions 0
  • picard/renamesampleinvcf 0
  • endogenous DNA 0
  • read 0
  • pairstools 0
  • restriction fragments 0
  • select 0
  • duplexumi 0
  • consensus sequence 0
  • public 0
  • pbbam 0
  • pbmerge 0
  • subreads 0
  • pbp 0
  • pair-end 0
  • pedigrees 0
  • pcr 0
  • ENA 0
  • SRA 0
  • motif 0
  • ChIP-Seq 0
  • phantom peaks 0
  • prophage 0
  • identification 0
  • illumina datasets 0
  • phylogenetic composition 0
  • ARGs 0
  • mate-pair 0
  • liftovervcf 0
  • porechop_abi 0
  • Streptococcus pyogenes 0
  • sambamba 0
  • rhocall 0
  • escherichia coli 0
  • R 0
  • depth information 0
  • bamstat 0
  • structural variation 0
  • strandedness 0
  • experiment 0
  • read_pairs 0
  • fragment_size 0
  • inner_distance 0
  • duphold 0
  • read distribution 0
  • sequence-based 0
  • subsampling 0
  • mapping-based 0
  • integrity 0
  • rtg 0
  • pedfilter 0
  • rocplot 0
  • rtg-tools 0
  • salsa 0
  • salsa2 0
  • flagstat 0
  • long uncorrected reads 0
  • swissprot 0
  • genomic intervals 0
  • genbank 0
  • embl 0
  • contact 0
  • pretext 0
  • jpg 0
  • bmp 0
  • contact maps 0
  • gene finding 0
  • split by chromosome 0
  • intervals coverage 0
  • deletion 0
  • circos 0
  • eklipse 0
  • normal database 0
  • PEP 0
  • panel of normals 0
  • cutoff 0
  • eigenstratdatabasetools 0
  • haplotype purging 0
  • duplicate purging 0
  • false duplications 0
  • assembly curation 0
  • Haplotype purging 0
  • False duplications 0
  • Assembly curation 0
  • pep 0
  • schema 0
  • data-download 0

A tool to parse and summarise results from antimicrobial peptides tools and present functional classification.

0100

sample_dir txt csv faa summary_csv summary_html log results_db results_db_dmnd results_db_fasta results_db_tsv versions

A submodule that clusters the merged AMP hits generated from ampcombi2/parsetables and ampcombi2/complete using MMseqs2 cluster.

0

cluster_tsv rep_cluster_tsv log versions

ampcombi2/cluster:

A tool for clustering all AMP hits found across many samples and supporting many AMP prediction tools.

A submodule that merges all output summary tables from ampcombi/parsetables in one summary file.

0

tsv log versions

ampcombi2/complete:

This merges the per sample AMPcombi summaries generated by running 'ampcombi2/parsetables'.

A submodule that parses and standardizes the results from various antimicrobial peptide identification tools.

0100000

sample_dir contig_gbks db_tsv tsv faa sample_log full_log db db_txt db_fasta db_mmseqs versions

ampcombi2/parsetables:

A parsing tool to convert and summarise the outputs from multiple AMP detection tools in a standardized format.

Post-processing script of the MaltExtract component of the HOPS package

000

json summary_pdf tsv candidate_pdfs versions

Run the alignment/variant-call/consensus logic of the artic pipeline

01012012

results bam bai bam_trimmed bai_trimmed bam_primertrimmed bai_primertrimmed fasta vcf tbi json versions

artic:

ARTIC pipeline - a bioinformatics pipeline for working with virus sequencing data sequenced with nanopore

Alignment by Simultaneous Harmonization of Layer/Adjacency Registration

0100

tif versions

removes unused references from header of sorted BAM/CRAM files.

01

bam versions

This module is used to clip primer sequences from your alignments.

0123

bam bai versions

Align short or PacBio reads to a reference genome using BBMap

010

bam log versions

bbmap:

BBMap is a short read aligner, as well as various other bioinformatic tools.

Adapter and quality trimming of sequencing reads

010

reads log versions

bbmap:

BBMap is a short read aligner, as well as various other bioinformatic tools.

Merging overlapping paired reads into a single read.

010

merged unmerged ihist versions log

bbmap:

BBMap is a short read aligner, as well as various other bioinformatic tools.

BBNorm is designed to normalize coverage by down-sampling reads over high-depth areas of a genome, to result in a flat coverage distribution.

01

fastq log versions

bbmap:

BBMap is a short read aligner, as well as various other bioinformatic tools.

Split sequencing reads by mapping them to multiple references simultaneously

0100010

index primary_fastq all_fastq stats log versions

bbmap:

BBMap is a short read aligner, as well as various other bioinformatic tools.

Create 30% Smaller, Faster Gzipped Fastq Files. And remove duplicates

01

reads log versions

bbmap:

BBMap is a short read aligner, as well as various other bioinformatic tools.

Filter out sequences by sequence header name(s)

01000

reads log versions

bbmap:

BBMap is a short read aligner, as well as various other bioinformatic tools.

Creates an index from a fasta file, ready to be used by bbmap.sh in mapping mode.

0

index versions

bbmap:

BBMap is a short read aligner, as well as various other bioinformatic tools.

Calculates per-scaffold or per-base coverage information from an unsorted sam or bam file.

01

covstats hist versions

bbmap:

BBMap is a short read aligner, as well as various other bioinformatic tools.

Compares query sketches to reference sketches hosted on a remote server via the Internet.

01

hits versions

bbmap:

BBMap is a short read aligner, as well as various other bioinformatic tools.

Computes histograms (default), per-base reports (-d) and BEDGRAPH (-bg) summaries of feature coverage (e.g., aligned sequences) for a given genome.

012000

genomecov versions

bedtools:

A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.

Locate and tag duplicate reads in a BAM file

01

bam metrics versions

biobambam:

biobambam is a set of tools for early stage alignment file processing.

Merge a list of sorted bam files

01

bam bam_index checksum versions

biobambam:

biobambam is a set of tools for early stage alignment file processing.

Parallel sorting and duplicate marking

0101

bam bam_index cram metrics versions

biobambam:

biobambam is a set of tools for early stage alignment file processing.

Aligns single- or paired-end reads from bisulfite-converted libraries to a reference genome using Biscuit.

010101

bam bai versions

biscuit:

A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data

A fast, compact one-liner to produce duplicate-marked, sorted, and indexed BAM files using Biscuit

010101

bam bai versions

biscuit:

A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data

samblaster:

samblaster is a fast and flexible program for marking duplicates in read-id grouped paired-end SAM files. It can also optionally output discordant read pairs and/or split read mappings to separate SAM files, and/or unmapped/clipped reads to a separate FASTQ file. By default, samblaster reads SAM input from stdin and writes SAM to stdout.

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Summarize and/or filter reads based on bisulfite conversion rate

01010101

bam versions

biscuit:

A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data

Summarizes read-level methylation (and optionally SNV) information from a Biscuit BAM file in a standard-compliant BED format.

0101010101

bed versions

biscuit:

A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data

Merges methylation information for opposite-strand C's in a CpG context

010101

bed versions

biscuit:

A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data

Summarizes methylation or SNV information from a Biscuit VCF in a standard-compliant BED file.

01

bed versions

biscuit:

A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data

Performs alignment of BS-Seq reads using bismark

010101

bam report unmapped versions

bismark:

Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.

Removes alignments to the same position in the genome from the Bismark mapping output.

01

bam report versions

bismark:

Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.

Converts a specified reference genome into two different bisulfite converted versions and indexes them for alignments.

01

index versions

bismark:

Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.

Extracts methylation information for individual cytosines from alignments.

0101

bedgraph methylation_calls coverage report mbias versions

bismark:

Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.

Collects bismark alignment reports

01234

report versions

bismark:

Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.

BLASTP (Basic Local Alignment Search Tool- Protein) compares an amino acid (protein) query sequence against a protein database

01010

xml tsv csv versions

blast:

BLAST+ is a new suite of BLAST tools that utilizes the NCBI C++ Toolkit.

Align reads to a reference genome using bowtie

01010

bam log fastq versions

bowtie:

bowtie is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Align reads to a reference genome using bowtie2

01010100

sam bam cram csi crai log fastq versions

bowtie2:

Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences.

Builds bowtie index for reference genome

01

index versions

bowtie2:

Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences.

Construct species phylogenies using BUSCO proteins

01

gene_trees supermatrix versions

busco:

Construct species phylogenies using BUSCO proteins

Find SA coordinates of the input reads for bwa short-read mapping

0101

sai versions

bwa:

BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Performs fastq alignment to a fasta reference using BWA

0101010

bam cram csi crai versions

bwa:

BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Convert paired-end bwa SA coordinate files to SAM format

01201

bam versions

bwa:

BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Convert bwa SA coordinate file to SAM format

01201

bam versions

bwa:

BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Performs fastq alignment to a fasta reference using BWA

0101010

sam bam cram crai csi versions

bwa:

BWA-mem2 is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Performs fastq alignment to a fasta reference using BWA-MEME

010101000

sam bam cram crai csi versions

bwameme:

Faster BWA-MEM2 using learned-index

Performs alignment of BS-Seq reads using bwameth

010101

bam versions

bwameth:

Fast and accurate alignment of BS-Seq reads using bwa-mem and a 3-letter genome.

Performs indexing of c2t converted reference genome

01

index versions

bwameth:

Fast and accurate alignment of BS-Seq reads using bwa-mem and a 3-letter genome.

Taxonomic classification plus read-based abundance estimation from long DNA sequences and metagenome assembled genomes (e.g. contigs, MAGs / bins).

0101010101001010101010101

rat_log complete_abundance contig_abundance read2classification alignment_diamond contig2classification cat_log orf2lca faa gff unmapped_diamond unmapped_fasta unmapped2classification versions

catpack:

CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)

Cluster protein sequences using sequence similarity

01

fasta clusters versions

cdhit:

Clusters and compares protein or nucleotide sequences

Cluster nucleotide sequences using sequence similarity

01

fasta clusters versions

cdhit:

Clusters and compares protein or nucleotide sequences

Module to use Cell Ranger's pipelines analyze sequencing data produced from Chromium Single Cell Gene Expression.

010

outs versions

cellranger:

Cell Ranger by 10x Genomics is a set of analysis pipelines that process Chromium single-cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more.

Module to create FASTQs needed by the 10x Genomics Cell Ranger tool. Uses the cellranger mkfastq command.

012

fastq undetermined_fastq reports stats interop versions

cellranger:

Cell Ranger by 10x Genomics is a set of analysis pipelines that process Chromium single-cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more.

Module to build a filtered GTF needed by the 10x Genomics Cell Ranger tool. Uses the cellranger mkgtf command.

0

gtf versions

cellranger:

Cell Ranger by 10x Genomics is a set of analysis pipelines that process Chromium single-cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more.

Module to build the reference needed by the 10x Genomics Cell Ranger tool. Uses the cellranger mkref command.

000

reference versions

cellranger:

Cell Ranger by 10x Genomics is a set of analysis pipelines that process Chromium single-cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more.

Module to use Cell Ranger's pipelines to analyze sequencing data produced from various Chromium technologies, including Single Cell Gene Expression, Single Cell Immune Profiling, Feature Barcoding, and Cell Multiplexing.

00101010101010000000000000

config outs versions

cellranger:

Cell Ranger by 10x Genomics is a set of analysis pipelines that process Chromium single-cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more.

Module to use Cell Ranger's pipelines analyze sequencing data produced from Chromium Single Cell Immune Profiling.

010

outs versions

cellranger:

Cell Ranger processes data from 10X Genomics Chromium kits. cellranger vdj takes FASTQ files from cellranger mkfastq or bcl2fastq for V(D)J libraries and performs sequence assembly and paired clonotype calling. It uses the Chromium cellular barcodes and UMIs to assemble V(D)J transcripts per cell. Clonotypes and CDR3 sequences are output as a .vloupe file which can be loaded into Loupe V(D)J Browser.

Module to use Cell Ranger's ARC pipelines analyze sequencing data produced from Chromium Single Cell ARC. Uses the cellranger-arc count command.

01230

outs lib versions

cellrangerarc:

Cell Ranger ARC is a set of analysis pipelines that process Chromium Single Cell ARC data.

Module to create fastqs needed by the 10x Genomics Cell Ranger Arc tool. Uses the cellranger-arc mkfastq command.

00

versions fastq

cellrangerarc:

Cell Ranger Arc by 10x Genomics is a set of analysis pipelines that process Chromium single-cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more.

Module to build a filtered gtf needed by the 10x Genomics Cell Ranger Arc tool. Uses the cellranger-arc mkgtf command.

0

gtf versions

cellrangerarc:

Cell Ranger Arc by 10x Genomics is a set of analysis pipelines that process Chromium single-cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more.

Module to use Cell Ranger's ATAC pipelines analyze sequencing data produced from Chromium Single Cell ATAC.

010

outs versions

cellranger-atac:

Cell Ranger ATAC is a set of analysis pipelines that process Chromium Single Cell ATAC data.

Module to create fastqs needed by the 10x Genomics Cell Ranger ATAC tool. Uses the cellranger-atac mkfastq command.

00

versions fastq

cellranger-atac:

Cell Ranger ATAC by 10x Genomics is a set of analysis pipelines that process Chromium single-cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more.

Cellsnp-lite is a C/C++ tool for efficient genotyping bi-allelic SNPs on single cells. You can use the mode A of cellsnp-lite after read alignment to obtain the snp x cell pileup UMI or read count matrices for each alleles of given or detected SNPs for droplet based single cell data.

01234

base cell sample allele_depth depth_coverage depth_other versions

cellsnp:

Efficient genotyping bi-allelic SNPs on single cells

Performs preprocessing and alignment of chromatin fastq files to fasta reference files using chromap.

0101010000

bed bam tagAlign pairs versions

chromap:

Fast alignment and preprocessing of chromatin profiles

Indexes a fasta reference genome ready for chromatin profiling.

01

index versions

chromap:

Fast alignment and preprocessing of chromatin profiles

CIRCexplorer2 parses fusion junction files from multiple aligners to prepare them for CIRCexplorer2 annotate.

01

junction versions

circexplorer2:

Circular RNA analysis toolkit

Realign reads mapped with BWA to elongated reference genome

01010101

bam versions

circularmapper:

A method to improve mappings on circular genomes such as Mitochondria.

ClipKIT is a fast and flexible alignment trimming tool that keeps phylogenetically informative sites and removes those that display characteristics poor phylogenetic signal.

010

clipkit log versions

Predict recomination events in bacterial genomes

012

emsim em status newick fasta pos_ref versions

Align sequences using Clustal Omega

010100000

alignment versions

clustalo:

Latest version of Clustal: a multiple sequence alignment program for DNA or proteins

pigz:

Parallel implementation of the gzip algorithm.

Renders a guidetree in clustalo

01

tree versions

clustalo:

Latest version of Clustal: a multiple sequence alignment program for DNA or proteins

Make a transcript/gene mapping from a GTF and cross-reference with transcript quantifications.

0101000

tx2gene versions

custom:

"Custom module to create a transcript to gene mapping from a GTF and check it against transcript quantifications"

DeepSomatic is an extension of deep learning-based variant caller DeepVariant that takes aligned reads (in BAM or CRAM format) from tumor and normal data, produces pileup image tensors from them, classifies each tensor using a convolutional neural network, and finally reports somatic variants in a standard VCF or gVCF file.

0123401010101

vcf vcf_tbi gvcf gvcf_tbi versions

This tool filters alignments in a BAM/CRAM file according the the specified parameters.

012

bam logs versions

deeptools:

A set of user-friendly tools for normalization and visualization of deep-sequencing data

This tool takes an alignment of reads or fragments as input (BAM file) and generates a coverage track (bigWig or bedGraph) as output.

01200

bigwig bedgraph versions

deeptools:

A set of user-friendly tools for normalization and visualization of deep-sequencing data

Transforms the input alignments to a format suitable for the deep neural network variant caller

012301010101

examples gvcf small_model_calls versions

deepvariant:

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

Queries a DIAMOND database using blastp mode

010100

blast xml txt daa sam tsv paf versions

diamond:

Accelerated BLAST compatible local sequence aligner

Queries a DIAMOND database using blastx mode

010100

blast xml txt daa sam tsv paf log versions

diamond:

Accelerated BLAST compatible local sequence aligner

calculate clusters of highly similar sequences

01

tsv versions

diamond:

Accelerated BLAST compatible local sequence aligner

Builds a DIAMOND database

01000

db versions

diamond:

Accelerated BLAST compatible local sequence aligner

Performs fastq alignment to a reference using DRAGMAP

0101010

sam bam cram crai csi log versions

dragmap:

Dragmap is the Dragen mapper/aligner Open Source Software.

Create DRAGEN hashtable for reference genome

01

hashmap versions

dragmap:

Dragmap is the Dragen mapper/aligner Open Source Software.

Export assembly segment sequences in GFA 1.0 format to FASTA format

01

fasta versions

dshbio:

Reads, features, variants, assemblies, alignments, genomic range trees, pangenome graphs, and a bunch of random command line tools for bioinformatics. LGPL version 3 or later.

Filter features in gzipped BED format

01

bed versions

dshbio:

Reads, features, variants, assemblies, alignments, genomic range trees, pangenome graphs, and a bunch of random command line tools for bioinformatics. LGPL version 3 or later.

Filter features in gzipped GFF3 format

01

gff3 versions

dshbio:

Reads, features, variants, assemblies, alignments, genomic range trees, pangenome graphs, and a bunch of random command line tools for bioinformatics. LGPL version 3 or later.

Split features in gzipped BED format

01

bed versions

dshbio:

Reads, features, variants, assemblies, alignments, genomic range trees, pangenome graphs, and a bunch of random command line tools for bioinformatics. LGPL version 3 or later.

Split features in gzipped GFF3 format

01

gff3 versions

dshbio:

Reads, features, variants, assemblies, alignments, genomic range trees, pangenome graphs, and a bunch of random command line tools for bioinformatics. LGPL version 3 or later.

cons calculates a consensus sequence from a multiple sequence alignment. To obtain the consensus, the sequence weights and a scoring matrix are used to calculate a score for each amino acid residue or nucleotide at each position in the alignment.

01

consensus versions

emboss:

The European Molecular Biology Open Software Suite

splits an alignment into reference and query parts

012

query reference versions

epang:

Massively parallel phylogenetic placement of genetic sequences

Aligns sequences using FAMSA

01010

alignment versions

famsa:

Algorithm for large-scale multiple sequence alignments

Renders a guidetree in famsa

01

tree versions

famsa:

Algorithm for large-scale multiple sequence alignments

Alignment-free computation of average nucleotide Identity (ANI)

010

ani versions

Align reads to multiple reference genomes using fastq-screen

010

txt png html fastq versions

fastqscreen:

FastQ Screen allows you to screen a library of sequences in FastQ format against a set of sequence databases so you can see if the composition of the library matches with what you expect.

Produces a Newick format phylogeny from a multiple sequence alignment. Capable of bacterial genome size alignments.

0

phylogeny versions

Runs FCS-GX (Foreign Contamination Screen - Genome eXtractor) to remove foreign contamination from genome assemblies

012

cleaned contaminants versions

fcsgx:

The NCBI Foreign Contamination Screen. Genomic cross-species aligner, for contamination detection.

Fetches the NCBI FCS-GX database using a provided manifest URL

0

database versions

fcsgx:

The NCBI Foreign Contamination Screen. Genomic cross-species aligner, for contamination detection.

Runs FCS-GX (Foreign Contamination Screen - Genome eXtractor) to screen and remove foreign contamination from genome assemblies

01200

fcsgx_report taxonomy_report log hits versions

fcsgx:

The NCBI Foreign Contamination Screen. Genomic cross-species aligner, for contamination detection.

Using the fgbio tools, converts FASTQ files sequenced into unaligned BAM or CRAM files possibly moving the UMI barcode into the RX field of the reads

01

bam cram versions

fgbio:

A set of tools for working with genomic and high throughput sequencing data, including UMIs

Creates a database for Foldmason.

01

db versions

foldmason:

Multiple Protein Structure Alignment at Scale with FoldMason

Aligns protein structures using foldmason

01010

msa_3di msa_aa versions

foldmason:

Multiple Protein Structure Alignment at Scale with FoldMason

Renders a visualization report using foldmason

01010101

html versions

foldmason:

Multiple Protein Structure Alignment at Scale with FoldMason

fq generate is a FASTQ file pair generator. It creates two reads, formatting names as described by Illumina. While generate creates "valid" FASTQ reads, the content of the files are completely random. The sequences do not align to any genome. This requires a seed (--seed) to be supplied in ext.args.

0

fastq versions

fq:

fq is a library to generate and validate FASTQ file pairs.

Performs local realignment around indels to correct for mapping errors

012301010101

bam versions

gatk:

The full Genome Analysis Toolkit (GATK) framework, license restricted.

Generates a list of locations that should be considered for local realignment prior genotyping.

01201010101

intervals versions

gatk:

The full Genome Analysis Toolkit (GATK) framework, license restricted.

Left align and trim variants using GATK4 LeftAlignAndTrimVariants.

0123000

vcf tbi versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Merge unmapped with mapped BAM files

0120101

bam versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Performs fastq alignment to a fasta reference using using gem3-mapper

01010

bam versions

gem3:

The GEM indexer (v3).

A versatile pairwise aligner for genomic and spliced nucleotide sequences

0100

sam versions

graphmap2:

A versatile pairwise aligner for genomic and spliced nucleotide sequences.

A versatile pairwise aligner for genomic and spliced nucleotide sequences

0

index versions

graphmap2:

A versatile pairwise aligner for genomic and spliced nucleotide sequences.

Gubbins (Genealogies Unbiased By recomBinations In Nucleotide Sequences) is an algorithm that iteratively identifies loci containing elevated densities of base substitutions while concurrently constructing a phylogeny based on the putative point mutations outside of these regions.

0

fasta gff vcf stats phylip embl_predicted embl_branch tree tree_labelled versions

Reformat a Multiple Sequence Alignment (MSA) file

0100

msa versions

hhsuite:

HH-suite3 for fast remote homology detection and deep protein annotation

Align RNA-Seq reads to a reference with HISAT2

010101

bam summary fastq versions

hisat2:

HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes as well as to a single reference genome.

Builds HISAT2 index for reference genome

010101

index versions

hisat2:

HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes as well as to a single reference genome.

Extracts splicing sites from a gtf files

01

txt versions

hisat2:

HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes as well as to a single reference genome.

Performs HLA typing based on a population reference graph and employs a new linear projection method to align reads to the graph.

0123

results extraction extraction_mapped extraction_unmpapped hla fastq reads_per_level remapped versions

hlala:

HLA typing from short and long reads

Mask multiple sequence alignments

012345670

maskedaln fmask_rf fmask_all gmask_rf gmask_all pmask_rf pmask_all versions

hmmer:

Biosequence analysis using profile hidden Markov models

hmmalign from the HMMER suite aligns a number of sequences to an HMM profile

010

sto versions

hmmer:

Biosequence analysis using profile hidden Markov models

create an hmm profile from a multiple sequence alignment

010

hmm hmmbuildout versions

hmmer:

Biosequence analysis using profile hidden Markov models

search profile(s) against a sequence database

012345

output alignments target_summary domain_summary versions

hmmer:

Biosequence analysis using profile hidden Markov models

Create a tag directory with the HOMER suite

010

tagdir taginfo versions

homer:

HOMER (Hypergeometric Optimization of Motif EnRichment) is a suite of tools for Motif Discovery and next-gen sequencing analysis.

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

DESeq2:

Differential gene expression analysis based on the negative binomial distribution

edgeR:

Empirical Analysis of Digital Gene Expression Data in R

igv.js is an embeddable interactive genome visualization component

012

browser align_files index_files versions

igv:

Create an embeddable interactive genome browser component. Output files are expected to be present in the same directory as the genome browser html file. To visualise it, files have to be served. Check the documentation at: https://github.com/igvteam/igv-webapp for an example and https://github.com/igvteam/igv.js/wiki/Data-Server-Requirements for server requirements

Search covariance models against a sequence database

01200

output alignments target_summary versions

infernal:

Infernal is for searching DNA sequence databases for RNA structure and sequence similarities.

Strain-level comparisons across multiple inStrain profiles

0120

compare comparisons_table pooled_snv snv_keys snv_info versions

instrain:

Calculation of strain-level metrics

Produces a Newick format phylogeny from a multiple sequence alignment using the maximum likelihood algorithm. Capable of bacterial genome size alignments.

012000000000000

phylogeny report mldist lmap_svg lmap_eps lmap_quartetlh sitefreq_out bootstrap state contree nex splits suptree alninfo partlh siteprob sitelh treels rate mlrate exch_matrix log versions

Aligns sequences using kalign

010

alignment versions

kalign:

Kalign is a fast and accurate multiple sequence alignment algorithm.

Computes equivalence classes for reads and quantifies abundances

01010000

results json_info log versions

kallisto:

Quantifying abundances of transcripts from RNA-Seq data, or more generally of target sequences using high-throughput sequencing reads.

This module wraps the index module of the KMA alignment tool.

01

index versions

kma:

Rapid and precise alignment of raw reads against redundant databases with KMA

Makes a dotplot (Oxford Grid) of pair-wise sequence alignments

0120100

gif png versions

last:

LAST finds & aligns related regions of sequences.

Aligns query sequences to target sequences indexed with lastdb

0120

maf multiqc versions

last:

LAST finds & aligns related regions of sequences.

Prepare sequences for subsequent alignment with lastal.

01

index versions

last:

LAST finds & aligns related regions of sequences.

Converts MAF alignments in another format.

012010101

axt_gz bam blast_gz blasttab_gz chain_gz cram gff_gz html_gz psl_gz sam_gz tab_gz versions

last:

LAST finds & aligns related regions of sequences.

Reorder alignments in a MAF file

01

maf versions

last:

LAST finds & aligns related regions of sequences.

Post-alignment masking

01

maf versions

last:

LAST finds & aligns related regions of sequences.

Find split or spliced alignments in a MAF file

01

maf multiqc versions

last:

LAST finds & aligns related regions of sequences.

Find suitable score parameters for sequence alignment

010

param_file multiqc versions

last:

LAST finds & aligns related regions of sequences.

Align sequences using learnMSA

01

alignment versions

learnmsa:

learnMSA: Learning and Aligning large Protein Families

Converting aligned short and long reads records from one reference to another

0101

bam versions

leviosam2:

Fast and accurate coordinate conversion between assemblies

Lofreq subcommand to for insert base and indel alignment qualities

010

bam versions

lofreq:

A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data

Lofreq subcommand to call low frequency variants from alignments

0120

vcf versions

lofreq:

A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data

Lofreq subcommand to call low frequency variants from alignments when tumor-normal paired samples are available

0123450101

vcf versions

lofreq:

A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data

Lofreq subcommand to call low frequency variants from alignments when tumor-normal paired samples are available

0101

bam versions

lofreq:

A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data

Peak calling of enriched genomic regions of ChIP-seq and ATAC-seq experiments

0120

peak xls versions gapped bed bdg

macs2:

Model Based Analysis for ChIP-Seq data

Peak calling of enriched genomic regions of ChIP-seq and ATAC-seq experiments

0120

peak xls versions gapped bed bdg

macs3:

Model Based Analysis for ChIP-Seq data

Multiple sequence alignment using MAFFT

0101010101010

fas versions

pigz:

Parallel implementation of the gzip algorithm.

Multiple sequence alignment using MAFFT

0101010101010

fas versions

mafft:

Multiple alignment program for amino acid or nucleotide sequences based on fast Fourier transform

pigz:

Parallel implementation of the gzip algorithm.

Guide tree rendering using MAFFT

01

tree versions

mafft:

Multiple alignment program for amino acid or nucleotide sequences based on fast Fourier transform

Multiple Sequence Alignment using Graph Clustering

01010

alignment versions

magus:

Multiple Sequence Alignment using Graph Clustering

Multiple Sequence Alignment using Graph Clustering

01

tree versions

magus:

Multiple Sequence Alignment using Graph Clustering

MALT, an acronym for MEGAN alignment tool, is a sequence alignment and analysis tool designed for processing high-throughput sequencing data, especially in the context of metagenomics.

0000

index versions log

malt:

A tool for mapping metagenomic data

MALT, an acronym for MEGAN alignment tool, is a sequence alignment and analysis tool designed for processing high-throughput sequencing data, especially in the context of metagenomics.

010

rma6 alignments log versions

malt:

A tool for mapping metagenomic data

Tool for evaluation of MALT results for true positives of ancient metagenomic taxonomic screening

0100

results versions

Map short-reads to an indexed reference genome

01010000000

bam versions

mapad:

An aDNA aware short-read mapper

Mashmap is an approximate long read or contig mapper based on Jaccard similarity

0101

paf versions

Extracts per-base methylation metrics from alignments

01200

bedgraph methylkit versions

methyldackel:

Methylation caller from MethylDackel, a (mostly) universal methylation extractor for methyl-seq experiments.

Generates methylation bias plots from alignments

01200

txt versions

methyldackel:

Read position methylation bias tools from MethylDackel, a (mostly) universal extractor for methyl-seq experiments.

A versatile pairwise aligner for genomic and spliced nucleotide sequences

01010000

paf bam index versions

minimap2:

A versatile pairwise aligner for genomic and spliced nucleotide sequences.

Provides fasta index required by minimap2 alignment.

01

index versions

minimap2:

A versatile pairwise aligner for genomic and spliced nucleotide sequences.

A versatile pairwise aligner for genomic and spliced nucleotide sequences

0101

paf gff versions

miniprot:

A versatile pairwise aligner for genomic and protein sequences.

Provides fasta index required by miniprot alignment.

01

index versions

miniprot:

A versatile pairwise aligner for genomic and protein sequences.

Aligns protein structures using mTM-align

010

alignment structure versions

mTM-align:

Algorithm for structural multiple sequence alignments

pigz:

Parallel implementation of the gzip algorithm.

SNP table generator from GATK UnifiedGenotyper with functionality geared for aDNA

010101010000001

full_alignment info_txt snp_alignment snp_genome_alignment snpstatistics snptable snptable_snpeff snptable_uncertainty structure_genotypes structure_genotypes_nomissing json versions

MUMmer is a system for rapidly aligning entire genomes

012

coords versions

MUSCLE is a program for creating multiple alignments of amino acid or nucleotide sequences. A range of options are provided that give you the choice of optimizing accuracy, speed, or some compromise between the two

01

aligned_fasta phyi phys clustalw html msf tree log versions

Muscle is a program for creating multiple alignments of amino acid or nucleotide sequences. This particular module uses the super5 algorithm for very big alignments. It can permutate the guide tree according to a set of flags.

010

alignment versions

muscle -super5:

Muscle v5 is a major re-write of MUSCLE based on new algorithms.

pigz:

Parallel implementation of the gzip algorithm.

Compare multiple runs of long read sequencing data and alignments

01

report_html lengths_violin_html log_length_violin_html n50_html number_of_reads_html overlay_histogram_html overlay_histogram_normalized_html overlay_log_histogram_html overlay_log_histogram_normalized_html total_throughput_html quals_violin_html overlay_histogram_identity_html overlay_histogram_phredscore_html percent_identity_violin_html active_pores_over_time_html cumulative_yield_plot_gigabases_html sequencing_speed_over_time_html stats_txt versions

Performs fastq alignment to a reference using NARFMAP

0101010

bam log versions

narfmap:

narfmap is a fork of the Dragen mapper/aligner Open Source Software.

Create DRAGEN hashtable for reference genome

01

hashmap versions

narfmap:

narfmap is a fork of the Dragen mapper/aligner Open Source Software.

SARS-CoV-2 genome clade assignment, mutation calling, and sequence quality checks (C++ implementation)

010

csv csv_errors csv_insertions tsv json json_auspice ndjson fasta_aligned fasta_translation nwk versions

nextclade:

SARS-CoV-2 genome clade assignment, mutation calling, and sequence quality checks

Performs fastq alignment to a fasta reference using NextGenMap

010

bam versions

bwa:

NextGenMap is a flexible highly sensitive short read mapping tool that handles much higher mismatch rates than comparable algorithms while still outperforming them in terms of runtime

NUCmer is a pipeline for the alignment of multiple closely related nucleotide sequences.

012

delta coords versions

A fast and scalable tool for bacterial pangenome analysis

01

results aln versions

panaroo:

panaroo - an updated pipeline for pangenome investigation

NVIDIA Clara Parabricks GPU-accelerated alignment, sorting, BQSR calculation, and duplicate marking. Note this nf-core module requires files to be copied into the working directory and not symlinked.

01010101010

bam bai cram crai bqsr_table qc_metrics duplicate_metrics versions

parabricks:

NVIDIA Clara Parabricks GPU-accelerated genomics tools

VIDIA Clara Parabricks GPU-accelerated fast, accurate algorithm for mapping methylated DNA sequence reads to a reference genome, performing local alignment, and producing alignment for different parts of the query sequence

0101010

bam bai qc_metrics bqsr_table duplicate_metrics versions

parabricks:

NVIDIA Clara Parabricks GPU-accelerated genomics tools

Determines the depth in a BAM/CRAM file

0120101

depth binned_depth versions

paragraph:

Graph realignment tools for structural variants

Genotype structural variants using paragraph and grmpy

0123450101

vcf json versions

paragraph:

Graph realignment tools for structural variants

Convert a VCF file to a JSON graph

0101

graph versions

paragraph:

Graph realignment tools for structural variants

Alignment with PacBio's minimap2 frontend

0101

bam versions

pbmm2:

A minimap2 frontend for PacBio native data formats

Cleans the provided BAM, soft-clipping beyond-end-of-reference alignments and setting MAPQ to 0 for unmapped reads

01

bam versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Collects hybrid-selection (HS) metrics for a SAM or BAM file.

01234010101

metrics versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Collect metrics about the insert size distribution of a paired-end library.

01

metrics histogram versions

picard:

Java tools for working with NGS data in the BAM format

Collect multiple metrics from a BAM file

0120101

metrics pdf versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Collect metrics from a RNAseq BAM file

01000

metrics pdf versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Collect metrics about coverage and performance of whole genome sequencing (WGS) experiments.

01201010

metrics versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Checks that all data in the set of input files appear to come from the same individual

01234501

crosscheck_metrics versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Converts a FASTQ file to an unaligned BAM or SAM file.

01

bam versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Filters SAM/BAM files to include/exclude either aligned/unaligned reads or based on a read list

0120

bam versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Merges multiple BAM files into a single file

01

bam versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Pangenome toolbox for bacterial genomes

01

results aln versions

Run all Portcullis steps in one go

010101

log pass_junctions_bed pass_junctions_tab intron_gff exon_gff spliced_bam spliced_bai versions

portcullis:

Portcullis is a tool that filters out invalid splice junctions from RNA-seq alignment data. It accepts BAM files from various RNA-seq mappers, analyzes splice junctions and removes likely false positives, outputting filtered results in multiple formats for downstream analysis.

Split fasta file by 'N's to aid in self alignment for duplicate purging

01

split_fasta versions

purgedups:

Purge_dups is a package used to purge haplotigs and overlaps in an assembly based on read depth

Evaluate alignment data

010

results versions

qualimap:

Qualimap 2 is a platform-independent application written in Java and R that provides both a Graphical User Interface and a command-line interface to facilitate the quality control of alignment sequencing data and its derivatives like feature counts.

Evaluate alignment data

012000

results versions

qualimap:

Qualimap 2 is a platform-independent application written in Java and R that provides both a Graphical User Interface and a command-line interface to facilitate the quality control of alignment sequencing data and its derivatives like feature counts.

Evaluate alignment data

0101

results versions

qualimap:

Qualimap 2 is a platform-independent application written in Java and R that provides both a Graphical User Interface and a command-line interface to facilitate the quality control of alignment sequencing data and its derivatives like feature counts.

Quality Assessment Tool for Genome Assemblies

010101

results tsv transcriptome misassemblies unaligned versions

Produces a Newick format phylogeny from a multiple sequence alignment using a Neighbour-Joining algorithm. Capable of bacterial genome size alignments.

0

stockholm_alignment phylogeny versions

Calculate pan-genome from annotated bacterial assemblies in GFF3 format

01

results aln versions

Calling lowest common ancestors from multi-mapped reads in SAM/BAM/CRAM files

0120

csv json bam versions

sam2lca:

Lowest Common Ancestor on SAM/BAM/CRAM alignment files

Clips read alignments where they match BED file defined regions

01000

bam stats rejects_bam versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

calculates MD and NM tags

0101

bam versions

samtoolscalmd:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Concatenate BAM or CRAM file

01

bam cram versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Produces a consensus FASTA/FASTQ/PILEUP

01

fasta fastq pileup versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

convert and then index CRAM -> BAM or BAM -> CRAM file

0120101

bam cram bai crai versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

produces a histogram or table of coverage per chromosome

0120101

coverage versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

List CRAM Content-ID and Data-Series sizes

01

size versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Create a sequence dictionary file from a FASTA file

01

dict versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Index FASTA file, and optionally generate a file of chromosome sizes

01010

fa fai sizes gzi versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Converts a SAM/BAM/CRAM file to FASTQ

010

fastq interleaved singleton other versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Samtools fixmate is a tool that can fill in information (insert size, cigar, mapq) about paired end reads onto the corresponding other read. Also has options to remove secondary/unmapped alignments and recalculate whether reads are proper pairs.

01

bam cram sam versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Counts the number of alignments in a BAM/CRAM/SAM file for each FLAG type

012

flagstat versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

filter/convert SAM/BAM/CRAM file

01

readgroup versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Reports alignment summary statistics for a BAM/CRAM/SAM file

012

idxstats versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

converts FASTQ files to unmapped SAM/BAM/CRAM

01

sam bam cram versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Index SAM/BAM/CRAM file

01

bai csi crai versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

mark duplicate alignments in a coordinate sorted file

0101

bam cram sam versions

samtools:

Tools for dealing with SAM, BAM and CRAM files

Merge BAM or CRAM file

010101

bam cram csi crai versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

BAM

0120

mpileup versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Replace the header in the bam file with the header generated by the command. This command is much faster than replacing the header with a BAMโ†’SAMโ†’BAM conversion.

01

bam versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Collate/Fixmate/Sort/Markdup SAM/BAM/CRAM file

0101

bam cram csi crai metrics versions

samtools_cat:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

samtools_collate:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

samtools_fixmate:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

samtools_sort:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

samtools_markdup:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Sort SAM/BAM/CRAM file

0101

bam cram crai csi versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Produces comprehensive statistics from SAM/BAM/CRAM file

01201

stats versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

filter/convert SAM/BAM/CRAM file

0120100

bam cram sam bai csi crai unselected unselected_index versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

The Cluster Analysis tool of Scramble analyses and interprets the soft-clipped clusters found by cluster_identifier

0100

meis_tab dels_tab vcf versions

scramble:

Soft Clipped Read Alignment Mapper

The cluster_identifier tool of Scramble identifies soft clipped clusters

0120

clusters versions

scramble:

Soft Clipped Read Alignment Mapper

A multi-split mapping algorithm for circular RNA, splicing, trans-splicing and fusion detection

0100

alignment trans_alignments multi_bed single_bed versions

segemehl:

A multi-split mapping algorithm for circular RNA, splicing, trans-splicing and fusion detection

Generate genome indices for segemehl align

0

index versions

segemehl:

A multi-split mapping algorithm for circular RNA, splicing, trans-splicing and fusion detection

Performs fastq alignment to a fasta reference using Sentieon's BWA MEM

01010101

bam_and_bai versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Generate recalibration table and optionally perform base quality recalibration

01201010101010

table table_post recal_alignment csv pdf versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Induce a variation graph in GFA format from alignments in PAF format

012

gfa versions

seqwish:

seqwish implements a lossless conversion from pairwise alignments between sequences to a variation graph encoding the sequences and their alignments.

Severus is a somatic structural variation (SV) caller for long reads (both PacBio and ONT)

01234501

log read_qual breakpoints_double read_alignments read_ids collapsed_dup loh all_vcf all_breakpoints_clusters_list all_breakpoints_clusters all_plots somatic_vcf somatic_breakpoints_clusters_list somatic_breakpoints_clusters somatic_plots versions

Simple ANI calculation between reference and query genomes.

0101

dist versions

skani:

skani is a fast and robust tool for calculating ANI between metagenome assembled genomes and contigs.

Memory-efficient ANI database queries with skani.

0101

search versions

skani:

skani is a fast and robust tool for calculating ANI between metagenome assembled genomes and contigs.

Storing skani sketches/indices on disk.

01

sketch_dir sketch markers versions

skani:

skani is a fast and robust tool for calculating ANI between metagenome assembled genomes and contigs.

All-to-all ANI computation.

01

triangle versions

skani:

skani is a fast and robust tool for calculating ANI between metagenome assembled genomes and contigs.

Linearize and simplify variation graph in GFA format using blocked partial order alignment

01

gfa maf versions

smoove simplifies and speeds calling and genotyping SVs for short reads. It also improves specificity by removing many spurious alignment signals that are indicative of low-level noise and often contribute to spurious calls. Developed by Brent Pedersen.

01230101

vcf versions

smoove:

structural variant calling and genotyping with existing tools, but, smoothly

Performs fastq alignment to a fasta reference using SNAP

0101

bam bai versions

snapaligner:

Scalable Nucleotide Alignment Program -- a fast and accurate read aligner for high-throughput sequencing data

Create a SNAP index for reference genome

01234

index versions

snapaligner:

Scalable Nucleotide Alignment Program -- a fast and accurate read aligner for high-throughput sequencing data

Core-SNP alignment from Snippy outputs

0120

aln full_aln tab vcf txt versions

snippy:

Rapid bacterial SNP calling and core genome alignments

Rapid haploid variant calling

010

tab csv html vcf bed gff bam bai log aligned_fa consensus_fa consensus_subs_fa raw_vcf filt_vcf vcf_gz vcf_csi txt versions

snippy:

Rapid bacterial SNP calling and core genome alignments

Pairwise SNP distance matrix from a FASTA sequence alignment

01

tsv versions

Rapidly extracts SNPs from a multi-FASTA alignment.

0

fasta constant_sites versions constant_sites_string

Local sequence alignment tool for filtering, mapping and clustering.

010101

reads log index versions

SortMeRNA:

The core algorithm is based on approximate seeds and allows for sensitive analysis of NGS reads. The main application of SortMeRNA is filtering rRNA from metatranscriptomic data. SortMeRNA takes as input files of reads (fasta, fastq, fasta.gz, fastq.gz) and one or multiple rRNA database file(s), and sorts apart aligned and rejected reads into two files. Additional applications include clustering and taxonomy assignation available through QIIME v1.9.1. SortMeRNA works with Illumina, Ion Torrent and PacBio data, and can produce SAM and BLAST-like alignments.

Module to use the 10x Space Ranger pipeline to process 10x spatial transcriptomics data

012345678900

outs versions

spaceranger:

Visium Spatial Gene Expression is a next-generation molecular profiling solution for classifying tissue based on total mRNA. Space Ranger is a set of analysis pipelines that process Visium Spatial Gene Expression data with brightfield and fluorescence microscope images. Space Ranger allows users to map the whole transcriptome in formalin fixed paraffin embedded (FFPE) and fresh frozen tissues to discover novel insights into normal development, disease pathology, and clinical translational research. Space Ranger provides pipelines for end to end analysis of Visium Spatial Gene Expression experiments.

Align reads to a reference genome using STAR

010101000

log_final log_out log_progress versions bam bam_sorted bam_sorted_aligned bam_transcript bam_unsorted fastq tab spl_junc_tab read_per_gene_tab junction sam wig bedgraph

star:

STAR is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Create a counts matrix for single-cell data using STARSolo, handling cell barcodes and UMI information.

012001

counts log_final log_out log_progress summary versions

Aligns sequences using T_COFFEE

01010120

alignment lib versions

tcoffee:

A collection of tools for Computing, Evaluating and Manipulating Multiple Alignments of DNA, RNA, Protein Sequences and Structures.

pigz:

Parallel implementation of the gzip algorithm.

Compares 2 alternative MSAs to evaluate them.

012

scores versions

tcoffee:

A collection of tools for Multiple Alignments of DNA, RNA, Protein Sequence

pigz:

Parallel implementation of the gzip algorithm.

Computes a consensus alignment using T_COFFEE

01010

alignment eval versions

tcoffee:

A collection of tools for Computing, Evaluating and Manipulating Multiple Alignments of DNA, RNA, Protein Sequences and Structures.

pigz:

Parallel implementation of the gzip algorithm.

Reformats the header of PDB files with t-coffee

01

formatted_pdb versions

tcoffee:

A collection of tools for Computing, Evaluating and Manipulating Multiple Alignments of DNA, RNA, Protein Sequences and Structures.

Computes the irmsd score for a given alignment and the structures.

01012

irmsd versions

tcoffee:

A collection of tools for Multiple Alignments of DNA, RNA, Protein Sequence

pigz:

Parallel implementation of the gzip algorithm.

Aligns sequences using the regressive algorithm as implemented in the T_COFFEE package

01010120

alignment versions

tcoffee:

A collection of tools for Computing, Evaluating and Manipulating Multiple Alignments of DNA, RNA, Protein Sequences and Structures.

pigz:

Parallel implementation of the gzip algorithm.

Reformats files with t-coffee

01

formatted_file versions

tcoffee:

A collection of tools for Computing, Evaluating and Manipulating Multiple Alignments of DNA, RNA, Protein Sequences and Structures.

Compute the TCS score for a MSA or for a MSA plus a library file. Outputs the tcs as it is and a csv with just the total TCS score.

0101

tcs scores versions

tcoffee:

A collection of tools for Multiple Alignments of DNA, RNA, Protein Sequence

pigz:

Parallel implementation of the gzip algorithm.

TransDecoder identifies candidate coding regions within transcript sequences. it is used to build gff file.

01

pep gff3 cds dat folder versions

transdecoder:

TransDecoder identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks.

TransDecoder identifies candidate coding regions within transcript sequences. It is used to build gff file. You can use this module after transdecoder_longorf

010

pep gff3 cds bed versions

transdecoder:

TransDecoder identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks.

Cluster contigs from multiple assemblies by similarity

012

cluster_dir versions

trycycler:

Trycycler is a tool for generating consensus long-read assemblies for bacterial genomes

Import transcript-level abundances and estimated counts for gene-level analysis packages

01010

tpm_gene counts_gene counts_gene_length_scaled counts_gene_scaled lengths_gene tpm_transcript counts_transcript lengths_transcript versions

tximeta:

Transcript Quantification Import with Automatic Metadata

uLTRA aligner - A wrapper around minimap2 to improve small exon detection - Map reads on genome

01001

bam versions

ultra:

Splice aligner of long transcriptomic reads to genome.

uLTRA aligner - A wrapper around minimap2 to improve small exon detection - Index gtf file for reads alignment

00

index versions

ultra:

Splice aligner of long transcriptomic reads to genome.

uLTRA aligner - A wrapper around minimap2 to improve small exon detection

0100

sam versions

ultra:

Splice aligner of long transcriptomic reads to genome.

Module to run UniverSC an open-source pipeline to demultiplex and process single-cell RNA-Seq data

010

outs versions

Aligns protein structures using UPP

01010

alignment versions

upp:

SATe-enabled phylogenetic placement

Filtering, downsampling and profiling alignments in BAM/CRAM formats

01

bam versions

In order to judge about candidate indel and structural variants, Varlociraptor needs to know about certain properties of the underlying sequencing experiment in combination with the used read aligner.

010101

alignment_properties_json versions

varlociraptor:

Flexible, uncertainty-aware variant calling with parameter free filtration via FDR control.

Constructs a graph from a reference and variant calls or a multiple sequence alignment file

01230101

graph versions

vg:

Variation graph data structures, interchange formats, alignment, genotyping, and variant calling methods.

Deconstruct snarls present in a variation graph in GFA format to variants in VCF format

0100

vcf versions

vg:

Variation graph data structures, interchange formats, alignment, genotyping, and variant calling methods.

write your description here

01

xg vg_index versions

vg:

Variation graph data structures, interchange formats, alignment, genotyping, and variant calling methods.

Cluster sequences using a single-pass, greedy centroid-based clustering algorithm.

01

aln biom mothur otu bam out blast uc centroids clusters profile msa versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Performs quality filtering and / or conversion of a FASTQ file to FASTA format.

01

fasta log versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Taxonomic classification using the sintax algorithm.

010

tsv versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Sort fasta entries by decreasing abundance (--sortbysize) or sequence length (--sortbylength).

010

fasta versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Compare target sequences to fasta-formatted query sequences using global pairwise alignment.

010000

aln biom lca mothur otu sam tsv txt uc versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

a pangenome-scale aligner

0123400

paf versions

Convert and filter aligned reads to .npz

0120101

npz versions

wisecondorx:

WIthin-SamplE COpy Number aberration DetectOR, including sex chromosomes

Builds a YARA index for a reference genome

01

index versions

yara:

Yara is an exact tool for aligning DNA sequencing reads to reference genomes.

Align reads to a reference genome using YARA

0101

bam bai versions

yara:

Yara is an exact tool for aligning DNA sequencing reads to reference genomes.

Click here to trigger an update.