Available Modules

Modules are the building stones of all DSL2 nf-core blocks. You can find more info from nf-core website, if you would like to write your own module.

  • assembly 67
  • genome 15
  • quality control 9
  • fasta 8
  • long reads 8
  • genomics 7
  • nanopore 7
  • binning 7
  • mags 7
  • bacteria 5
  • coverage 5
  • classification 5
  • contamination 5
  • taxonomic classification 5
  • de novo 5
  • de novo assembly 5
  • bam 4
  • fastq 4
  • metagenomics 4
  • sort 4
  • contigs 4
  • evaluation 4
  • genome assembler 4
  • reference 3
  • download 3
  • gfa 3
  • pacbio 3
  • illumina 3
  • table 3
  • bins 3
  • transcript 3
  • transcriptome 3
  • prokaryote 3
  • NCBI 3
  • scaffold 3
  • scaffolding 3
  • chimeras 3
  • polishing 3
  • genome assembly 3
  • PacBio 3
  • organelle 3
  • bacterial 3
  • das tool 3
  • das_tool 3
  • gatk4 2
  • annotation 2
  • variant calling 2
  • filter 2
  • gff 2
  • gtf 2
  • k-mer 2
  • quality 2
  • single-cell 2
  • ancient DNA 2
  • visualisation 2
  • depth 2
  • aDNA 2
  • haplotype 2
  • filtering 2
  • palaeogenomics 2
  • damage 2
  • archaeogenomics 2
  • mitochondria 2
  • deamination 2
  • reference-free 2
  • summary 2
  • archaeogenetics 2
  • palaeogenetics 2
  • miscoding lesions 2
  • resistance 2
  • HiFi 2
  • fcs-gx 2
  • virulence 2
  • cellranger 2
  • C to T 2
  • proteome 2
  • purge duplications 2
  • contig 2
  • polish 2
  • Duplication purging 2
  • immunoprofiling 2
  • Read depth 2
  • duplicate 2
  • assembly evaluation 2
  • mitochondrion 2
  • small genome 2
  • de novo assembler 2
  • cleaning 2
  • screening 2
  • ragtag 2
  • vcf 1
  • index 1
  • bed 1
  • sam 1
  • structural variants 1
  • database 1
  • align 1
  • merge 1
  • statistics 1
  • qc 1
  • split 1
  • somatic 1
  • count 1
  • kmer 1
  • sv 1
  • consensus 1
  • graph 1
  • picard 1
  • stats 1
  • phage 1
  • antimicrobial resistance 1
  • repeat 1
  • phasing 1
  • metagenome 1
  • virus 1
  • completeness 1
  • sequence 1
  • checkm 1
  • ucsc 1
  • plasmid 1
  • json 1
  • compare 1
  • ont 1
  • indels 1
  • mutect2 1
  • read depth 1
  • haplotypecaller 1
  • hic 1
  • retrotransposon 1
  • quantification 1
  • normalization 1
  • abundance 1
  • clean 1
  • typing 1
  • comparison 1
  • mlst 1
  • hi-c 1
  • eukaryotes 1
  • vdj 1
  • k-mer frequency 1
  • rgfa 1
  • screen 1
  • serogroup 1
  • long terminal retrotransposon 1
  • estimation 1
  • detecting svs 1
  • short-read sequencing 1
  • chloroplast 1
  • assembly polishing 1
  • genome polishing 1
  • Read coverage histogram 1
  • genome graph 1
  • patch 1
  • ucsc/liftover 1
  • transcroder 1
  • cds 1
  • coding 1
  • eucaryotes 1
  • long-reads 1
  • yahs 1
  • redundant 1
  • busco 1
  • metagenome assembler 1
  • taxonomic composition 1
  • plastid 1
  • vector 1
  • drep 1
  • dereplication 1
  • microbial genomics 1
  • cobra 1
  • extension 1
  • trio binning 1
  • reference-independent 1
  • single molecule 1
  • genome statistics 1
  • genome manipulation 1
  • gfastats 1
  • gunc 1
  • genome summary 1
  • snvs 1
  • haplotype resolution 1
  • antimicrobial reistance 1
  • contiguate 1
  • segment 1
  • mkvdjref 1
  • hifi 1
  • Assembly 1
  • cmseq 1
  • protein coding genes 1
  • polymorphic sites 1
  • polymorphic 1
  • polymut 1
  • haplotype purging 1
  • Assembly curation 1
  • False duplications 1
  • Haplotype purging 1
  • assembly curation 1
  • false duplications 1
  • duplicate purging 1
  • cutoff 1
  • quast 1
  • purging 1
  • long uncorrected reads 1
  • liftovervcf 1
  • longread 1
  • de-novo 1
  • salsa 1
  • salsa2 1
  • assembly-binning 1
  • metagenome-assembled genomes 1
  • maxbin2 1
  • denovo 1
  • pneumoniae 1
  • Klebsiella 1
  • megahit 1
  • debruijn 1
  • pbp 1
  • Merqury 1
  • alignment 0
  • cram 0
  • map 0
  • variants 0
  • classify 0
  • cnv 0
  • MSA 0
  • variant 0
  • taxonomy 0
  • taxonomic profiling 0
  • sentieon 0
  • convert 0
  • conversion 0
  • clustering 0
  • proteomics 0
  • copy number 0
  • VCF 0
  • rnaseq 0
  • trimming 0
  • imputation 0
  • phylogeny 0
  • bedtools 0
  • bcftools 0
  • protein 0
  • isoseq 0
  • gvcf 0
  • build 0
  • reporting 0
  • variation graph 0
  • bisulfite 0
  • bqsr 0
  • methylseq 0
  • long-read 0
  • bisulphite 0
  • QC 0
  • methylation 0
  • databases 0
  • indexing 0
  • wgs 0
  • cna 0
  • imaging 0
  • compression 0
  • sequences 0
  • example 0
  • metrics 0
  • openms 0
  • demultiplex 0
  • serotype 0
  • tsv 0
  • mapping 0
  • plink2 0
  • 5mC 0
  • structure 0
  • matrix 0
  • plot 0
  • pairs 0
  • base quality score recalibration 0
  • neural network 0
  • scWGBS 0
  • pangenome graph 0
  • WGBS 0
  • DNA methylation 0
  • amr 0
  • expression 0
  • markduplicates 0
  • cluster 0
  • searching 0
  • histogram 0
  • samtools 0
  • protein sequence 0
  • LAST 0
  • bcf 0
  • annotate 0
  • mappability 0
  • db 0
  • aligner 0
  • bisulfite sequencing 0
  • iCLIP 0
  • biscuit 0
  • germline 0
  • mmseqs2 0
  • seqkit 0
  • genotype 0
  • validation 0
  • bwa 0
  • gene 0
  • low-coverage 0
  • cooler 0
  • machine learning 0
  • gzip 0
  • decompression 0
  • peaks 0
  • population genetics 0
  • complexity 0
  • hmmer 0
  • dedup 0
  • ncbi 0
  • msa 0
  • hmmsearch 0
  • bismark 0
  • prediction 0
  • glimpse 0
  • kraken2 0
  • gff3 0
  • spatial 0
  • mkref 0
  • umi 0
  • blast 0
  • mag 0
  • segmentation 0
  • feature 0
  • newick 0
  • genotyping 0
  • sketch 0
  • splicing 0
  • rna 0
  • short-read 0
  • antimicrobial peptides 0
  • report 0
  • low frequency variant calling 0
  • bedGraph 0
  • kmers 0
  • multiple sequence alignment 0
  • profile 0
  • pangenome 0
  • differential 0
  • scRNA-seq 0
  • cnvkit 0
  • tumor-only 0
  • mirna 0
  • duplicates 0
  • reads 0
  • vsearch 0
  • deduplication 0
  • antimicrobial resistance genes 0
  • extract 0
  • snp 0
  • csv 0
  • demultiplexing 0
  • single 0
  • FASTQ 0
  • merging 0
  • antibiotic resistance 0
  • idXML 0
  • isolates 0
  • clipping 0
  • 3-letter genome 0
  • arg 0
  • single cell 0
  • profiling 0
  • microbiome 0
  • cat 0
  • text 0
  • concatenate 0
  • diversity 0
  • sourmash 0
  • mem 0
  • gridss 0
  • HMM 0
  • ptr 0
  • fragment 0
  • MAF 0
  • interval 0
  • amps 0
  • svtk 0
  • distance 0
  • coptr 0
  • wxs 0
  • detection 0
  • tabular 0
  • structural 0
  • counts 0
  • riboseq 0
  • adapters 0
  • benchmark 0
  • call 0
  • visualization 0
  • query 0
  • view 0
  • kallisto 0
  • mpileup 0
  • fastx 0
  • STR 0
  • fgbio 0
  • phylogenetic placement 0
  • gsea 0
  • xeniumranger 0
  • enrichment 0
  • SV 0
  • transcriptomics 0
  • bedgraph 0
  • genmod 0
  • ranking 0
  • snps 0
  • bin 0
  • CLIP 0
  • interval_list 0
  • mtDNA 0
  • peak-calling 0
  • circrna 0
  • compress 0
  • isomir 0
  • bigwig 0
  • ganon 0
  • pypgx 0
  • deep learning 0
  • sylph 0
  • public datasets 0
  • cut 0
  • preprocessing 0
  • telomere 0
  • paf 0
  • fungi 0
  • chromosome 0
  • diamond 0
  • sample 0
  • family 0
  • fai 0
  • containment 0
  • fusion 0
  • umitools 0
  • logratio 0
  • BGC 0
  • biosynthetic gene cluster 0
  • microsatellite 0
  • union 0
  • chunk 0
  • redundancy 0
  • add 0
  • bedpe 0
  • ngscheckmate 0
  • matching 0
  • propr 0
  • malt 0
  • bgzip 0
  • ccs 0
  • image 0
  • nucleotide 0
  • hmmcopy 0
  • skani 0
  • targeted sequencing 0
  • DNA sequencing 0
  • ancestry 0
  • ampir 0
  • dna 0
  • hybrid capture sequencing 0
  • bcl2fastq 0
  • happy 0
  • copy number alteration calling 0
  • microarray 0
  • ATAC-seq 0
  • parsing 0
  • DNA sequence 0
  • sequencing 0
  • duplication 0
  • benchmarking 0
  • windowmasker 0
  • amplicon sequences 0
  • rsem 0
  • vrhyme 0
  • fastk 0
  • archiving 0
  • tabix 0
  • kinship 0
  • identity 0
  • relatedness 0
  • dist 0
  • spaceranger 0
  • lossless 0
  • observations 0
  • survivor 0
  • shapeit 0
  • seqtk 0
  • zip 0
  • fusions 0
  • anndata 0
  • uLTRA 0
  • minimap2 0
  • long_read 0
  • prokka 0
  • entrez 0
  • untar 0
  • uncompress 0
  • unzip 0
  • UMI 0
  • npz 0
  • krona chart 0
  • transposons 0
  • rna_structure 0
  • RNA 0
  • small indels 0
  • panel 0
  • score 0
  • html 0
  • krona 0
  • khmer 0
  • pseudoalignment 0
  • reports 0
  • transcripts 0
  • notebook 0
  • amplicon sequencing 0
  • ligate 0
  • population genomics 0
  • cfDNA 0
  • hidden Markov model 0
  • mask 0
  • popscle 0
  • immunoinformatics 0
  • ambient RNA removal 0
  • mapper 0
  • structural_variants 0
  • fam 0
  • bim 0
  • guide tree 0
  • subsample 0
  • SNP 0
  • variant_calling 0
  • insert 0
  • replace 0
  • wastewater 0
  • indel 0
  • genotype-based deconvoltion 0
  • GPU-accelerated 0
  • lineage 0
  • pangolin 0
  • covid 0
  • spark 0
  • pan-genome 0
  • pairsam 0
  • miRNA 0
  • somatic variants 0
  • mzml 0
  • dictionary 0
  • gatk4spark 0
  • informative sites 0
  • microbes 0
  • ataqv 0
  • macrel 0
  • intervals 0
  • host 0
  • converter 0
  • prefetch 0
  • CRISPR 0
  • angsd 0
  • bamtools 0
  • repeat expansion 0
  • pileup 0
  • bakta 0
  • mkfastq 0
  • roh 0
  • image_analysis 0
  • cut up 0
  • cool 0
  • bracken 0
  • combine 0
  • comparisons 0
  • dump 0
  • DRAMP 0
  • neubi 0
  • aln 0
  • mcmicro 0
  • highly_multiplexed_imaging 0
  • bwameth 0
  • amplify 0
  • deeparg 0
  • gene expression 0
  • arriba 0
  • RNA-seq 0
  • genome mining 0
  • scores 0
  • regions 0
  • prokaryotes 0
  • checkv 0
  • atac-seq 0
  • chip-seq 0
  • remove 0
  • quality trimming 0
  • fingerprint 0
  • complement 0
  • adapter trimming 0
  • png 0
  • kraken 0
  • PCA 0
  • genomes 0
  • wig 0
  • phase 0
  • preseq 0
  • reformat 0
  • haplotypes 0
  • adapter 0
  • functional analysis 0
  • ampgram 0
  • ichorcna 0
  • library 0
  • amino acid 0
  • metamaps 0
  • genetics 0
  • refine 0
  • concordance 0
  • amptransformer 0
  • instrain 0
  • hlala_typing 0
  • junctions 0
  • lift 0
  • iphop 0
  • vcflib 0
  • maximum likelihood 0
  • polyA_tail 0
  • leviosam2 0
  • mapcounter 0
  • tama 0
  • hla_typing 0
  • pigz 0
  • demultiplexed reads 0
  • aggregate 0
  • artic 0
  • simulate 0
  • differential expression 0
  • gene set 0
  • find 0
  • translation 0
  • variation 0
  • trancriptome 0
  • RNA-Seq 0
  • gstama 0
  • BAM 0
  • hlala 0
  • haplogroups 0
  • hla 0
  • vg 0
  • runs_of_homozygosity 0
  • genome bins 0
  • zlib 0
  • interactions 0
  • gene set analysis 0
  • regression 0
  • single cells 0
  • taxids 0
  • taxon name 0
  • blastn 0
  • pair 0
  • import 0
  • resolve_bioscience 0
  • homoploymer 0
  • multiallelic 0
  • normalize 0
  • norm 0
  • scatter 0
  • reheader 0
  • trim 0
  • spatial_transcriptomics 0
  • xz 0
  • instability 0
  • profiles 0
  • archive 0
  • nucleotides 0
  • cnvnator 0
  • COBS 0
  • k-mer index 0
  • bloom filter 0
  • MSI 0
  • msi 0
  • removal 0
  • graph layout 0
  • reformatting 0
  • microscopy 0
  • rrna 0
  • orthology 0
  • parallelized 0
  • nextclade 0
  • concat 0
  • tnhaplotyper2 0
  • small variants 0
  • transcriptomic 0
  • mudskipper 0
  • tbi 0
  • intersect 0
  • msisensor-pro 0
  • micro-satellite-scan 0
  • tumor 0
  • GC content 0
  • proportionality 0
  • awk 0
  • interactive 0
  • copyratios 0
  • barcode 0
  • primer 0
  • region 0
  • homologs 0
  • sizes 0
  • bases 0
  • parse 0
  • krakenuniq 0
  • hostile 0
  • krakentools 0
  • subset 0
  • orf 0
  • bfiles 0
  • variant pruning 0
  • SimpleAF 0
  • bustools 0
  • lofreq 0
  • megan 0
  • long terminal repeat 0
  • ped 0
  • pharokka 0
  • function 0
  • graft 0
  • checksum 0
  • xenograft 0
  • retrotransposons 0
  • tree 0
  • minhash 0
  • salmon 0
  • mash 0
  • intersection 0
  • read-group 0
  • windows 0
  • kma 0
  • registration 0
  • image_processing 0
  • ChIP-seq 0
  • blastp 0
  • rtgtools 0
  • rename 0
  • identifier 0
  • metagenomic 0
  • evidence 0
  • fasterq-dump 0
  • ome-tif 0
  • MCMICRO 0
  • mirdeep2 0
  • pharmacogenetics 0
  • RNA sequencing 0
  • taxon tables 0
  • filtermutectcalls 0
  • transformation 0
  • otu tables 0
  • trgt 0
  • fetch 0
  • repeats 0
  • CNV 0
  • doublets 0
  • cnv calling 0
  • metagenomes 0
  • tab 0
  • shigella 0
  • expansionhunterdenovo 0
  • gwas 0
  • metadata 0
  • structural-variant calling 0
  • repeat_expansions 0
  • salmonella 0
  • calling 0
  • GEO 0
  • sra-tools 0
  • varcal 0
  • reads merging 0
  • ancient dna 0
  • nanostring 0
  • deconvolution 0
  • bayesian 0
  • short reads 0
  • nacho 0
  • merge mate pairs 0
  • microbial 0
  • correction 0
  • Pharmacogenetics 0
  • frame-shift correction 0
  • switch 0
  • corrupted 0
  • svdb 0
  • long-read sequencing 0
  • sequence analysis 0
  • smrnaseq 0
  • UMIs 0
  • interval list 0
  • standardisation 0
  • settings 0
  • duplex 0
  • standardise 0
  • allele-specific 0
  • taxonomic profile 0
  • standardization 0
  • sequenzautils 0
  • unaligned 0
  • realignment 0
  • Streptococcus pneumoniae 0
  • version 0
  • allele 0
  • mRNA 0
  • MaltExtract 0
  • panelofnormals 0
  • random forest 0
  • dict 0
  • immunology 0
  • antibiotics 0
  • RiPP 0
  • bam2fq 0
  • baf 0
  • NRPS 0
  • validate 0
  • fixmate 0
  • eigenstrat 0
  • signature 0
  • secondary metabolites 0
  • heatmap 0
  • orthologs 0
  • FracMinHash sketch 0
  • samplesheet 0
  • mass spectrometry 0
  • samples 0
  • deseq2 0
  • rna-seq 0
  • decontamination 0
  • splice 0
  • genomad 0
  • eCLIP 0
  • recombination 0
  • format 0
  • antismash 0
  • gene labels 0
  • BCR 0
  • gem 0
  • eido 0
  • human removal 0
  • join 0
  • collate 0
  • split_kmers 0
  • HOPS 0
  • gatk 0
  • joint genotyping 0
  • dereplicate 0
  • qualty 0
  • spatial_omics 0
  • authentication 0
  • emboss 0
  • snpsift 0
  • effect prediction 0
  • snpeff 0
  • cancer genomics 0
  • edit distance 0
  • WGS 0
  • cvnkit 0
  • cgMLST 0
  • soft-clipped clusters 0
  • comp 0
  • groupby 0
  • tnscope 0
  • genotype dosages 0
  • fracminhash sketch 0
  • bgen 0
  • TCR 0
  • embeddings 0
  • linkage equilibrium 0
  • sompy 0
  • vcf file 0
  • decompose 0
  • plink2_pca 0
  • confidence 0
  • bgen file 0
  • bedcov 0
  • hmmfetch 0
  • hash sketch 0
  • propd 0
  • streptococcus 0
  • transmembrane 0
  • spa 0
  • metabolomics 0
  • spatype 0
  • signatures 0
  • reverse complement 0
  • secondary structure 0
  • Escherichia coli 0
  • htseq 0
  • sccmec 0
  • tnseq 0
  • boxcox 0
  • spatialdata 0
  • simulation 0
  • pca 0
  • clr 0
  • alr 0
  • decoy 0
  • pruning 0
  • blat 0
  • variantcalling 0
  • decompress 0
  • wget 0
  • bedgraphtobigwig 0
  • sintax 0
  • missingness 0
  • gtftogenepred 0
  • refflat 0
  • genepred 0
  • bedtobigbed 0
  • vsearch/sort 0
  • bigbed 0
  • usearch 0
  • umicollapse 0
  • long read alignment 0
  • sequencing adapters 0
  • pangenome-scale 0
  • all versus all 0
  • mashmap 0
  • modelsegments 0
  • wavefront 0
  • hwe 0
  • linkbins 0
  • wham 0
  • downsample bam 0
  • toml 0
  • maf 0
  • gemini 0
  • vcfbreakmulti 0
  • data-download 0
  • uniq 0
  • vcf2db 0
  • deduplicate 0
  • subsample bam 0
  • VCFtools 0
  • verifybamid 0
  • scRNA-Seq 0
  • downsample 0
  • snv 0
  • DNA contamination estimation 0
  • disomy 0
  • metabolite annotation 0
  • construct 0
  • uniparental 0
  • graph projection to vcf 0
  • upd 0
  • files 0
  • extractunbinned 0
  • whamg 0
  • metaspace 0
  • dnascope 0
  • rdtest2vcf 0
  • eigenvectors 0
  • polya tail 0
  • iterative model refinement 0
  • lua 0
  • plant 0
  • hicPCA 0
  • vcf2bed 0
  • sliding 0
  • SINE 0
  • rdtest 0
  • snakemake 0
  • readproteingroups 0
  • workflow 0
  • workflow_mode 0
  • network 0
  • countsvtypes 0
  • createreadcountpanelofnormals 0
  • baftest 0
  • svtk/baftest 0
  • denoisereadcounts 0
  • readwriter 0
  • dnamodelapply 0
  • fast5 0
  • proteus 0
  • integron 0
  • copy number variation 0
  • mobile genetic elements 0
  • genome annotation 0
  • copy-number 0
  • trna 0
  • covariance models 0
  • copy number analysis 0
  • gender determination 0
  • unmarkduplicates 0
  • copy number alterations 0
  • references 0
  • melon 0
  • chromosomal rearrangements 0
  • Mycobacterium tuberculosis 0
  • scanner 0
  • helitron 0
  • geo 0
  • mapad 0
  • adna 0
  • c to t 0
  • remove samples 0
  • peak picking 0
  • low-complexity 0
  • site frequency spectrum 0
  • nanoq 0
  • paraphase 0
  • selector 0
  • cram-size 0
  • size 0
  • quality check 0
  • realign 0
  • circular 0
  • spot 0
  • orthogroup 0
  • sage 0
  • featuretable 0
  • extraction 0
  • paired reads re-pairing 0
  • regex 0
  • Read filters 0
  • fix 0
  • malformed 0
  • Read trimming 0
  • Read report 0
  • drug categorization 0
  • uniques 0
  • Illumina 0
  • functional 0
  • impute-info 0
  • ribosomal RNA 0
  • tag2tag 0
  • partitioning 0
  • hashing-based deconvolution 0
  • transcription factors 0
  • regulatory network 0
  • java 0
  • genotype-based demultiplexing 0
  • coreutils 0
  • generic 0
  • transposable element 0
  • retrieval 0
  • mzML 0
  • prepare 0
  • MMseqs2 0
  • catpack 0
  • InterProScan 0
  • droplet based single cells 0
  • Computational Immunology 0
  • lexogen 0
  • donor deconvolution 0
  • patterns 0
  • cellsnp 0
  • trimfq 0
  • vcflib/vcffixup 0
  • AC/NS/AF 0
  • Pacbio 0
  • Bioinformatics Tools 0
  • guidetree 0
  • Immune Deconvolution 0
  • doublet 0
  • bwamem2 0
  • bwameme 0
  • grabix 0
  • ribosomal 0
  • 10x 0
  • rank 0
  • script 0
  • hashing-based deconvoltion 0
  • scanpy 0
  • leafcutter 0
  • bclconvert 0
  • nucBed 0
  • AT content 0
  • nucleotide content 0
  • regtools 0
  • elfasta 0
  • elprep 0
  • plotting 0
  • controlstatistics 0
  • source tracking 0
  • emoji 0
  • quality_control 0
  • doublet_detection 0
  • tarball 0
  • barcodes 0
  • subsetting 0
  • logFC 0
  • significance statistic 0
  • p-value 0
  • scvi 0
  • solo 0
  • import segmentation 0
  • nuclear segmentation 0
  • cell segmentation 0
  • relabel 0
  • resegment 0
  • morphology 0
  • targz 0
  • tar 0
  • chip 0
  • block substitutions 0
  • xml 0
  • svg 0
  • standard 0
  • haplotag 0
  • staging 0
  • updatedata 0
  • Staging 0
  • run 0
  • pdb 0
  • microRNA 0
  • multiqc 0
  • mass_error 0
  • search engine 0
  • poolseq 0
  • decomposeblocksub 0
  • translate 0
  • variant-calling 0
  • stardist 0
  • identity-by-descent 0
  • telseq 0
  • vsearch/dereplicate 0
  • vsearch/fastqfilter 0
  • fastqfilter 0
  • ATACseq 0
  • shift 0
  • ATACshift 0
  • mgi 0
  • recovery 0
  • setgt 0
  • jvarkit 0
  • gnu 0
  • ancestral alleles 0
  • resistance genes 0
  • gaps 0
  • introns 0
  • install 0
  • joint-genotyping 0
  • genotypegvcf 0
  • isoform 0
  • variancepartition 0
  • dream 0
  • md 0
  • nm 0
  • parallel 0
  • resfinder 0
  • uq 0
  • longest 0
  • short 0
  • intron 0
  • raw 0
  • mgf 0
  • parquet 0
  • parser 0
  • dbsnp 0
  • masking 0
  • standardize 0
  • quarto 0
  • python 0
  • r 0
  • coexpression 0
  • correlation 0
  • transform 0
  • agat 0
  • assay 0
  • bam2fastx 0
  • derived alleles 0
  • tnfilter 0
  • f coefficient 0
  • homozygous genotypes 0
  • heterozygous genotypes 0
  • array_cgh 0
  • cytosure 0
  • gprofiler2 0
  • gost 0
  • rad 0
  • inbreeding 0
  • structural variant 0
  • bam2fastq 0
  • immcantation 0
  • tags 0
  • covariance model 0
  • co-orthology 0
  • homology 0
  • sequence similarity 0
  • spectral clustering 0
  • comparative genomics 0
  • deep variant 0
  • mutect 0
  • idx 0
  • corpcor 0
  • phylogenetics 0
  • hamming-distance 0
  • hmmpress 0
  • scimap 0
  • Bayesian 0
  • structural-variants 0
  • reference compression 0
  • omics 0
  • biological activity 0
  • reference panel 0
  • prior knowledge 0
  • junction 0
  • tag 0
  • cell_barcodes 0
  • phylogenies 0
  • hmmscan 0
  • mygene 0
  • associations 0
  • go 0
  • hhsuite 0
  • 16S 0
  • pile up 0
  • nanopore sequencing 0
  • rna velocity 0
  • CRISPRi 0
  • grea 0
  • functional enrichment 0
  • paired reads merging 0
  • overlap-based merging 0
  • check 0
  • spatial_neighborhoods 0
  • case/control 0
  • minimum_evolution 0
  • genotype likelihood 0
  • distance-based 0
  • nucleotide sequence 0
  • GFF/GTF 0
  • tandem repeats 0
  • multi-tool 0
  • predict 0
  • long read 0
  • hardy-weinberg 0
  • hwe statistics 0
  • hwe equilibrium 0
  • shuffleBed 0
  • SNV 0
  • collapse 0
  • GWAS 0
  • Indel 0
  • host removal 0
  • haploype 0
  • liftover 0
  • probabilistic realignment 0
  • seqfu 0
  • n50 0
  • cell_type_identification 0
  • cell_phenotyping 0
  • impute 0
  • machine_learning 0
  • clahe 0
  • refresh 0
  • association 0
  • airrseq 0
  • de Bruijn 0
  • rRNA 0
  • calibratedragstrmodel 0
  • composestrtablefile 0
  • short variant discovery 0
  • combinegvcfs 0
  • collectsvevidence 0
  • collectreadcounts 0
  • cnnscorevariants 0
  • getpileupsummaries 0
  • condensedepthevidence 0
  • cross-samplecontamination 0
  • calculatecontamination 0
  • bedtointervallist 0
  • asereadcounter 0
  • vqsr 0
  • variant quality score recalibration 0
  • dragstr 0
  • createsequencedictionary 0
  • targets 0
  • genomicsdb 0
  • germlinevariantsites 0
  • germlinecnvcaller 0
  • germline contig ploidy 0
  • panelofnormalscreation 0
  • jointgenotyping 0
  • genomicsdbimport 0
  • gatherbqsrreports 0
  • createsomaticpanelofnormals 0
  • tranche filtering 0
  • filtervarianttranches 0
  • filterintervals 0
  • estimatelibrarycomplexity 0
  • duplication metrics 0
  • determinegermlinecontigploidy 0
  • annotateintervals 0
  • heattree 0
  • readcountssummary 0
  • ARGs 0
  • duplexumi 0
  • consensus sequence 0
  • public 0
  • ENA 0
  • SRA 0
  • ANI 0
  • antibiotic resistance genes 0
  • unmapped 0
  • faqcs 0
  • str 0
  • cache 0
  • percent on target 0
  • endogenous DNA 0
  • Streptococcus pyogenes 0
  • groupreads 0
  • ubam 0
  • gangstr 0
  • somatic variant calling 0
  • gene-calling 0
  • gamma 0
  • UShER 0
  • bootstrapping 0
  • bacterial variant calling 0
  • germline variant calling 0
  • variant caller 0
  • zipperbams 0
  • rust 0
  • fq 0
  • lint 0
  • random 0
  • generate 0
  • getpileupsumaries 0
  • indexfeaturefile 0
  • genbank 0
  • joint-variant-calling 0
  • TAMA 0
  • gene model 0
  • tama_collapse.py 0
  • genomes on a tree 0
  • merge compare 0
  • GNU 0
  • Imputation 0
  • gstama/polyacleanup 0
  • Haplotypes 0
  • Sample 0
  • low coverage 0
  • gget 0
  • gstama/merge 0
  • GTDB taxonomy 0
  • amrfinderplus 0
  • mitochondrial 0
  • beagle 0
  • hbd 0
  • ibd 0
  • rgi 0
  • fARGene 0
  • abricate 0
  • genome taxonomy database 0
  • extractvariants 0
  • extract_variants 0
  • gvcftools 0
  • gunzip 0
  • archaea 0
  • Mykrobe 0
  • learnreadorientationmodel 0
  • printreads 0
  • shiftfasta 0
  • shiftchain 0
  • selectvariants 0
  • revert 0
  • reblockgvcf 0
  • printsvevidence 0
  • preprocessintervals 0
  • site depth 0
  • postprocessgermlinecnvcalls 0
  • mutectstats 0
  • mergebamalignment 0
  • leftalignandtrimvariants 0
  • readorientationartifacts 0
  • shiftintervals 0
  • splitcram 0
  • Salmonella Typhi 0
  • bgc 0
  • repeat content 0
  • genome heterozygosity 0
  • genome size 0
  • models 0
  • compound 0
  • genome profile 0
  • file parsing 0
  • splitintervals 0
  • txt 0
  • gawk 0
  • variantrecalibrator 0
  • recalibration model 0
  • variantfiltration 0
  • svcluster 0
  • svannotate 0
  • swissprot 0
  • embl 0
  • background_correction 0
  • update header 0
  • BCF 0
  • csi 0
  • deduping 0
  • smaller fastqs 0
  • clumping fastqs 0
  • illumiation_correction 0
  • homozygosity 0
  • element 0
  • trimBam 0
  • bamUtil 0
  • bamtools/split 0
  • yaml 0
  • bamtools/convert 0
  • biallelic 0
  • autozygosity 0
  • bacphlip 0
  • maskfasta 0
  • unionBedGraphs 0
  • subtract 0
  • slopBed 0
  • shiftBed 0
  • multinterval 0
  • overlapped bed 0
  • chunking 0
  • sorting 0
  • jaccard 0
  • overlap 0
  • getfasta 0
  • genomecov 0
  • closest 0
  • bamtobed 0
  • mouse 0
  • virulent 0
  • file manipulation 0
  • amp 0
  • allele counts 0
  • nuclear contamination estimate 0
  • post Post-processing 0
  • model 0
  • AMPs 0
  • antimicrobial peptide prediction 0
  • Staphylococcus aureus 0
  • installation 0
  • affy 0
  • reference panels 0
  • admixture 0
  • adapterremoval 0
  • doCounts 0
  • HLA 0
  • temperate 0
  • authentict 0
  • lifestyle 0
  • autofluorescence 0
  • cycif 0
  • background 0
  • single-stranded 0
  • ancientDNA 0
  • read group 0
  • utility 0
  • bias 0
  • ATLAS 0
  • sequencing_bias 0
  • post mortem damage 0
  • atlas 0
  • mkarv 0
  • http(s) 0
  • bioawk 0
  • sorted 0
  • split by chromosome 0
  • Cores 0
  • pcr duplicates 0
  • cutesv 0
  • gct 0
  • cls 0
  • na 0
  • custom 0
  • Segmentation 0
  • track 0
  • TMA dearray 0
  • UNet 0
  • mcool 0
  • genomic bins 0
  • makebins 0
  • enzyme 0
  • paired-end 0
  • corrrelation 0
  • cload 0
  • PEP 0
  • deletion 0
  • circos 0
  • eklipse 0
  • eigenstratdatabasetools 0
  • pep 0
  • schema 0
  • escherichia coli 0
  • scatterplot 0
  • depth information 0
  • structural variation 0
  • duphold 0
  • blastx 0
  • cumulative coverage 0
  • digest 0
  • cooler/balance 0
  • Salmonella enterica 0
  • domains 0
  • antigen capture 0
  • multiomics 0
  • cellpose 0
  • compartments 0
  • crispr 0
  • topology 0
  • calder2 0
  • cadd 0
  • postprocessing 0
  • tblastn 0
  • subtyping 0
  • antibody capture 0
  • qa 0
  • subcontigs 0
  • access 0
  • nucleotide composition 0
  • concoct 0
  • partition histograms 0
  • target 0
  • export 0
  • antitarget 0
  • quality assurnce 0
  • chromosome_visualization 0
  • duplicate removal 0
  • chromap 0
  • Haemophilus influenzae 0
  • gccounter 0
  • constant 0
  • panel of normals 0
  • normal database 0
  • genomic intervals 0
  • intervals coverage 0
  • gene finding 0
  • contact maps 0
  • neighbour-joining 0
  • jpg 0
  • fragment_size 0
  • rtg 0
  • integrity 0
  • mapping-based 0
  • sequence-based 0
  • read distribution 0
  • inner_distance 0
  • read_pairs 0
  • subsampling 0
  • experiment 0
  • strandedness 0
  • bamstat 0
  • R 0
  • rhocall 0
  • bmp 0
  • pretext 0
  • rocplot 0
  • deletions 0
  • CAGE 0
  • PRO-cap 0
  • GRO-cap 0
  • CoPRO 0
  • tandem duplications 0
  • insertions 0
  • sortvcf 0
  • RAMPAGE 0
  • picard/renamesampleinvcf 0
  • pcr 0
  • mate-pair 0
  • hybrid-selection 0
  • phylogenetic composition 0
  • NETCAGE 0
  • csRNA-seq 0
  • contact 0
  • recode 0
  • porechop_abi 0
  • pmdtools 0
  • variant genetic 0
  • scoring 0
  • identifiers 0
  • whole genome association 0
  • indep pairwise 0
  • STRIPE-seq 0
  • indep 0
  • variant identifiers 0
  • exclude 0
  • genetic 0
  • GRO-seq 0
  • PRO-seq 0
  • pedfilter 0
  • rtg-tools 0
  • identification 0
  • genetic sex 0
  • sha256 0
  • error 0
  • rare variants 0
  • relative coverage 0
  • sex determination 0
  • shinyngs 0
  • induce 0
  • gc_wiggle 0
  • bam2seqz 0
  • freqsum 0
  • pseudodiploid 0
  • pseudohaploid 0
  • 256 bit 0
  • exploratory 0
  • selection 0
  • sniffles 0
  • invariant 0
  • SNPs 0
  • predictions 0
  • dbnsfp 0
  • snippy 0
  • core 0
  • POA 0
  • boxplot 0
  • SMN2 0
  • SMN1 0
  • CRAM 0
  • sliding window 0
  • features 0
  • density 0
  • random draw 0
  • seq 0
  • amplicon 0
  • paired 0
  • repair 0
  • insert size 0
  • faidx 0
  • calmd 0
  • ampliconclip 0
  • duplicate marking 0
  • readgroup 0
  • sambamba 0
  • flagstat 0
  • multimapper 0
  • Ancestor 0
  • LCA 0
  • read pairs 0
  • scramble 0
  • header 0
  • VQSR 0
  • interleave 0
  • sertotype 0
  • sequence headers 0
  • grep 0
  • subseq 0
  • variant recalibration 0
  • applyvarcal 0
  • cluster analysis 0
  • seacr 0
  • chromatin 0
  • cut&run 0
  • cut&tag 0
  • peak-caller 0
  • clusteridentifier 0
  • illumina datasets 0
  • prophage 0
  • readcounter 0
  • Listeria monocytogenes 0
  • functional genomics 0
  • peptide prediction 0
  • AMP 0
  • qualities 0
  • lofreq/filter 0
  • lofreq/call 0
  • limma 0
  • CRISPR-Cas9 0
  • pneumophila 0
  • clinical 0
  • legionella 0
  • collapsing 0
  • adapter removal 0
  • train 0
  • sgRNA 0
  • maximum-likelihood 0
  • reorder 0
  • representations 0
  • 128 bit 0
  • MD5 0
  • mcr-1 0
  • mass-spectroscopy 0
  • reduced 0
  • rra 0
  • mash/sketch 0
  • taxonomic assignment 0
  • estimate 0
  • damage patterns 0
  • NGS 0
  • DNA damage 0
  • spliced 0
  • combining 0
  • IDR 0
  • pixel classification 0
  • multicut 0
  • genome browser 0
  • js 0
  • igv.js 0
  • igv 0
  • panel_of_normals 0
  • probability_maps 0
  • haemophilus 0
  • pos 0
  • annotations 0
  • hmtnote 0
  • Hidden Markov Model 0
  • HMMER 0
  • pixel_classification 0
  • interproscan 0
  • kofamscan 0
  • quant 0
  • kegg 0
  • effective genome size 0
  • k-mer counting 0
  • digital normalization 0
  • kallisto/index 0
  • genomic islands 0
  • papermill 0
  • jupytext 0
  • Jupyter 0
  • Python 0
  • jasmine 0
  • jasminesv 0
  • insertion 0
  • phantom peaks 0
  • graph viz 0
  • PCR/optical duplicates 0
  • block-compressed 0
  • HLA-I 0
  • ILP 0
  • hla-typing 0
  • tumor/normal 0
  • graph formats 0
  • upper-triangular matrix 0
  • graph unchopping 0
  • graph stats 0
  • combine graphs 0
  • odgi 0
  • squeeze 0
  • graph drawing 0
  • flip 0
  • ligation junctions 0
  • gender 0
  • subreads 0
  • ChIP-Seq 0
  • motif 0
  • pedigrees 0
  • read 0
  • pair-end 0
  • pbmerge 0
  • pairtools 0
  • pbbam 0
  • graphs 0
  • paragraph 0
  • select 0
  • restriction fragments 0
  • pairstools 0
  • graph construction 0
  • Neisseria gonorrhoeae 0
  • daa 0
  • unionsum 0
  • target prediction 0
  • microrna 0
  • assembler 0
  • mbias 0
  • methylation bias 0
  • metaphlan 0
  • ploidy 0
  • reference genome 0
  • smudgeplot 0
  • contour map 0
  • 3D heat map 0
  • Neisseria meningitidis 0
  • rma6 0
  • mitochondrial genome 0
  • mosdepth 0
  • ngm 0
  • SNP table 0
  • NextGenMap 0
  • sequencing summary 0
  • mobile element insertions 0
  • somatic structural variations 0
  • cancer genome 0
  • contaminant 0
  • GATK UnifiedGenotyper 0
  • otu table 0
  • Beautiful stand-alone HTML report 0
  • bioinformatics tools 0
  • mitochondrial to nuclear ratio 0
  • ratio 0
  • mtnucratio 0
  • scan 0
  • microsatellite instability 0
  • antibody 0

contiguate draft genome assembly

010

results versions

Screen assemblies for antimicrobial resistance against multiple databases

010

report versions

abricate:

Mass screening of contigs for antibiotic resistance genes

Screen assemblies for antimicrobial resistance against multiple databases

01

report versions

abricate:

Mass screening of contigs for antibiotic resistance genes

ALE: assembly likelihood estimator.

012

ale versions

Download and prepare database for Ariba analysis

01

db versions

ariba:

ARIBA: Antibiotic Resistance Identification By Assembly

Query input FASTQs against Ariba formatted databases

0101

results versions

ariba:

ARIBA: Antibiotic Resistance Identification By Assembly

Assembly summary statistics in JSON format

01

json versions

Render an assembly graph in GFA 1.0 format to PNG and SVG image formats

01

png svg versions

bandage:

Bandage - a Bioinformatics Application for Navigating De novo Assembly Graphs Easily

BBNorm is designed to normalize coverage by down-sampling reads over high-depth areas of a genome, to result in a flat coverage distribution.

01

fastq log versions

bbmap:

BBMap is a short read aligner, as well as various other bioinformatic tools.

Benchmarking Universal Single Copy Orthologs

0100000

batch_summary short_summaries_txt short_summaries_json log full_table missing_busco_list single_copy_proteins seq_dir translated_dir busco_dir downloaded_lineages single_copy_faa single_copy_fna versions

busco:

BUSCO provides measures for quantitative assessment of genome assembly, gene set, and transcriptome completeness based on evolutionarily informed expectations of gene content from near-universal single-copy orthologs selected from OrthoDB.

Download database for BUSCO

0

download_dir versions

busco:

BUSCO provides measures for quantitative assessment of genome assembly, gene set, and transcriptome completeness based on evolutionarily informed expectations of gene content from near-universal single-copy orthologs selected from OrthoDB.

BUSCO plot generation tool

0

png versions

busco:

BUSCO provides measures for quantitative assessment of genome assembly, gene set, and transcriptome completeness based on evolutionarily informed expectations of gene content from near-universal single-copy orthologs selected from OrthoDB.

Accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads.

0100

report assembly contigs corrected_reads corrected_trimmed_reads metadata contig_position contig_info versions

Taxonomic classification of long DNA sequences and metagenome assembled genomes (e.g. MAGs / bins).

0101

txt versions

catpack:

CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)

Taxonomic classification of long DNA sequences and metagenome assembled genomes (e.g. MAGs / bins).

0101010101

orf2lca bin2classification log diamond faa gff versions

catpack:

CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)

Taxonomic classification of long DNA sequences and metagenome assembled genomes (e.g. contigs, MAGs / bins).

0101010101

orf2lca contig2classification log diamond faa gff versions

catpack:

CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)

Taxonomic classification plus read-based abundance estimation from long DNA sequences and metagenome assembled genomes (e.g. contigs, MAGs / bins).

0101010101001010101010101

rat_log complete_abundance contig_abundance read2classification alignment_diamond contig2classification cat_log orf2lca faa gff unmapped_diamond unmapped_fasta unmapped2classification versions

catpack:

CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)

Summarises results from CAT/BAT/RAT classification steps

0101

txt versions

catpack:

CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)

Module to build the VDJ reference needed by the 10x Genomics Cell Ranger tool. Uses the cellranger mkvdjref command.

0000

reference versions

cellranger:

Cell Ranger processes data from 10X Genomics Chromium kits. cellranger vdj takes FASTQ files from cellranger mkfastq or bcl2fastq for V(D)J libraries and performs sequence assembly and paired clonotype calling. It uses the Chromium cellular barcodes and UMIs to assemble V(D)J transcripts per cell. Clonotypes and CDR3 sequences are output as a .vloupe file which can be loaded into Loupe V(D)J Browser.

Module to use Cell Ranger's pipelines analyze sequencing data produced from Chromium Single Cell Immune Profiling.

010

outs versions

cellranger:

Cell Ranger processes data from 10X Genomics Chromium kits. cellranger vdj takes FASTQ files from cellranger mkfastq or bcl2fastq for V(D)J libraries and performs sequence assembly and paired clonotype calling. It uses the Chromium cellular barcodes and UMIs to assemble V(D)J transcripts per cell. Clonotypes and CDR3 sequences are output as a .vloupe file which can be loaded into Loupe V(D)J Browser.

Calculates polymorphic site rates over protein coding genes

01234

polymut versions

cmseq:

Set of utilities on sequences and BAM files

A tool to raise the quality of viral genomes assembled from short-read metagenomes via resolving and joining of contigs fragmented during de novo assembly.

01010101000

self_circular extended_circular extended_partial extended_failed orphan_end all_cobra_assemblies joining_summary log versions

cobra-meta:

COBRA is a tool to get higher quality viral genomes assembled from metagenomes.

DAS Tool binning step.

01230

log summary contig2bin eval bins pdfs fasta_proteins candidates_faa fasta_archaea_scg fasta_bacteria_scg b6 seqlength versions

dastool:

DAS Tool is an automated method that integrates the results of a flexible number of binning algorithms to calculate an optimized, non-redundant set of bins from a single assembly.

Helper script to convert a set of bins in fasta format to tabular scaffolds2bin format

010

fastatocontig2bin versions

dastool:

DAS Tool is an automated method that integrates the results of a flexible number of binning algorithms to calculate an optimized, non-redundant set of bins from a single assembly.

Helper script to convert a set of bins in fasta format to tabular scaffolds2bin format

010

scaffolds2bin versions

dastool:

DAS Tool is an automated method that integrates the results of a flexible number of binning algorithms to calculate an optimized, non-redundant set of bins from a single assembly.

Assemble bacterial isolate genomes from Nanopore reads

012

contigs log raw_contigs gfa txt versions

Performs rapid genome comparisons for a group of genomes and visualize their relatedness

01

directory versions

drep:

De-replication of microbial genomes assembled from multiple samples

Export assembly segment sequences in GFA 1.0 format to FASTA format

01

fasta versions

dshbio:

Reads, features, variants, assemblies, alignments, genomic range trees, pangenome graphs, and a bunch of random command line tools for bioinformatics. LGPL version 3 or later.

Filter, sort and markdup sam/bam files, with optional BQSR and variant calling.

012345601010100000

bam logs metrics recall gvcf table activity_profile assembly_regions versions

elprep:

elPrep is a high-performance tool for preparing .sam/.bam files for variant calling in sequencing pipelines. It can be used as a drop-in replacement for SAMtools/Picard/GATK4.

Uses evigene/scripts/prot/tr2aacds.pl to filter a transcript assembly

01

dropset okayset versions

evigene:

EvidentialGene is a genome informatics project for "Evidence Directed Gene Construction for Eukaryotes", for constructing high quality, accurate gene sets for animals and plants (any eukaryotes), being developed by Don Gilbert at Indiana University, gilbertd at indiana edu.

Run NCBI's FCS adaptor on assembled genomes

01

cleaned_assembly adaptor_report log pipeline_args skipped_trims versions

fcs:

The Foreign Contamination Screening (FCS) tool rapidly detects contaminants from foreign organisms in genome assemblies to prepare your data for submission. Therefore, the submission process to NCBI is faster and fewer contaminated genomes are submitted. This reduces errors in analyses and conclusions, not just for the original data submitter but for all subsequent users of the assembly.

Run FCS-GX on assembled genomes. The contigs of the assembly are searched against a reference database excluding the given taxid.

010

fcs_gx_report taxonomy_report versions

fcs:

"The Foreign Contamination Screening (FCS) tool rapidly detects contaminants from foreign organisms in genome assemblies to prepare your data for submission. Therefore, the submission process to NCBI is faster and fewer contaminated genomes are submitted. This reduces errors in analyses and conclusions, not just for the original data submitter but for all subsequent users of the assembly."

Runs FCS-GX (Foreign Contamination Screen - Genome eXtractor) to remove foreign contamination from genome assemblies

012

cleaned contaminants versions

fcsgx:

The NCBI Foreign Contamination Screen. Genomic cross-species aligner, for contamination detection.

Runs FCS-GX (Foreign Contamination Screen - Genome eXtractor) to screen and remove foreign contamination from genome assemblies

01200

fcsgx_report taxonomy_report log hits versions

fcsgx:

The NCBI Foreign Contamination Screen. Genomic cross-species aligner, for contamination detection.

De novo assembler for single molecule sequencing reads

010

fasta gfa gv txt log json versions

Call germline SNPs and indels via local re-assembly of haplotypes

012340101010101

vcf tbi bam versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Call somatic SNVs and indels via local assembly of haplotypes.

01230101010000

vcf tbi stats f1r2 versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Downloads databases needed for running getorganelle

0

db versions

getorganelle:

Get organelle genomes from genome skimming data

Assembles organelle genomes from genomic data

0101

fasta etc versions

getorganelle:

Get organelle genomes from genome skimming data

A single fast and exhaustive tool for summary statistics and simultaneous fa (fasta, fastq, gfa [.gz]) genome assembly file manipulation.

0100001010101

assembly_summary assembly versions

Converts GFA or rGFA files to FASTA

01

fasta versions

gfatools:

Tools for manipulating sequence graphs in the GFA and rGFA formats

Download database for GUNC detection of Chimerism and Contamination in Prokaryotic Genomes

0

db versions

gunc:

Python package for detection of chimerism and contamination in prokaryotic genomes.

Merging of CheckM and GUNC results in one summary table

012

tsv versions

gunc:

Python package for detection of chimerism and contamination in prokaryotic genomes.

Detection of Chimerism and Contamination in Prokaryotic Genomes

010

maxcss_level_tsv all_levels_tsv versions

gunc:

Python package for detection of chimerism and contamination in prokaryotic genomes.

Whole-genome assembly using PacBio HiFi reads

01201201201

raw_unitigs bin_files processed_unitigs primary_contigs alternate_contigs hap1_contigs hap2_contigs corrected_reads read_overlaps log versions

Assembly polisher using short (and long) reads

0101000

fasta versions

Kleborate is a tool to screen genome assemblies of Klebsiella pneumoniae and the Klebsiella pneumoniae species complex (KpSC).

01

txt versions

LINKS is a genomics application for scaffolding genome assemblies with long reads, such as those produced by Oxford Nanopore Technologies Ltd. It can be used to scaffold high-quality draft genome assemblies with any long sequences (eg. ONT reads, PacBio reads, other draft genomes, etc). It is also used to scaffold contig pairs linked by ARCS/ARKS. This module is for LINKS >=2.0.0 and does not support MPET input.

0101

log pairing_distribution pairing_issues scaffolds_csv scaffolds_fasta bloom scaffolds_graph assembly_correspondence simplepair_checkpoint tigpair_checkpoint versions

Estimates the mean LTR sequence identity in the genome. The input genome fasta should have short alphanumeric IDs without comments

01000

log lai_out versions

lai:

Assessing genome assembly quality using the LTR Assembly Index (LAI)

MaxBin is a software that is capable of clustering metagenomic contigs

0123

binned_fastas summary abundance log marker_counts unbinned_fasta tooshort_fasta marker_bins marker_genes versions

A tool to create consensus sequences and variant calls from nanopore sequencing data

012

assembly versions

An ultra-fast metagenomic assembler for large and complex metagenomics

012

contigs k_contigs addi_contigs local_contigs kfinal_contigs log versions

pigz:

Parallel implementation of the gzip algorithm.

Compare k-mer frequency in reads and assembly to devise the metrics K and QV

0101000

hist log_stderr versions

merfin:

Merfin (k-mer based finishing tool) is a suite of subtools to variant filtering, assembly evaluation and polishing via k-mer validation. The subtool -hist estimates the QV (quality value of Merqury) for each scaffold/contig and genome-wide averages. In addition, Merfin produces a QV* estimate, which accounts also for kmers that are seen in excess with respect to their expected multiplicity predicted from the reads.

k-mer based assembly evaluation.

metameryl_dbassembly

meta versions assembly_only_kmers_bed assembly_only_kmers_wig stats dist_hist spectra_cn_fl_png spectra_cn_ln_png spectra_cn_st_png spectra_cn_hist spectra_asm_fl_png spectra_asm_ln_png spectra_asm_st_png spectra_asm_hist assembly_qv scaffold_qv read_ploidy

k-mer based assembly evaluation.

012

assembly_only_kmers_bed assembly_only_kmers_wig stats dist_hist spectra_cn_fl_png spectra_cn_hist spectra_cn_ln_png spectra_cn_st_png spectra_asm_fl_png spectra_asm_hist spectra_asm_ln_png spectra_asm_st_png assembly_qv scaffold_qv read_ploidy hapmers_blob_png versions

merqury:

Evaluate genome assemblies with k-mers and more.

Produces maternal and paternal FastK kmer tables from maternal, paternal and child FastK tables

010101

mat_hap_ktab pat_hap_ktab versions

merquryfk:

FastK based version of Merqury

FastK based version of Merqury

012340101

stats bed assembly_qv spectra_cn_fl spectra_cn_ln spectra_cn_st qv spectra_asm_fl spectra_asm_ln spectra_asm_st phased_block_bed phased_block_stats continuity_N block_N block_blob hapmers_blob versions

merquryfk:

FastK based version of Merqury

Depth computation per contig step of metabat2

012

depth versions

metabat2:

Metagenome binning

Metagenome binning of contigs

012

tooshort lowdepth unbinned membership fasta versions

metabat2:

Metagenome binning

Metagenome assembler for long-read sequences (HiFi and ONT).

010

contigs log versions

metamdbg:

MetaMDBG: a lightweight assembler for long and accurate metagenomics reads.

A very fast OLC-based de novo assembler for noisy long reads

012

gfa assembly versions

A python workflow that assembles mitogenomes from Pacbio HiFi reads

010000

fasta stats gb gff all_potential_contigs contigs_annotations contigs_circularization contigs_filtering coverage_mapping coverage_plot final_mitogenome_annotation final_mitogenome_choice final_mitogenome_coverage potential_contigs reads_mapping_and_assembly shared_genes versions

mitohifi.py:

A python workflow that assembles mitogenomes from Pacbio HiFi reads

Run Torsten Seemann's classic MLST on a genome assembly

01

tsv versions

A tool to quickly download assemblies from NCBI's Assembly database

0000

gbk fna rm features gff faa gpff wgs_gbk cds rna rna_fna report stats versions

NCBI tool for detecting vector contamination in nucleic acid sequences. This tool is older than NCBI's FCS-adaptor, which is for the same purpose

0101

vecscreen_output versions

ncbitools:

"NCBI libraries for biology applications (text-based utilities)"

An nf-core module for the OATK

010123401234

mito_fasta pltd_fasta mito_bed pltd_bed mito_gfa pltd_gfa annot_mito_txt annot_pltd_txt clean_gfa final_gfa initial_gfa multiplex_gfa unzip_gfa versions

Serogroup Pseudomonas aeruginosa assemblies

01

tsv blast details versions

Assign PBP type of Streptococcus pneumoniae assemblies

010

tsv blast versions

Lifts over a VCF file from one reference build to another.

01010101

vcf_lifted vcf_unlifted versions

picard:

Move annotations from one assembly to another

Automatically improve draft assemblies and find variation among strains, including large event detection

010120

improved_assembly vcf change_record tracks_bed tracks_wig versions

assembles bacterial plasmids

010

html tab images logs data database fasta_files kmer versions

Polishing genome assemblies with short reads.

01010

fasta versions debug

polypolish:

Polishing genome assemblies with short reads.

Calculate coverage cutoffs to determine when to purge duplicated sequence.

01

cutoff log versions

purgedups:

Purge_dups is a package used to purge haplotigs and overlaps in an assembly based on read depth

Separates out sequences purged of falsely duplicated sequences.

012

haplotigs purged versions

purgedups:

Purge_dups is a package used to purge haplotigs and overlaps in an assembly based on read depth

Plots the read coverage from a purge dups statistics file and cutoffs.

012

png versions

purgedups:

Purge_dups is a package used to purge haplotigs and overlaps in an assembly based on read depth

Create read depth histogram and base-level read depth for an assembly based on pacbio data

01

stat basecov versions

purgedups:

Purge_dups is a package used to purge haplotigs and overlaps in an assembly based on read depth

Purge haplotigs and overlaps for an assembly

0123

bed log versions

purgedups:

Purge_dups is a package used to purge haplotigs and overlaps in an assembly based on read depth

Split fasta file by 'N's to aid in self alignment for duplicate purging

01

split_fasta versions

purgedups:

Purge_dups is a package used to purge haplotigs and overlaps in an assembly based on read depth

Damage parameter estimation for ancient DNA

012

csv versions

pydamage:

Damage parameter estimation for ancient DNA

Damage parameter estimation for ancient DNA

01

csv versions

pydamage:

Damage parameter estimation for ancient DNA

Quality Assessment Tool for Genome Assemblies

010101

results tsv transcriptome misassemblies unaligned versions

Consensus module for raw de novo DNA assembly of long uncorrected reads

0123

improved_assembly versions

Homology-based assembly patching: Make continuous joins and fill gaps in 'target.fa' using sequences from 'query.fa'

01010101

patch_fasta patch_agp patch_components_fasta assembly_alignments target_splits_agp target_splits_fasta qry_rename_agp qry_rename_fasta stderr versions

ragtag:

Fast reference-guided genome assembly scaffolding

Scaffolding is the process of ordering and orienting draft assembly (query) sequences into longer sequences. Gaps (stretches of "N" characters) are placed between adjacent query sequences to indicate the presence of unknown sequence. RagTag uses whole-genome alignments to a reference assembly to scaffold query sequences. RagTag does not alter input query sequence in any way and only orders and orients sequences, joining them with gaps.

010101012

corrected_assembly corrected_agp corrected_stats versions

ragtag:

Fast reference-guided genome assembly scaffolding

De novo genome assembler for long uncorrected reads.

01

fasta gfa versions

SALSA, A tool to scaffold long read assemblies with HiC

0120000

fasta agp agp_original_coordinates versions

metagenomic binning with self-supervised learning

012

csv model output_fasta recluster_fasta tsv versions

semibin:

Metagenomic binning with semi-supervised siamese neural network

The goal of the Shasta long read assembler is to rapidly produce accurate assembled sequence using DNA reads generated by Oxford Nanopore flow cells as input. Please note Assembler is design to focus on speed, so assembly may be considered somewhat non-deterministic as final assembly may vary across executions. See https://github.com/chanzuckerberg/shasta/issues/296.

01

assembly gfa results versions

Assemble bacterial isolate genomes from Illumina paired-end reads

01

contigs corrections log raw_contigs gfa versions

Assembles a small genome (bacterial, fungal, viral)

012300

scaffolds contigs transcripts gene_clusters gfa warnings log versions

Merges the annotation gtf file and the stringtie output gtf files

00

gtf versions

stringtie2:

Transcript assembly and quantification for RNA-Seq

Transcript assembly and quantification for RNA-Se

010

transcript_gtf abundance coverage_gtf ballgown versions

stringtie2:

Transcript assembly and quantification for RNA-Seq

SvABA is an efficient and accurate method for detecting SVs from short-read sequencing data using genome-wide local assembly with low memory and computing requirements

01234010101010101

sv indel germ_indel germ_sv som_indel som_sv unfiltered_sv unfiltered_indel unfiltered_germ_indel unfiltered_germ_sv unfiltered_som_indel unfiltered_som_sv raw_calls discordants log versions

TransDecoder identifies candidate coding regions within transcript sequences. it is used to build gff file.

01

pep gff3 cds dat folder versions

transdecoder:

TransDecoder identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks.

TransDecoder identifies candidate coding regions within transcript sequences. It is used to build gff file. You can use this module after transdecoder_longorf

010

pep gff3 cds bed versions

transdecoder:

TransDecoder identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks.

Assembles a de novo transcriptome from RNAseq reads

01

transcript_fasta log versions

convert between genome builds

010

lifted unlifted versions

ucsc:

Move annotations from one assembly to another

Assembles bacterial genomes

012

scaffolds gfa log versions

Performs assembly scaffolding using YaHS

0100

scaffolds_fasta scaffolds_agp binary versions

a tool to build k-mer hash table for fasta and fastq files

01

yak versions

yak:

Yet another k-mer analyzer

Click here to trigger an update.