Available Modules

Modules are the building stones of all DSL2 nf-core blocks. You can find more info from nf-core website, if you would like to write your own module.

  • assembly 67
  • genome 15
  • quality control 9
  • fasta 8
  • long reads 8
  • genomics 7
  • nanopore 7
  • binning 7
  • mags 7
  • bacteria 5
  • coverage 5
  • classification 5
  • contamination 5
  • taxonomic classification 5
  • de novo assembly 5
  • de novo 5
  • bam 4
  • fastq 4
  • metagenomics 4
  • sort 4
  • contigs 4
  • evaluation 4
  • genome assembler 4
  • reference 3
  • download 3
  • gfa 3
  • pacbio 3
  • table 3
  • illumina 3
  • bins 3
  • transcript 3
  • transcriptome 3
  • prokaryote 3
  • NCBI 3
  • bacterial 3
  • polishing 3
  • scaffolding 3
  • chimeras 3
  • PacBio 3
  • genome assembly 3
  • das tool 3
  • organelle 3
  • scaffold 3
  • das_tool 3
  • gatk4 2
  • annotation 2
  • variant calling 2
  • filter 2
  • gff 2
  • gtf 2
  • k-mer 2
  • quality 2
  • ancient DNA 2
  • single-cell 2
  • visualisation 2
  • depth 2
  • haplotype 2
  • filtering 2
  • aDNA 2
  • palaeogenomics 2
  • archaeogenomics 2
  • damage 2
  • mitochondria 2
  • summary 2
  • reference-free 2
  • deamination 2
  • miscoding lesions 2
  • palaeogenetics 2
  • HiFi 2
  • archaeogenetics 2
  • resistance 2
  • virulence 2
  • proteome 2
  • fcs-gx 2
  • cellranger 2
  • C to T 2
  • small genome 2
  • duplicate 2
  • de novo assembler 2
  • Read depth 2
  • mitochondrion 2
  • contig 2
  • assembly evaluation 2
  • purge duplications 2
  • Duplication purging 2
  • polish 2
  • immunoprofiling 2
  • screening 2
  • cleaning 2
  • ragtag 2
  • vcf 1
  • index 1
  • bed 1
  • sam 1
  • structural variants 1
  • database 1
  • align 1
  • merge 1
  • statistics 1
  • qc 1
  • split 1
  • somatic 1
  • count 1
  • consensus 1
  • sv 1
  • kmer 1
  • graph 1
  • picard 1
  • stats 1
  • phage 1
  • antimicrobial resistance 1
  • repeat 1
  • completeness 1
  • phasing 1
  • checkm 1
  • virus 1
  • metagenome 1
  • sequence 1
  • ucsc 1
  • plasmid 1
  • json 1
  • compare 1
  • indels 1
  • mutect2 1
  • ont 1
  • haplotypecaller 1
  • quantification 1
  • read depth 1
  • hic 1
  • retrotransposon 1
  • clean 1
  • abundance 1
  • normalization 1
  • typing 1
  • eukaryotes 1
  • hi-c 1
  • comparison 1
  • mlst 1
  • k-mer frequency 1
  • rgfa 1
  • serogroup 1
  • long terminal retrotransposon 1
  • screen 1
  • vdj 1
  • estimation 1
  • short-read sequencing 1
  • yahs 1
  • detecting svs 1
  • chloroplast 1
  • patch 1
  • taxonomic composition 1
  • transcroder 1
  • cds 1
  • coding 1
  • eucaryotes 1
  • drep 1
  • ucsc/liftover 1
  • microbial genomics 1
  • dereplication 1
  • assembly polishing 1
  • trio binning 1
  • genome polishing 1
  • busco 1
  • reference-independent 1
  • redundant 1
  • cobra 1
  • extension 1
  • metagenome assembler 1
  • vector 1
  • Read coverage histogram 1
  • genome graph 1
  • plastid 1
  • single molecule 1
  • genome statistics 1
  • genome manipulation 1
  • genome summary 1
  • gunc 1
  • gfastats 1
  • snvs 1
  • antimicrobial reistance 1
  • contiguate 1
  • segment 1
  • mkvdjref 1
  • hifi 1
  • Assembly 1
  • cmseq 1
  • protein coding genes 1
  • polymorphic sites 1
  • polymorphic 1
  • polymut 1
  • haplotype resolution 1
  • assembly curation 1
  • false duplications 1
  • duplicate purging 1
  • haplotype purging 1
  • cutoff 1
  • False duplications 1
  • Haplotype purging 1
  • Assembly curation 1
  • purging 1
  • long uncorrected reads 1
  • quast 1
  • liftovervcf 1
  • longread 1
  • de-novo 1
  • salsa2 1
  • salsa 1
  • assembly-binning 1
  • metagenome-assembled genomes 1
  • maxbin2 1
  • pneumoniae 1
  • Klebsiella 1
  • pbp 1
  • megahit 1
  • Merqury 1
  • debruijn 1
  • denovo 1
  • long-reads 1
  • alignment 0
  • cram 0
  • map 0
  • variants 0
  • classify 0
  • cnv 0
  • variant 0
  • MSA 0
  • taxonomy 0
  • taxonomic profiling 0
  • sentieon 0
  • convert 0
  • conversion 0
  • clustering 0
  • proteomics 0
  • copy number 0
  • VCF 0
  • bedtools 0
  • phylogeny 0
  • rnaseq 0
  • imputation 0
  • trimming 0
  • build 0
  • reporting 0
  • gvcf 0
  • isoseq 0
  • bcftools 0
  • variation graph 0
  • bisulfite 0
  • bisulphite 0
  • compression 0
  • methylseq 0
  • cna 0
  • indexing 0
  • long-read 0
  • wgs 0
  • protein 0
  • bqsr 0
  • databases 0
  • QC 0
  • methylation 0
  • sequences 0
  • demultiplex 0
  • mapping 0
  • openms 0
  • imaging 0
  • plink2 0
  • 5mC 0
  • serotype 0
  • metrics 0
  • tsv 0
  • pangenome graph 0
  • markduplicates 0
  • histogram 0
  • neural network 0
  • structure 0
  • cluster 0
  • base quality score recalibration 0
  • scWGBS 0
  • samtools 0
  • matrix 0
  • WGBS 0
  • expression 0
  • plot 0
  • amr 0
  • protein sequence 0
  • pairs 0
  • searching 0
  • DNA methylation 0
  • example 0
  • machine learning 0
  • validation 0
  • mmseqs2 0
  • bcf 0
  • low-coverage 0
  • mappability 0
  • biscuit 0
  • annotate 0
  • gzip 0
  • aligner 0
  • cooler 0
  • LAST 0
  • gene 0
  • bwa 0
  • genotype 0
  • seqkit 0
  • iCLIP 0
  • germline 0
  • bisulfite sequencing 0
  • db 0
  • complexity 0
  • genotyping 0
  • feature 0
  • peaks 0
  • gff3 0
  • kraken2 0
  • hmmer 0
  • mkref 0
  • segmentation 0
  • blast 0
  • decompression 0
  • ncbi 0
  • glimpse 0
  • population genetics 0
  • msa 0
  • spatial 0
  • newick 0
  • umi 0
  • mag 0
  • bismark 0
  • hmmsearch 0
  • dedup 0
  • sketch 0
  • vsearch 0
  • reads 0
  • demultiplexing 0
  • antimicrobial resistance genes 0
  • rna 0
  • csv 0
  • extract 0
  • bedGraph 0
  • multiple sequence alignment 0
  • short-read 0
  • tumor-only 0
  • deduplication 0
  • report 0
  • single 0
  • cnvkit 0
  • duplicates 0
  • mirna 0
  • snp 0
  • pangenome 0
  • antimicrobial peptides 0
  • prediction 0
  • splicing 0
  • scRNA-seq 0
  • low frequency variant calling 0
  • kmers 0
  • profile 0
  • differential 0
  • mpileup 0
  • idXML 0
  • concatenate 0
  • fragment 0
  • diversity 0
  • profiling 0
  • svtk 0
  • mem 0
  • cat 0
  • kallisto 0
  • detection 0
  • riboseq 0
  • fastx 0
  • text 0
  • counts 0
  • benchmark 0
  • MAF 0
  • gridss 0
  • adapters 0
  • merging 0
  • isolates 0
  • arg 0
  • amps 0
  • antibiotic resistance 0
  • interval 0
  • sourmash 0
  • call 0
  • FASTQ 0
  • microbiome 0
  • visualization 0
  • ptr 0
  • query 0
  • distance 0
  • tabular 0
  • view 0
  • wxs 0
  • clipping 0
  • structural 0
  • single cell 0
  • 3-letter genome 0
  • coptr 0
  • microsatellite 0
  • deep learning 0
  • gsea 0
  • snps 0
  • mtDNA 0
  • enrichment 0
  • fgbio 0
  • redundancy 0
  • CLIP 0
  • transcriptomics 0
  • peak-calling 0
  • xeniumranger 0
  • diamond 0
  • circrna 0
  • bedgraph 0
  • ranking 0
  • interval_list 0
  • happy 0
  • public datasets 0
  • preprocessing 0
  • bin 0
  • bigwig 0
  • STR 0
  • cut 0
  • ganon 0
  • phylogenetic placement 0
  • containment 0
  • SV 0
  • sylph 0
  • isomir 0
  • bedpe 0
  • dna 0
  • ngscheckmate 0
  • HMM 0
  • hmmcopy 0
  • paf 0
  • telomere 0
  • pypgx 0
  • compress 0
  • matching 0
  • ccs 0
  • genmod 0
  • BGC 0
  • chunk 0
  • propr 0
  • DNA sequencing 0
  • targeted sequencing 0
  • fai 0
  • hybrid capture sequencing 0
  • image 0
  • bgzip 0
  • biosynthetic gene cluster 0
  • malt 0
  • copy number alteration calling 0
  • chromosome 0
  • fungi 0
  • DNA sequence 0
  • fusion 0
  • ATAC-seq 0
  • umitools 0
  • bcl2fastq 0
  • ampir 0
  • logratio 0
  • union 0
  • ancestry 0
  • add 0
  • sample 0
  • sequencing 0
  • microarray 0
  • skani 0
  • family 0
  • parsing 0
  • untar 0
  • transposons 0
  • highly_multiplexed_imaging 0
  • unzip 0
  • fastk 0
  • mcmicro 0
  • image_analysis 0
  • duplication 0
  • fusions 0
  • UMI 0
  • uncompress 0
  • html 0
  • ataqv 0
  • krona 0
  • bakta 0
  • benchmarking 0
  • minimap2 0
  • pileup 0
  • tabix 0
  • quality trimming 0
  • zip 0
  • archiving 0
  • remove 0
  • entrez 0
  • panel 0
  • adapter trimming 0
  • uLTRA 0
  • small indels 0
  • host 0
  • bamtools 0
  • checkv 0
  • khmer 0
  • informative sites 0
  • spaceranger 0
  • popscle 0
  • genotype-based deconvoltion 0
  • observations 0
  • lossless 0
  • DRAMP 0
  • neubi 0
  • amplify 0
  • macrel 0
  • kinship 0
  • ligate 0
  • rna_structure 0
  • RNA 0
  • identity 0
  • dist 0
  • transcripts 0
  • relatedness 0
  • score 0
  • angsd 0
  • shapeit 0
  • seqtk 0
  • RNA-seq 0
  • subsample 0
  • pseudoalignment 0
  • SNP 0
  • arriba 0
  • krona chart 0
  • rsem 0
  • reports 0
  • notebook 0
  • wastewater 0
  • amplicon sequencing 0
  • indel 0
  • dictionary 0
  • miRNA 0
  • prokaryotes 0
  • spark 0
  • survivor 0
  • population genomics 0
  • cfDNA 0
  • genome mining 0
  • hidden Markov model 0
  • mask 0
  • ambient RNA removal 0
  • complement 0
  • long_read 0
  • atac-seq 0
  • fam 0
  • somatic variants 0
  • aln 0
  • cut up 0
  • bracken 0
  • bim 0
  • cool 0
  • mzml 0
  • gatk4spark 0
  • mapper 0
  • repeat expansion 0
  • CRISPR 0
  • npz 0
  • combine 0
  • comparisons 0
  • prefetch 0
  • windowmasker 0
  • prokka 0
  • bwameth 0
  • guide tree 0
  • amplicon sequences 0
  • kraken 0
  • structural_variants 0
  • chip-seq 0
  • lineage 0
  • wig 0
  • png 0
  • microbes 0
  • pangolin 0
  • covid 0
  • pan-genome 0
  • pairsam 0
  • gene expression 0
  • variant_calling 0
  • replace 0
  • mkfastq 0
  • nucleotide 0
  • insert 0
  • dump 0
  • regions 0
  • roh 0
  • intervals 0
  • fingerprint 0
  • genomes 0
  • converter 0
  • PCA 0
  • vrhyme 0
  • deeparg 0
  • scores 0
  • graph layout 0
  • shigella 0
  • haplogroups 0
  • genetics 0
  • functional analysis 0
  • copyratios 0
  • tnhaplotyper2 0
  • signature 0
  • interactions 0
  • rrna 0
  • ancient dna 0
  • switch 0
  • xz 0
  • hla 0
  • reformat 0
  • megan 0
  • regression 0
  • COBS 0
  • hlala 0
  • hla_typing 0
  • hlala_typing 0
  • read-group 0
  • archive 0
  • mapcounter 0
  • zlib 0
  • taxids 0
  • ChIP-seq 0
  • concordance 0
  • variation 0
  • resolve_bioscience 0
  • effect prediction 0
  • snpeff 0
  • GPU-accelerated 0
  • ampgram 0
  • amptransformer 0
  • snpsift 0
  • cancer genomics 0
  • spatial_transcriptomics 0
  • genomad 0
  • runs_of_homozygosity 0
  • profiles 0
  • junctions 0
  • small variants 0
  • gstama 0
  • taxon name 0
  • SimpleAF 0
  • trancriptome 0
  • multiallelic 0
  • FracMinHash sketch 0
  • tama 0
  • image_processing 0
  • nucleotides 0
  • ped 0
  • cnvnator 0
  • gene set 0
  • registration 0
  • GC content 0
  • gene set analysis 0
  • proportionality 0
  • differential expression 0
  • phase 0
  • checksum 0
  • leviosam2 0
  • metamaps 0
  • salmon 0
  • barcode 0
  • primer 0
  • soft-clipped clusters 0
  • pharokka 0
  • taxon tables 0
  • otu tables 0
  • varcal 0
  • instability 0
  • pair 0
  • standardisation 0
  • msi 0
  • minhash 0
  • interactive 0
  • krakenuniq 0
  • standardise 0
  • lofreq 0
  • salmonella 0
  • homoploymer 0
  • library 0
  • bam2fq 0
  • preseq 0
  • collate 0
  • adapter 0
  • function 0
  • retrotransposons 0
  • MSI 0
  • long terminal repeat 0
  • dict 0
  • fixmate 0
  • kma 0
  • import 0
  • mash 0
  • taxonomic profile 0
  • tumor 0
  • ichorcna 0
  • maximum likelihood 0
  • polyA_tail 0
  • sequenzautils 0
  • refine 0
  • svdb 0
  • mudskipper 0
  • reformatting 0
  • iphop 0
  • orf 0
  • vg 0
  • Streptococcus pneumoniae 0
  • bloom filter 0
  • rtgtools 0
  • instrain 0
  • lift 0
  • k-mer index 0
  • nextclade 0
  • transformation 0
  • micro-satellite-scan 0
  • rename 0
  • krakentools 0
  • tree 0
  • variant pruning 0
  • msisensor-pro 0
  • bustools 0
  • bfiles 0
  • transcriptomic 0
  • parallelized 0
  • standardization 0
  • orthology 0
  • subset 0
  • vcflib 0
  • removal 0
  • join 0
  • repeat_expansions 0
  • duplex 0
  • fetch 0
  • GEO 0
  • frame-shift correction 0
  • long-read sequencing 0
  • metagenomic 0
  • identifier 0
  • sequence analysis 0
  • expansionhunterdenovo 0
  • metadata 0
  • reheader 0
  • tab 0
  • intersection 0
  • windows 0
  • pharmacogenetics 0
  • emboss 0
  • doublets 0
  • eigenstrat 0
  • anndata 0
  • validate 0
  • UMIs 0
  • unaligned 0
  • samplesheet 0
  • smrnaseq 0
  • xenograft 0
  • MCMICRO 0
  • graft 0
  • trim 0
  • allele-specific 0
  • mirdeep2 0
  • RNA sequencing 0
  • realignment 0
  • microbial 0
  • microscopy 0
  • Pharmacogenetics 0
  • deconvolution 0
  • bayesian 0
  • concat 0
  • tbi 0
  • intersect 0
  • normalize 0
  • norm 0
  • merge mate pairs 0
  • reads merging 0
  • short reads 0
  • region 0
  • sizes 0
  • ome-tif 0
  • nanostring 0
  • trgt 0
  • pigz 0
  • find 0
  • split_kmers 0
  • corrupted 0
  • calling 0
  • nacho 0
  • cnv calling 0
  • CNV 0
  • mRNA 0
  • cvnkit 0
  • single cells 0
  • genome bins 0
  • recombination 0
  • eCLIP 0
  • splice 0
  • parse 0
  • correction 0
  • bases 0
  • heatmap 0
  • format 0
  • eido 0
  • haplotypes 0
  • awk 0
  • BAM 0
  • blastp 0
  • deseq2 0
  • rna-seq 0
  • blastn 0
  • spatial_omics 0
  • human removal 0
  • random forest 0
  • metagenomes 0
  • gene labels 0
  • structural-variant calling 0
  • hostile 0
  • fasterq-dump 0
  • sra-tools 0
  • settings 0
  • decontamination 0
  • version 0
  • interval list 0
  • scatter 0
  • gatk 0
  • NRPS 0
  • evidence 0
  • MaltExtract 0
  • HOPS 0
  • panelofnormals 0
  • baf 0
  • authentication 0
  • edit distance 0
  • dereplicate 0
  • secondary metabolites 0
  • RiPP 0
  • allele 0
  • demultiplexed reads 0
  • antibiotics 0
  • aggregate 0
  • artic 0
  • simulate 0
  • antismash 0
  • RNA-Seq 0
  • WGS 0
  • joint genotyping 0
  • cgMLST 0
  • samples 0
  • orthologs 0
  • repeats 0
  • filtermutectcalls 0
  • qualty 0
  • gem 0
  • gwas 0
  • hmmscan 0
  • alr 0
  • blat 0
  • Bioinformatics Tools 0
  • confidence 0
  • phylogenies 0
  • geo 0
  • hmmpress 0
  • hhsuite 0
  • mapad 0
  • covariance models 0
  • trna 0
  • clr 0
  • copy number variation 0
  • missingness 0
  • reference compression 0
  • baftest 0
  • svtk/baftest 0
  • regex 0
  • impute 0
  • scanner 0
  • whamg 0
  • constant 0
  • wham 0
  • reference panel 0
  • modelsegments 0
  • copy-number 0
  • copy number analysis 0
  • unmarkduplicates 0
  • gender determination 0
  • junction 0
  • references 0
  • copy number alterations 0
  • sccmec 0
  • variantcalling 0
  • c to t 0
  • adna 0
  • dnamodelapply 0
  • workflow_mode 0
  • groupby 0
  • createreadcountpanelofnormals 0
  • denoisereadcounts 0
  • metaspace 0
  • metabolite annotation 0
  • readwriter 0
  • mzML 0
  • snakemake 0
  • data-download 0
  • Immune Deconvolution 0
  • ribosomal RNA 0
  • rRNA 0
  • prepare 0
  • hwe 0
  • catpack 0
  • Computational Immunology 0
  • workflow 0
  • tnscope 0
  • genome annotation 0
  • readproteingroups 0
  • dnascope 0
  • 16S 0
  • proteus 0
  • streptococcus 0
  • spa 0
  • spatype 0
  • mobile genetic elements 0
  • integron 0
  • patterns 0
  • signatures 0
  • doublet 0
  • countsvtypes 0
  • eigenvectors 0
  • hicPCA 0
  • fracminhash sketch 0
  • hash sketch 0
  • sliding 0
  • bgen 0
  • CRISPRi 0
  • pruning 0
  • rdtest2vcf 0
  • downsample 0
  • longest 0
  • isoform 0
  • upd 0
  • uniparental 0
  • disomy 0
  • snv 0
  • variancepartition 0
  • sequencing adapters 0
  • downsample bam 0
  • subsample bam 0
  • vcf2db 0
  • gemini 0
  • maf 0
  • lua 0
  • dream 0
  • toml 0
  • chromosomal rearrangements 0
  • agat 0
  • vcfbreakmulti 0
  • pca 0
  • linkage equilibrium 0
  • refflat 0
  • genepred 0
  • bedtobigbed 0
  • bigbed 0
  • f coefficient 0
  • homozygous genotypes 0
  • heterozygous genotypes 0
  • inbreeding 0
  • umicollapse 0
  • bedgraphtobigwig 0
  • scRNA-Seq 0
  • plink2_pca 0
  • bgen file 0
  • covariance model 0
  • files 0
  • vcf file 0
  • genotype dosages 0
  • Mycobacterium tuberculosis 0
  • rdtest 0
  • SNV 0
  • remove samples 0
  • extractunbinned 0
  • tandem repeats 0
  • linkbins 0
  • long read 0
  • decompress 0
  • sintax 0
  • vsearch/sort 0
  • vcf2bed 0
  • shuffleBed 0
  • Indel 0
  • host removal 0
  • usearch 0
  • long read alignment 0
  • pangenome-scale 0
  • all versus all 0
  • mashmap 0
  • gtftogenepred 0
  • wavefront 0
  • haploype 0
  • helitron 0
  • polya tail 0
  • fast5 0
  • network 0
  • bedcov 0
  • uniq 0
  • deduplicate 0
  • paired reads re-pairing 0
  • comp 0
  • md 0
  • VCFtools 0
  • nm 0
  • wget 0
  • uq 0
  • verifybamid 0
  • GFF/GTF 0
  • short 0
  • intron 0
  • DNA contamination estimation 0
  • SINE 0
  • masking 0
  • low-complexity 0
  • plant 0
  • construct 0
  • melon 0
  • graph projection to vcf 0
  • boxcox 0
  • fix 0
  • tag2tag 0
  • association 0
  • GWAS 0
  • svg 0
  • case/control 0
  • xml 0
  • script 0
  • java 0
  • associations 0
  • rank 0
  • spatial_neighborhoods 0
  • hashing-based deconvolution 0
  • tags 0
  • standard 0
  • impute-info 0
  • functional 0
  • Illumina 0
  • scimap 0
  • Bayesian 0
  • uniques 0
  • invariant 0
  • structural-variants 0
  • omics 0
  • biological activity 0
  • drug categorization 0
  • prior knowledge 0
  • refresh 0
  • clahe 0
  • cell_barcodes 0
  • microRNA 0
  • telseq 0
  • stardist 0
  • variant-calling 0
  • poolseq 0
  • multi-tool 0
  • predict 0
  • search engine 0
  • mass_error 0
  • hardy-weinberg 0
  • hwe statistics 0
  • multiqc 0
  • hwe equilibrium 0
  • haplotag 0
  • genotype likelihood 0
  • Staging 0
  • collapse 0
  • liftover 0
  • probabilistic realignment 0
  • seqfu 0
  • n50 0
  • cell_type_identification 0
  • cell_phenotyping 0
  • machine_learning 0
  • staging 0
  • tag 0
  • mygene 0
  • vsearch/dereplicate 0
  • coreutils 0
  • transcription factors 0
  • regulatory network 0
  • 10x 0
  • ribosomal 0
  • grabix 0
  • hamming-distance 0
  • bwameme 0
  • bwamem2 0
  • guidetree 0
  • hashing-based deconvoltion 0
  • gnu 0
  • Pacbio 0
  • overlap-based merging 0
  • generic 0
  • AC/NS/AF 0
  • vcflib/vcffixup 0
  • trimfq 0
  • cellsnp 0
  • transposable element 0
  • retrieval 0
  • donor deconvolution 0
  • genotype-based demultiplexing 0
  • MMseqs2 0
  • lexogen 0
  • droplet based single cells 0
  • check 0
  • paired reads merging 0
  • Read report 0
  • orthogroup 0
  • go 0
  • Read trimming 0
  • Read filters 0
  • nanoq 0
  • pile up 0
  • extraction 0
  • featuretable 0
  • mass spectrometry 0
  • sage 0
  • nanopore sequencing 0
  • rna velocity 0
  • translation 0
  • spot 0
  • circular 0
  • realign 0
  • quality check 0
  • size 0
  • cram-size 0
  • selector 0
  • grea 0
  • paraphase 0
  • functional enrichment 0
  • homologs 0
  • vsearch/fastqfilter 0
  • malformed 0
  • rad 0
  • tnfilter 0
  • plotting 0
  • scanpy 0
  • array_cgh 0
  • cytosure 0
  • gprofiler2 0
  • gost 0
  • morphology 0
  • resegment 0
  • relabel 0
  • regtools 0
  • cell segmentation 0
  • nuclear segmentation 0
  • structural variant 0
  • bam2fastx 0
  • import segmentation 0
  • bam2fastq 0
  • immcantation 0
  • airrseq 0
  • immunoinformatics 0
  • solo 0
  • scvi 0
  • co-orthology 0
  • derived alleles 0
  • InterProScan 0
  • sequence similarity 0
  • decompose 0
  • partitioning 0
  • Escherichia coli 0
  • chip 0
  • propd 0
  • updatedata 0
  • run 0
  • reverse complement 0
  • pdb 0
  • simulation 0
  • hmmfetch 0
  • block substitutions 0
  • site frequency spectrum 0
  • transmembrane 0
  • decomposeblocksub 0
  • tnseq 0
  • identity-by-descent 0
  • decoy 0
  • htseq 0
  • mgi 0
  • sompy 0
  • recovery 0
  • peak picking 0
  • leafcutter 0
  • homology 0
  • p-value 0
  • fastqfilter 0
  • translate 0
  • raw 0
  • mgf 0
  • tarball 0
  • parquet 0
  • parser 0
  • dbsnp 0
  • standardize 0
  • quarto 0
  • python 0
  • r 0
  • tar 0
  • jvarkit 0
  • resistance genes 0
  • setgt 0
  • coexpression 0
  • correlation 0
  • corpcor 0
  • ATACshift 0
  • assay 0
  • phylogenetics 0
  • shift 0
  • minimum_evolution 0
  • distance-based 0
  • ATACseq 0
  • nucleotide sequence 0
  • targz 0
  • resfinder 0
  • significance statistic 0
  • gaps 0
  • logFC 0
  • spectral clustering 0
  • comparative genomics 0
  • subsetting 0
  • deep variant 0
  • mutect 0
  • idx 0
  • barcodes 0
  • doublet_detection 0
  • quality_control 0
  • transform 0
  • emoji 0
  • introns 0
  • source tracking 0
  • controlstatistics 0
  • elprep 0
  • elfasta 0
  • install 0
  • nucleotide content 0
  • joint-genotyping 0
  • genotypegvcf 0
  • AT content 0
  • nucBed 0
  • bclconvert 0
  • parallel 0
  • ancestral alleles 0
  • methylation bias 0
  • SNPs 0
  • getpileupsummaries 0
  • short variant discovery 0
  • combinegvcfs 0
  • collectsvevidence 0
  • collectreadcounts 0
  • cnnscorevariants 0
  • calibratedragstrmodel 0
  • cross-samplecontamination 0
  • dragstr 0
  • calculatecontamination 0
  • bedtointervallist 0
  • asereadcounter 0
  • vqsr 0
  • variant quality score recalibration 0
  • annotateintervals 0
  • composestrtablefile 0
  • condensedepthevidence 0
  • heattree 0
  • gatherbqsrreports 0
  • germlinecnvcaller 0
  • germline contig ploidy 0
  • panelofnormalscreation 0
  • jointgenotyping 0
  • genomicsdbimport 0
  • genomicsdb 0
  • tranche filtering 0
  • createsequencedictionary 0
  • filtervarianttranches 0
  • filterintervals 0
  • estimatelibrarycomplexity 0
  • duplication metrics 0
  • determinegermlinecontigploidy 0
  • createsomaticpanelofnormals 0
  • targets 0
  • gangstr 0
  • getpileupsumaries 0
  • antibiotic resistance genes 0
  • consensus sequence 0
  • public 0
  • ENA 0
  • SRA 0
  • ANI 0
  • ARGs 0
  • faqcs 0
  • groupreads 0
  • str 0
  • cache 0
  • percent on target 0
  • endogenous DNA 0
  • Streptococcus pyogenes 0
  • swissprot 0
  • duplexumi 0
  • unmapped 0
  • gene-calling 0
  • variant caller 0
  • gamma 0
  • UShER 0
  • bootstrapping 0
  • bacterial variant calling 0
  • germline variant calling 0
  • somatic variant calling 0
  • rust 0
  • ubam 0
  • fq 0
  • lint 0
  • random 0
  • generate 0
  • zipperbams 0
  • germlinevariantsites 0
  • readcountssummary 0
  • embl 0
  • Imputation 0
  • gene model 0
  • tama_collapse.py 0
  • genomes on a tree 0
  • merge compare 0
  • GNU 0
  • joint-variant-calling 0
  • Haplotypes 0
  • gstama/merge 0
  • Sample 0
  • low coverage 0
  • gget 0
  • TAMA 0
  • gstama/polyacleanup 0
  • Mykrobe 0
  • abricate 0
  • beagle 0
  • hbd 0
  • ibd 0
  • rgi 0
  • fARGene 0
  • amrfinderplus 0
  • extractvariants 0
  • GTDB taxonomy 0
  • extract_variants 0
  • gvcftools 0
  • gunzip 0
  • archaea 0
  • genome taxonomy database 0
  • Salmonella Typhi 0
  • indexfeaturefile 0
  • preprocessintervals 0
  • shiftchain 0
  • selectvariants 0
  • revert 0
  • reblockgvcf 0
  • printsvevidence 0
  • printreads 0
  • postprocessgermlinecnvcalls 0
  • shiftintervals 0
  • mutectstats 0
  • mergebamalignment 0
  • leftalignandtrimvariants 0
  • readorientationartifacts 0
  • learnreadorientationmodel 0
  • shiftfasta 0
  • site depth 0
  • repeat content 0
  • file parsing 0
  • genome heterozygosity 0
  • genome size 0
  • models 0
  • compound 0
  • genome profile 0
  • bgc 0
  • txt 0
  • splitcram 0
  • gawk 0
  • variantrecalibrator 0
  • recalibration model 0
  • variantfiltration 0
  • svcluster 0
  • svannotate 0
  • splitintervals 0
  • genbank 0
  • split by chromosome 0
  • Haemophilus influenzae 0
  • illumiation_correction 0
  • BCF 0
  • csi 0
  • deduping 0
  • smaller fastqs 0
  • clumping fastqs 0
  • background_correction 0
  • element 0
  • biallelic 0
  • trimBam 0
  • bamUtil 0
  • bamtools/split 0
  • yaml 0
  • bamtools/convert 0
  • mouse 0
  • update header 0
  • homozygosity 0
  • virulent 0
  • chunking 0
  • subtract 0
  • slopBed 0
  • shiftBed 0
  • multinterval 0
  • overlapped bed 0
  • maskfasta 0
  • jaccard 0
  • autozygosity 0
  • overlap 0
  • getfasta 0
  • genomecov 0
  • closest 0
  • bamtobed 0
  • sorting 0
  • bacphlip 0
  • temperate 0
  • bioawk 0
  • amp 0
  • allele counts 0
  • nuclear contamination estimate 0
  • post Post-processing 0
  • model 0
  • AMPs 0
  • antimicrobial peptide prediction 0
  • Staphylococcus aureus 0
  • installation 0
  • affy 0
  • reference panels 0
  • admixture 0
  • adapterremoval 0
  • doCounts 0
  • HLA 0
  • lifestyle 0
  • read group 0
  • autofluorescence 0
  • cycif 0
  • background 0
  • single-stranded 0
  • ancientDNA 0
  • authentict 0
  • bias 0
  • utility 0
  • ATLAS 0
  • sequencing_bias 0
  • post mortem damage 0
  • atlas 0
  • mkarv 0
  • http(s) 0
  • unionBedGraphs 0
  • file manipulation 0
  • deletion 0
  • Segmentation 0
  • cutesv 0
  • gct 0
  • cls 0
  • na 0
  • custom 0
  • Cores 0
  • TMA dearray 0
  • paired-end 0
  • UNet 0
  • mcool 0
  • genomic bins 0
  • makebins 0
  • enzyme 0
  • digest 0
  • pcr duplicates 0
  • track 0
  • cooler/balance 0
  • escherichia coli 0
  • circos 0
  • eklipse 0
  • eigenstratdatabasetools 0
  • pep 0
  • schema 0
  • PEP 0
  • depth information 0
  • corrrelation 0
  • structural variation 0
  • duphold 0
  • blastx 0
  • cumulative coverage 0
  • scatterplot 0
  • cload 0
  • subcontigs 0
  • sorted 0
  • compartments 0
  • multiomics 0
  • cellpose 0
  • domains 0
  • topology 0
  • antibody capture 0
  • calder2 0
  • cadd 0
  • postprocessing 0
  • tblastn 0
  • subtyping 0
  • Salmonella enterica 0
  • antigen capture 0
  • crispr 0
  • nucleotide composition 0
  • concoct 0
  • partition histograms 0
  • target 0
  • export 0
  • antitarget 0
  • access 0
  • qa 0
  • chromosome_visualization 0
  • duplicate removal 0
  • chromap 0
  • quality assurnce 0
  • mitochondrial 0
  • predictions 0
  • normal database 0
  • panel of normals 0
  • genomic intervals 0
  • intervals coverage 0
  • gene finding 0
  • contact maps 0
  • bmp 0
  • jpg 0
  • pretext 0
  • porechop_abi 0
  • strandedness 0
  • sequence-based 0
  • read distribution 0
  • inner_distance 0
  • fragment_size 0
  • read_pairs 0
  • experiment 0
  • bamstat 0
  • R 0
  • rhocall 0
  • subsampling 0
  • neighbour-joining 0
  • contact 0
  • pmdtools 0
  • integrity 0
  • pcr 0
  • CoPRO 0
  • tandem duplications 0
  • insertions 0
  • deletions 0
  • sortvcf 0
  • picard/renamesampleinvcf 0
  • PRO-cap 0
  • mate-pair 0
  • hybrid-selection 0
  • phylogenetic composition 0
  • illumina datasets 0
  • identification 0
  • prophage 0
  • GRO-cap 0
  • CAGE 0
  • variant genetic 0
  • variant identifiers 0
  • scoring 0
  • identifiers 0
  • whole genome association 0
  • recode 0
  • indep pairwise 0
  • indep 0
  • exclude 0
  • NETCAGE 0
  • genetic 0
  • GRO-seq 0
  • PRO-seq 0
  • STRIPE-seq 0
  • csRNA-seq 0
  • RAMPAGE 0
  • mapping-based 0
  • rtg 0
  • ChIP-Seq 0
  • gc_wiggle 0
  • error 0
  • rare variants 0
  • relative coverage 0
  • genetic sex 0
  • sex determination 0
  • induce 0
  • bam2seqz 0
  • freqsum 0
  • pseudodiploid 0
  • pseudohaploid 0
  • random draw 0
  • selection 0
  • seq 0
  • sha256 0
  • interleave 0
  • SMN1 0
  • dbnsfp 0
  • snippy 0
  • core 0
  • sniffles 0
  • POA 0
  • SMN2 0
  • CRAM 0
  • 256 bit 0
  • sliding window 0
  • features 0
  • density 0
  • boxplot 0
  • exploratory 0
  • shinyngs 0
  • header 0
  • sertotype 0
  • pedfilter 0
  • flagstat 0
  • faidx 0
  • calmd 0
  • ampliconclip 0
  • amplicon 0
  • duplicate marking 0
  • sambamba 0
  • multimapper 0
  • repair 0
  • Ancestor 0
  • LCA 0
  • rtg-tools 0
  • rocplot 0
  • insert size 0
  • paired 0
  • sequence headers 0
  • seacr 0
  • grep 0
  • subseq 0
  • variant recalibration 0
  • VQSR 0
  • applyvarcal 0
  • chromatin 0
  • read pairs 0
  • cut&run 0
  • cut&tag 0
  • peak-caller 0
  • clusteridentifier 0
  • cluster analysis 0
  • scramble 0
  • readgroup 0
  • phantom peaks 0
  • motif 0
  • gccounter 0
  • clinical 0
  • qualities 0
  • lofreq/filter 0
  • lofreq/call 0
  • Listeria monocytogenes 0
  • limma 0
  • pneumophila 0
  • legionella 0
  • peptide prediction 0
  • collapsing 0
  • adapter removal 0
  • train 0
  • spliced 0
  • reorder 0
  • combining 0
  • AMP 0
  • functional genomics 0
  • kegg 0
  • taxonomic assignment 0
  • mass-spectroscopy 0
  • representations 0
  • reduced 0
  • mash/sketch 0
  • estimate 0
  • sgRNA 0
  • damage patterns 0
  • NGS 0
  • DNA damage 0
  • rra 0
  • maximum-likelihood 0
  • CRISPR-Cas9 0
  • kofamscan 0
  • MD5 0
  • haemophilus 0
  • genome browser 0
  • js 0
  • igv.js 0
  • igv 0
  • IDR 0
  • panel_of_normals 0
  • pos 0
  • pixel classification 0
  • annotations 0
  • hmtnote 0
  • Hidden Markov Model 0
  • amino acid 0
  • HMMER 0
  • readcounter 0
  • multicut 0
  • pixel_classification 0
  • jupytext 0
  • effective genome size 0
  • k-mer counting 0
  • digital normalization 0
  • quant 0
  • kallisto/index 0
  • papermill 0
  • Jupyter 0
  • probability_maps 0
  • Python 0
  • jasmine 0
  • jasminesv 0
  • insertion 0
  • genomic islands 0
  • interproscan 0
  • mcr-1 0
  • 128 bit 0
  • pedigrees 0
  • graph stats 0
  • ILP 0
  • hla-typing 0
  • tumor/normal 0
  • graph viz 0
  • graph formats 0
  • graph unchopping 0
  • combine graphs 0
  • block-compressed 0
  • odgi 0
  • squeeze 0
  • graph drawing 0
  • graph construction 0
  • gender 0
  • Neisseria gonorrhoeae 0
  • HLA-I 0
  • PCR/optical duplicates 0
  • NextGenMap 0
  • graphs 0
  • read 0
  • pair-end 0
  • subreads 0
  • pbmerge 0
  • pbbam 0
  • paragraph 0
  • flip 0
  • select 0
  • restriction fragments 0
  • pairstools 0
  • pairtools 0
  • ligation junctions 0
  • upper-triangular matrix 0
  • ngm 0
  • sequencing summary 0
  • assembler 0
  • mbias 0
  • metaphlan 0
  • unionsum 0
  • ploidy 0
  • smudgeplot 0
  • contour map 0
  • microrna 0
  • 3D heat map 0
  • Neisseria meningitidis 0
  • rma6 0
  • daa 0
  • de Bruijn 0
  • target prediction 0
  • mobile element insertions 0
  • bioinformatics tools 0
  • somatic structural variations 0
  • cancer genome 0
  • contaminant 0
  • SNP table 0
  • GATK UnifiedGenotyper 0
  • Beautiful stand-alone HTML report 0
  • mitochondrial to nuclear ratio 0
  • mitochondrial genome 0
  • ratio 0
  • mtnucratio 0
  • scan 0
  • microsatellite instability 0
  • otu table 0
  • mosdepth 0
  • reference genome 0

contiguate draft genome assembly

010

results versions

Screen assemblies for antimicrobial resistance against multiple databases

010

report versions

abricate:

Mass screening of contigs for antibiotic resistance genes

Screen assemblies for antimicrobial resistance against multiple databases

01

report versions

abricate:

Mass screening of contigs for antibiotic resistance genes

ALE: assembly likelihood estimator.

012

ale versions

Download and prepare database for Ariba analysis

01

db versions

ariba:

ARIBA: Antibiotic Resistance Identification By Assembly

Query input FASTQs against Ariba formatted databases

0101

results versions

ariba:

ARIBA: Antibiotic Resistance Identification By Assembly

Assembly summary statistics in JSON format

01

json versions

Render an assembly graph in GFA 1.0 format to PNG and SVG image formats

01

png svg versions

bandage:

Bandage - a Bioinformatics Application for Navigating De novo Assembly Graphs Easily

BBNorm is designed to normalize coverage by down-sampling reads over high-depth areas of a genome, to result in a flat coverage distribution.

01

fastq log versions

bbmap:

BBMap is a short read aligner, as well as various other bioinformatic tools.

Benchmarking Universal Single Copy Orthologs

0100000

batch_summary short_summaries_txt short_summaries_json log full_table missing_busco_list single_copy_proteins seq_dir translated_dir busco_dir downloaded_lineages single_copy_faa single_copy_fna versions

busco:

BUSCO provides measures for quantitative assessment of genome assembly, gene set, and transcriptome completeness based on evolutionarily informed expectations of gene content from near-universal single-copy orthologs selected from OrthoDB.

Download database for BUSCO

0

download_dir versions

busco:

BUSCO provides measures for quantitative assessment of genome assembly, gene set, and transcriptome completeness based on evolutionarily informed expectations of gene content from near-universal single-copy orthologs selected from OrthoDB.

BUSCO plot generation tool

0

png versions

busco:

BUSCO provides measures for quantitative assessment of genome assembly, gene set, and transcriptome completeness based on evolutionarily informed expectations of gene content from near-universal single-copy orthologs selected from OrthoDB.

Accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads.

0100

report assembly contigs corrected_reads corrected_trimmed_reads metadata contig_position contig_info versions

Taxonomic classification of long DNA sequences and metagenome assembled genomes (e.g. MAGs / bins).

0101

txt versions

catpack:

CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)

Taxonomic classification of long DNA sequences and metagenome assembled genomes (e.g. MAGs / bins).

0101010101

orf2lca bin2classification log diamond faa gff versions

catpack:

CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)

Taxonomic classification of long DNA sequences and metagenome assembled genomes (e.g. contigs, MAGs / bins).

0101010101

orf2lca contig2classification log diamond faa gff versions

catpack:

CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)

Taxonomic classification plus read-based abundance estimation from long DNA sequences and metagenome assembled genomes (e.g. contigs, MAGs / bins).

0101010101001010101010101

rat_log complete_abundance contig_abundance read2classification alignment_diamond contig2classification cat_log orf2lca faa gff unmapped_diamond unmapped_fasta unmapped2classification versions

catpack:

CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)

Summarises results from CAT/BAT/RAT classification steps

0101

txt versions

catpack:

CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)

Module to build the VDJ reference needed by the 10x Genomics Cell Ranger tool. Uses the cellranger mkvdjref command.

0000

reference versions

cellranger:

Cell Ranger processes data from 10X Genomics Chromium kits. cellranger vdj takes FASTQ files from cellranger mkfastq or bcl2fastq for V(D)J libraries and performs sequence assembly and paired clonotype calling. It uses the Chromium cellular barcodes and UMIs to assemble V(D)J transcripts per cell. Clonotypes and CDR3 sequences are output as a .vloupe file which can be loaded into Loupe V(D)J Browser.

Module to use Cell Ranger's pipelines analyze sequencing data produced from Chromium Single Cell Immune Profiling.

010

outs versions

cellranger:

Cell Ranger processes data from 10X Genomics Chromium kits. cellranger vdj takes FASTQ files from cellranger mkfastq or bcl2fastq for V(D)J libraries and performs sequence assembly and paired clonotype calling. It uses the Chromium cellular barcodes and UMIs to assemble V(D)J transcripts per cell. Clonotypes and CDR3 sequences are output as a .vloupe file which can be loaded into Loupe V(D)J Browser.

Calculates polymorphic site rates over protein coding genes

01234

polymut versions

cmseq:

Set of utilities on sequences and BAM files

A tool to raise the quality of viral genomes assembled from short-read metagenomes via resolving and joining of contigs fragmented during de novo assembly.

01010101000

self_circular extended_circular extended_partial extended_failed orphan_end all_cobra_assemblies joining_summary log versions

cobra-meta:

COBRA is a tool to get higher quality viral genomes assembled from metagenomes.

DAS Tool binning step.

01230

log summary contig2bin eval bins pdfs fasta_proteins candidates_faa fasta_archaea_scg fasta_bacteria_scg b6 seqlength versions

dastool:

DAS Tool is an automated method that integrates the results of a flexible number of binning algorithms to calculate an optimized, non-redundant set of bins from a single assembly.

Helper script to convert a set of bins in fasta format to tabular scaffolds2bin format

010

fastatocontig2bin versions

dastool:

DAS Tool is an automated method that integrates the results of a flexible number of binning algorithms to calculate an optimized, non-redundant set of bins from a single assembly.

Helper script to convert a set of bins in fasta format to tabular scaffolds2bin format

010

scaffolds2bin versions

dastool:

DAS Tool is an automated method that integrates the results of a flexible number of binning algorithms to calculate an optimized, non-redundant set of bins from a single assembly.

Assemble bacterial isolate genomes from Nanopore reads

012

contigs log raw_contigs gfa txt versions

Performs rapid genome comparisons for a group of genomes and visualize their relatedness

01

directory versions

drep:

De-replication of microbial genomes assembled from multiple samples

Export assembly segment sequences in GFA 1.0 format to FASTA format

01

fasta versions

dshbio:

Reads, features, variants, assemblies, alignments, genomic range trees, pangenome graphs, and a bunch of random command line tools for bioinformatics. LGPL version 3 or later.

Filter, sort and markdup sam/bam files, with optional BQSR and variant calling.

012345601010100000

bam logs metrics recall gvcf table activity_profile assembly_regions versions

elprep:

elPrep is a high-performance tool for preparing .sam/.bam files for variant calling in sequencing pipelines. It can be used as a drop-in replacement for SAMtools/Picard/GATK4.

Uses evigene/scripts/prot/tr2aacds.pl to filter a transcript assembly

01

dropset okayset versions

evigene:

EvidentialGene is a genome informatics project for "Evidence Directed Gene Construction for Eukaryotes", for constructing high quality, accurate gene sets for animals and plants (any eukaryotes), being developed by Don Gilbert at Indiana University, gilbertd at indiana edu.

Run NCBI's FCS adaptor on assembled genomes

01

cleaned_assembly adaptor_report log pipeline_args skipped_trims versions

fcs:

The Foreign Contamination Screening (FCS) tool rapidly detects contaminants from foreign organisms in genome assemblies to prepare your data for submission. Therefore, the submission process to NCBI is faster and fewer contaminated genomes are submitted. This reduces errors in analyses and conclusions, not just for the original data submitter but for all subsequent users of the assembly.

Run FCS-GX on assembled genomes. The contigs of the assembly are searched against a reference database excluding the given taxid.

010

fcs_gx_report taxonomy_report versions

fcs:

"The Foreign Contamination Screening (FCS) tool rapidly detects contaminants from foreign organisms in genome assemblies to prepare your data for submission. Therefore, the submission process to NCBI is faster and fewer contaminated genomes are submitted. This reduces errors in analyses and conclusions, not just for the original data submitter but for all subsequent users of the assembly."

Runs FCS-GX (Foreign Contamination Screen - Genome eXtractor) to remove foreign contamination from genome assemblies

012

cleaned contaminants versions

fcsgx:

The NCBI Foreign Contamination Screen. Genomic cross-species aligner, for contamination detection.

Runs FCS-GX (Foreign Contamination Screen - Genome eXtractor) to screen and remove foreign contamination from genome assemblies

01200

fcsgx_report taxonomy_report log hits versions

fcsgx:

The NCBI Foreign Contamination Screen. Genomic cross-species aligner, for contamination detection.

De novo assembler for single molecule sequencing reads

010

fasta gfa gv txt log json versions

Call germline SNPs and indels via local re-assembly of haplotypes

012340101010101

vcf tbi bam versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Call somatic SNVs and indels via local assembly of haplotypes.

01230101010000

vcf tbi stats f1r2 versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Downloads databases needed for running getorganelle

0

db versions

getorganelle:

Get organelle genomes from genome skimming data

Assembles organelle genomes from genomic data

0101

fasta etc versions

getorganelle:

Get organelle genomes from genome skimming data

A single fast and exhaustive tool for summary statistics and simultaneous fa (fasta, fastq, gfa [.gz]) genome assembly file manipulation.

0100001010101

assembly_summary assembly versions

Converts GFA or rGFA files to FASTA

01

fasta versions

gfatools:

Tools for manipulating sequence graphs in the GFA and rGFA formats

Download database for GUNC detection of Chimerism and Contamination in Prokaryotic Genomes

0

db versions

gunc:

Python package for detection of chimerism and contamination in prokaryotic genomes.

Merging of CheckM and GUNC results in one summary table

012

tsv versions

gunc:

Python package for detection of chimerism and contamination in prokaryotic genomes.

Detection of Chimerism and Contamination in Prokaryotic Genomes

010

maxcss_level_tsv all_levels_tsv versions

gunc:

Python package for detection of chimerism and contamination in prokaryotic genomes.

Whole-genome assembly using PacBio HiFi reads

01201201201

raw_unitigs bin_files processed_unitigs primary_contigs alternate_contigs hap1_contigs hap2_contigs corrected_reads read_overlaps log versions

Assembly polisher using short (and long) reads

0101000

fasta versions

Kleborate is a tool to screen genome assemblies of Klebsiella pneumoniae and the Klebsiella pneumoniae species complex (KpSC).

01

txt versions

LINKS is a genomics application for scaffolding genome assemblies with long reads, such as those produced by Oxford Nanopore Technologies Ltd. It can be used to scaffold high-quality draft genome assemblies with any long sequences (eg. ONT reads, PacBio reads, other draft genomes, etc). It is also used to scaffold contig pairs linked by ARCS/ARKS. This module is for LINKS >=2.0.0 and does not support MPET input.

0101

log pairing_distribution pairing_issues scaffolds_csv scaffolds_fasta bloom scaffolds_graph assembly_correspondence simplepair_checkpoint tigpair_checkpoint versions

Estimates the mean LTR sequence identity in the genome. The input genome fasta should have short alphanumeric IDs without comments

01000

log lai_out versions

lai:

Assessing genome assembly quality using the LTR Assembly Index (LAI)

MaxBin is a software that is capable of clustering metagenomic contigs

0123

binned_fastas summary abundance log marker_counts unbinned_fasta tooshort_fasta marker_bins marker_genes versions

A tool to create consensus sequences and variant calls from nanopore sequencing data

012

assembly versions

An ultra-fast metagenomic assembler for large and complex metagenomics

012

contigs k_contigs addi_contigs local_contigs kfinal_contigs log versions

pigz:

Parallel implementation of the gzip algorithm.

Compare k-mer frequency in reads and assembly to devise the metrics K and QV

0101000

hist log_stderr versions

merfin:

Merfin (k-mer based finishing tool) is a suite of subtools to variant filtering, assembly evaluation and polishing via k-mer validation. The subtool -hist estimates the QV (quality value of Merqury) for each scaffold/contig and genome-wide averages. In addition, Merfin produces a QV* estimate, which accounts also for kmers that are seen in excess with respect to their expected multiplicity predicted from the reads.

k-mer based assembly evaluation.

metameryl_dbassembly

meta versions assembly_only_kmers_bed assembly_only_kmers_wig stats dist_hist spectra_cn_fl_png spectra_cn_ln_png spectra_cn_st_png spectra_cn_hist spectra_asm_fl_png spectra_asm_ln_png spectra_asm_st_png spectra_asm_hist assembly_qv scaffold_qv read_ploidy

k-mer based assembly evaluation.

012

assembly_only_kmers_bed assembly_only_kmers_wig stats dist_hist spectra_cn_fl_png spectra_cn_hist spectra_cn_ln_png spectra_cn_st_png spectra_asm_fl_png spectra_asm_hist spectra_asm_ln_png spectra_asm_st_png assembly_qv scaffold_qv read_ploidy hapmers_blob_png versions

merqury:

Evaluate genome assemblies with k-mers and more.

Produces maternal and paternal FastK kmer tables from maternal, paternal and child FastK tables

010101

mat_hap_ktab pat_hap_ktab versions

merquryfk:

FastK based version of Merqury

FastK based version of Merqury

012340101

stats bed assembly_qv spectra_cn_fl spectra_cn_ln spectra_cn_st qv spectra_asm_fl spectra_asm_ln spectra_asm_st phased_block_bed phased_block_stats continuity_N block_N block_blob hapmers_blob versions

merquryfk:

FastK based version of Merqury

Depth computation per contig step of metabat2

012

depth versions

metabat2:

Metagenome binning

Metagenome binning of contigs

012

tooshort lowdepth unbinned membership fasta versions

metabat2:

Metagenome binning

Metagenome assembler for long-read sequences (HiFi and ONT).

010

contigs log versions

metamdbg:

MetaMDBG: a lightweight assembler for long and accurate metagenomics reads.

A very fast OLC-based de novo assembler for noisy long reads

012

gfa assembly versions

A python workflow that assembles mitogenomes from Pacbio HiFi reads

010000

fasta stats gb gff all_potential_contigs contigs_annotations contigs_circularization contigs_filtering coverage_mapping coverage_plot final_mitogenome_annotation final_mitogenome_choice final_mitogenome_coverage potential_contigs reads_mapping_and_assembly shared_genes versions

mitohifi.py:

A python workflow that assembles mitogenomes from Pacbio HiFi reads

Run Torsten Seemann's classic MLST on a genome assembly

01

tsv versions

A tool to quickly download assemblies from NCBI's Assembly database

0000

gbk fna rm features gff faa gpff wgs_gbk cds rna rna_fna report stats versions

NCBI tool for detecting vector contamination in nucleic acid sequences. This tool is older than NCBI's FCS-adaptor, which is for the same purpose

0101

vecscreen_output versions

ncbitools:

"NCBI libraries for biology applications (text-based utilities)"

An nf-core module for the OATK

010123401234

mito_fasta pltd_fasta mito_bed pltd_bed mito_gfa pltd_gfa annot_mito_txt annot_pltd_txt clean_gfa final_gfa initial_gfa multiplex_gfa unzip_gfa versions

Serogroup Pseudomonas aeruginosa assemblies

01

tsv blast details versions

Assign PBP type of Streptococcus pneumoniae assemblies

010

tsv blast versions

Lifts over a VCF file from one reference build to another.

01010101

vcf_lifted vcf_unlifted versions

picard:

Move annotations from one assembly to another

Automatically improve draft assemblies and find variation among strains, including large event detection

010120

improved_assembly vcf change_record tracks_bed tracks_wig versions

assembles bacterial plasmids

010

html tab images logs data database fasta_files kmer versions

Polishing genome assemblies with short reads.

01010

fasta versions debug

polypolish:

Polishing genome assemblies with short reads.

Calculate coverage cutoffs to determine when to purge duplicated sequence.

01

cutoff log versions

purgedups:

Purge_dups is a package used to purge haplotigs and overlaps in an assembly based on read depth

Separates out sequences purged of falsely duplicated sequences.

012

haplotigs purged versions

purgedups:

Purge_dups is a package used to purge haplotigs and overlaps in an assembly based on read depth

Plots the read coverage from a purge dups statistics file and cutoffs.

012

png versions

purgedups:

Purge_dups is a package used to purge haplotigs and overlaps in an assembly based on read depth

Create read depth histogram and base-level read depth for an assembly based on pacbio data

01

stat basecov versions

purgedups:

Purge_dups is a package used to purge haplotigs and overlaps in an assembly based on read depth

Purge haplotigs and overlaps for an assembly

0123

bed log versions

purgedups:

Purge_dups is a package used to purge haplotigs and overlaps in an assembly based on read depth

Split fasta file by 'N's to aid in self alignment for duplicate purging

01

split_fasta versions

purgedups:

Purge_dups is a package used to purge haplotigs and overlaps in an assembly based on read depth

Damage parameter estimation for ancient DNA

012

csv versions

pydamage:

Damage parameter estimation for ancient DNA

Damage parameter estimation for ancient DNA

01

csv versions

pydamage:

Damage parameter estimation for ancient DNA

Quality Assessment Tool for Genome Assemblies

010101

results tsv transcriptome misassemblies unaligned versions

Consensus module for raw de novo DNA assembly of long uncorrected reads

0123

improved_assembly versions

Homology-based assembly patching: Make continuous joins and fill gaps in 'target.fa' using sequences from 'query.fa'

01010101

patch_fasta patch_agp patch_components_fasta assembly_alignments target_splits_agp target_splits_fasta qry_rename_agp qry_rename_fasta stderr versions

ragtag:

Fast reference-guided genome assembly scaffolding

Scaffolding is the process of ordering and orienting draft assembly (query) sequences into longer sequences. Gaps (stretches of "N" characters) are placed between adjacent query sequences to indicate the presence of unknown sequence. RagTag uses whole-genome alignments to a reference assembly to scaffold query sequences. RagTag does not alter input query sequence in any way and only orders and orients sequences, joining them with gaps.

010101012

corrected_assembly corrected_agp corrected_stats versions

ragtag:

Fast reference-guided genome assembly scaffolding

De novo genome assembler for long uncorrected reads.

01

fasta gfa versions

SALSA, A tool to scaffold long read assemblies with HiC

0120000

fasta agp agp_original_coordinates versions

metagenomic binning with self-supervised learning

012

csv model output_fasta recluster_fasta tsv versions

semibin:

Metagenomic binning with semi-supervised siamese neural network

The goal of the Shasta long read assembler is to rapidly produce accurate assembled sequence using DNA reads generated by Oxford Nanopore flow cells as input. Please note Assembler is design to focus on speed, so assembly may be considered somewhat non-deterministic as final assembly may vary across executions. See https://github.com/chanzuckerberg/shasta/issues/296.

01

assembly gfa results versions

Assemble bacterial isolate genomes from Illumina paired-end reads

01

contigs corrections log raw_contigs gfa versions

Assembles a small genome (bacterial, fungal, viral)

012300

scaffolds contigs transcripts gene_clusters gfa warnings log versions

Merges the annotation gtf file and the stringtie output gtf files

00

gtf versions

stringtie2:

Transcript assembly and quantification for RNA-Seq

Transcript assembly and quantification for RNA-Se

010

transcript_gtf abundance coverage_gtf ballgown versions

stringtie2:

Transcript assembly and quantification for RNA-Seq

SvABA is an efficient and accurate method for detecting SVs from short-read sequencing data using genome-wide local assembly with low memory and computing requirements

01234010101010101

sv indel germ_indel germ_sv som_indel som_sv unfiltered_sv unfiltered_indel unfiltered_germ_indel unfiltered_germ_sv unfiltered_som_indel unfiltered_som_sv raw_calls discordants log versions

TransDecoder identifies candidate coding regions within transcript sequences. it is used to build gff file.

01

pep gff3 cds dat folder versions

transdecoder:

TransDecoder identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks.

TransDecoder identifies candidate coding regions within transcript sequences. It is used to build gff file. You can use this module after transdecoder_longorf

010

pep gff3 cds bed versions

transdecoder:

TransDecoder identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks.

Assembles a de novo transcriptome from RNAseq reads

01

transcript_fasta log versions

convert between genome builds

010

lifted unlifted versions

ucsc:

Move annotations from one assembly to another

Assembles bacterial genomes

012

scaffolds gfa log versions

Performs assembly scaffolding using YaHS

0100

scaffolds_fasta scaffolds_agp binary versions

a tool to build k-mer hash table for fasta and fastq files

01

yak versions

yak:

Yet another k-mer analyzer

Click here to trigger an update.