Available Modules

Modules are the building stones of all DSL2 nf-core blocks. You can find more info from nf-core website, if you would like to write your own module.

  • sort 51
  • bam 18
  • genomics 13
  • example 9
  • gatk4 6
  • markduplicates 6
  • fasta 5
  • vcf 5
  • bed 5
  • sam 5
  • bqsr 5
  • vsearch 5
  • align 4
  • index 3
  • cram 3
  • coverage 3
  • table 3
  • base quality score recalibration 3
  • duplicates 3
  • CRISPR 3
  • metagenomics 2
  • genome 2
  • reference 2
  • variant calling 2
  • gfa 2
  • clustering 2
  • binning 2
  • VCF 2
  • bedtools 2
  • variation graph 2
  • depth 2
  • pangenome graph 2
  • population genetics 2
  • de novo assembly 2
  • gatk4spark 2
  • amplicon sequences 2
  • spark 2
  • tabix 2
  • graph layout 2
  • merge mate pairs 2
  • reads merging 2
  • fastq 1
  • alignment 1
  • annotation 1
  • filter 1
  • merge 1
  • gff 1
  • nanopore 1
  • gtf 1
  • k-mer 1
  • split 1
  • taxonomy 1
  • conversion 1
  • count 1
  • proteomics 1
  • trimming 1
  • phylogeny 1
  • rnaseq 1
  • illumina 1
  • picard 1
  • QC 1
  • mapping 1
  • DNA methylation 1
  • WGBS 1
  • cluster 1
  • samtools 1
  • filtering 1
  • pairs 1
  • scWGBS 1
  • aligner 1
  • bisulfite sequencing 1
  • biscuit 1
  • genotype 1
  • prediction 1
  • hmmer 1
  • spatial 1
  • prokaryote 1
  • cat 1
  • microbiome 1
  • single cell 1
  • mpileup 1
  • fgbio 1
  • compress 1
  • clean 1
  • read depth 1
  • interval_list 1
  • bgzip 1
  • chromosome 1
  • typing 1
  • pairsam 1
  • UMI 1
  • mlst 1
  • genome assembly 1
  • bwameth 1
  • RNA-seq 1
  • reformat 1
  • fixmate 1
  • collate 1
  • graft 1
  • purge duplications 1
  • xenograft 1
  • CNV 1
  • correction 1
  • vsearch/fastqfilter 1
  • fastqfilter 1
  • rRNA 1
  • ribosomal RNA 1
  • vsearch/sort 1
  • sintax 1
  • usearch 1
  • VCFtools 1
  • multi-tool 1
  • rna velocity 1
  • uq 1
  • md 1
  • nm 1
  • groupreads 1
  • merge compare 1
  • GNU 1
  • multinterval 1
  • sorting 1
  • HLA 1
  • depth information 1
  • structural variation 1
  • duphold 1
  • sorted 1
  • sortvcf 1
  • duplicate marking 1
  • sgRNA 1
  • functional genomics 1
  • CRISPR-Cas9 1
  • mass-spectroscopy 1
  • maximum-likelihood 1
  • rra 1
  • assembly 0
  • structural variants 0
  • database 0
  • bacteria 0
  • map 0
  • statistics 0
  • variants 0
  • qc 0
  • classification 0
  • quality control 0
  • download 0
  • classify 0
  • cnv 0
  • MSA 0
  • variant 0
  • taxonomic profiling 0
  • contamination 0
  • pacbio 0
  • sentieon 0
  • somatic 0
  • convert 0
  • quality 0
  • single-cell 0
  • ancient DNA 0
  • copy number 0
  • long reads 0
  • imputation 0
  • contigs 0
  • graph 0
  • bisulfite 0
  • mags 0
  • isoseq 0
  • reporting 0
  • gvcf 0
  • bcftools 0
  • protein 0
  • sv 0
  • build 0
  • consensus 0
  • kmer 0
  • bisulphite 0
  • imaging 0
  • methylseq 0
  • indexing 0
  • methylation 0
  • databases 0
  • wgs 0
  • visualisation 0
  • cna 0
  • compression 0
  • long-read 0
  • openms 0
  • demultiplex 0
  • stats 0
  • antimicrobial resistance 0
  • serotype 0
  • metrics 0
  • phage 0
  • plink2 0
  • sequences 0
  • taxonomic classification 0
  • tsv 0
  • 5mC 0
  • haplotype 0
  • searching 0
  • structure 0
  • protein sequence 0
  • plot 0
  • bins 0
  • aDNA 0
  • repeat 0
  • histogram 0
  • neural network 0
  • expression 0
  • matrix 0
  • amr 0
  • machine learning 0
  • mappability 0
  • cooler 0
  • gzip 0
  • transcriptome 0
  • mmseqs2 0
  • low-coverage 0
  • annotate 0
  • iCLIP 0
  • virus 0
  • validation 0
  • db 0
  • bcf 0
  • phasing 0
  • completeness 0
  • bwa 0
  • metagenome 0
  • checkm 0
  • sequence 0
  • LAST 0
  • damage 0
  • palaeogenomics 0
  • germline 0
  • seqkit 0
  • gene 0
  • transcript 0
  • archaeogenomics 0
  • peaks 0
  • msa 0
  • evaluation 0
  • bismark 0
  • kraken2 0
  • ucsc 0
  • blast 0
  • hmmsearch 0
  • decompression 0
  • genotyping 0
  • glimpse 0
  • mag 0
  • umi 0
  • mkref 0
  • newick 0
  • ncbi 0
  • segmentation 0
  • dedup 0
  • complexity 0
  • gff3 0
  • feature 0
  • sketch 0
  • report 0
  • json 0
  • scRNA-seq 0
  • bedGraph 0
  • kmers 0
  • short-read 0
  • rna 0
  • splicing 0
  • pangenome 0
  • plasmid 0
  • multiple sequence alignment 0
  • cnvkit 0
  • single 0
  • tumor-only 0
  • antimicrobial peptides 0
  • csv 0
  • NCBI 0
  • deduplication 0
  • antimicrobial resistance genes 0
  • mitochondria 0
  • snp 0
  • profile 0
  • low frequency variant calling 0
  • differential 0
  • demultiplexing 0
  • extract 0
  • reads 0
  • mirna 0
  • clipping 0
  • mem 0
  • ptr 0
  • wxs 0
  • arg 0
  • HMM 0
  • reference-free 0
  • benchmark 0
  • sourmash 0
  • indels 0
  • detection 0
  • merging 0
  • coptr 0
  • diversity 0
  • concatenate 0
  • deamination 0
  • compare 0
  • FASTQ 0
  • de novo 0
  • text 0
  • antibiotic resistance 0
  • idXML 0
  • gridss 0
  • isolates 0
  • tabular 0
  • 3-letter genome 0
  • interval 0
  • mutect2 0
  • structural 0
  • distance 0
  • profiling 0
  • MAF 0
  • amps 0
  • visualization 0
  • riboseq 0
  • svtk 0
  • kallisto 0
  • adapters 0
  • fragment 0
  • query 0
  • fastx 0
  • ont 0
  • call 0
  • counts 0
  • summary 0
  • view 0
  • add 0
  • propr 0
  • haplotypecaller 0
  • malt 0
  • gsea 0
  • STR 0
  • parsing 0
  • microarray 0
  • hic 0
  • redundancy 0
  • family 0
  • ganon 0
  • phylogenetic placement 0
  • bedpe 0
  • cut 0
  • bedgraph 0
  • ranking 0
  • logratio 0
  • genome assembler 0
  • transcriptomics 0
  • CLIP 0
  • genmod 0
  • circrna 0
  • pypgx 0
  • peak-calling 0
  • ampir 0
  • enrichment 0
  • union 0
  • isomir 0
  • microsatellite 0
  • normalization 0
  • umitools 0
  • DNA sequencing 0
  • fusion 0
  • abundance 0
  • dna 0
  • DNA sequence 0
  • ccs 0
  • quantification 0
  • sample 0
  • sequencing 0
  • mtDNA 0
  • snps 0
  • ATAC-seq 0
  • targeted sequencing 0
  • resistance 0
  • hybrid capture sequencing 0
  • bin 0
  • chunk 0
  • copy number alteration calling 0
  • xeniumranger 0
  • retrotransposon 0
  • containment 0
  • bigwig 0
  • diamond 0
  • preprocessing 0
  • fai 0
  • telomere 0
  • SV 0
  • sylph 0
  • ngscheckmate 0
  • archaeogenetics 0
  • ancestry 0
  • bcl2fastq 0
  • happy 0
  • deep learning 0
  • image 0
  • nucleotide 0
  • fungi 0
  • miscoding lesions 0
  • public datasets 0
  • HiFi 0
  • skani 0
  • BGC 0
  • matching 0
  • biosynthetic gene cluster 0
  • palaeogenetics 0
  • paf 0
  • hmmcopy 0
  • somatic variants 0
  • dist 0
  • SNP 0
  • comparison 0
  • lossless 0
  • bacterial 0
  • mzml 0
  • identity 0
  • relatedness 0
  • subsample 0
  • entrez 0
  • fastk 0
  • structural_variants 0
  • pan-genome 0
  • pangolin 0
  • spaceranger 0
  • lineage 0
  • anndata 0
  • covid 0
  • observations 0
  • survivor 0
  • panel 0
  • wastewater 0
  • mapper 0
  • benchmarking 0
  • bim 0
  • duplication 0
  • PacBio 0
  • fam 0
  • rsem 0
  • mask 0
  • hidden Markov model 0
  • cfDNA 0
  • polishing 0
  • population genomics 0
  • vrhyme 0
  • scaffold 0
  • amplicon sequencing 0
  • notebook 0
  • reports 0
  • prokka 0
  • krona chart 0
  • pseudoalignment 0
  • transposons 0
  • khmer 0
  • windowmasker 0
  • npz 0
  • krona 0
  • html 0
  • small indels 0
  • popscle 0
  • genotype-based deconvoltion 0
  • indel 0
  • kinship 0
  • shapeit 0
  • miRNA 0
  • dictionary 0
  • seqtk 0
  • ambient RNA removal 0
  • informative sites 0
  • rna_structure 0
  • RNA 0
  • fusions 0
  • replace 0
  • score 0
  • scaffolding 0
  • transcripts 0
  • uLTRA 0
  • insert 0
  • variant_calling 0
  • ligate 0
  • minimap2 0
  • long_read 0
  • guide tree 0
  • untar 0
  • uncompress 0
  • chimeras 0
  • unzip 0
  • zip 0
  • archiving 0
  • organelle 0
  • cellranger 0
  • kraken 0
  • angsd 0
  • genome mining 0
  • bamtools 0
  • pileup 0
  • cool 0
  • png 0
  • proteome 0
  • repeat expansion 0
  • bracken 0
  • aln 0
  • cut up 0
  • das tool 0
  • das_tool 0
  • wig 0
  • prefetch 0
  • prokaryotes 0
  • chip-seq 0
  • comparisons 0
  • ataqv 0
  • image_analysis 0
  • mcmicro 0
  • highly_multiplexed_imaging 0
  • dump 0
  • arriba 0
  • eukaryotes 0
  • combine 0
  • bakta 0
  • intervals 0
  • host 0
  • converter 0
  • deeparg 0
  • C to T 0
  • roh 0
  • adapter trimming 0
  • remove 0
  • virulence 0
  • fingerprint 0
  • macrel 0
  • amplify 0
  • neubi 0
  • fcs-gx 0
  • scores 0
  • gene expression 0
  • regions 0
  • mkfastq 0
  • quality trimming 0
  • checkv 0
  • hi-c 0
  • complement 0
  • atac-seq 0
  • genomes 0
  • PCA 0
  • DRAMP 0
  • microbes 0
  • minhash 0
  • windows 0
  • immunoinformatics 0
  • intersect 0
  • norm 0
  • long terminal repeat 0
  • normalize 0
  • intersection 0
  • mash 0
  • long terminal retrotransposon 0
  • kma 0
  • retrotransposons 0
  • checksum 0
  • scatter 0
  • megan 0
  • assembly evaluation 0
  • GC content 0
  • k-mer frequency 0
  • k-mer index 0
  • archive 0
  • lofreq 0
  • bloom filter 0
  • pharokka 0
  • reheader 0
  • xz 0
  • function 0
  • profiles 0
  • COBS 0
  • resolve_bioscience 0
  • spatial_transcriptomics 0
  • tree 0
  • salmon 0
  • BAM 0
  • rna-seq 0
  • regression 0
  • haplotypes 0
  • functional analysis 0
  • mapcounter 0
  • hlala_typing 0
  • hla_typing 0
  • hlala 0
  • hla 0
  • haplogroups 0
  • interactions 0
  • taxids 0
  • ichorcna 0
  • immunoprofiling 0
  • taxon name 0
  • zlib 0
  • pigz 0
  • vdj 0
  • find 0
  • differential expression 0
  • trancriptome 0
  • tama 0
  • translation 0
  • amino acid 0
  • genetics 0
  • barcode 0
  • orf 0
  • primer 0
  • pair 0
  • region 0
  • interactive 0
  • krakenuniq 0
  • sizes 0
  • bases 0
  • homologs 0
  • krakentools 0
  • screen 0
  • bustools 0
  • metamaps 0
  • awk 0
  • tbi 0
  • polyA_tail 0
  • blastn 0
  • refine 0
  • maximum likelihood 0
  • iphop 0
  • instrain 0
  • leviosam2 0
  • lift 0
  • homoploymer 0
  • deseq2 0
  • MSI 0
  • dict 0
  • varcal 0
  • MaltExtract 0
  • HOPS 0
  • authentication 0
  • soft-clipped clusters 0
  • edit distance 0
  • ragtag 0
  • qualty 0
  • samples 0
  • taxon tables 0
  • secondary metabolites 0
  • bam2fq 0
  • NRPS 0
  • RiPP 0
  • antibiotics 0
  • antismash 0
  • rtgtools 0
  • vcflib 0
  • junctions 0
  • vg 0
  • salmonella 0
  • rename 0
  • allele 0
  • FracMinHash sketch 0
  • join 0
  • signature 0
  • cancer genomics 0
  • snpsift 0
  • snpeff 0
  • effect prediction 0
  • small genome 0
  • de novo assembler 0
  • gwas 0
  • shigella 0
  • otu tables 0
  • svdb 0
  • switch 0
  • ancient dna 0
  • Streptococcus pneumoniae 0
  • standardization 0
  • sequenzautils 0
  • taxonomic profile 0
  • standardise 0
  • transformation 0
  • standardisation 0
  • runs_of_homozygosity 0
  • polish 0
  • instability 0
  • microscopy 0
  • nucleotides 0
  • GPU-accelerated 0
  • trim 0
  • multiallelic 0
  • small variants 0
  • rgfa 0
  • tnhaplotyper2 0
  • gstama 0
  • reformatting 0
  • nextclade 0
  • orthology 0
  • parallelized 0
  • removal 0
  • transcriptomic 0
  • mudskipper 0
  • concat 0
  • msisensor-pro 0
  • micro-satellite-scan 0
  • tumor 0
  • msi 0
  • cnvnator 0
  • proportionality 0
  • RNA-Seq 0
  • preseq 0
  • contig 0
  • simulate 0
  • artic 0
  • duplicate 0
  • Read depth 0
  • aggregate 0
  • Duplication purging 0
  • demultiplexed reads 0
  • library 0
  • adapter 0
  • ped 0
  • import 0
  • variant pruning 0
  • bfiles 0
  • subset 0
  • SimpleAF 0
  • copyratios 0
  • image_processing 0
  • registration 0
  • mitochondrion 0
  • read-group 0
  • rrna 0
  • serogroup 0
  • nacho 0
  • metagenomic 0
  • cgMLST 0
  • unaligned 0
  • mass spectrometry 0
  • UMIs 0
  • version 0
  • orthologs 0
  • duplex 0
  • trgt 0
  • nanostring 0
  • fetch 0
  • GEO 0
  • sra-tools 0
  • fasterq-dump 0
  • identifier 0
  • sequence analysis 0
  • baf 0
  • pharmacogenetics 0
  • estimation 0
  • expansionhunterdenovo 0
  • repeat_expansions 0
  • cleaning 0
  • structural-variant calling 0
  • metadata 0
  • screening 0
  • tab 0
  • recombination 0
  • gem 0
  • metagenomes 0
  • eCLIP 0
  • WGS 0
  • long-read sequencing 0
  • doublets 0
  • corrupted 0
  • mRNA 0
  • realignment 0
  • microbial 0
  • deconvolution 0
  • allele-specific 0
  • smrnaseq 0
  • bayesian 0
  • interval list 0
  • RNA sequencing 0
  • filtermutectcalls 0
  • mirdeep2 0
  • MCMICRO 0
  • calling 0
  • ome-tif 0
  • Pharmacogenetics 0
  • cvnkit 0
  • split_kmers 0
  • evidence 0
  • repeats 0
  • panelofnormals 0
  • cnv calling 0
  • dereplicate 0
  • joint genotyping 0
  • gatk 0
  • short reads 0
  • frame-shift correction 0
  • splice 0
  • settings 0
  • random forest 0
  • amptransformer 0
  • gene set 0
  • gene set analysis 0
  • eigenstrat 0
  • variation 0
  • samplesheet 0
  • human removal 0
  • validate 0
  • format 0
  • genome bins 0
  • blastp 0
  • phase 0
  • decontamination 0
  • ChIP-seq 0
  • gene labels 0
  • genomad 0
  • single cells 0
  • hostile 0
  • emboss 0
  • parse 0
  • heatmap 0
  • ampgram 0
  • eido 0
  • spatial_omics 0
  • concordance 0
  • spatialdata 0
  • melon 0
  • c to t 0
  • proteus 0
  • plant 0
  • mapad 0
  • hash sketch 0
  • signatures 0
  • setgt 0
  • readproteingroups 0
  • metabolomics 0
  • cell segmentation 0
  • SINE 0
  • adna 0
  • copy-number 0
  • jvarkit 0
  • remove samples 0
  • gender determination 0
  • scanner 0
  • copy number alterations 0
  • helitron 0
  • tar 0
  • unmarkduplicates 0
  • covariance models 0
  • translate 0
  • leafcutter 0
  • copy number analysis 0
  • trna 0
  • wham 0
  • fracminhash sketch 0
  • genome annotation 0
  • mobile genetic elements 0
  • tarball 0
  • copy number variation 0
  • yahs 0
  • geo 0
  • recovery 0
  • relabel 0
  • bedcov 0
  • genome polishing 0
  • assembly polishing 0
  • chloroplast 0
  • confidence 0
  • blat 0
  • alr 0
  • clr 0
  • boxcox 0
  • tnscope 0
  • vsearch/dereplicate 0
  • telseq 0
  • Escherichia coli 0
  • stardist 0
  • propd 0
  • Read coverage histogram 0
  • immunology 0
  • BCR 0
  • bgen 0
  • groupby 0
  • eigenvectors 0
  • secondary structure 0
  • network 0
  • resegment 0
  • wget 0
  • wavefront 0
  • hicPCA 0
  • sliding 0
  • mgi 0
  • snakemake 0
  • workflow 0
  • morphology 0
  • ATACseq 0
  • workflow_mode 0
  • ATACshift 0
  • createreadcountpanelofnormals 0
  • shift 0
  • denoisereadcounts 0
  • readwriter 0
  • dnamodelapply 0
  • dnascope 0
  • comp 0
  • whamg 0
  • mashmap 0
  • source tracking 0
  • decompress 0
  • vcf2bed 0
  • significance statistic 0
  • scanpy 0
  • rdtest 0
  • hwe 0
  • emoji 0
  • umicollapse 0
  • data-download 0
  • scRNA-Seq 0
  • gtftogenepred 0
  • controlstatistics 0
  • rdtest2vcf 0
  • countsvtypes 0
  • p-value 0
  • scvi 0
  • elprep 0
  • files 0
  • baftest 0
  • elfasta 0
  • ucsc/liftover 0
  • refflat 0
  • upd 0
  • eucaryotes 0
  • doublet_detection 0
  • subsetting 0
  • fast5 0
  • references 0
  • modelsegments 0
  • polya tail 0
  • Mycobacterium tuberculosis 0
  • metagenome assembler 0
  • chromosomal rearrangements 0
  • coding 0
  • genepred 0
  • missingness 0
  • cds 0
  • transcroder 0
  • quality_control 0
  • sequencing adapters 0
  • patch 0
  • logFC 0
  • bedgraphtobigwig 0
  • bigbed 0
  • bedtobigbed 0
  • nucleotide content 0
  • uniparental 0
  • all versus all 0
  • spa 0
  • graph projection to vcf 0
  • nucBed 0
  • long-reads 0
  • bclconvert 0
  • plotting 0
  • variantcalling 0
  • sccmec 0
  • streptococcus 0
  • extractunbinned 0
  • linkbins 0
  • integron 0
  • targz 0
  • iterative model refinement 0
  • nuclear segmentation 0
  • spatype 0
  • barcodes 0
  • long read alignment 0
  • pangenome-scale 0
  • regtools 0
  • construct 0
  • DNA contamination estimation 0
  • disomy 0
  • metabolite annotation 0
  • snv 0
  • downsample 0
  • svtk/baftest 0
  • downsample bam 0
  • subsample bam 0
  • vcf2db 0
  • AT content 0
  • gemini 0
  • maf 0
  • lua 0
  • detecting svs 0
  • toml 0
  • solo 0
  • import segmentation 0
  • short-read sequencing 0
  • vcfbreakmulti 0
  • uniq 0
  • deduplicate 0
  • verifybamid 0
  • metaspace 0
  • check 0
  • decoy 0
  • genotype dosages 0
  • impute 0
  • 10x 0
  • hwe statistics 0
  • ribosomal 0
  • grabix 0
  • SNV 0
  • hwe equilibrium 0
  • reference-independent 0
  • Indel 0
  • bwameme 0
  • host removal 0
  • haploype 0
  • genotype likelihood 0
  • patterns 0
  • collapse 0
  • liftover 0
  • probabilistic realignment 0
  • seqfu 0
  • n50 0
  • bwamem2 0
  • guidetree 0
  • Pacbio 0
  • doublet 0
  • cell_type_identification 0
  • cell_phenotyping 0
  • machine_learning 0
  • hardy-weinberg 0
  • regex 0
  • AC/NS/AF 0
  • distance-based 0
  • circular 0
  • python 0
  • r 0
  • realign 0
  • quality check 0
  • coexpression 0
  • correlation 0
  • corpcor 0
  • low-complexity 0
  • assay 0
  • phylogenetics 0
  • minimum_evolution 0
  • nucleotide sequence 0
  • shuffleBed 0
  • GFF/GTF 0
  • size 0
  • trio binning 0
  • cram-size 0
  • selector 0
  • paraphase 0
  • transcription factors 0
  • regulatory network 0
  • tandem repeats 0
  • long read 0
  • predict 0
  • reference compression 0
  • vcflib/vcffixup 0
  • spot 0
  • nanopore sequencing 0
  • cell_barcodes 0
  • hhsuite 0
  • 16S 0
  • mygene 0
  • go 0
  • retrieval 0
  • CRISPRi 0
  • pile up 0
  • catpack 0
  • prepare 0
  • transposable element 0
  • generic 0
  • hmmpress 0
  • coreutils 0
  • cobra 0
  • gnu 0
  • extension 0
  • grea 0
  • hashing-based deconvoltion 0
  • hamming-distance 0
  • functional enrichment 0
  • paired reads merging 0
  • overlap-based merging 0
  • taxonomic composition 0
  • tag 0
  • Computational Immunology 0
  • trimfq 0
  • omics 0
  • clahe 0
  • refresh 0
  • association 0
  • GWAS 0
  • case/control 0
  • associations 0
  • reference panel 0
  • spatial_neighborhoods 0
  • scimap 0
  • cellsnp 0
  • Bayesian 0
  • structural-variants 0
  • donor deconvolution 0
  • hmmscan 0
  • genotype-based demultiplexing 0
  • lexogen 0
  • biological activity 0
  • droplet based single cells 0
  • junction 0
  • Immune Deconvolution 0
  • Bioinformatics Tools 0
  • prior knowledge 0
  • phylogenies 0
  • busco 0
  • InterProScan 0
  • MMseqs2 0
  • masking 0
  • quarto 0
  • variant-calling 0
  • staging 0
  • derived alleles 0
  • tnfilter 0
  • heterozygous genotypes 0
  • inbreeding 0
  • array_cgh 0
  • cytosure 0
  • Staging 0
  • vector 0
  • gprofiler2 0
  • gost 0
  • rad 0
  • block substitutions 0
  • covariance model 0
  • haplotag 0
  • standard 0
  • svg 0
  • structural variant 0
  • xml 0
  • run 0
  • script 0
  • bam2fastx 0
  • bam2fastq 0
  • dereplication 0
  • java 0
  • pdb 0
  • ancestral alleles 0
  • immcantation 0
  • mass_error 0
  • reverse complement 0
  • vcf file 0
  • poolseq 0
  • bgen file 0
  • plink2_pca 0
  • search engine 0
  • simulation 0
  • hmmfetch 0
  • decompose 0
  • identity-by-descent 0
  • decomposeblocksub 0
  • transmembrane 0
  • genome graph 0
  • site frequency spectrum 0
  • pca 0
  • tnseq 0
  • multiqc 0
  • mzML 0
  • pruning 0
  • htseq 0
  • linkage equilibrium 0
  • sompy 0
  • f coefficient 0
  • peak picking 0
  • homozygous genotypes 0
  • microRNA 0
  • rank 0
  • airrseq 0
  • orthogroup 0
  • isoform 0
  • joint-genotyping 0
  • variancepartition 0
  • genotypegvcf 0
  • dream 0
  • redundant 0
  • fix 0
  • extraction 0
  • featuretable 0
  • parallel 0
  • plastid 0
  • malformed 0
  • paired reads re-pairing 0
  • short 0
  • sage 0
  • resfinder 0
  • resistance genes 0
  • raw 0
  • mgf 0
  • parquet 0
  • intron 0
  • parser 0
  • dbsnp 0
  • standardize 0
  • install 0
  • nanoq 0
  • hashing-based deconvolution 0
  • deep variant 0
  • co-orthology 0
  • updatedata 0
  • homology 0
  • microbial genomics 0
  • chip 0
  • tag2tag 0
  • sequence similarity 0
  • spectral clustering 0
  • tags 0
  • comparative genomics 0
  • partitioning 0
  • functional 0
  • Illumina 0
  • Read filters 0
  • uniques 0
  • mutect 0
  • idx 0
  • drep 0
  • drug categorization 0
  • Read report 0
  • agat 0
  • Read trimming 0
  • transform 0
  • gaps 0
  • introns 0
  • longest 0
  • impute-info 0
  • assembler 0
  • constant 0
  • getpileupsummaries 0
  • short variant discovery 0
  • combinegvcfs 0
  • collectsvevidence 0
  • collectreadcounts 0
  • cnnscorevariants 0
  • calibratedragstrmodel 0
  • cross-samplecontamination 0
  • dragstr 0
  • calculatecontamination 0
  • bedtointervallist 0
  • asereadcounter 0
  • vqsr 0
  • variant quality score recalibration 0
  • annotateintervals 0
  • composestrtablefile 0
  • condensedepthevidence 0
  • heattree 0
  • gatherbqsrreports 0
  • germlinecnvcaller 0
  • germline contig ploidy 0
  • panelofnormalscreation 0
  • jointgenotyping 0
  • genomicsdbimport 0
  • genomicsdb 0
  • tranche filtering 0
  • createsequencedictionary 0
  • filtervarianttranches 0
  • filterintervals 0
  • estimatelibrarycomplexity 0
  • duplication metrics 0
  • determinegermlinecontigploidy 0
  • createsomaticpanelofnormals 0
  • targets 0
  • gangstr 0
  • getpileupsumaries 0
  • antibiotic resistance genes 0
  • consensus sequence 0
  • public 0
  • ENA 0
  • SRA 0
  • ANI 0
  • ARGs 0
  • faqcs 0
  • str 0
  • cache 0
  • percent on target 0
  • endogenous DNA 0
  • Streptococcus pyogenes 0
  • swissprot 0
  • duplexumi 0
  • unmapped 0
  • gene-calling 0
  • variant caller 0
  • gamma 0
  • UShER 0
  • bootstrapping 0
  • bacterial variant calling 0
  • germline variant calling 0
  • somatic variant calling 0
  • rust 0
  • ubam 0
  • fq 0
  • lint 0
  • random 0
  • generate 0
  • single molecule 0
  • zipperbams 0
  • germlinevariantsites 0
  • readcountssummary 0
  • embl 0
  • Imputation 0
  • gene model 0
  • tama_collapse.py 0
  • genomes on a tree 0
  • joint-variant-calling 0
  • Haplotypes 0
  • gstama/merge 0
  • Sample 0
  • low coverage 0
  • gget 0
  • genome statistics 0
  • genome manipulation 0
  • genome summary 0
  • TAMA 0
  • gstama/polyacleanup 0
  • Mykrobe 0
  • abricate 0
  • beagle 0
  • hbd 0
  • ibd 0
  • rgi 0
  • fARGene 0
  • amrfinderplus 0
  • extractvariants 0
  • GTDB taxonomy 0
  • extract_variants 0
  • gvcftools 0
  • gunzip 0
  • gunc 0
  • archaea 0
  • genome taxonomy database 0
  • gfastats 0
  • Salmonella Typhi 0
  • indexfeaturefile 0
  • preprocessintervals 0
  • shiftchain 0
  • selectvariants 0
  • revert 0
  • reblockgvcf 0
  • printsvevidence 0
  • printreads 0
  • postprocessgermlinecnvcalls 0
  • shiftintervals 0
  • snvs 0
  • mutectstats 0
  • mergebamalignment 0
  • leftalignandtrimvariants 0
  • readorientationartifacts 0
  • learnreadorientationmodel 0
  • shiftfasta 0
  • site depth 0
  • repeat content 0
  • file parsing 0
  • genome heterozygosity 0
  • genome size 0
  • models 0
  • compound 0
  • genome profile 0
  • bgc 0
  • txt 0
  • splitcram 0
  • gawk 0
  • variantrecalibrator 0
  • recalibration model 0
  • variantfiltration 0
  • svcluster 0
  • svannotate 0
  • splitintervals 0
  • genbank 0
  • split by chromosome 0
  • Haemophilus influenzae 0
  • illumiation_correction 0
  • BCF 0
  • csi 0
  • deduping 0
  • smaller fastqs 0
  • clumping fastqs 0
  • background_correction 0
  • element 0
  • biallelic 0
  • trimBam 0
  • bamUtil 0
  • bamtools/split 0
  • yaml 0
  • bamtools/convert 0
  • mouse 0
  • update header 0
  • homozygosity 0
  • virulent 0
  • chunking 0
  • subtract 0
  • slopBed 0
  • shiftBed 0
  • overlapped bed 0
  • maskfasta 0
  • jaccard 0
  • autozygosity 0
  • overlap 0
  • getfasta 0
  • genomecov 0
  • closest 0
  • bamtobed 0
  • bacphlip 0
  • temperate 0
  • bioawk 0
  • amp 0
  • allele counts 0
  • nuclear contamination estimate 0
  • post Post-processing 0
  • model 0
  • AMPs 0
  • antimicrobial peptide prediction 0
  • Staphylococcus aureus 0
  • installation 0
  • affy 0
  • reference panels 0
  • admixture 0
  • adapterremoval 0
  • antimicrobial reistance 0
  • contiguate 0
  • doCounts 0
  • lifestyle 0
  • read group 0
  • autofluorescence 0
  • cycif 0
  • background 0
  • single-stranded 0
  • ancientDNA 0
  • authentict 0
  • bias 0
  • utility 0
  • ATLAS 0
  • sequencing_bias 0
  • post mortem damage 0
  • atlas 0
  • mkarv 0
  • http(s) 0
  • unionBedGraphs 0
  • file manipulation 0
  • deletion 0
  • Segmentation 0
  • cutesv 0
  • gct 0
  • cls 0
  • na 0
  • custom 0
  • Cores 0
  • TMA dearray 0
  • paired-end 0
  • UNet 0
  • mcool 0
  • genomic bins 0
  • makebins 0
  • enzyme 0
  • digest 0
  • pcr duplicates 0
  • track 0
  • cooler/balance 0
  • escherichia coli 0
  • circos 0
  • eklipse 0
  • eigenstratdatabasetools 0
  • pep 0
  • schema 0
  • PEP 0
  • corrrelation 0
  • segment 0
  • blastx 0
  • cumulative coverage 0
  • scatterplot 0
  • cload 0
  • subcontigs 0
  • compartments 0
  • multiomics 0
  • mkvdjref 0
  • cellpose 0
  • hifi 0
  • Assembly 0
  • domains 0
  • topology 0
  • antibody capture 0
  • calder2 0
  • cadd 0
  • postprocessing 0
  • tblastn 0
  • subtyping 0
  • Salmonella enterica 0
  • antigen capture 0
  • crispr 0
  • nucleotide composition 0
  • cmseq 0
  • concoct 0
  • partition histograms 0
  • target 0
  • export 0
  • antitarget 0
  • access 0
  • protein coding genes 0
  • qa 0
  • polymorphic sites 0
  • polymorphic 0
  • polymut 0
  • chromosome_visualization 0
  • duplicate removal 0
  • chromap 0
  • quality assurnce 0
  • mitochondrial 0
  • haplotype resolution 0
  • invariant 0
  • cutoff 0
  • False duplications 0
  • Haplotype purging 0
  • assembly curation 0
  • false duplications 0
  • duplicate purging 0
  • haplotype purging 0
  • panel of normals 0
  • purging 0
  • normal database 0
  • genomic intervals 0
  • intervals coverage 0
  • gene finding 0
  • contact maps 0
  • bmp 0
  • Assembly curation 0
  • quast 0
  • pretext 0
  • read_pairs 0
  • integrity 0
  • mapping-based 0
  • sequence-based 0
  • read distribution 0
  • inner_distance 0
  • fragment_size 0
  • experiment 0
  • neighbour-joining 0
  • strandedness 0
  • bamstat 0
  • R 0
  • rhocall 0
  • long uncorrected reads 0
  • subsampling 0
  • jpg 0
  • contact 0
  • pedfilter 0
  • PRO-cap 0
  • GRO-cap 0
  • CoPRO 0
  • tandem duplications 0
  • insertions 0
  • deletions 0
  • picard/renamesampleinvcf 0
  • NETCAGE 0
  • pcr 0
  • liftovervcf 0
  • mate-pair 0
  • hybrid-selection 0
  • phylogenetic composition 0
  • illumina datasets 0
  • CAGE 0
  • RAMPAGE 0
  • porechop_abi 0
  • indep pairwise 0
  • pmdtools 0
  • variant genetic 0
  • scoring 0
  • identifiers 0
  • whole genome association 0
  • recode 0
  • indep 0
  • csRNA-seq 0
  • variant identifiers 0
  • exclude 0
  • genetic 0
  • GRO-seq 0
  • PRO-seq 0
  • STRIPE-seq 0
  • rtg 0
  • rocplot 0
  • prophage 0
  • sex determination 0
  • longread 0
  • de-novo 0
  • error 0
  • rare variants 0
  • relative coverage 0
  • genetic sex 0
  • induce 0
  • 256 bit 0
  • gc_wiggle 0
  • bam2seqz 0
  • freqsum 0
  • pseudodiploid 0
  • pseudohaploid 0
  • random draw 0
  • sha256 0
  • shinyngs 0
  • seq 0
  • POA 0
  • SNPs 0
  • predictions 0
  • dbnsfp 0
  • snippy 0
  • core 0
  • sniffles 0
  • SMN2 0
  • exploratory 0
  • SMN1 0
  • CRAM 0
  • sliding window 0
  • features 0
  • density 0
  • boxplot 0
  • selection 0
  • header 0
  • rtg-tools 0
  • repair 0
  • insert size 0
  • faidx 0
  • calmd 0
  • ampliconclip 0
  • amplicon 0
  • sambamba 0
  • read pairs 0
  • flagstat 0
  • multimapper 0
  • Ancestor 0
  • LCA 0
  • salsa2 0
  • salsa 0
  • paired 0
  • readgroup 0
  • interleave 0
  • applyvarcal 0
  • sertotype 0
  • sequence headers 0
  • grep 0
  • subseq 0
  • variant recalibration 0
  • VQSR 0
  • assembly-binning 0
  • scramble 0
  • seacr 0
  • chromatin 0
  • cut&run 0
  • cut&tag 0
  • peak-caller 0
  • clusteridentifier 0
  • cluster analysis 0
  • identification 0
  • phantom peaks 0
  • gccounter 0
  • limma 0
  • peptide prediction 0
  • AMP 0
  • qualities 0
  • lofreq/filter 0
  • lofreq/call 0
  • Listeria monocytogenes 0
  • pneumophila 0
  • clinical 0
  • legionella 0
  • collapsing 0
  • adapter removal 0
  • train 0
  • spliced 0
  • combining 0
  • reduced 0
  • MD5 0
  • mcr-1 0
  • metagenome-assembled genomes 0
  • maxbin2 0
  • representations 0
  • mash/sketch 0
  • taxonomic assignment 0
  • estimate 0
  • damage patterns 0
  • NGS 0
  • DNA damage 0
  • reorder 0
  • kofamscan 0
  • megahit 0
  • panel_of_normals 0
  • multicut 0
  • genome browser 0
  • js 0
  • igv.js 0
  • igv 0
  • IDR 0
  • haemophilus 0
  • pixel_classification 0
  • pos 0
  • annotations 0
  • hmtnote 0
  • Hidden Markov Model 0
  • HMMER 0
  • readcounter 0
  • pixel classification 0
  • probability_maps 0
  • kegg 0
  • kallisto/index 0
  • pneumoniae 0
  • Klebsiella 0
  • effective genome size 0
  • k-mer counting 0
  • digital normalization 0
  • quant 0
  • papermill 0
  • interproscan 0
  • jupytext 0
  • Jupyter 0
  • Python 0
  • jasmine 0
  • jasminesv 0
  • insertion 0
  • genomic islands 0
  • 128 bit 0
  • denovo 0
  • ChIP-Seq 0
  • graph formats 0
  • block-compressed 0
  • HLA-I 0
  • ILP 0
  • hla-typing 0
  • tumor/normal 0
  • graph viz 0
  • graph unchopping 0
  • flip 0
  • graph stats 0
  • combine graphs 0
  • odgi 0
  • squeeze 0
  • graph drawing 0
  • graph construction 0
  • PCR/optical duplicates 0
  • upper-triangular matrix 0
  • Neisseria gonorrhoeae 0
  • pbmerge 0
  • motif 0
  • pedigrees 0
  • read 0
  • pair-end 0
  • pbp 0
  • subreads 0
  • pbbam 0
  • ligation junctions 0
  • graphs 0
  • paragraph 0
  • select 0
  • restriction fragments 0
  • pairstools 0
  • pairtools 0
  • gender 0
  • ngm 0
  • debruijn 0
  • ploidy 0
  • microrna 0
  • de Bruijn 0
  • mbias 0
  • methylation bias 0
  • metaphlan 0
  • unionsum 0
  • smudgeplot 0
  • mitochondrial genome 0
  • Merqury 0
  • contour map 0
  • 3D heat map 0
  • Neisseria meningitidis 0
  • rma6 0
  • daa 0
  • target prediction 0
  • reference genome 0
  • NextGenMap 0
  • GATK UnifiedGenotyper 0
  • sequencing summary 0
  • mobile element insertions 0
  • somatic structural variations 0
  • cancer genome 0
  • contaminant 0
  • SNP table 0
  • Beautiful stand-alone HTML report 0
  • mosdepth 0
  • bioinformatics tools 0
  • mitochondrial to nuclear ratio 0
  • ratio 0
  • mtnucratio 0
  • scan 0
  • microsatellite instability 0
  • otu table 0
  • TCR 0

Extracts reads mapped to chromosome 6 and any HLA decoys or chromosome 6 alternates.

01

extracted_reads_fastq log intermediate_sam intermediate_bam intermediate_sorted_bam versions

arcashla:

arcasHLA performs high resolution genotyping for HLA class I and class II genes from RNA sequencing, supporting both paired and single-end samples.

removes unused references from header of sorted BAM/CRAM files.

01

bam versions

Calculates per-scaffold or per-base coverage information from an unsorted sam or bam file.

01

covstats hist versions

bbmap:

BBMap is a short read aligner, as well as various other bioinformatic tools.

Sorts VCF files

01

vcf tbi csi versions

sort:

Sort VCF files by coordinates.

Split a vcf file into files per chromosome

012

split_vcf versions

bcftools:

Sort VCF files by coordinates.

Convert gtf format to bed format

01

bed versions

gtf2bed:

The gtf2bed script converts 1-based, closed [start, end] Gene Transfer Format v2.2 (GTF2.2) to sorted, 0-based, half-open [start-1, end) extended BED-formatted data.

Identifies common intervals among multiple (and subsets thereof) sorted BED/GFF/VCF files.

010

bed versions

bedtools:

A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.

Sorts a feature file by chromosome and other criteria.

010

sorted versions

bedtools:

A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.

Merge a list of sorted bam files

01

bam bam_index checksum versions

biobambam:

biobambam is a set of tools for early stage alignment file processing.

Parallel sorting and duplicate marking

0101

bam bam_index cram metrics versions

biobambam:

biobambam is a set of tools for early stage alignment file processing.

A fast, compact one-liner to produce duplicate-marked, sorted, and indexed BAM files using Biscuit

010101

bam bai versions

biscuit:

A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data

samblaster:

samblaster is a fast and flexible program for marking duplicates in read-id grouped paired-end SAM files. It can also optionally output discordant read pairs and/or split read mappings to separate SAM files, and/or unmapped/clipped reads to a separate FASTQ file. By default, samblaster reads SAM input from stdin and writes SAM to stdout.

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

A method to improve mappings on circular genomes, using the BWA mapper.

010101

fasta elongated versions

circulargenerator:

Creating a modified reference genome, with an elongation of the an specified amount of bases

binning of metagenomic sequences

01

fasta bins fm index links result versions

ClipKIT is a fast and flexible alignment trimming tool that keeps phylogenetically informative sites and removes those that display characteristics poor phylogenetic signal.

010

clipkit log versions

remove false positives of functional crispr genomics due to CNVs

01200

norm_count_file versions

crisprcleanr:

Analysis of CRISPR functional genomics, remove false positive due to CNVs.

Annotate a VEP annotated VCF with the most severe consequence field

0101

vcf versions

custom:

Custom module to annotate a VEP annotated VCF with the most severe consequence field

Annotate a VEP annotated VCF with the most severe pLi field

01

vcf versions

custom:

Custom module to annotate a VEP annotated VCF with the most severe pLi field

SV callers like lumpy look at split-reads and pair distances to find structural variants. This tool is a fast way to add depth information to those calls. This can be used as additional information for filtering variants; for example we will be skeptical of deletion calls that do not have lower than average coverage compared to regions with similar gc-content.

01234500

vcf versions

Filter, sort and markdup sam/bam files, with optional BQSR and variant calling.

012345601010100000

bam logs metrics recall gvcf table activity_profile assembly_regions versions

elprep:

elPrep is a high-performance tool for preparing .sam/.bam files for variant calling in sequencing pipelines. It can be used as a drop-in replacement for SAMtools/Picard/GATK4.

Groups reads together that appear to have come from the same original molecule. Reads are grouped by template, and then templates are sorted by the 5โ€™ mapping positions of the reads from the template, used from earliest mapping position to latest. Reads that have the same end positions are then sub-grouped by UMI sequence. (!) Note: the MQ tag is required on reads with mapped mates (!) This can be added using samblaster with the optional argument --addMateTags.

010

bam histogram versions

fgbio:

A set of tools for working with genomic and high throughput sequencing data, including UMIs

Sorts a SAM or BAM file. Several sort orders are available, including coordinate, queryname, random, and randomquery.

01

bam versions

fgbio:

Tools for working with genomic and high throughput sequencing data.

Perform merging of mate paired-end sequencing reads

01

merged notcombined histogram versions

Grafts query sequences from phylogenetic placement on the reference tree

01

newick versions

gappa:

Genesis Applications for Phylogenetic Placement Analysis

Generate recalibration table for Base Quality Score Recalibration (BQSR)

01230101010101

table versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Generate recalibration table for Base Quality Score Recalibration (BQSR)

metainputinput_indexintervalsfastafaidictknown_sitesknown_sites_tbi

meta versions table

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

write your description here

010

table versions

gatk4:

Genome Analysis Toolkit (GATK4)

Splits the interval list file into unique, equally-sized interval files and place it under a directory

01

interval_list versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

This tool locates and tags duplicate reads in a BAM or SAM file, where duplicate reads are defined as originating from a single fragment of DNA.

0100

cram bam crai bai metrics versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

This tool locates and tags duplicate reads in a BAM or SAM file, where duplicate reads are defined as originating from a single fragment of DNA.

metabamfastafaidict

meta versions output bam_index

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Generate recalibration table for Base Quality Score Recalibration (BQSR)

012300000

table versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

This tool locates and tags duplicate reads in a BAM or SAM file, where duplicate reads are defined as originating from a single fragment of DNA.

01000

output bam_index metrics versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Writes a sorted concatenation of file/s

01

sorted versions

sort:

Writes a sorted concatenation of file/s

Sort GTF files in chr/pos/feature order

0

gtf versions

pacbio structural variant calling tool

01201201

vcf csv versions

reformats sequence files, see HMMER documentation for details. The module requires that the format is specified in ext.args in a config file, and that this comes last. See the tools help for possible values.

01

seqreformated versions

hmmer:

Biosequence analysis using profile hidden Markov models

mageck count for functional genomics, reads are usually mapped to a specific sgRNA

010

count norm versions

mageck:

MAGeCK (Model-based Analysis of Genome-wide CRISPR-Cas9 Knockout), an algorithm to process, QC, analyze and visualize CRISPR screening data.

maximum-likelihood analysis of gene essentialities computation

010

gene_summary sgrna_summary versions

mageck:

MAGeCK (Model-based Analysis of Genome-wide CRISPR-Cas9 Knockout), an algorithm to process, QC, analyze and visualize CRISPR screening data.

Mageck test performs a robust ranking aggregation (RRA) to identify positively or negatively selected genes in functional genomics screens.

01

gene_summary sgrna_summary r_script versions

mageck:

MAGeCK (Model-based Analysis of Genome-wide CRISPR-Cas9 Knockout), an algorithm to process, QC, analyze and visualize CRISPR screening data.

Run standard proteomics data analysis with MaxQuant, mostly dedicated to label-free. Paths to fasta and raw files needs to be marked by "PLACEHOLDER"

0120

maxquant_txt versions

maxquant:

MaxQuant is a quantitative proteomics software package designed for analyzing large mass-spectrometric data sets. License restricted.

Depth computation per contig step of metabat2

012

depth versions

metabat2:

Metagenome binning

Metagenome binning of contigs

012

tooshort lowdepth unbinned membership fasta versions

metabat2:

Metagenome binning

Merging paired-end reads and removing sequencing adapters.

01

merged_reads unstitched_read1 unstitched_read2 versions

Establish 2D layouts of the graph using path-guided stochastic gradient descent. The graph must be sorted and id-compacted.

01

lay tsv versions

odgi:

An optimized dynamic genome/graph implementation

Apply different kind of sorting algorithms to a graph. The most prominent one is the PG-SGD sorting algorithm.

01

sorted_graph versions

odgi:

An optimized dynamic genome/graph implementation

Sort a .pairs/.pairsam file

01

sorted versions

pairtools:

CLI tools to process mapped Hi-C data

NVIDIA Clara Parabricks GPU-accelerated alignment, sorting, BQSR calculation, and duplicate marking. Note this nf-core module requires files to be copied into the working directory and not symlinked.

01010101010

bam bai cram crai bqsr_table qc_metrics duplicate_metrics versions

parabricks:

NVIDIA Clara Parabricks GPU-accelerated genomics tools

VIDIA Clara Parabricks GPU-accelerated fast, accurate algorithm for mapping methylated DNA sequence reads to a reference genome, performing local alignment, and producing alignment for different parts of the query sequence

0101010

bam bai qc_metrics bqsr_table duplicate_metrics versions

parabricks:

NVIDIA Clara Parabricks GPU-accelerated genomics tools

Paraclu finds clusters in data attached to sequences.

010

bed versions

This tool takes in a coordinate-sorted SAM or BAM and calculatesthe NM, MD, and UQ tags by comparing with the reference.

0101

bam bai versions

picard:

Java tools for working with NGS data in the BAM format

Sorts BAM/SAM files based on a variety of picard specific criteria

010

bam versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Sorts vcf files

010101

vcf versions

picard:

Java tools for working with NGS data in the BAM/CRAM/SAM and VCF format

Create read depth histogram and base-level read depth for an assembly based on pacbio data

01

stat basecov versions

purgedups:

Purge_dups is a package used to purge haplotigs and overlaps in an assembly based on read depth

Identify, orient and trim nanopore cDNA reads

01

fastq versions

gzip:

Gzip reduces the size of the named files using Lempel-Ziv coding (LZ77).

Pyrodigal is a Python module that provides bindings to Prodigal, a fast, reliable protein-coding gene prediction for prokaryotic genomes.

010

annotations fna faa score versions

Calculation of optimal P-site offsets, diagnostic analysis and visual inspection of ribosome profiling data

010101

best_offset offset offset_plot psites codon_coverage_rpf codon_coverage_psite cds_coverage cds_window_coverage ribowaltz_qc versions

This module combines samtools and samblaster in order to use samblaster capability to filter or tag SAM files, with the advantage of maintaining both input and output in BAM format. Samblaster input must contain a sequence header: for this reason it has been piped with the "samtools view -h" command. Additional desired arguments for samtools can be passed using: options.args2 for the input bam file options.args3 for the output bam file

01

bam versions

mark duplicate alignments in a coordinate sorted file

0101

bam cram sam versions

samtools:

Tools for dealing with SAM, BAM and CRAM files

Collate/Fixmate/Sort/Markdup SAM/BAM/CRAM file

0101

bam cram csi crai metrics versions

samtools_cat:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

samtools_collate:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

samtools_fixmate:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

samtools_sort:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

samtools_markdup:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Sort SAM/BAM/CRAM file

0101

bam cram crai csi versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

SCIMAP is a suite of tools that enables spatial single-cell analyses

01

csv h5ad versions

scimap:

Scimap is a scalable toolkit for analyzing spatial molecular data.

Sorts sequences by id/name/sequence/length

01

fastx versions

seqkit:

A cross-platform and ultrafast toolkit for FASTA/Q file manipulation

Local sequence alignment tool for filtering, mapping and clustering.

010101

reads log index versions

SortMeRNA:

The core algorithm is based on approximate seeds and allows for sensitive analysis of NGS reads. The main application of SortMeRNA is filtering rRNA from metatranscriptomic data. SortMeRNA takes as input files of reads (fasta, fastq, fasta.gz, fastq.gz) and one or multiple rRNA database file(s), and sorts apart aligned and rejected reads into two files. Additional applications include clustering and taxonomy assignation available through QIIME v1.9.1. SortMeRNA works with Illumina, Ion Torrent and PacBio data, and can produce SAM and BLAST-like alignments.

Short Read Sequence Typing for Bacterial Pathogens is a program designed to take Illumina sequence data, a MLST database and/or a database of gene sequences (e.g. resistance genes, virulence genes, etc) and report the presence of STs and/or reference genes.

012

gene_results fullgene_results mlst_results pileup sorted_bam versions

srst2:

Short Read Sequence Typing for Bacterial Pathogens

Align reads to a reference genome using STAR

010101000

log_final log_out log_progress versions bam bam_sorted bam_sorted_aligned bam_transcript bam_unsorted fastq tab spl_junc_tab read_per_gene_tab junction sam wig bedgraph

star:

STAR is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

bgzip a sorted tab-delimited genome file and then create tabix index

01

gz_tbi gz_csi versions

tabix:

Generic indexer for TAB-delimited genome position files.

create tabix index from a sorted bgzip tab-delimited genome file

01

tbi csi versions

tabix:

Generic indexer for TAB-delimited genome position files.

Run TRUST4 on RNA-seq data

01201010101

tsv airr_files airr_tsv report_tsv fasta out fq outs versions

A set of tools written in Perl and C++ for working with VCF files

0100

vcf bcf frq frq_count idepth ldepth ldepth_mean gdepth hap_ld geno_ld geno_chisq list_hap_ld list_geno_ld interchrom_hap_ld interchrom_geno_ld tstv tstv_summary tstv_count tstv_qual filter_summary sites_pi windowed_pi weir_fst heterozygosity hwe tajima_d freq_burden lroh relatedness relatedness2 lqual missing_individual missing_site snp_density kept_sites removed_sites singeltons indel_hist hapcount mendel format info genotypes_matrix genotypes_matrix_individual genotypes_matrix_position impute_hap impute_hap_legend impute_hap_indv ldhat_sites ldhat_locs beagle_gl beagle_pl ped map_ tped tfam diff_sites_in_files diff_indv_in_files diff_sites diff_indv diff_discd_matrix diff_switch_error versions

Velocyto is a library for the analysis of RNA velocity. velocyto.py CLI use Path(resolve_path=True) and breaks the nextflow logic of symbolic links. If in the work dir velocyto find a file named EXACTLY cellsorted_[ORIGINAL_BAM_NAME] it will skip the samtools sort step. Cellsorted bam file should be cell sorted with:

    samtools sort -t CB -O BAM -o cellsorted_input.bam input.bam

See module test for an example with the SAMTOOLS_SORT nf-core module. Config example to cellsort input bam using SAMTOOLS_SORT:

    withName: SAMTOOLS_SORT {
        ext.prefix = { "cellsorted_${bam.baseName}" }
        ext.args = '-t CB -O BAM'
    }

Optional mask must be passed with ext.args and option --mask This is why I need to stage in the work dir 2 bam files (cellsorted and original). See also velocyto tutorial

01230

loom versions

Cluster sequences using a single-pass, greedy centroid-based clustering algorithm.

01

aln biom mothur otu bam out blast uc centroids clusters profile msa versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Performs quality filtering and / or conversion of a FASTQ file to FASTA format.

01

fasta log versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Taxonomic classification using the sintax algorithm.

010

tsv versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Sort fasta entries by decreasing abundance (--sortbysize) or sequence length (--sortbylength).

010

fasta versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Compare target sequences to fasta-formatted query sequences using global pairwise alignment.

010000

aln biom lca mothur otu sam tsv txt uc versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Fast lightweight accurate xenograft sorting

00000

hash info versions

xengsort:

A fast xenograft read sorter based on space-efficient k-mer hashing

Click here to trigger an update.