Available Modules

Modules are the building stones of all DSL2 nf-core blocks. You can find more info from nf-core website, if you would like to write your own module.

  • vcf 19
  • genomics 17
  • metagenomics 15
  • bam 14
  • fasta 13
  • sort 11
  • variants 10
  • fastq 9
  • gatk4 9
  • example 9
  • structural variants 7
  • cram 6
  • variant calling 6
  • somatic 6
  • gvcf 6
  • vsearch 6
  • antimicrobial peptides 6
  • align 5
  • classification 5
  • reporting 5
  • wgs 5
  • demultiplex 5
  • population genetics 5
  • amps 5
  • sam 4
  • merge 4
  • qc 4
  • plink2 4
  • genotype 4
  • wxs 4
  • ampir 4
  • parsing 4
  • sample 4
  • reference 3
  • database 3
  • filter 3
  • coverage 3
  • QC 3
  • table 3
  • matrix 3
  • machine learning 3
  • genotyping 3
  • hmmsearch 3
  • demultiplexing 3
  • single cell 3
  • small indels 3
  • panel 3
  • DRAMP 3
  • neubi 3
  • amplify 3
  • macrel 3
  • subsample 3
  • amplicon sequencing 3
  • wastewater 3
  • amplicon sequences 3
  • bracken 3
  • npz 3
  • alignment 2
  • assembly 2
  • bed 2
  • cnv 2
  • contamination 2
  • taxonomy 2
  • pacbio 2
  • convert 2
  • clustering 2
  • copy number 2
  • imputation 2
  • bcftools 2
  • visualisation 2
  • expression 2
  • neural network 2
  • kraken2 2
  • low frequency variant calling 2
  • mirna 2
  • report 2
  • microbiome 2
  • microsatellite 2
  • DNA sequencing 2
  • abundance 2
  • targeted sequencing 2
  • hybrid capture sequencing 2
  • copy number alteration calling 2
  • ancestry 2
  • popscle 2
  • genotype-based deconvoltion 2
  • observations 2
  • miRNA 2
  • ampgram 2
  • tumor 2
  • profiles 2
  • amptransformer 2
  • deconvolution 2
  • smrnaseq 2
  • nacho 2
  • nanostring 2
  • mRNA 2
  • samplesheet 2
  • format 2
  • eido 2
  • joint genotyping 2
  • genome 1
  • index 1
  • gff 1
  • map 1
  • nanopore 1
  • split 1
  • k-mer 1
  • sentieon 1
  • binning 1
  • count 1
  • single-cell 1
  • ancient DNA 1
  • trimming 1
  • rnaseq 1
  • contigs 1
  • mags 1
  • consensus 1
  • isoseq 1
  • build 1
  • sv 1
  • kmer 1
  • bisulfite 1
  • cna 1
  • picard 1
  • methylation 1
  • illumina 1
  • bisulphite 1
  • methylseq 1
  • serotype 1
  • depth 1
  • 5mC 1
  • samtools 1
  • cluster 1
  • aDNA 1
  • bins 1
  • transcript 1
  • mmseqs2 1
  • archaeogenomics 1
  • low-coverage 1
  • validation 1
  • palaeogenomics 1
  • gene 1
  • damage 1
  • bwa 1
  • germline 1
  • glimpse 1
  • peaks 1
  • spatial 1
  • bismark 1
  • prediction 1
  • short-read 1
  • cnvkit 1
  • extract 1
  • reads 1
  • tumor-only 1
  • single 1
  • detection 1
  • summary 1
  • fastx 1
  • svtk 1
  • profiling 1
  • counts 1
  • mpileup 1
  • fragment 1
  • compare 1
  • clipping 1
  • propr 1
  • logratio 1
  • preprocessing 1
  • isomir 1
  • ccs 1
  • ganon 1
  • enrichment 1
  • peak-calling 1
  • bedgraph 1
  • STR 1
  • xeniumranger 1
  • family 1
  • umitools 1
  • bcl2fastq 1
  • bgzip 1
  • malt 1
  • normalization 1
  • microarray 1
  • sequencing 1
  • UMI 1
  • informative sites 1
  • rna_structure 1
  • RNA 1
  • kinship 1
  • identity 1
  • transcripts 1
  • relatedness 1
  • score 1
  • angsd 1
  • SNP 1
  • RNA-seq 1
  • survivor 1
  • population genomics 1
  • cfDNA 1
  • genome mining 1
  • aln 1
  • repeat expansion 1
  • png 1
  • comparison 1
  • proportionality 1
  • functional analysis 1
  • concordance 1
  • vcflib 1
  • polyA_tail 1
  • rename 1
  • refine 1
  • bfiles 1
  • barcode 1
  • msi 1
  • instability 1
  • expansionhunterdenovo 1
  • GEO 1
  • reheader 1
  • eigenstrat 1
  • short reads 1
  • trgt 1
  • validate 1
  • heatmap 1
  • metagenomes 1
  • gene labels 1
  • gwas 1
  • authentication 1
  • dereplicate 1
  • panelofnormals 1
  • MaltExtract 1
  • HOPS 1
  • edit distance 1
  • gatk 1
  • filtermutectcalls 1
  • remove samples 1
  • copy number analysis 1
  • copy-number 1
  • gender determination 1
  • copy number alterations 1
  • copy number variation 1
  • samples 1
  • bgen 1
  • hwe 1
  • qualty 1
  • createreadcountpanelofnormals 1
  • downsample 1
  • downsample bam 1
  • subsample bam 1
  • dereplication 1
  • microbial genomics 1
  • drep 1
  • bgen file 1
  • vsearch/sort 1
  • sintax 1
  • usearch 1
  • countsvtypes 1
  • vcf file 1
  • genotype dosages 1
  • verifybamid 1
  • DNA contamination estimation 1
  • hashing-based deconvolution 1
  • Bayesian 1
  • omics 1
  • biological activity 1
  • prior knowledge 1
  • microRNA 1
  • probabilistic realignment 1
  • AC/NS/AF 1
  • vcflib/vcffixup 1
  • droplet based single cells 1
  • InterProScan 1
  • MMseqs2 1
  • rna velocity 1
  • Escherichia coli 1
  • mgi 1
  • setgt 1
  • fastqfilter 1
  • coexpression 1
  • vsearch/fastqfilter 1
  • correlation 1
  • corpcor 1
  • vsearch/dereplicate 1
  • assay 1
  • source tracking 1
  • bclconvert 1
  • metaphlan 1
  • getpileupsummaries 1
  • short variant discovery 1
  • combinegvcfs 1
  • cross-samplecontamination 1
  • dragstr 1
  • calculatecontamination 1
  • composestrtablefile 1
  • germlinecnvcaller 1
  • germline contig ploidy 1
  • panelofnormalscreation 1
  • jointgenotyping 1
  • genomicsdbimport 1
  • genomicsdb 1
  • determinegermlinecontigploidy 1
  • gangstr 1
  • str 1
  • UShER 1
  • bootstrapping 1
  • rust 1
  • fq 1
  • Haplotypes 1
  • Imputation 1
  • Sample 1
  • hbd 1
  • ibd 1
  • reblockgvcf 1
  • variantrecalibrator 1
  • recalibration model 1
  • element 1
  • update header 1
  • amp 1
  • allele counts 1
  • post Post-processing 1
  • model 1
  • AMPs 1
  • antimicrobial peptide prediction 1
  • reference panels 1
  • admixture 1
  • doCounts 1
  • HLA 1
  • na 1
  • pep 1
  • schema 1
  • PEP 1
  • depth information 1
  • corrrelation 1
  • structural variation 1
  • duphold 1
  • scatterplot 1
  • postprocessing 1
  • nucleotide composition 1
  • concoct 1
  • beagle 1
  • intervals coverage 1
  • subsampling 1
  • picard/renamesampleinvcf 1
  • GRO-cap 1
  • CoPRO 1
  • PRO-cap 1
  • scoring 1
  • whole genome association 1
  • recode 1
  • CAGE 1
  • GRO-seq 1
  • PRO-seq 1
  • STRIPE-seq 1
  • csRNA-seq 1
  • RAMPAGE 1
  • NETCAGE 1
  • freqsum 1
  • pseudodiploid 1
  • pseudohaploid 1
  • random draw 1
  • features 1
  • ampliconclip 1
  • amplicon 1
  • chromatin 1
  • seacr 1
  • cut&run 1
  • cut&tag 1
  • peak-caller 1
  • AMP 1
  • peptide prediction 1
  • js 1
  • igv.js 1
  • igv 1
  • genome browser 1
  • jasmine 1
  • jasminesv 1
  • gender 1
  • Beautiful stand-alone HTML report 1
  • bioinformatics tools 1
  • microsatellite instability 1
  • annotation 0
  • bacteria 0
  • statistics 0
  • quality control 0
  • download 0
  • gtf 0
  • classify 0
  • MSA 0
  • variant 0
  • taxonomic profiling 0
  • gfa 0
  • conversion 0
  • quality 0
  • proteomics 0
  • VCF 0
  • phylogeny 0
  • long reads 0
  • bedtools 0
  • graph 0
  • variation graph 0
  • long-read 0
  • databases 0
  • protein 0
  • compression 0
  • indexing 0
  • bqsr 0
  • stats 0
  • taxonomic classification 0
  • metrics 0
  • antimicrobial resistance 0
  • openms 0
  • imaging 0
  • tsv 0
  • mapping 0
  • phage 0
  • sequences 0
  • WGBS 0
  • scWGBS 0
  • pangenome graph 0
  • repeat 0
  • pairs 0
  • DNA methylation 0
  • plot 0
  • searching 0
  • amr 0
  • protein sequence 0
  • structure 0
  • histogram 0
  • base quality score recalibration 0
  • haplotype 0
  • markduplicates 0
  • filtering 0
  • annotate 0
  • bisulfite sequencing 0
  • gzip 0
  • biscuit 0
  • phasing 0
  • virus 0
  • bcf 0
  • aligner 0
  • transcriptome 0
  • completeness 0
  • seqkit 0
  • sequence 0
  • cooler 0
  • iCLIP 0
  • LAST 0
  • db 0
  • checkm 0
  • metagenome 0
  • mappability 0
  • complexity 0
  • gff3 0
  • decompression 0
  • mag 0
  • hmmer 0
  • dedup 0
  • blast 0
  • segmentation 0
  • evaluation 0
  • feature 0
  • newick 0
  • umi 0
  • ucsc 0
  • msa 0
  • mkref 0
  • sketch 0
  • ncbi 0
  • bedGraph 0
  • antimicrobial resistance genes 0
  • mitochondria 0
  • deduplication 0
  • splicing 0
  • kmers 0
  • csv 0
  • json 0
  • prokaryote 0
  • scRNA-seq 0
  • multiple sequence alignment 0
  • pangenome 0
  • duplicates 0
  • differential 0
  • NCBI 0
  • snp 0
  • profile 0
  • plasmid 0
  • text 0
  • 3-letter genome 0
  • adapters 0
  • mem 0
  • idXML 0
  • merging 0
  • diversity 0
  • de novo assembly 0
  • tabular 0
  • deamination 0
  • MAF 0
  • indels 0
  • visualization 0
  • interval 0
  • FASTQ 0
  • kallisto 0
  • riboseq 0
  • sourmash 0
  • isolates 0
  • benchmark 0
  • antibiotic resistance 0
  • mutect2 0
  • concatenate 0
  • gridss 0
  • view 0
  • cat 0
  • de novo 0
  • arg 0
  • call 0
  • structural 0
  • reference-free 0
  • query 0
  • ont 0
  • distance 0
  • coptr 0
  • ptr 0
  • circrna 0
  • ngscheckmate 0
  • matching 0
  • read depth 0
  • CLIP 0
  • rna 0
  • sylph 0
  • snps 0
  • cut 0
  • dna 0
  • retrotransposon 0
  • fgbio 0
  • pypgx 0
  • genome assembler 0
  • HMM 0
  • phylogenetic placement 0
  • hmmcopy 0
  • HiFi 0
  • happy 0
  • haplotypecaller 0
  • transcriptomics 0
  • public datasets 0
  • hic 0
  • deep learning 0
  • bedpe 0
  • compress 0
  • gsea 0
  • miscoding lesions 0
  • palaeogenetics 0
  • archaeogenetics 0
  • paf 0
  • containment 0
  • bin 0
  • redundancy 0
  • bigwig 0
  • diamond 0
  • quantification 0
  • mtDNA 0
  • SV 0
  • telomere 0
  • genmod 0
  • ranking 0
  • fai 0
  • image 0
  • interval_list 0
  • chunk 0
  • clean 0
  • fungi 0
  • ATAC-seq 0
  • chromosome 0
  • BGC 0
  • biosynthetic gene cluster 0
  • DNA sequence 0
  • add 0
  • resistance 0
  • union 0
  • skani 0
  • tabix 0
  • uLTRA 0
  • krona 0
  • html 0
  • host 0
  • image_analysis 0
  • mcmicro 0
  • fastk 0
  • highly_multiplexed_imaging 0
  • transposons 0
  • bakta 0
  • bamtools 0
  • checkv 0
  • minimap2 0
  • adapter trimming 0
  • bacterial 0
  • rsem 0
  • duplication 0
  • polishing 0
  • remove 0
  • archiving 0
  • zip 0
  • quality trimming 0
  • unzip 0
  • pileup 0
  • uncompress 0
  • untar 0
  • benchmarking 0
  • scaffolding 0
  • typing 0
  • pangolin 0
  • long_read 0
  • entrez 0
  • ataqv 0
  • fusion 0
  • khmer 0
  • spaceranger 0
  • chimeras 0
  • lossless 0
  • PacBio 0
  • ligate 0
  • virulence 0
  • genome assembly 0
  • dist 0
  • shapeit 0
  • pseudoalignment 0
  • seqtk 0
  • krona chart 0
  • arriba 0
  • complement 0
  • reports 0
  • notebook 0
  • dictionary 0
  • indel 0
  • eukaryotes 0
  • prokaryotes 0
  • spark 0
  • hidden Markov model 0
  • mask 0
  • ambient RNA removal 0
  • organelle 0
  • covid 0
  • dump 0
  • mapper 0
  • variant_calling 0
  • mkfastq 0
  • windowmasker 0
  • cellranger 0
  • combine 0
  • prefetch 0
  • comparisons 0
  • replace 0
  • prokka 0
  • C to T 0
  • das tool 0
  • das_tool 0
  • mlst 0
  • vrhyme 0
  • nucleotide 0
  • CRISPR 0
  • intervals 0
  • bwameth 0
  • cut up 0
  • cool 0
  • somatic variants 0
  • mzml 0
  • hi-c 0
  • bim 0
  • fam 0
  • gatk4spark 0
  • guide tree 0
  • fcs-gx 0
  • insert 0
  • deeparg 0
  • proteome 0
  • gene expression 0
  • genomes 0
  • scores 0
  • lineage 0
  • regions 0
  • microbes 0
  • kraken 0
  • wig 0
  • structural_variants 0
  • pairsam 0
  • fingerprint 0
  • chip-seq 0
  • pan-genome 0
  • roh 0
  • PCA 0
  • atac-seq 0
  • converter 0
  • variation 0
  • hla_typing 0
  • hlala_typing 0
  • ancient dna 0
  • sequenzautils 0
  • mapcounter 0
  • Streptococcus pneumoniae 0
  • snpsift 0
  • nextclade 0
  • snpeff 0
  • reformat 0
  • effect prediction 0
  • reformatting 0
  • instrain 0
  • SimpleAF 0
  • metamaps 0
  • lift 0
  • hla 0
  • genomad 0
  • ChIP-seq 0
  • leviosam2 0
  • ichorcna 0
  • hlala 0
  • de novo assembler 0
  • rrna 0
  • nucleotides 0
  • taxids 0
  • taxon name 0
  • FracMinHash sketch 0
  • rgfa 0
  • small variants 0
  • multiallelic 0
  • regression 0
  • mitochondrion 0
  • registration 0
  • ped 0
  • cnvnator 0
  • gene set analysis 0
  • zlib 0
  • gstama 0
  • differential expression 0
  • read-group 0
  • GPU-accelerated 0
  • gene set 0
  • genetics 0
  • switch 0
  • haplogroups 0
  • small genome 0
  • trancriptome 0
  • shigella 0
  • signature 0
  • image_processing 0
  • tnhaplotyper2 0
  • graph layout 0
  • phase 0
  • interactions 0
  • tama 0
  • polish 0
  • iphop 0
  • pharokka 0
  • k-mer index 0
  • vg 0
  • bloom filter 0
  • minhash 0
  • cancer genomics 0
  • mash 0
  • purge duplications 0
  • library 0
  • rtgtools 0
  • preseq 0
  • bam2fq 0
  • adapter 0
  • import 0
  • collate 0
  • retrotransposons 0
  • long terminal repeat 0
  • dict 0
  • tree 0
  • COBS 0
  • lofreq 0
  • megan 0
  • runs_of_homozygosity 0
  • scaffold 0
  • contig 0
  • assembly evaluation 0
  • junctions 0
  • GC content 0
  • k-mer frequency 0
  • resolve_bioscience 0
  • Duplication purging 0
  • spatial_transcriptomics 0
  • xz 0
  • archive 0
  • checksum 0
  • mudskipper 0
  • duplicate 0
  • transcriptomic 0
  • Read depth 0
  • long terminal retrotransposon 0
  • fixmate 0
  • maximum likelihood 0
  • msisensor-pro 0
  • subset 0
  • screen 0
  • bustools 0
  • standardization 0
  • salmonella 0
  • parallelized 0
  • micro-satellite-scan 0
  • orthology 0
  • krakentools 0
  • transformation 0
  • svdb 0
  • orf 0
  • removal 0
  • salmon 0
  • homoploymer 0
  • pair 0
  • serogroup 0
  • kma 0
  • primer 0
  • soft-clipped clusters 0
  • taxon tables 0
  • otu tables 0
  • standardisation 0
  • standardise 0
  • MSI 0
  • fusions 0
  • variant pruning 0
  • interactive 0
  • krakenuniq 0
  • taxonomic profile 0
  • varcal 0
  • function 0
  • immunoprofiling 0
  • Pharmacogenetics 0
  • UMIs 0
  • duplex 0
  • fetch 0
  • metagenomic 0
  • identifier 0
  • frame-shift correction 0
  • long-read sequencing 0
  • repeat_expansions 0
  • genome bins 0
  • metadata 0
  • tab 0
  • sequence analysis 0
  • intersection 0
  • windows 0
  • emboss 0
  • pharmacogenetics 0
  • haplotypes 0
  • region 0
  • unaligned 0
  • graft 0
  • allele-specific 0
  • trim 0
  • ome-tif 0
  • MCMICRO 0
  • realignment 0
  • mirdeep2 0
  • microbial 0
  • RNA sequencing 0
  • microscopy 0
  • scatter 0
  • bayesian 0
  • concat 0
  • tbi 0
  • intersect 0
  • merge mate pairs 0
  • normalize 0
  • reads merging 0
  • norm 0
  • sizes 0
  • bases 0
  • interval list 0
  • cnv calling 0
  • decontamination 0
  • human removal 0
  • screening 0
  • cleaning 0
  • gem 0
  • split_kmers 0
  • calling 0
  • corrupted 0
  • CNV 0
  • correction 0
  • cvnkit 0
  • estimation 0
  • vdj 0
  • recombination 0
  • eCLIP 0
  • splice 0
  • parse 0
  • hostile 0
  • version 0
  • BAM 0
  • doublets 0
  • anndata 0
  • awk 0
  • blastp 0
  • deseq2 0
  • rna-seq 0
  • blastn 0
  • settings 0
  • pigz 0
  • spatial_omics 0
  • random forest 0
  • structural-variant calling 0
  • fasterq-dump 0
  • find 0
  • sra-tools 0
  • xenograft 0
  • single cells 0
  • allele 0
  • artic 0
  • WGS 0
  • antibiotics 0
  • RiPP 0
  • aggregate 0
  • secondary metabolites 0
  • demultiplexed reads 0
  • simulate 0
  • join 0
  • evidence 0
  • antismash 0
  • baf 0
  • cgMLST 0
  • NRPS 0
  • repeats 0
  • RNA-Seq 0
  • ragtag 0
  • orthologs 0
  • scanner 0
  • geo 0
  • helitron 0
  • mapad 0
  • adna 0
  • spatype 0
  • wavefront 0
  • c to t 0
  • unmarkduplicates 0
  • junction 0
  • mashmap 0
  • covariance models 0
  • proteus 0
  • 16S 0
  • yahs 0
  • hmmscan 0
  • hhsuite 0
  • CRISPRi 0
  • detecting svs 0
  • hmmpress 0
  • short-read sequencing 0
  • variantcalling 0
  • wham 0
  • phylogenies 0
  • whamg 0
  • sccmec 0
  • streptococcus 0
  • spa 0
  • signatures 0
  • readproteingroups 0
  • groupby 0
  • data-download 0
  • dnamodelapply 0
  • constant 0
  • invariant 0
  • dnascope 0
  • doublet 0
  • patterns 0
  • regex 0
  • SNPs 0
  • paired reads re-pairing 0
  • denoisereadcounts 0
  • tnscope 0
  • fix 0
  • chloroplast 0
  • confidence 0
  • malformed 0
  • blat 0
  • alr 0
  • readwriter 0
  • metabolite annotation 0
  • fracminhash sketch 0
  • ribosomal RNA 0
  • taxonomic composition 0
  • hash sketch 0
  • eigenvectors 0
  • trna 0
  • hicPCA 0
  • sliding 0
  • mzML 0
  • snakemake 0
  • workflow 0
  • genome annotation 0
  • workflow_mode 0
  • copyratios 0
  • prepare 0
  • catpack 0
  • mobile genetic elements 0
  • rRNA 0
  • integron 0
  • Computational Immunology 0
  • Bioinformatics Tools 0
  • metaspace 0
  • Immune Deconvolution 0
  • all versus all 0
  • inbreeding 0
  • melon 0
  • disomy 0
  • pca 0
  • dream 0
  • md 0
  • nm 0
  • plink2_pca 0
  • coding 0
  • upd 0
  • uq 0
  • uniparental 0
  • snv 0
  • files 0
  • eucaryotes 0
  • vcf2db 0
  • short 0
  • gemini 0
  • maf 0
  • lua 0
  • toml 0
  • pruning 0
  • cds 0
  • Mycobacterium tuberculosis 0
  • bigbed 0
  • heterozygous genotypes 0
  • genepred 0
  • refflat 0
  • gtftogenepred 0
  • ucsc/liftover 0
  • covariance model 0
  • umicollapse 0
  • variancepartition 0
  • scRNA-Seq 0
  • homozygous genotypes 0
  • agat 0
  • longest 0
  • bedgraphtobigwig 0
  • f coefficient 0
  • isoform 0
  • sequencing adapters 0
  • transcroder 0
  • linkage equilibrium 0
  • chromosomal rearrangements 0
  • svtk/baftest 0
  • vcf2bed 0
  • extractunbinned 0
  • Indel 0
  • host removal 0
  • rdtest 0
  • linkbins 0
  • haploype 0
  • impute 0
  • reference compression 0
  • SNV 0
  • rdtest2vcf 0
  • long read alignment 0
  • reference panel 0
  • SINE 0
  • bedtobigbed 0
  • baftest 0
  • pangenome-scale 0
  • plant 0
  • decompress 0
  • shuffleBed 0
  • uniq 0
  • assembly polishing 0
  • genome polishing 0
  • bedcov 0
  • comp 0
  • fast5 0
  • masking 0
  • vcfbreakmulti 0
  • low-complexity 0
  • GFF/GTF 0
  • deduplicate 0
  • graph projection to vcf 0
  • trio binning 0
  • VCFtools 0
  • wget 0
  • polya tail 0
  • tandem repeats 0
  • construct 0
  • long read 0
  • network 0
  • intron 0
  • peak picking 0
  • partitioning 0
  • Illumina 0
  • clahe 0
  • refresh 0
  • java 0
  • rank 0
  • association 0
  • tag2tag 0
  • GWAS 0
  • tags 0
  • impute-info 0
  • functional 0
  • uniques 0
  • xml 0
  • case/control 0
  • drug categorization 0
  • Read report 0
  • Read trimming 0
  • Read filters 0
  • associations 0
  • spatial_neighborhoods 0
  • scimap 0
  • structural-variants 0
  • script 0
  • svg 0
  • staging 0
  • search engine 0
  • mass_error 0
  • multiqc 0
  • distance-based 0
  • nucleotide sequence 0
  • homologs 0
  • multi-tool 0
  • predict 0
  • Staging 0
  • hardy-weinberg 0
  • haplotag 0
  • machine_learning 0
  • hwe statistics 0
  • hwe equilibrium 0
  • reference-independent 0
  • genotype likelihood 0
  • collapse 0
  • liftover 0
  • seqfu 0
  • n50 0
  • cell_type_identification 0
  • standard 0
  • cell_phenotyping 0
  • nanoq 0
  • tag 0
  • minimum_evolution 0
  • cellsnp 0
  • bwamem2 0
  • guidetree 0
  • translation 0
  • paired reads merging 0
  • Pacbio 0
  • overlap-based merging 0
  • check 0
  • trimfq 0
  • hamming-distance 0
  • donor deconvolution 0
  • grabix 0
  • genotype-based demultiplexing 0
  • lexogen 0
  • hashing-based deconvoltion 0
  • gnu 0
  • coreutils 0
  • generic 0
  • transposable element 0
  • busco 0
  • retrieval 0
  • bwameme 0
  • ribosomal 0
  • cell_barcodes 0
  • realign 0
  • redundant 0
  • mygene 0
  • go 0
  • extraction 0
  • featuretable 0
  • mass spectrometry 0
  • pile up 0
  • sage 0
  • orthogroup 0
  • spot 0
  • circular 0
  • quality check 0
  • functional enrichment 0
  • size 0
  • cram-size 0
  • selector 0
  • paraphase 0
  • transcription factors 0
  • regulatory network 0
  • nanopore sequencing 0
  • cobra 0
  • extension 0
  • grea 0
  • 10x 0
  • poolseq 0
  • phylogenetics 0
  • chip 0
  • gost 0
  • tnfilter 0
  • scanpy 0
  • metagenome assembler 0
  • morphology 0
  • resegment 0
  • array_cgh 0
  • cytosure 0
  • relabel 0
  • cell segmentation 0
  • nuclear segmentation 0
  • gprofiler2 0
  • import segmentation 0
  • ancestral alleles 0
  • solo 0
  • scvi 0
  • rad 0
  • p-value 0
  • structural variant 0
  • bam2fastx 0
  • significance statistic 0
  • logFC 0
  • bam2fastq 0
  • immcantation 0
  • airrseq 0
  • subsetting 0
  • derived alleles 0
  • site frequency spectrum 0
  • immunoinformatics 0
  • reverse complement 0
  • updatedata 0
  • run 0
  • pdb 0
  • clr 0
  • boxcox 0
  • propd 0
  • Read coverage histogram 0
  • block substitutions 0
  • decomposeblocksub 0
  • identity-by-descent 0
  • simulation 0
  • plotting 0
  • hmmfetch 0
  • decompose 0
  • transmembrane 0
  • genome graph 0
  • tnseq 0
  • recovery 0
  • decoy 0
  • htseq 0
  • sompy 0
  • leafcutter 0
  • regtools 0
  • barcodes 0
  • co-orthology 0
  • variant-calling 0
  • shift 0
  • jvarkit 0
  • resfinder 0
  • resistance genes 0
  • raw 0
  • mgf 0
  • parquet 0
  • parser 0
  • dbsnp 0
  • standardize 0
  • ATACshift 0
  • ATACseq 0
  • translate 0
  • quarto 0
  • python 0
  • r 0
  • telseq 0
  • stardist 0
  • plastid 0
  • tar 0
  • homology 0
  • elprep 0
  • doublet_detection 0
  • sequence similarity 0
  • spectral clustering 0
  • comparative genomics 0
  • deep variant 0
  • mutect 0
  • idx 0
  • quality_control 0
  • emoji 0
  • controlstatistics 0
  • elfasta 0
  • parallel 0
  • transform 0
  • nucleotide content 0
  • gaps 0
  • AT content 0
  • introns 0
  • nucBed 0
  • install 0
  • joint-genotyping 0
  • genotypegvcf 0
  • targz 0
  • tarball 0
  • vector 0
  • predictions 0
  • collectsvevidence 0
  • collectreadcounts 0
  • cnnscorevariants 0
  • calibratedragstrmodel 0
  • bedtointervallist 0
  • asereadcounter 0
  • vqsr 0
  • variant quality score recalibration 0
  • annotateintervals 0
  • condensedepthevidence 0
  • heattree 0
  • gatherbqsrreports 0
  • tranche filtering 0
  • createsequencedictionary 0
  • filtervarianttranches 0
  • filterintervals 0
  • estimatelibrarycomplexity 0
  • duplication metrics 0
  • createsomaticpanelofnormals 0
  • targets 0
  • getpileupsumaries 0
  • antibiotic resistance genes 0
  • consensus sequence 0
  • public 0
  • ENA 0
  • SRA 0
  • ANI 0
  • ARGs 0
  • faqcs 0
  • groupreads 0
  • cache 0
  • percent on target 0
  • endogenous DNA 0
  • Streptococcus pyogenes 0
  • swissprot 0
  • duplexumi 0
  • unmapped 0
  • gene-calling 0
  • variant caller 0
  • gamma 0
  • bacterial variant calling 0
  • germline variant calling 0
  • somatic variant calling 0
  • ubam 0
  • lint 0
  • random 0
  • generate 0
  • single molecule 0
  • zipperbams 0
  • germlinevariantsites 0
  • readcountssummary 0
  • embl 0
  • tama_collapse.py 0
  • genomes on a tree 0
  • merge compare 0
  • GNU 0
  • joint-variant-calling 0
  • TAMA 0
  • low coverage 0
  • gget 0
  • genome statistics 0
  • genome manipulation 0
  • genome summary 0
  • gfastats 0
  • gene model 0
  • gstama/merge 0
  • Salmonella Typhi 0
  • extractvariants 0
  • rgi 0
  • fARGene 0
  • amrfinderplus 0
  • abricate 0
  • extract_variants 0
  • gstama/polyacleanup 0
  • gvcftools 0
  • gunzip 0
  • gunc 0
  • archaea 0
  • genome taxonomy database 0
  • GTDB taxonomy 0
  • Mykrobe 0
  • repeat content 0
  • indexfeaturefile 0
  • preprocessintervals 0
  • shiftchain 0
  • selectvariants 0
  • revert 0
  • printsvevidence 0
  • printreads 0
  • postprocessgermlinecnvcalls 0
  • shiftintervals 0
  • snvs 0
  • mutectstats 0
  • mergebamalignment 0
  • leftalignandtrimvariants 0
  • readorientationartifacts 0
  • learnreadorientationmodel 0
  • shiftfasta 0
  • site depth 0
  • genome heterozygosity 0
  • txt 0
  • genome size 0
  • models 0
  • compound 0
  • genome profile 0
  • bgc 0
  • file parsing 0
  • gawk 0
  • splitcram 0
  • variantfiltration 0
  • svcluster 0
  • svannotate 0
  • splitintervals 0
  • genbank 0
  • split by chromosome 0
  • mitochondrial 0
  • illumiation_correction 0
  • BCF 0
  • csi 0
  • deduping 0
  • smaller fastqs 0
  • clumping fastqs 0
  • background_correction 0
  • biallelic 0
  • trimBam 0
  • bamUtil 0
  • bamtools/split 0
  • yaml 0
  • bamtools/convert 0
  • mouse 0
  • homozygosity 0
  • virulent 0
  • chunking 0
  • subtract 0
  • slopBed 0
  • shiftBed 0
  • multinterval 0
  • overlapped bed 0
  • maskfasta 0
  • jaccard 0
  • autozygosity 0
  • overlap 0
  • getfasta 0
  • genomecov 0
  • closest 0
  • bamtobed 0
  • sorting 0
  • bacphlip 0
  • temperate 0
  • bioawk 0
  • nuclear contamination estimate 0
  • Staphylococcus aureus 0
  • installation 0
  • affy 0
  • adapterremoval 0
  • antimicrobial reistance 0
  • contiguate 0
  • lifestyle 0
  • read group 0
  • autofluorescence 0
  • cycif 0
  • background 0
  • single-stranded 0
  • ancientDNA 0
  • authentict 0
  • bias 0
  • utility 0
  • ATLAS 0
  • sequencing_bias 0
  • post mortem damage 0
  • atlas 0
  • mkarv 0
  • http(s) 0
  • unionBedGraphs 0
  • file manipulation 0
  • deletion 0
  • Segmentation 0
  • cutesv 0
  • gct 0
  • cls 0
  • custom 0
  • Cores 0
  • TMA dearray 0
  • paired-end 0
  • UNet 0
  • mcool 0
  • genomic bins 0
  • makebins 0
  • enzyme 0
  • digest 0
  • pcr duplicates 0
  • track 0
  • cooler/balance 0
  • escherichia coli 0
  • circos 0
  • eklipse 0
  • eigenstratdatabasetools 0
  • segment 0
  • blastx 0
  • cumulative coverage 0
  • cload 0
  • subcontigs 0
  • sorted 0
  • compartments 0
  • multiomics 0
  • mkvdjref 0
  • cellpose 0
  • hifi 0
  • Assembly 0
  • domains 0
  • topology 0
  • antibody capture 0
  • calder2 0
  • cadd 0
  • tblastn 0
  • subtyping 0
  • Salmonella enterica 0
  • antigen capture 0
  • crispr 0
  • cmseq 0
  • partition histograms 0
  • target 0
  • export 0
  • antitarget 0
  • access 0
  • protein coding genes 0
  • qa 0
  • polymorphic sites 0
  • polymorphic 0
  • polymut 0
  • chromosome_visualization 0
  • duplicate removal 0
  • chromap 0
  • quality assurnce 0
  • Haemophilus influenzae 0
  • dbnsfp 0
  • genomic intervals 0
  • false duplications 0
  • duplicate purging 0
  • haplotype purging 0
  • cutoff 0
  • panel of normals 0
  • normal database 0
  • Haplotype purging 0
  • gene finding 0
  • contact maps 0
  • bmp 0
  • jpg 0
  • pretext 0
  • contact 0
  • assembly curation 0
  • False duplications 0
  • pmdtools 0
  • bamstat 0
  • read distribution 0
  • inner_distance 0
  • fragment_size 0
  • read_pairs 0
  • experiment 0
  • strandedness 0
  • R 0
  • Assembly curation 0
  • rhocall 0
  • long uncorrected reads 0
  • neighbour-joining 0
  • quast 0
  • purging 0
  • porechop_abi 0
  • variant genetic 0
  • mapping-based 0
  • liftovervcf 0
  • tandem duplications 0
  • insertions 0
  • deletions 0
  • sortvcf 0
  • pcr 0
  • mate-pair 0
  • hybrid-selection 0
  • phylogenetic composition 0
  • illumina datasets 0
  • identification 0
  • prophage 0
  • phantom peaks 0
  • exclude 0
  • identifiers 0
  • indep pairwise 0
  • indep 0
  • variant identifiers 0
  • genetic 0
  • sequence-based 0
  • integrity 0
  • motif 0
  • bam2seqz 0
  • rare variants 0
  • relative coverage 0
  • genetic sex 0
  • sex determination 0
  • induce 0
  • gc_wiggle 0
  • de-novo 0
  • selection 0
  • seq 0
  • header 0
  • error 0
  • longread 0
  • sertotype 0
  • CRAM 0
  • snippy 0
  • core 0
  • sniffles 0
  • POA 0
  • SMN2 0
  • SMN1 0
  • sliding window 0
  • sha256 0
  • density 0
  • boxplot 0
  • exploratory 0
  • shinyngs 0
  • 256 bit 0
  • interleave 0
  • sequence headers 0
  • rtg 0
  • multimapper 0
  • calmd 0
  • duplicate marking 0
  • sambamba 0
  • flagstat 0
  • Ancestor 0
  • insert size 0
  • LCA 0
  • salsa2 0
  • salsa 0
  • rtg-tools 0
  • rocplot 0
  • pedfilter 0
  • faidx 0
  • repair 0
  • grep 0
  • subseq 0
  • variant recalibration 0
  • VQSR 0
  • applyvarcal 0
  • assembly-binning 0
  • paired 0
  • clusteridentifier 0
  • cluster analysis 0
  • scramble 0
  • readgroup 0
  • read pairs 0
  • ChIP-Seq 0
  • pedigrees 0
  • haplotype resolution 0
  • legionella 0
  • lofreq/filter 0
  • lofreq/call 0
  • Listeria monocytogenes 0
  • limma 0
  • pneumophila 0
  • clinical 0
  • collapsing 0
  • adapter removal 0
  • train 0
  • spliced 0
  • reorder 0
  • combining 0
  • kofamscan 0
  • qualities 0
  • pneumoniae 0
  • estimate 0
  • metagenome-assembled genomes 0
  • maxbin2 0
  • representations 0
  • reduced 0
  • mash/sketch 0
  • taxonomic assignment 0
  • damage patterns 0
  • functional genomics 0
  • NGS 0
  • DNA damage 0
  • rra 0
  • maximum-likelihood 0
  • CRISPR-Cas9 0
  • sgRNA 0
  • kegg 0
  • Klebsiella 0
  • mcr-1 0
  • pos 0
  • IDR 0
  • panel_of_normals 0
  • haemophilus 0
  • annotations 0
  • multicut 0
  • hmtnote 0
  • Hidden Markov Model 0
  • amino acid 0
  • HMMER 0
  • readcounter 0
  • gccounter 0
  • pixel classification 0
  • effective genome size 0
  • Jupyter 0
  • k-mer counting 0
  • digital normalization 0
  • quant 0
  • kallisto/index 0
  • papermill 0
  • jupytext 0
  • Python 0
  • pixel_classification 0
  • insertion 0
  • genomic islands 0
  • interproscan 0
  • probability_maps 0
  • mass-spectroscopy 0
  • MD5 0
  • read 0
  • combine graphs 0
  • hla-typing 0
  • tumor/normal 0
  • graph viz 0
  • graph formats 0
  • graph unchopping 0
  • graph stats 0
  • odgi 0
  • HLA-I 0
  • squeeze 0
  • graph drawing 0
  • graph construction 0
  • Neisseria gonorrhoeae 0
  • ngm 0
  • ILP 0
  • block-compressed 0
  • sequencing summary 0
  • paragraph 0
  • pair-end 0
  • pbp 0
  • subreads 0
  • pbmerge 0
  • pbbam 0
  • graphs 0
  • select 0
  • PCR/optical duplicates 0
  • restriction fragments 0
  • pairstools 0
  • pairtools 0
  • ligation junctions 0
  • upper-triangular matrix 0
  • flip 0
  • NextGenMap 0
  • mobile element insertions 0
  • 128 bit 0
  • contour map 0
  • mbias 0
  • methylation bias 0
  • unionsum 0
  • ploidy 0
  • smudgeplot 0
  • Merqury 0
  • 3D heat map 0
  • de Bruijn 0
  • Neisseria meningitidis 0
  • rma6 0
  • daa 0
  • debruijn 0
  • denovo 0
  • megahit 0
  • assembler 0
  • microrna 0
  • somatic structural variations 0
  • mitochondrial to nuclear ratio 0
  • cancer genome 0
  • contaminant 0
  • SNP table 0
  • GATK UnifiedGenotyper 0
  • ratio 0
  • target prediction 0
  • mtnucratio 0
  • scan 0
  • otu table 0
  • mosdepth 0
  • reference genome 0
  • mitochondrial genome 0
  • patch 0

ADMIXTURE is a program for estimating ancestry in a model-based manner from large autosomal SNP genotype datasets, where the individuals are unrelated (for example, the individuals in a case-control association study).

01230

ancestry_fractions allele_frequencies versions

A tool to parse and summarise results from antimicrobial peptides tools and present functional classification.

0100

sample_dir txt csv faa summary_csv summary_html log results_db results_db_dmnd results_db_fasta results_db_tsv versions

A submodule that clusters the merged AMP hits generated from ampcombi2/parsetables and ampcombi2/complete using MMseqs2 cluster.

0

cluster_tsv rep_cluster_tsv log versions

ampcombi2/cluster:

A tool for clustering all AMP hits found across many samples and supporting many AMP prediction tools.

A submodule that merges all output summary tables from ampcombi/parsetables in one summary file.

0

tsv log versions

ampcombi2/complete:

This merges the per sample AMPcombi summaries generated by running 'ampcombi2/parsetables'.

A submodule that parses and standardizes the results from various antimicrobial peptide identification tools.

0100000

sample_dir contig_gbks db_tsv tsv faa sample_log full_log db db_txt db_fasta db_mmseqs versions

ampcombi2/parsetables:

A parsing tool to convert and summarise the outputs from multiple AMP detection tools in a standardized format.

A fast and user-friendly method to predict antimicrobial peptides (AMPs) from any given size protein dataset. ampir uses a supervised statistical machine learning approach to predict AMPs.

01000

amps_faa amps_tsv versions

AMPlify is an attentive deep learning model for antimicrobial peptide prediction.

010

tsv versions

amplify:

Attentive deep learning model for antimicrobial peptide prediction

Post-processing script of the MaltExtract component of the HOPS package

000

json summary_pdf tsv candidate_pdfs versions

Calculates base frequency statistics across reference positions from BAM.

0123

depth_sample depth_global qs pos counts icounts versions

angsd:

ANGSD: Analysis of next generation Sequencing Data

Extracts reads mapped to chromosome 6 and any HLA decoys or chromosome 6 alternates.

01

extracted_reads_fastq log intermediate_sam intermediate_bam intermediate_sorted_bam versions

arcashla:

arcasHLA performs high resolution genotyping for HLA class I and class II genes from RNA sequencing, supporting both paired and single-end samples.

Demultiplex Element Biosciences bases files

012

sample_fastq sample_json qc_report run_stats generated_run_manifest metrics unassigned versions

BBNorm is designed to normalize coverage by down-sampling reads over high-depth areas of a genome, to result in a flat coverage distribution.

01

fastq log versions

bbmap:

BBMap is a short read aligner, as well as various other bioinformatic tools.

Converts certain output formats to VCF

012010

vcf_gz vcf bcf_gz bcf hap legend samples tbi csi versions

bcftools:

BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. All commands work transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed. Most commands accept VCF, bgzipped VCF and BCF with filetype detected automatically even when streaming from a pipe. Indexed VCF and BCF will work in all situations. Un-indexed VCF and BCF and streams will work in most, but not all situations.

Sets genotypes according to the specified criteria and filtering expressions. For example, missing genotypes can be set to ref, but much more than that.

0120000

vcf tbi csi versions

bcftools:

BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. All commands work transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed. Most commands accept VCF, bgzipped VCF and BCF with filetype detected automatically even when streaming from a pipe. Indexed VCF and BCF will work in all situations. Un-indexed VCF and BCF and streams will work in most, but not all situations.

bcftools plugin setGT:

Bcftools plugins are tools that can be used with bcftools to manipulate variant calls in Variant Call Format (VCF) and BCF. The setGT plugin sets genotypes according to the specified criteria and filtering expressions. For example, missing genotypes can be set to ref, but much more than that.

Split VCF by sample, creating single- or multi-sample VCFs.

0120000

vcf tbi csi versions

pluginsplit:

Split VCF by sample, creating single- or multi-sample VCFs.

Reheader a VCF file

012301

vcf index versions

reheader:

Modify header of VCF/BCF files, change sample names.

Uses Bismark report files of several samples in a run folder to generate a graphical summary HTML report.

00000

summary versions

bismark:

Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.

Re-estimate taxonomic abundance of metagenomic samples analyzed by kraken.

010

reports txt versions

bracken:

Bracken (Bayesian Reestimation of Abundance with KrakEN) is a highly accurate statistical method that computes the abundance of species in DNA sequences from a metagenomics sample.

Extends a Kraken2 database to be compatible with Bracken

01

db bracken_files versions

bracken:

Bracken (Bayesian Reestimation of Abundance with KrakEN) is a highly accurate statistical method that computes the abundance of species in DNA sequences from a metagenomics sample.

Combine output of metagenomic samples analyzed by bracken.

01

txt versions

bracken:

Bracken (Bayesian Reestimation of Abundance with KrakEN) is a highly accurate statistical method that computes the abundance of species in DNA sequences from a metagenomics sample.

Convert paired-end bwa SA coordinate files to SAM format

01201

bam versions

bwa:

BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome.

Cellsnp-lite is a C/C++ tool for efficient genotyping bi-allelic SNPs on single cells. You can use the mode A of cellsnp-lite after read alignment to obtain the snp x cell pileup UMI or read count matrices for each alleles of given or detected SNPs for droplet based single cell data.

01234

base cell sample allele_depth depth_coverage depth_other versions

cellsnp:

Efficient genotyping bi-allelic SNPs on single cells

A method to improve mappings on circular genomes, using the BWA mapper.

010101

fasta elongated versions

circulargenerator:

Creating a modified reference genome, with an elongation of the an specified amount of bases

binning of metagenomic sequences

01

fasta bins fm index links result versions

ClipKIT is a fast and flexible alignment trimming tool that keeps phylogenetically informative sites and removes those that display characteristics poor phylogenetic signal.

010

clipkit log versions

Compile a coverage reference from the given files (normal samples).

000

cnn versions

cnvkit:

CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent.

Unsupervised binning of metagenomic contigs by using nucleotide composition - kmer frequencies - and coverage data for multiple samples

012

args_txt clustering_csv log_txt original_data_csv pca_components_csv pca_transformed_csv versions

concoct:

Clustering cONtigs with COverage and ComposiTion

Copy number and genotype annotation from whole genome and whole exome sequencing data

0123456000000000

bedgraph control_cpn sample_cpn gcprofile_cpn BAF CNV info ratio config versions

controlfreec/freec:

Copy number and genotype annotation from whole genome and whole exome sequencing data.

Annotate a VEP annotated VCF with the most severe consequence field

0101

vcf versions

custom:

Custom module to annotate a VEP annotated VCF with the most severe consequence field

Annotate a VEP annotated VCF with the most severe pLi field

01

vcf versions

custom:

Custom module to annotate a VEP annotated VCF with the most severe pLi field

filter a matrix based on a minimum value and numbers of samples that must pass.

0101

filtered tests session_info versions

matrixfilter:

filter a matrix based on a minimum value and numbers of samples

decoupler is a package containing different statistical methods to extract biological activities from omics data within a unified framework. It allows to flexibly test any enrichment method with any prior knowledge resource and incorporates methods that take into account the sign and weight. It can be used with any omic, as long as its features can be linked to a biological process based on prior knowledge. For example, in transcriptomics gene sets regulated by a transcription factor, or in phospho-proteomics phosphosites that are targeted by a kinase.

0100

dc_estimate dc_pvals versions

Visualises sample correlations using a compressed matrix generated by mutlibamsummary or multibigwigsummary as input.

0100

pdf matrix versions

deeptools:

A set of user-friendly tools for normalization and visualization of deep-sequencing data

Call variants from the examples produced by make_examples

01

call_variants_tfrecords versions

deepvariant:

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

Transforms the input alignments to a format suitable for the deep neural network variant caller

012301010101

examples gvcf small_model_calls versions

deepvariant:

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

Performs rapid genome comparisons for a group of genomes and visualize their relatedness

01

directory versions

drep:

De-replication of microbial genomes assembled from multiple samples

SV callers like lumpy look at split-reads and pair distances to find structural variants. This tool is a fast way to add depth information to those calls. This can be used as additional information for filtering variants; for example we will be skeptical of deletion calls that do not have lower than average coverage compared to regions with similar gc-content.

01234500

vcf versions

Convert any PEP project or Nextflow samplesheet to any format

000

versions samplesheet_converted

eido:

Convert any PEP project or Nextflow samplesheet to any format

Validate samplesheet or PEP config against a schema

000

versions log

validate:

Validate samplesheet or PEP config against a schema.

Merge STR profiles into a multi-sample STR profile

010101

merged_profiles versions

expansionhunterdenovo:

ExpansionHunter Denovo (EHdn) is a suite of tools for detecting novel expansions of short tandem repeats (STRs).

fq subsample outputs a subset of records from single or paired FASTQ files. This requires a seed (--seed) to be set in ext.args.

01

fastq versions

fq:

fq is a library to generate and validate FASTQ file pairs.

Demultiplex fastq files

0123

sample_fastq metrics most_frequent_unmatched versions

Bootstrap sample demixing by resampling each site based on a multinomial distribution of read depth across all sites, where the event probabilities were determined by the fraction of the total sample reads found at each site, followed by a secondary resampling at each site according to a multinomial distribution (that is, binomial when there was only one SNV at a site), where event probabilities were determined by the frequencies of each base at the site, and the number of trials is given by the sequencing depth.

012000

lineages summarized versions

freyja:

Freyja recovers relative lineage abundances from mixed SARS-CoV-2 samples and provides functionality to analyze lineage dynamics.

specify the relative abundance of each known haplotype

01200

demix versions

freyja:

Freyja recovers relative lineage abundances from mixed SARS-CoV-2 samples and provides functionality to analyze lineage dynamics.

downloads new versions of the curated SARS-CoV-2 lineage file and barcodes

0

barcodes lineages_topology lineages_meta versions

freyja:

Freyja recovers relative lineage abundances from mixed SARS-CoV-2 samples and provides functionality to analyze lineage dynamics.

call variant and sequencing depth information of the variant

010

variants versions

freyja:

Freyja recovers relative lineage abundances from mixed SARS-CoV-2 samples and provides functionality to analyze lineage dynamics.

GangSTR is a tool for genome-wide profiling tandem repeats from short reads.

012300

vcf samplestats versions

Generate a multi-sample report file from the output of ganon report runs

01

txt versions

ganon:

ganon classifies short DNA sequences against large sets of genomic reference sequences efficiently

Calculates the fraction of reads from cross-sample contamination based on summary tables from getpileupsummaries. Output to be used with filtermutectcalls.

012

contamination segmentation versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Combine per-sample gVCF files produced by HaplotypeCaller into a multi-sample gVCF file

012000

combined_gvcf versions

gatk4:

Genome Analysis Toolkit (GATK4). Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

This tool looks for low-complexity STR sequences along the reference that are later used to estimate the Dragstr model during single sample auto calibration CalibrateDragstrModel.

000

str_table versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Creates a panel of normals (PoN) for read-count denoising given the read counts for samples in the panel.

01

pon versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Determines the baseline contig ploidy for germline samples given counts data

0123010

calls model versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

merge GVCFs from multiple samples. For use in joint genotyping or somatic panel of normal creation.

012345000

genomicsdb updatedb intervallist versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Perform joint genotyping on one or more samples pre-called with HaplotypeCaller.

012340101010101

vcf tbi versions

gatk4:

Genome Analysis Toolkit (GATK4)

Calls copy-number variants in germline samples given their counts and the output of DetermineGermlineContigPloidy.

01234

cohortcalls cohortmodel casecalls versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Condenses homRef blocks in a single-sample GVCF

012300000

vcf versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

Build a recalibration model to score variant quality for filtering purposes. It is highly recommended to follow GATK best practices when using this module, the gaussian mixture model requires a large number of samples to be used for the tool to produce optimal results. For example, 30 samples for exome data. For more details see https://gatk.broadinstitute.org/hc/en-us/articles/4402736812443-Which-training-sets-arguments-should-I-use-for-running-VQSR-

012000000

recal idx tranches plots versions

gatk4:

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.

geofetch is a command-line tool that downloads and organizes data and metadata from GEO and SRA

0

samples versions

Generates haplotype calls by sampling haplotype estimates

01

haplo_sampled versions

glimpse:

GLIMPSE is a phasing and imputation method for large-scale low-coverage sequencing studies.

Program to compute the genotyping error rate at the sample or marker level.

0123456780123456

errors_cal errors_grp errors_spl rsquare_grp rsquare_spl rsquare_per_site versions

glimpse2:

GLIMPSE2 is a phasing and imputation method for large-scale low-coverage sequencing studies.

GMM-Demux is a Gaussian-Mixture-Model-based software for processing sample barcoding data (cell hashing and MULTI-seq).

0120000

barcodes matrix features classification_report config_report summary_report versions

Sort GTF files in chr/pos/feature order

0

gtf versions

The hap-ibd program detects identity-by-descent (IBD) segments and homozygosity-by-descent (HBD) segments in phased genotype data. The hap-ibd program can analyze data sets with hundreds of thousands of samples.

0100

hbd ibd log versions

pacbio structural variant calling tool

01201201

vcf csv versions

igv.js is an embeddable interactive genome visualization component

012

browser align_files index_files versions

igv:

Create an embeddable interactive genome browser component. Output files are expected to be present in the same directory as the genome browser html file. To visualise it, files have to be served. Check the documentation at: https://github.com/igvteam/igv-webapp for an example and https://github.com/igvteam/igv.js/wiki/Data-Server-Requirements for server requirements

Remove polyA tail and artificial concatemers

metabamprimers

meta bam pbi consensusreadset summary report versions

isoseq3:

IsoSeq3 - Scalable De Novo Isoform Discovery

Generate a consensus sequence from a BAM file using iVar

0100

fasta qual mpileup versions

ivar:

iVar - a computational package that contains functions broadly useful for viral amplicon-based sequencing.

Trim primer sequences rom a BAM file with iVar

0120

bam log versions

ivar:

iVar - a computational package that contains functions broadly useful for viral amplicon-based sequencing.

Call variants from a BAM file using iVar

010000

tsv mpileup versions

ivar:

iVar - a computational package that contains functions broadly useful for viral amplicon-based sequencing.

Jointly Accurate Sv Merging with Intersample Network Edges

012301010

vcf versions

Lofreq subcommand to call low frequency variants from alignments when tumor-normal paired samples are available

0123450101

vcf versions

lofreq:

A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data

Lofreq subcommand to call low frequency variants from alignments when tumor-normal paired samples are available

0101

bam versions

lofreq:

A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data

A tool that mines antimicrobial peptides (AMPs) from (meta)genomes by predicting peptides from genomes (provided as contigs) and outputs all the predicted anti-microbial peptides found.

01

smorfs all_orfs amp_prediction readme_file log_file versions

macrel:

A pipeline for AMP (antimicrobial peptide) prediction

Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads. It is optimized for analysis of germline variation in small sets of individuals and somatic variation in tumor/normal sample pairs.

0123401010

candidate_small_indels_vcf candidate_small_indels_vcf_tbi candidate_sv_vcf candidate_sv_vcf_tbi diploid_sv_vcf diploid_sv_vcf_tbi versions

manta:

Structural variant and indel caller for mapped sequencing data

Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads. It is optimized for analysis of germline variation in small sets of individuals and somatic variation in tumor/normal sample pairs.

012345601010

candidate_small_indels_vcf candidate_small_indels_vcf_tbi candidate_sv_vcf candidate_sv_vcf_tbi diploid_sv_vcf diploid_sv_vcf_tbi somatic_sv_vcf somatic_sv_vcf_tbi versions

manta:

Structural variant and indel caller for mapped sequencing data

Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads. It is optimized for analysis of germline variation in small sets of individuals and somatic variation in tumor/normal sample pairs.

0123401010

candidate_small_indels_vcf candidate_small_indels_vcf_tbi candidate_sv_vcf candidate_sv_vcf_tbi tumor_sv_vcf tumor_sv_vcf_tbi versions

manta:

Structural variant and indel caller for mapped sequencing data

Build MetaPhlAn database for taxonomic profiling.

NO input

db versions

metaphlan:

Identify clades (phyla to species) present in the metagenome obtained from a microbiome sample and their relative abundance

Merges output abundance tables from MetaPhlAn4

01

txt versions

metaphlan4:

Identify clades (phyla to species) present in the metagenome obtained from a microbiome sample and their relative abundance

MetaPhlAn is a tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data.

010

profile biom bt2out versions

metaphlan:

Identify clades (phyla to species) present in the metagenome obtained from a microbiome sample and their relative abundance

Merges output abundance tables from MetaPhlAn3

01

txt versions

metaphlan3:

Identify clades (phyla to species) present in the metagenome obtained from a microbiome sample and their relative abundance

MetaPhlAn is a tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data.

010

profile biom bt2out versions

metaphlan3:

Identify clades (phyla to species) present in the metagenome obtained from a microbiome sample and their relative abundance

Demultiplex MGI fastq files

012

fastq undetermined ambiguous undetermined_reports ambiguous_reports general_info_reports index_reports sample_stat_reports qc_reports versions

mgikit demultiplex:

Demultiplex MGI fastq files

mirtop counts generates a file with the minimal information about each sequence and the count data in columns for each samples.

0101012

tsv versions

mirtop:

Small RNA-seq annotation

A tool for quality control and tracing taxonomic origins of microRNA sequencing data

0120

html json tsv all_fa rnatype_unknown_fa versions

mirtrace:

miRTrace is a new quality control and taxonomic tracing tool developed specifically for small RNA sequencing data (sRNA-Seq). Each sample is characterized by profiling sequencing quality, read length, sequencing depth and miRNA complexity and also the amounts of miRNAs versus undesirable sequences (derived from tRNAs, rRNAs and sequencing artifacts). In addition to these routine quality control (QC) analyses, miRTrace can accurately and sensitively resolve taxonomic origins of small RNA-Seq data based on the composition of clade-specific miRNAs. This feature can be used to detect cross-clade contaminations in typical lab settings. It can also be applied for more specific applications in forensics, food quality control and clinical diagnosis, for instance tracing the origins of meat products or detecting parasitic microRNAs in host serum.

msisensor2 detection of MSI regions.

01234500

msi distribution somatic versions

msisensor2:

MSIsensor2 is a novel algorithm based machine learning, featuring a large upgrade in the microsatellite instability (MSI) detection for tumor only sequencing data, including Cell-Free DNA (cfDNA), Formalin-Fixed Paraffin-Embedded(FFPE) and other sample types. The original MSIsensor is specially designed for tumor/normal paired sequencing data.

msisensor2 detection of MSI regions.

00

scan versions

msisensor2:

MSIsensor2 is a novel algorithm based machine learning, featuring a large upgrade in the microsatellite instability (MSI) detection for tumor only sequencing data, including Cell-Free DNA (cfDNA), Formalin-Fixed Paraffin-Embedded(FFPE) and other sample types. The original MSIsensor is specially designed for tumor/normal paired sequencing data.

Aggregate results from bioinformatics analyses across many samples into a single report

000000

report data plots versions

multiqc:

MultiQC searches a given directory for analysis logs and compiles a HTML report. It's a general use tool, perfect for summarising the output from numerous bioinformatics tools.

Computes tier-based cutoffs from a sample-specific error model which is generated by muse/call and reports the finalized variants

01012

vcf tbi versions

MuSE:

Somatic point mutation caller based on Markov substitution model for molecular evolution

NACHO (NAnostring quality Control dasHbOard) is developed for NanoString nCounter data. NanoString nCounter data is a messenger-RNA/micro-RNA (mRNA/miRNA) expression assay and works with fluorescent barcodes. Each barcode is assigned a mRNA/miRNA, which can be counted after bonding with its target. As a result each count of a specific barcode represents the presence of its target mRNA/miRNA.

0101

normalized_counts normalized_counts_wo_HK versions

NACHO:

R package that uses two main functions to summarize and visualize NanoString RCC files, namely: load_rcc() and visualise(). It also includes a function normalise(), which (re)calculates sample specific size factors and normalises the data. For more information vignette("NACHO") and vignette("NACHO-analysis")

NACHO (NAnostring quality Control dasHbOard) is developed for NanoString nCounter data. NanoString nCounter data is a messenger-RNA/micro-RNA (mRNA/miRNA) expression assay and works with fluorescent barcodes. Each barcode is assigned a mRNA/miRNA, which can be counted after bonding with its target. As a result each count of a specific barcode represents the presence of its target mRNA/miRNA.

0101

nacho_qc_reports nacho_qc_png nacho_qc_txt versions

NACHO:

R package that uses two main functions to summarize and visualize NanoString RCC files, namely: load_rcc() and visualise(). It also includes a function normalise(), which (re)calculates sample specific size factors and normalises the data. For more information vignette("NACHO") and vignette("NACHO-analysis")

Determines the gender of a sample from the BAM/CRAM file.

01201010

tsv versions

ngsbits:

Short-read sequencing tools

Samples a SAM/BAM/CRAM file using flowcell position information for the best approximation of having sequenced fewer reads

012

bam bai num_reads versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

changes name of sample in the vcf file

01

vcf versions

picard:

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.

Main caller script for peak calling

0120

divergent_TREs bidirectional_TREs unidirectional_TREs peakcalling_log versions

pints:

Peak Identifier for Nascent Transcripts Starts (PINTS)

Recodes plink bfiles into a new text fileset applying different modifiers

0123

ped map txt raw traw beagledat chrdat chrmap geno pheno pos phase info lgen list gen gengz sample rlist strctin tped tfam vcf vcfgz versions

plink:

Whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.

Filters plink bfiles or pfiles with maf filters

01230

bed bim fam pgen pvar psam versions

plink2:

Whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner

Remove samples from a plink2 dataset

01230

remove_bim remove_bed remove_fam remove_pgen remove_psam remove_pvar versions

plink2:

Whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner

Apply a scoring system to each sample in a plink 2 fileset

01230

score versions

plink2:

Whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner

Convert from VCF file to BGEN file version 1.2 format preserving dosages.

01234

bgen_file sample_file log_file versions

plink2:

Whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner

Software to deconvolute sample identity and identify multiplets when multiple samples are pooled by barcoded single cell sequencing and external genotyping data for each sample is available.

0123

demuxlet_result versions

popscle:

A suite of population scale analysis tools for single-cell genomics data including implementation of Demuxlet / Freemuxlet methods and auxiliary tools

Software to deconvolute sample identity and identify multiplets when multiple samples are pooled by barcoded single cell sequencing and external genotyping data for each sample is not available.

012

result vcf lmix singlet_result singlet_vcf versions

popscle:

A suite of population scale analysis tools for single-cell genomics data including implementation of Demuxlet / Freemuxlet methods and auxiliary tools

Perform logratio-based correlation analysis -> get proportionality & basis shrinkage partial correlation coefficients. One can also compute standard correlation coefficients, if required.

01

propr matrix fdr adj warnings session_info versions

propr:

Logratio methods for omics data

corpcor:

Efficient Estimation of Covariance and (Partial) Correlation

Calculate intervals coverage for each sample. N.B. the tool can not handle staging files with symlinks, stageInMode should be set to 'link'.

0120

txt png loess_qc_txt loess_txt versions

purecn:

Copy number calling and SNV classification using targeted short read sequencing

Run PureCN workflow to normalize, segment and determine purity and ploidy

01200

pdf local_optima_pdf seg genes_csv amplification_pvalues_csv vcf_gz variants_csv loh_csv chr_pdf segmentation_pdf multisample_seg versions

purecn:

Copy number calling and SNV classification using targeted short read sequencing

Demultiplexer for Nanopore samples

010

reads versions

Randomly subsample sequencing reads to a specified coverage

0120

reads versions

Calculation of optimal P-site offsets, diagnostic analysis and visual inspection of ribosome profiling data

010101

best_offset offset offset_plot psites codon_coverage_rpf codon_coverage_psite cds_coverage cds_window_coverage ribowaltz_qc versions

Module to validate illuminaยฎ Sample Sheet v2 files.

010

samplesheet versions

Clips read alignments where they match BED file defined regions

01000

bam stats rejects_bam versions

samtools:

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.

Call peaks using SEACR on sequenced reads in bedgraph format

0120

bed versions

seacr:

SEACR is intended to call peaks and enriched regions from sparse CUT&RUN or chromatin profiling data in which background is dominated by "zeroes" (i.e. regions with no read coverage).

Accelerated implementation of the GATK DepthOfCoverage tool.

01201010101

per_locus sample_summary statistics coverage_counts coverage_proportions interval_summary versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Perform joint genotyping on one or more samples pre-called with Sentieon's Haplotyper.

012301010101

vcf_gz vcf_gz_tbi versions

sentieon:

Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.

Seqcluster collapse reduces computational complexity by collapsing identical sequences in a FASTQ file.

01

fastq versions

seqcluster:

Small RNA analysis from NGS data. Seqcluster generates a list of clusters of small RNA sequences, their genome location, their annotation and the abundance in all the sample of the project.

Subsample reads from FASTQ files

012

reads versions

seqtk:

Seqtk is a fast and lightweight tool for processing sequences in the FASTA or FASTQ format. Seqtk sample command subsamples sequences.

PileupCaller is a tool to create genotype calls from bam files using read-sampling methods

0100

eigenstrat plink freqsum versions

sequencetools:

Tools for population genetics on sequencing data

Demultiplex bgzip'd fastq files

012

sample_fastq metrics most_frequent_unmatched per_project_metrics per_sample_metrics sample_barcode_hop_metrics versions

validate consistency of feature and sample annotations with matrices and contrasts

0120101

sample_meta feature_meta assays contrasts versions

shinyngs:

Provides Shiny applications for various array and NGS applications. Currently very RNA-seq centric, with plans for expansion.

Somalier can extract informative sites, evaluate relatedness, and perform quality-control on BAM/CRAM/BCF/VCF/GVCF or from jointly-called VCFs

0120

html pairs_tsv samples_tsv versions

somalier:

Somalier can extract informative sites, evaluate relatedness, and perform quality-control on BAM/CRAM/BCF/VCF/GVCF or from jointly-called VCFs

Classifies and predicts the origin of metagenomic samples

010000

report versions

Serotype STEC samples from paired-end reads or assemblies

01

tsv versions

STITCH is an R program for reference panel free, read aware, low coverage sequencing genotype imputation. STITCH runs on a set of samples with sequencing reads in BAM format, as well as a list of positions to genotype, and outputs imputed genotypes in VCF format.

0123456789100120

input rdata plots vcf bgen versions

Strelka2 is a fast and accurate small variant caller optimized for analysis of germline variation in small cohorts and somatic variation in tumor/normal sample pairs

01234567800

vcf_indels vcf_indels_tbi vcf_snvs vcf_snvs_tbi versions

strelka:

Strelka calls somatic and germline small variants from mapped sequencing reads

SummarizedExperiment container

010101

rds log versions

summarizedexperiment:

The SummarizedExperiment container contains one or more assays, each represented by a matrix-like object of numeric or other mode. The rows typically represent genomic ranges of interest and the columns represent samples.

Compare or merge VCF files to generate a consensus or multi sample VCF files.

01000000

vcf versions

survivor:

Toolset for SV simulation, comparison and filtering

Count the instances of each SVTYPE observed in each sample in a VCF.

01

counts versions

svtk:

Utilities for consolidating, filtering, resolving, and annotating structural variants.

SVTyper-sso computes structural variant (SV) genotypes based on breakpoint depth on a SINGLE sample

012301

gt_vcf json versions

svtyper:

Bayesian genotyper for structural variants

Merge TRGT VCFs from multiple samples

0120101

vcf versions

trgt:

Tandem repeat genotyping and visualization from PacBio HiFi data

Run TRUST4 on RNA-seq data

01201010101

tsv airr_files airr_tsv report_tsv fasta out fq outs versions

Subsample a long-read sequencing fastq file for multiple assemblies

01

subreads versions

trycycler:

Trycycler is a tool for generating consensus long-read assemblies for bacterial genomes

Extracts UMI barcode from a read and add it to the read name, leaving any sample barcode in place

01

reads log versions

umi_tools:

UMI-tools contains tools for dealing with Unique Molecular Identifiers (UMIs)/Random Molecular Tags (RMTs) and single cell RNA-Seq cell barcodes

Filtering, downsampling and profiling alignments in BAM/CRAM formats

01

bam versions

Obtains per-sample observations for the actual calling process with varlociraptor calls

012340101

bcf_gz vcf_gz bcf vcf versions

varlociraptor:

Flexible, uncertainty-aware variant calling with parameter free filtration via FDR control.

Generates a VCF stream where AC and NS have been generated for each record using sample genotypes.

012

vcf versions

vcflib:

Command-line tools for manipulating VCF files

Velocyto is a library for the analysis of RNA velocity. velocyto.py CLI use Path(resolve_path=True) and breaks the nextflow logic of symbolic links. If in the work dir velocyto find a file named EXACTLY cellsorted_[ORIGINAL_BAM_NAME] it will skip the samtools sort step. Cellsorted bam file should be cell sorted with:

    samtools sort -t CB -O BAM -o cellsorted_input.bam input.bam

See module test for an example with the SAMTOOLS_SORT nf-core module. Config example to cellsort input bam using SAMTOOLS_SORT:

    withName: SAMTOOLS_SORT {
        ext.prefix = { "cellsorted_${bam.baseName}" }
        ext.args = '-t CB -O BAM'
    }

Optional mask must be passed with ext.args and option --mask This is why I need to stage in the work dir 2 bam files (cellsorted and original). See also velocyto tutorial

01230

loom versions

Detecting and estimating inter-sample DNA contamination became a crucial quality assessment step to ensure high quality sequence reads and reliable downstream analysis.

0120

log selfsm depthsm selfrg depthrg bestsm bestrg versions

verifybamid:

verifyBamID is a software that verifies whether the reads in particular file match previously known genotypes for an individual (or group of individuals), and checks whether the reads are contaminated as a mixture of two samples.

Detecting and estimating inter-sample DNA contamination became a crucial quality assessment step to ensure high quality sequence reads and reliable downstream analysis.

01201200

log ud bed mu self_sm ancestry versions

verifybamid2:

A robust tool for DNA contamination estimation from sequence reads using ancestry-agnostic method.

calculate secondary structures of two RNAs with dimerization

01

rnacofold_csv rnacofold_ps versions

viennarna:

calculate secondary structures of two RNAs with dimerization

The program works much like RNAfold, but allows one to specify two RNA sequences which are then allowed to form a dimer structure. RNA sequences are read from stdin in the usual format, i.e. each line of input corresponds to one sequence, except for lines starting with > which contain the name of the next sequence. To compute the hybrid structure of two molecules, the two sequences must be concatenated using the & character as separator. RNAcofold can compute minimum free energy (mfe) structures, as well as partition function (pf) and base pairing probability matrix (using the -p switch) Since dimer formation is concentration dependent, RNAcofold can be used to compute equilibrium concentrations for all five monomer and (homo/hetero)-dimer species, given input concentrations for the monomers. Output consists of the mfe structure in bracket notation as well as PostScript structure plots and โ€œdot plotโ€ files containing the pair probabilities, see the RNAfold man page for details. In the dot plots a cross marks the chain break between the two concatenated sequences. The program will continue to read new sequences until a line consisting of the single character @ or an end of file condition is encountered.

Cluster sequences using a single-pass, greedy centroid-based clustering algorithm.

01

aln biom mothur otu bam out blast uc centroids clusters profile msa versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Merge strictly identical sequences contained in filename. Identical sequences are defined as having the same length and the same string of nucleotides (case insensitive, T and U are considered the same).

01

fasta clustering log versions

vsearch:

A versatile open source tool for metagenomics (USEARCH alternative)

Performs quality filtering and / or conversion of a FASTQ file to FASTA format.

01

fasta log versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Taxonomic classification using the sintax algorithm.

010

tsv versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Sort fasta entries by decreasing abundance (--sortbysize) or sequence length (--sortbylength).

010

fasta versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Compare target sequences to fasta-formatted query sequences using global pairwise alignment.

010000

aln biom lca mothur otu sam tsv txt uc versions

vsearch:

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

Convert and filter aligned reads to .npz

0120101

npz versions

wisecondorx:

WIthin-SamplE COpy Number aberration DetectOR, including sex chromosomes

Returns the gender of a .npz resulting from convert, based on a Gaussian mixture model trained during the newref phase

0101

gender versions

wisecondorx:

WIthin-SamplE COpy Number aberration DetectOR, including sex chromosomes

Create a new reference using healthy reference samples

01

npz versions

wisecondorx:

WIthin-SamplE COpy Number aberration DetectOR, including sex chromosomes

Find copy number aberrations

010101

aberrations_bed bins_bed segments_bed chr_statistics chr_plots genome_plot versions

wisecondorx:

WIthin-SamplE COpy Number aberration DetectOR, including sex chromosomes

The xeniumranger rename module allows you to change the sample region_name and cassette_name throughout all the Xenium Onboard Analysis output files that contain this information.

0100

outs versions

xeniumranger:

Xenium Ranger is a set of analysis pipelines that process Xenium In Situ Gene Expression data to relabel, resegment, or import new segmentation results from community-developed tools. Xenium Ranger provides flexible off-instrument reanalysis of Xenium In Situ data. Relabel transcripts, resegment cells with the latest 10x segmentation algorithms, or import your own segmentation data to assign transcripts to cells.

Click here to trigger an update.