Available Modules
Modules are the building stones of all DSL2 nf-core blocks. You can find more info from nf-core website, if you would like to write your own module.
Calculates base frequency statistics across reference positions from BAM.
0
1
2
3
depth_sample
depth_global
qs
pos
counts
icounts
versions
ANGSD: Analysis of next generation Sequencing Data
Calculated genotype likelihoods from BAM files.
0
1
0
1
0
1
genotype_likelihood
versions
ANGSD: Analysis of next generation Sequencing Data
Extracts reads mapped to chromosome 6 and any HLA decoys or chromosome 6 alternates.
0
1
extracted_reads_fastq
log
intermediate_sam
intermediate_bam
intermediate_sorted_bam
versions
arcasHLA performs high resolution genotyping for HLA class I and class II genes from RNA sequencing, supporting both paired and single-end samples.
Run the alignment/variant-call/consensus logic of the artic pipeline
0
1
0
1
2
0
1
2
results
bam
bai
bam_trimmed
bai_trimmed
bam_primertrimmed
bai_primertrimmed
fasta
vcf
tbi
json
versions
ARTIC pipeline - a bioinformatics pipeline for working with virus sequencing data sequenced with nanopore
copy number profiles of tumour cells.
0
1
2
3
4
0
0
0
0
0
0
allelefreqs
bafs
cnvs
logrs
metrics
png
purityploidy
segments
versions
generate VCF file from a BAM file using various calling methods
0
1
2
3
4
0
0
0
0
vcf
versions
ATLAS, a suite of methods to accurately genotype and estimate genetic diversity
Estimate the post-mortem damage patterns of DNA
0
1
2
3
0
0
empiric
exponential
counts
table
versions
ATLAS, a suite of methods to accurately genotype and estimate genetic diversity
split single end read groups by length and merge paired end reads
0
1
2
3
4
bam
txt
versions
ATLAS, a suite of methods to accurately genotype and estimate genetic diversity
Conversion of PacBio BAM files into gzipped fastq files, including splitting of barcoded data
0
1
2
fastq
versions
Converting and demultiplexing of PacBio BAM files into gzipped fasta and fastq files
removes unused references from header of sorted BAM/CRAM files.
0
1
bam
versions
This module is used to clip primer sequences from your alignments.
0
1
2
3
bam
bai
versions
Bamcmp (Bam Compare) is a tool for assigning reads between a primary genome and a contamination genome. For instance, filtering out mouse reads from patient derived xenograft mouse models (PDX).
0
1
2
primary_filtered_bam
contamination_bam
versions
write your description here
0
1
json
versions
A command line tool to compute mapping statistics from a BAM file
Tool for converting 10x BAMs produced by Cell Ranger, Space Ranger, Cell Ranger ATAC, Cell Ranger DNA, and Long Ranger back to FASTQ files that can be used as inputs to re-run analysis
0
1
fastq
versions
BamTools provides both a programmer's API and an end-user's toolkit for handling BAM files.
0
1
bam
versions
C++ API & command-line toolkit for working with BAM data
BamTools provides both a programmer's API and an end-user's toolkit for handling BAM files.
0
1
stats
versions
C++ API & command-line toolkit for working with BAM data
Calculates per-scaffold or per-base coverage information from an unsorted sam or bam file.
0
1
covstats
hist
versions
BBMap is a short read aligner, as well as various other bioinformatic tools.
Convert BAM/GFF/GTF/GVF/PSL files to bed
0
1
bed
versions
High-performance genomic feature operations.
Computes histograms (default), per-base reports (-d) and BEDGRAPH (-bg) summaries of feature coverage (e.g., aligned sequences) for a given genome.
0
1
2
0
0
0
genomecov
versions
A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types.
Locate and tag duplicate reads in a BAM file
0
1
bam
metrics
versions
biobambam is a set of tools for early stage alignment file processing.
Merge a list of sorted bam files
0
1
bam
bam_index
checksum
versions
biobambam is a set of tools for early stage alignment file processing.
Parallel sorting and duplicate marking
0
1
0
1
bam
bam_index
cram
metrics
versions
biobambam is a set of tools for early stage alignment file processing.
Aligns single- or paired-end reads from bisulfite-converted libraries to a reference genome using Biscuit.
0
1
0
1
0
1
bam
bai
versions
A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data
A fast, compact one-liner to produce duplicate-marked, sorted, and indexed BAM files using Biscuit
0
1
0
1
0
1
bam
bai
versions
A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data
samblaster is a fast and flexible program for marking duplicates in read-id grouped paired-end SAM files. It can also optionally output discordant read pairs and/or split read mappings to separate SAM files, and/or unmapped/clipped reads to a separate FASTQ file. By default, samblaster reads SAM input from stdin and writes SAM to stdout.
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
Summarize and/or filter reads based on bisulfite conversion rate
0
1
0
1
0
1
0
1
bam
versions
A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data
Summarizes read-level methylation (and optionally SNV) information from a Biscuit BAM file in a standard-compliant BED format.
0
1
0
1
0
1
0
1
0
1
bed
versions
A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data
Computes cytosine methylation and callable SNV mutations, optionally in reference to a germline BAM to call somatic variants
0
1
2
3
4
0
1
0
1
vcf
versions
A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data
Perform basic quality control on a BAM file generated with Biscuit
0
1
0
1
0
1
reports
versions
A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data
Performs alignment of BS-Seq reads using bismark
0
1
0
1
0
1
bam
report
unmapped
versions
Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.
Relates methylation calls back to genomic cytosine contexts.
0
1
0
1
0
1
coverage
report
summary
versions
Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.
Removes alignments to the same position in the genome from the Bismark mapping output.
0
1
bam
report
versions
Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.
Extracts methylation information for individual cytosines from alignments.
0
1
0
1
bedgraph
methylation_calls
coverage
report
mbias
versions
Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.
Performs alignment of BS-Seq reads using bwameth
0
1
0
1
0
1
bam
versions
Fast and accurate alignment of BS-Seq reads using bwa-mem and a 3-letter genome.
Performs preprocessing and alignment of chromatin fastq files to fasta reference files using chromap.
0
1
0
1
0
1
0
0
0
0
bed
bam
tagAlign
pairs
versions
Fast alignment and preprocessing of chromatin profiles
Realign reads mapped with BWA to elongated reference genome
0
1
0
1
0
1
0
1
bam
versions
A method to improve mappings on circular genomes such as Mitochondria.
Calculates polymorphic site rates over protein coding genes
0
1
2
3
4
polymut
versions
Set of utilities on sequences and BAM files
Copy number variant detection from high-throughput sequencing data
0
1
2
0
1
0
1
0
1
0
1
0
bed
cnn
cnr
cns
pdf
png
versions
CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent.
Given segmented log2 ratio estimates (.cns), derive each segmentโs absolute integer copy number
0
1
2
cns
versions
CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent.
Copy number variant detection from high-throughput sequencing data
0
1
2
tsv
cnn
versions
CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent.
Generate the input coverage table for CONCOCT using a BEDFile
0
1
2
3
tsv
versions
Clustering cONtigs with COverage and ComposiTion
Merge reads that were mapped to multiple indices
0
1
bam
versions
Accurate and robust inference of microbial growth dynamics from metagenomic sequencing reads.
Generates a FASTA file of chromosome sizes and a fasta index file
0
1
sizes
fai
gzi
versions
Tools for dealing with SAM, BAM and CRAM files
DeDup is a tool for read deduplication in paired-end read merging (e.g. for ancient DNA experiments).
0
1
bam
json
hist
log
versions
DeepSomatic is an extension of deep learning-based variant caller DeepVariant that takes aligned reads (in BAM or CRAM format) from tumor and normal data, produces pileup image tensors from them, classifies each tensor using a convolutional neural network, and finally reports somatic variants in a standard VCF or gVCF file.
0
1
2
3
4
0
1
0
1
0
1
0
1
vcf
vcf_tbi
gvcf
gvcf_tbi
versions
This tool filters alignments in a BAM/CRAM file according the the specified parameters.
0
1
2
bam
logs
versions
A set of user-friendly tools for normalization and visualization of deep-sequencing data
This tool takes an alignment of reads or fragments as input (BAM file) and generates a coverage track (bigWig or bedGraph) as output.
0
1
2
0
0
bigwig
bedgraph
versions
A set of user-friendly tools for normalization and visualization of deep-sequencing data
Computes read coverage for genomic regions (bins) across the entire genome.
0
1
2
3
matrix
versions
A set of user-friendly tools for normalization and visualization of deep-sequencing data
Visualises sample correlations using a compressed matrix generated by mutlibamsummary or multibigwigsummary as input.
0
1
0
0
pdf
matrix
versions
A set of user-friendly tools for normalization and visualization of deep-sequencing data
plots cumulative reads coverages by BAM file
0
1
2
pdf
matrix
metrics
versions
A set of user-friendly tools for normalization and visualization of deep-sequencing data
Generates principal component analysis (PCA) plot using a compressed matrix generated by multibamsummary or multibigwigsummary as input.
0
1
pdf
tab
versions
A set of user-friendly tools for normalization and visualization of deep-sequencing data
Convert a file in FASTA format to the ELFASTA format
0
1
elfasta
log
versions
elPrep is a high-performance tool for preparing .sam/.bam files for variant calling in sequencing pipelines. It can be used as a drop-in replacement for SAMtools/Picard/GATK4.
Filter, sort and markdup sam/bam files, with optional BQSR and variant calling.
0
1
2
3
4
5
6
0
1
0
1
0
1
0
0
0
0
0
bam
logs
metrics
recall
gvcf
table
activity_profile
assembly_regions
versions
elPrep is a high-performance tool for preparing .sam/.bam files for variant calling in sequencing pipelines. It can be used as a drop-in replacement for SAMtools/Picard/GATK4.
Merge split bam/sam chunks in one file
0
1
bam
versions
elPrep is a high-performance tool for preparing .sam/.bam files for variant calling in sequencing pipelines. It can be used as a drop-in replacement for SAMtools/Picard/GATK4.
Split bam file into manageable chunks
0
1
bam
versions
elPrep is a high-performance tool for preparing .sam/.bam files for variant calling in sequencing pipelines. It can be used as a drop-in replacement for SAMtools/Picard/GATK4.
Estimate repeat sizes using NGS data
0
1
2
0
1
0
1
0
1
vcf
json
bam
versions
Compute genome-wide STR profile
0
1
2
0
1
0
1
locus_tsv
motif_tsv
str_profile
versions
ExpansionHunter Denovo (EHdn) is a suite of tools for detecting novel expansions of short tandem repeats (STRs).
Uses FGBIO CallDuplexConsensusReads to call duplex consensus sequences from reads generated from the same double-stranded source molecule.
0
1
0
0
bam
versions
A set of tools for working with genomic and high throughput sequencing data, including UMIs
Calls consensus sequences from reads with the same unique molecular tag.
0
1
0
0
bam
versions
Tools for working with genomic and high throughput sequencing data.
Collects a suite of metrics to QC duplex sequencing data.
0
1
0
family_sizes
duplex_family_sizes
duplex_yield_metrics
umi_counts
duplex_qc
duplex_umi_counts
versions
A set of tools for working with genomic and high throughput sequencing data, including UMIs
ggplot2 is a system for declaratively creating graphics, based on The Grammar of Graphics.
Using the fgbio tools, converts FASTQ files sequenced into unaligned BAM or CRAM files possibly moving the UMI barcode into the RX field of the reads
0
1
bam
cram
versions
A set of tools for working with genomic and high throughput sequencing data, including UMIs
Uses FGBIO FilterConsensusReads to filter consensus reads generated by CallMolecularConsensusReads or CallDuplexConsensusReads.
0
1
0
1
0
0
0
bam
versions
A set of tools for working with genomic and high throughput sequencing data, including UMIs
Groups reads together that appear to have come from the same original molecule. Reads are grouped by template, and then templates are sorted by the 5โ mapping positions of the reads from the template, used from earliest mapping position to latest. Reads that have the same end positions are then sub-grouped by UMI sequence. (!) Note: the MQ tag is required on reads with mapped mates (!) This can be added using samblaster with the optional argument --addMateTags.
0
1
0
bam
histogram
versions
A set of tools for working with genomic and high throughput sequencing data, including UMIs
Sorts a SAM or BAM file. Several sort orders are available, including coordinate, queryname, random, and randomquery.
0
1
bam
versions
Tools for working with genomic and high throughput sequencing data.
FGBIO tool to zip together an unmapped and mapped BAM to transfer metadata into the output BAM
0
1
0
1
0
1
0
1
bam
versions
A set of tools for working with genomic and high throughput sequencing data, including UMIs
Performs local realignment around indels to correct for mapping errors
0
1
2
3
0
1
0
1
0
1
0
1
bam
versions
The full Genome Analysis Toolkit (GATK) framework, license restricted.
Generates a list of locations that should be considered for local realignment prior genotyping.
0
1
2
0
1
0
1
0
1
0
1
intervals
versions
The full Genome Analysis Toolkit (GATK) framework, license restricted.
SNP and Indel variant caller on a per-locus basis
0
1
2
0
1
0
1
0
1
0
1
0
1
0
1
0
1
vcf
versions
The full Genome Analysis Toolkit (GATK) framework, license restricted.
Assigns all the reads in a file to a single new read-group
0
1
0
1
0
1
bam
bai
cram
versions
Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.
Apply base quality score recalibration (BQSR) to a bam file
0
1
2
3
4
0
0
0
bam
cram
versions
Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.
Apply base quality score recalibration (BQSR) to a bam file
meta
input
input_index
bqsr_table
intervals
fasta
fai
dict
meta
versions
bam
cram
Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.
estimates the parameters for the DRAGstr model
0
1
2
0
0
0
0
dragstr_model
versions
Genome Analysis Toolkit (GATK4). Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.
Collects read counts at specified intervals. The count for each interval is calculated by counting the number of read starts that lie in the interval.
0
1
2
3
0
1
0
1
0
1
hdf5
tsv
versions
Genome Analysis Toolkit (GATK4). Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.
Converts FastQ file to SAM/BAM format
0
1
bam
versions
Genome Analysis Toolkit (GATK4) Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.
Call germline SNPs and indels via local re-assembly of haplotypes
0
1
2
3
4
0
1
0
1
0
1
0
1
0
1
vcf
tbi
bam
versions
Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.
This tool locates and tags duplicate reads in a BAM or SAM file, where duplicate reads are defined as originating from a single fragment of DNA.
0
1
0
0
cram
bam
crai
bai
metrics
versions
Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.
This tool locates and tags duplicate reads in a BAM or SAM file, where duplicate reads are defined as originating from a single fragment of DNA.
meta
bam
fasta
fai
dict
meta
versions
output
bam_index
Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.
Merge unmapped with mapped BAM files
0
1
2
0
1
0
1
bam
versions
Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.
Print reads in the SAM/BAM/CRAM file
0
1
2
0
1
0
1
0
1
bam
cram
sam
versions
Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.
Reverts SAM or BAM files to a previous state.
0
1
bam
versions
Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.
Converts BAM/SAM file to FastQ format
0
1
fastq
versions
Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.
Splits reads that contain Ns in their cigar string
0
1
2
3
0
1
0
1
0
1
bam
versions
Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.
This tool locates and unmark the marked duplicate reads in a BAM or SAM file, where duplicate reads are defined as originating from a single fragment of DNA.
0
1
bam
bai
versions
Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.
Apply base quality score recalibration (BQSR) to a bam file
0
1
2
3
4
0
0
0
bam
cram
versions
Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.
This tool locates and tags duplicate reads in a BAM or SAM file, where duplicate reads are defined as originating from a single fragment of DNA.
0
1
0
0
0
output
bam_index
metrics
versions
Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.
Performs fastq alignment to a fasta reference using using gem3-mapper
0
1
0
1
0
bam
versions
The GEM indexer (v3).
Tool for imputation and phasing from vcf file or directly from bam files.
0
1
2
3
4
5
6
7
8
9
0
1
2
phased_variants
stats_coverage
versions
GLIMPSE2 is a phasing and imputation method for large-scale low-coverage sequencing studies.
Quickly estimate coverage from a whole-genome bam or cram index. A bam index has 16KB resolution so that's what this gives, but it provides what appears to be a high-quality coverage estimate in seconds per genome.
0
1
2
0
1
output
ped
bed
bed_index
roc
html
png
versions
goleft is a collection of bioinformatics tools distributed under MIT license in a single static binary
GRIDSS is a module software suite containing tools useful for the detection of genomic rearrangements.
0
1
0
1
0
1
0
1
vcf
versions
GRIDSS: the Genomic Rearrangement IDentification Software Suite
Align RNA-Seq reads to a reference with HISAT2
0
1
0
1
0
1
bam
summary
fastq
versions
HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes as well as to a single reference genome.
gcCounter function from HMMcopy utilities, used to generate GC content in non-overlapping windows from a fasta reference
0
1
wig
versions
C++ based programs for analyzing BAM files and preparing read counts -- used with bioconductor-hmmcopy
Perl script (generateMap.pl) generates the mappability of a genome given a certain size of reads, for input to hmmcopy mapcounter. Takes a very long time on large genomes, is not parallelised at all.
0
1
bigwig
versions
C++ based programs for analyzing BAM files and preparing read counts -- used with bioconductor-hmmcopy
mapCounter function from HMMcopy utilities, used to generate mappability in non-overlapping windows from a bigwig file
0
1
wig
versions
C++ based programs for analyzing BAM files and preparing read counts -- used with bioconductor-hmmcopy
readCounter function from HMMcopy utilities, used to generate read in windows
0
1
2
wig
versions
C++ based programs for analyzing BAM files and preparing read counts -- used with bioconductor-hmmcopy
Create a tag directory with the HOMER suite
0
1
0
tagdir
taginfo
versions
HOMER (Hypergeometric Optimization of Motif EnRichment) is a suite of tools for Motif Discovery and next-gen sequencing analysis.
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
Differential gene expression analysis based on the negative binomial distribution
Empirical Analysis of Digital Gene Expression Data in R
IsoSeq - Cluster - Cluster trimmed consensus sequences
0
1
bam
pbi
cluster
cluster_report
transcriptset
hq_bam
hq_pbi
lq_bam
lq_pbi
singletons_bam
singletons_pbi
versions
IsoSeq - Cluster - Cluster trimmed consensus sequences
Remove polyA tail and artificial concatemers
0
1
0
bam
pbi
consensusreadset
summary
report
versions
IsoSeq - Scalable De Novo Isoform Discovery
IsoSeq3 - Cluster - Cluster trimmed consensus sequences
meta
bam
meta
version
bam
pbi
cluster
cluster_report
transcriptset
hq_bam
hq_pbi
lq_bam
lq_pbi
singletons_bam
singletons_pbi
IsoSeq3 - Cluster - Cluster trimmed consensus sequences
Remove polyA tail and artificial concatemers
meta
bam
primers
meta
bam
pbi
consensusreadset
summary
report
versions
IsoSeq3 - Scalable De Novo Isoform Discovery
Extract UMI and cell barcodes
0
1
0
bam
pbi
versions
Iso-Seq - Scalable De Novo Isoform Discovery
Generate a consensus sequence from a BAM file using iVar
0
1
0
0
fasta
qual
mpileup
versions
iVar - a computational package that contains functions broadly useful for viral amplicon-based sequencing.
Trim primer sequences rom a BAM file with iVar
0
1
2
0
bam
log
versions
iVar - a computational package that contains functions broadly useful for viral amplicon-based sequencing.
Call variants from a BAM file using iVar
0
1
0
0
0
0
tsv
mpileup
versions
iVar - a computational package that contains functions broadly useful for viral amplicon-based sequencing.
Extract BED file from hts files containing a dictionary (VCF,BAM, CRAM, DICT, etc...)
0
1
bed
versions
Java utilities for Bioinformatics.
Plot whole genome coverage from BAM/CRAM file as SVG
0
1
2
0
1
0
1
0
1
output
versions
Java utilities for Bioinformatics.
Converts MAF alignments in another format.
0
1
2
0
1
0
1
0
1
axt_gz
bam
blast_gz
blasttab_gz
chain_gz
cram
gff_gz
html_gz
psl_gz
sam_gz
tab_gz
versions
LAST finds & aligns related regions of sequences.
Bayesian reconstruction of ancient DNA fragments
0
1
bam
fq_pass
fq_fail
unmerged_r1_fq_pass
unmerged_r1_fq_fail
unmerged_r2_fq_pass
unmerged_r2_fq_fail
log
versions
Converting aligned short and long reads records from one reference to another
0
1
0
1
bam
versions
Fast and accurate coordinate conversion between assemblies
Lofreq subcommand to for insert base and indel alignment qualities
0
1
0
bam
versions
A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data
Inserts indel qualities in a BAM file
0
1
0
1
bam
versions
Lofreq is a fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data. It's indelqual programme inserts indel qualities in a BAM file
Lofreq subcommand to call low frequency variants from alignments when tumor-normal paired samples are available
0
1
0
1
bam
versions
A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data
LongPhase is an ultra-fast program for simultaneously co-phasing SNPs, small indels, large SVs, and (5mC) modifications for Nanopore and PacBio platforms.
0
1
2
3
4
5
0
1
0
1
bam
log
versions
LongPhase is an ultra-fast program for simultaneously co-phasing SNPs, small indels, large SVs, and (5mC) modifications for Nanopore and PacBio platforms.
Map short-reads to an indexed reference genome
0
1
0
1
0
0
0
0
0
0
0
bam
versions
An aDNA aware short-read mapper
Computational framework for tracking and quantifying DNA damage patterns among ancient DNA sequencing reads generated by Next-Generation Sequencing platforms.
0
1
0
runtime_log
fragmisincorporation_plot
length_plot
misincorporation
lgdistribution
dnacomp
stats_out_mcmc_hist
stats_out_mcmc_iter
stats_out_mcmc_trace
stats_out_mcmc_iter_summ_stat
stats_out_mcmc_post_pred
stats_out_mcmc_correct_prob
dnacomp_genome
rescaled
pctot_freq
pgtoa_freq
fasta
folder
versions
Depth computation per contig step of metabat2
0
1
2
depth
versions
Metagenome binning of contigs
0
1
2
tooshort
lowdepth
unbinned
membership
fasta
versions
MetaPhlAn is a tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data.
0
1
0
profile
biom
bt2out
versions
Extracts per-base methylation metrics from alignments
0
1
2
0
0
bedgraph
methylkit
versions
Methylation caller from MethylDackel, a (mostly) universal methylation extractor for methyl-seq experiments.
Generates methylation bias plots from alignments
0
1
2
0
0
txt
versions
Read position methylation bias tools from MethylDackel, a (mostly) universal extractor for methyl-seq experiments.
Taxonomic meta-omics profiling using universal marker genes
0
1
0
out
bam
mgc
log
versions
Marker gene-based OTU (mOTU) profiling
A small Java tool to calculate ratios between MT and nuclear sequencing reads in a given BAM file.
0
1
0
mtnucratio
json
versions
Convert genomic BAM/SAM files to transcriptomic BAM/RAD files.
0
1
0
0
0
bam
rad
versions
mudskipper is a tool for converting genomic BAM/SAM files to transcriptomic BAM/RAD files.
Build and store a gtf index, which is useful for converting genomic BAM/SAM files to transcriptomic BAM/SAM files.
0
index
versions
mudskipper is a tool for converting genomic BAM/SAM files to transcriptomic BAM/RAD files.
AMR predictions for supported species
0
1
0
csv
json
versions
Antibiotic resistance prediction in minutes
Compare multiple runs of long read sequencing data and alignments
0
1
report_html
lengths_violin_html
log_length_violin_html
n50_html
number_of_reads_html
overlay_histogram_html
overlay_histogram_normalized_html
overlay_log_histogram_html
overlay_log_histogram_normalized_html
total_throughput_html
quals_violin_html
overlay_histogram_identity_html
overlay_histogram_phredscore_html
percent_identity_violin_html
active_pores_over_time_html
cumulative_yield_plot_gigabases_html
sequencing_speed_over_time_html
stats_txt
versions
Performs fastq alignment to a fasta reference using NextGenMap
0
1
0
bam
versions
NextGenMap is a flexible highly sensitive short read mapping tool that handles much higher mismatch rates than comparable algorithms while still outperforming them in terms of runtime
Determines the gender of a sample from the BAM/CRAM file.
0
1
2
0
1
0
1
0
tsv
versions
Short-read sequencing tools
Determining whether sequencing data comes from the same individual by using SNP matching. Designed for humans on vcf or bam files.
0
1
0
1
0
1
corr_matrix
matched
all
pdf
vcf
versions
NGSCheckMate is a software package for identifying next generation sequencing (NGS) data files from the same individual, including matching between DNA and RNA.
Calls CNVs in bam files from tumor patients
0
1
2
3
4
0
0
png
profile
summary
versions
A program to convert bam into paf.
0
1
paf
versions
A program to manipulate paf files / convert to and from paf.
Split a .pairsam file into .pairs and .sam.
0
1
pairs
bam
versions
CLI tools to process mapped Hi-C data
NVIDIA Clara Parabricks GPU-accelerated apply Base Quality Score Recalibration (BQSR).
0
1
0
1
0
1
0
1
0
1
bam
bai
versions
NVIDIA Clara Parabricks GPU-accelerated genomics tools
NVIDIA Clara Parabricks GPU-accelerated alignment, sorting, BQSR calculation, and duplicate marking. Note this nf-core module requires files to be copied into the working directory and not symlinked.
0
1
0
1
0
1
0
1
0
1
0
bam
bai
cram
crai
bqsr_table
qc_metrics
duplicate_metrics
versions
NVIDIA Clara Parabricks GPU-accelerated genomics tools
VIDIA Clara Parabricks GPU-accelerated fast, accurate algorithm for mapping methylated DNA sequence reads to a reference genome, performing local alignment, and producing alignment for different parts of the query sequence
0
1
0
1
0
1
0
bam
bai
qc_metrics
bqsr_table
duplicate_metrics
versions
NVIDIA Clara Parabricks GPU-accelerated genomics tools
Determines the depth in a BAM/CRAM file
0
1
2
0
1
0
1
depth
binned_depth
versions
Graph realignment tools for structural variants
The pbbam software package provides components to create, query, & edit PacBio BAM files and associated indices. These components include a core C++ library, bindings for additional languages, and command-line utilities.
0
1
bam
pbi
versions
PacBio BAM C++ library
Alignment with PacBio's minimap2 frontend
0
1
0
1
bam
versions
A minimap2 frontend for PacBio native data formats
converts pacbio bam files to fastq.gz using PacBioToolKit (pbtk) bam2fastq
0
1
2
fastq
versions
pbtk - PacBio BAM toolkit
Minimalistic tool which creates an index file that enables random access into PacBio BAM files
0
1
pbi
versions
pbtk - PacBio BAM toolkit
Assigns all the reads in a file to a single new read-group
0
1
0
1
0
1
bam
bai
cram
versions
A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
Cleans the provided BAM, soft-clipping beyond-end-of-reference alignments and setting MAPQ to 0 for unmapped reads
0
1
bam
versions
A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
Collects hybrid-selection (HS) metrics for a SAM or BAM file.
0
1
2
3
4
0
1
0
1
0
1
metrics
versions
A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
Collect metrics about the insert size distribution of a paired-end library.
0
1
metrics
histogram
versions
Java tools for working with NGS data in the BAM format
Collect multiple metrics from a BAM file
0
1
2
0
1
0
1
metrics
pdf
versions
A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
Collect metrics from a RNAseq BAM file
0
1
0
0
0
metrics
pdf
versions
A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
Collect metrics about coverage and performance of whole genome sequencing (WGS) experiments.
0
1
2
0
1
0
1
0
metrics
versions
A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
Checks that all data in the set of input files appear to come from the same individual
0
1
2
3
4
5
0
1
crosscheck_metrics
versions
A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
Computes/Extracts the fingerprint genotype likelihoods from the supplied file. It is given as a list of PLs at the fingerprinting sites.
0
1
2
0
0
0
0
vcf
tbi
versions
A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
Converts a FASTQ file to an unaligned BAM or SAM file.
0
1
bam
versions
A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
Filters SAM/BAM files to include/exclude either aligned/unaligned reads or based on a read list
0
1
2
0
bam
versions
A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
Verify mate-pair information between mates and fix if needed
0
1
bam
versions
A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
Locate and tag duplicate reads in a BAM file
0
1
0
1
0
1
bam
bai
cram
metrics
versions
A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
Merges multiple BAM files into a single file
0
1
bam
versions
A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
Samples a SAM/BAM/CRAM file using flowcell position information for the best approximation of having sequenced fewer reads
0
1
2
bam
bai
num_reads
versions
A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
changes name of sample in the vcf file
0
1
vcf
versions
A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
Writes an interval list created by splitting a reference at Ns.A Program for breaking up a reference into intervals of alternating regions of N and ACGT bases
0
1
0
1
0
1
intervals
versions
A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
This tool takes in a coordinate-sorted SAM or BAM and calculatesthe NM, MD, and UQ tags by comparing with the reference.
0
1
0
1
bam
bai
versions
Sorts BAM/SAM files based on a variety of picard specific criteria
0
1
0
bam
versions
A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
Sorts vcf files
0
1
0
1
0
1
vcf
versions
Java tools for working with NGS data in the BAM/CRAM/SAM and VCF format
pmdtools command to filter ancient DNA molecules from others
0
1
2
0
0
bam
versions
Compute postmortem damage patterns and decontaminate ancient genomes
Run all Portcullis steps in one go
0
1
0
1
0
1
log
pass_junctions_bed
pass_junctions_tab
intron_gff
exon_gff
spliced_bam
spliced_bai
versions
Portcullis is a tool that filters out invalid splice junctions from RNA-seq alignment data. It accepts BAM files from various RNA-seq mappers, analyzes splice junctions and removes likely false positives, outputting filtered results in multiple formats for downstream analysis.
converts sam/bam/cram/pairs into genome contact map
0
1
0
1
2
pretext
versions
Compute summary statistics for control gene from BAM files.
0
1
2
0
0
control_stats
versions
A Python package for pharmacogenomics research
Call SNVs/indels from BAM files for all target genes.
0
1
2
0
1
0
0
vcf
tbi
versions
A Python package for pharmacogenomics research
Prepare a depth of coverage file for all target genes with SV from BAM files.
0
1
2
0
0
coverage
versions
A Python package for pharmacogenomics research
Evaluate alignment data
0
1
0
results
versions
Qualimap 2 is a platform-independent application written in Java and R that provides both a Graphical User Interface and a command-line interface to facilitate the quality control of alignment sequencing data and its derivatives like feature counts.
Evaluate alignment data
0
1
2
0
0
0
results
versions
Qualimap 2 is a platform-independent application written in Java and R that provides both a Graphical User Interface and a command-line interface to facilitate the quality control of alignment sequencing data and its derivatives like feature counts.
Extract exon-exon junctions from an RNAseq BAM file. The output is a BED file in the BED12 format.
0
1
2
junc
versions
RegTools is a set of tools that integrate DNA-seq and RNA-seq data to help interpret mutations in a regulatory and splicing context.
Quality control of riboseq bam data
0
1
2
0
1
2
0
1
2
0
1
0
1
0
1
predictions
all
transprofile
versions
Ribo TIS Hunter (Ribo-TISH) identifies translation activities using ribosome profiling data.
Quality control of riboseq bam data
0
1
2
0
1
distribution
pdf
offset
versions
Ribo TIS Hunter (Ribo-TISH) identifies translation activities using ribosome profiling data.
Accurate detection of short and long active ORFs using Ribo-seq data
0
1
2
0
1
protocol
bam_summary
read_length_dist
metagene_profile_5p
metagene_profile_3p
metagene_plots
psite_offsets
pos_wig
neg_wig
orfs
versions
Python package to detect translating ORF from Ribo-seq data
Calculate expression with RSEM
0
1
0
counts_gene
counts_transcript
stat
logs
versions
bam_star
bam_genome
bam_transcript
RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome
Generate statistics from a bam file
0
1
txt
versions
RSeQC package provides a number of useful modules that can comprehensively evaluate high throughput sequence data especially RNA-seq data.
Calling lowest common ancestors from multi-mapped reads in SAM/BAM/CRAM files
0
1
2
0
csv
json
bam
versions
Lowest Common Ancestor on SAM/BAM/CRAM alignment files
Outputs some statistics drawn from read flags.
0
1
stats
versions
Tools for working with SAM/BAM data
find and mark duplicate reads in BAM file
0
1
bam
bai
versions
process your BAM data faster!
This module combines samtools and samblaster in order to use samblaster capability to filter or tag SAM files, with the advantage of maintaining both input and output in BAM format. Samblaster input must contain a sequence header: for this reason it has been piped with the "samtools view -h" command. Additional desired arguments for samtools can be passed using: options.args2 for the input bam file options.args3 for the output bam file
0
1
bam
versions
Clips read alignments where they match BED file defined regions
0
1
0
0
0
bam
stats
rejects_bam
versions
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
The module uses bam2fq method from samtools to convert a SAM, BAM or CRAM file to FASTQ format
0
1
0
reads
versions
Tools for dealing with SAM, BAM and CRAM files
calculates MD and NM tags
0
1
0
1
bam
versions
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
Concatenate BAM or CRAM file
0
1
bam
cram
versions
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
shuffles and groups reads together by their names
0
1
0
1
bam
cram
sam
versions
Tools for dealing with SAM, BAM and CRAM files
The module uses collate and then fastq methods from samtools to convert a SAM, BAM or CRAM file to FASTQ format
0
1
0
1
0
fastq
fastq_interleaved
fastq_other
fastq_singleton
versions
Tools for dealing with SAM, BAM and CRAM files
Produces a consensus FASTA/FASTQ/PILEUP
0
1
fasta
fastq
pileup
versions
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
convert and then index CRAM -> BAM or BAM -> CRAM file
0
1
2
0
1
0
1
bam
cram
bai
crai
versions
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
produces a histogram or table of coverage per chromosome
0
1
2
0
1
0
1
coverage
versions
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
List CRAM Content-ID and Data-Series sizes
0
1
size
versions
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
Computes the depth at each position or region.
0
1
0
1
tsv
versions
Tools for dealing with SAM, BAM and CRAM files; samtools depth โ computes the read depth at each position or region
Create a sequence dictionary file from a FASTA file
0
1
dict
versions
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
Index FASTA file, and optionally generate a file of chromosome sizes
0
1
0
1
0
fa
fai
sizes
gzi
versions
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
Converts a SAM/BAM/CRAM file to FASTA
0
1
0
fasta
interleaved
singleton
other
versions
Tools for dealing with SAM, BAM and CRAM files
Converts a SAM/BAM/CRAM file to FASTQ
0
1
0
fastq
interleaved
singleton
other
versions
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
Samtools fixmate is a tool that can fill in information (insert size, cigar, mapq) about paired end reads onto the corresponding other read. Also has options to remove secondary/unmapped alignments and recalculate whether reads are proper pairs.
0
1
bam
cram
sam
versions
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
Counts the number of alignments in a BAM/CRAM/SAM file for each FLAG type
0
1
2
flagstat
versions
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
filter/convert SAM/BAM/CRAM file
0
1
readgroup
versions
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
Reports alignment summary statistics for a BAM/CRAM/SAM file
0
1
2
idxstats
versions
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
converts FASTQ files to unmapped SAM/BAM/CRAM
0
1
sam
bam
cram
versions
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
Index SAM/BAM/CRAM file
0
1
bai
csi
crai
versions
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
mark duplicate alignments in a coordinate sorted file
0
1
0
1
bam
cram
sam
versions
Tools for dealing with SAM, BAM and CRAM files
Merge BAM or CRAM file
0
1
0
1
0
1
bam
cram
csi
crai
versions
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
BAM
0
1
2
0
mpileup
versions
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
Replace the header in the bam file with the header generated by the command. This command is much faster than replacing the header with a BAMโSAMโBAM conversion.
0
1
bam
versions
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
Collate/Fixmate/Sort/Markdup SAM/BAM/CRAM file
0
1
0
1
bam
cram
csi
crai
metrics
versions
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
Sort SAM/BAM/CRAM file
0
1
0
1
bam
cram
crai
csi
versions
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
Produces comprehensive statistics from SAM/BAM/CRAM file
0
1
2
0
1
stats
versions
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
filter/convert SAM/BAM/CRAM file
0
1
2
0
1
0
0
bam
cram
sam
bai
csi
crai
unselected
unselected_index
versions
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA.
The cluster_identifier tool of Scramble identifies soft clipped clusters
0
1
2
0
clusters
versions
Soft Clipped Read Alignment Mapper
Performs fastq alignment to a fasta reference using Sentieon's BWA MEM
0
1
0
1
0
1
0
1
bam_and_bai
versions
Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.
Collects multiple quality metrics from a bam file
0
1
2
0
1
0
1
0
mq_metrics
qd_metrics
gc_summary
gc_metrics
aln_metrics
is_metrics
mq_plot
qd_plot
is_plot
gc_plot
versions
Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.
Runs the sentieon tool LocusCollector followed by Dedup. LocusCollector collects read information that is used by Dedup which in turn marks or removes duplicate reads.
0
1
2
0
1
0
1
cram
crai
bam
bai
score
metrics
metrics_multiqc_tsv
versions
Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.
Merges BAM files, and/or convert them into cram files. Also, outputs the result of applying the Base Quality Score Recalibration to a file.
0
1
2
0
1
0
1
output
index
output_index
versions
Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.
Collects whole genome quality metrics from a bam file
0
1
2
0
1
0
1
0
1
wgs_metrics
versions
Sentieonยฎ provides complete solutions for secondary DNA/RNA analysis for a variety of sequencing platforms, including short and long reads. Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system.
Sequence quality metrics for FASTQ and uBAM files.
0
1
json
html
versions
PileupCaller is a tool to create genotype calls from bam files using read-sampling methods
0
1
0
0
eigenstrat
plink
freqsum
versions
Tools for population genetics on sequencing data
Sequenza-utils bam2seqz process BAM and Wiggle files to produce a seqz file
0
1
2
0
0
seqz
versions
Sequenza-utils provides 3 main command line programs to transform common NGS file format - such as FASTA, BAM - to input files for the Sequenza R package. The program - bam2seqz - process a paired set of BAM/pileup files (tumour and matching normal), and GC-content genome-wide information, to extract the common positions with A and B alleles frequencies.
Sequenza-utils gc_wiggle computes the GC percentage across the sequences, and returns a file in the UCSC wiggle format, given a fasta file and a window size.
0
1
wig
versions
Sequenza-utils provides 3 main command line programs to transform common NGS file format - such as FASTA, BAM - to input files for the Sequenza R package. The program -gc_wiggle- takes fasta file as an input, computes GC percentage across the sequences and returns a file in the UCSC wiggle format.
tool to call the copy number of full-length SMN1, full-length SMN2, as well as SMN2ฮ7โ8 (SMN2 with a deletion of Exon7-8) from a whole-genome sequencing (WGS) BAM file.
0
1
2
smncopynumber
run_metrics
versions
Rapid haploid variant calling
0
1
0
tab
csv
html
vcf
bed
gff
bam
bai
log
aligned_fa
consensus_fa
consensus_subs_fa
raw_vcf
filt_vcf
vcf_gz
vcf_csi
txt
versions
Rapid bacterial SNP calling and core genome alignments
Somalier can extract informative sites, evaluate relatedness, and perform quality-control on BAM/CRAM/BCF/VCF/GVCF or from jointly-called VCFs
0
1
0
1
2
tsv
html
versions
Somalier can extract informative sites, evaluate relatedness, and perform quality-control on BAM/CRAM/BCF/VCF/GVCF or from jointly-called VCFs
Somalier can extract informative sites, evaluate relatedness, and perform quality-control on BAM/CRAM/BCF/VCF/GVCF or from jointly-called VCFs
0
1
2
0
1
0
1
0
1
extract
versions
Somalier can extract informative sites, evaluate relatedness, and perform quality-control on BAM/CRAM/BCF/VCF/GVCF or from jointly-called VCFs
Somalier can extract informative sites, evaluate relatedness, and perform quality-control on BAM/CRAM/BCF/VCF/GVCF or from jointly-called VCFs
0
1
2
0
html
pairs_tsv
samples_tsv
versions
Somalier can extract informative sites, evaluate relatedness, and perform quality-control on BAM/CRAM/BCF/VCF/GVCF or from jointly-called VCFs
Short Read Sequence Typing for Bacterial Pathogens is a program designed to take Illumina sequence data, a MLST database and/or a database of gene sequences (e.g. resistance genes, virulence genes, etc) and report the presence of STs and/or reference genes.
0
1
2
gene_results
fullgene_results
mlst_results
pileup
sorted_bam
versions
Short Read Sequence Typing for Bacterial Pathogens
Advanced sequence file format conversions
0
1
0
0
0
cram
gzi
versions
Staden Package 'io_lib' (sometimes referred to as libstaden-read by distributions). This contains code for reading and writing a variety of Bioinformatics / DNA Sequence formats.
Align reads to a reference genome using STAR
0
1
0
1
0
1
0
0
0
log_final
log_out
log_progress
versions
bam
bam_sorted
bam_sorted_aligned
bam_transcript
bam_unsorted
fastq
tab
spl_junc_tab
read_per_gene_tab
junction
sam
wig
bedgraph
STAR is a software package for mapping DNA sequences against a large reference genome, such as the human genome.
STITCH is an R program for reference panel free, read aware, low coverage sequencing genotype imputation. STITCH runs on a set of samples with sequencing reads in BAM format, as well as a list of positions to genotype, and outputs imputed genotypes in VCF format.
0
1
2
3
4
5
6
7
8
9
10
0
1
2
0
input
rdata
plots
vcf
bgen
versions
SVTyper performs breakpoint genotyping of structural variants (SVs) using whole genome sequencing data
0
1
2
3
0
1
0
1
json
gt_vcf
bam
versions
Compute genotype of structural variants based on breakpoint depth
A tool to detect resistance and lineages of M. tuberculosis genomes
0
1
bam
csv
json
txt
vcf
versions
Profiling tool for Mycobacterium tuberculosis to detect drug resistance and lineage from WGS data
Computes the coverage of different regions from the bam file.
0
1
0
1
cov
wig
versions
TIDDIT - structural variant calling.
Tandem repeat genotyping from PacBio HiFi data
0
1
2
3
0
1
0
1
0
1
vcf
bam
versions
Tandem repeat genotyping and visualization from PacBio HiFi data
Deduplicate reads based on the mapping co-ordinate and the UMI attached to the read.
0
1
2
0
bam
fastq
log
versions
Deduplicate reads based on the mapping co-ordinate and the UMI attached to the read.
0
1
2
0
bam
log
tsv_edit_distance
tsv_per_umi
tsv_umi_per_position
versions
UMI-tools contains tools for dealing with Unique Molecular Identifiers (UMIs)/Random Molecular Tags (RMTs) and single cell RNA-Seq cell barcodes
Group reads based on their UMI and mapping coordinates
0
1
2
0
0
log
bam
tsv
versions
UMI-tools contains tools for dealing with Unique Molecular Identifiers (UMIs)/Random Molecular Tags (RMTs) and single cell RNA-Seq cell barcodes
Make the output from umi_tools dedup or group compatible with RSEM
0
1
2
bam
log
versions
UMI-tools contains tools for dealing with Unique Molecular Identifiers (UMIs)/Random Molecular Tags (RMTs) and single cell RNA-Seq cell barcodes
The Java port of the VarDict variant caller
0
1
2
3
0
1
0
1
vcf
versions
Filtering, downsampling and profiling alignments in BAM/CRAM formats
0
1
bam
versions
Velocyto is a library for the analysis of RNA velocity. velocyto.py CLI use
Path(resolve_path=True)
and breaks the nextflow logic of symbolic links.
If in the work dir velocyto find a file named EXACTLY cellsorted_[ORIGINAL_BAM_NAME]
it will skip the samtools sort step.
Cellsorted bam file should be cell sorted with:
samtools sort -t CB -O BAM -o cellsorted_input.bam input.bam
See module test for an example with the SAMTOOLS_SORT nf-core module. Config example to cellsort input bam using SAMTOOLS_SORT:
withName: SAMTOOLS_SORT {
ext.prefix = { "cellsorted_${bam.baseName}" }
ext.args = '-t CB -O BAM'
}
Optional mask must be passed with ext.args
and option --mask
This is why I need to stage in the work dir 2 bam files (cellsorted and original).
See also velocyto tutorial
0
1
2
3
0
loom
versions
Detecting and estimating inter-sample DNA contamination became a crucial quality assessment step to ensure high quality sequence reads and reliable downstream analysis.
0
1
2
0
log
selfsm
depthsm
selfrg
depthrg
bestsm
bestrg
versions
verifyBamID is a software that verifies whether the reads in particular file match previously known genotypes for an individual (or group of individuals), and checks whether the reads are contaminated as a mixture of two samples.
Detecting and estimating inter-sample DNA contamination became a crucial quality assessment step to ensure high quality sequence reads and reliable downstream analysis.
0
1
2
0
1
2
0
0
log
ud
bed
mu
self_sm
ancestry
versions
A robust tool for DNA contamination estimation from sequence reads using ancestry-agnostic method.
Cluster sequences using a single-pass, greedy centroid-based clustering algorithm.
0
1
aln
biom
mothur
otu
bam
out
blast
uc
centroids
clusters
profile
msa
versions
VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)
The wham suite consists of two programs, wham and whamg. wham, the original tool, is a very sensitive method with a high false discovery rate. The second program, whamg, is more accurate and better suited for general structural variant (SV) discovery.
0
1
2
0
0
vcf
tbi
graph
versions
Convert and filter aligned reads to .npz
0
1
2
0
1
0
1
npz
versions
WIthin-SamplE COpy Number aberration DetectOR, including sex chromosomes
Align reads to a reference genome using YARA
0
1
0
1
bam
bai
versions
Yara is an exact tool for aligning DNA sequencing reads to reference genomes.
Click here to trigger an update.