Description

The DRAGEN DNA Germline Pipeline accelerates the secondary analysis of NGS data by harnessing the tremendous power available on the DRAGEN Platform. The pipeline includes highly optimized algorithms for mapping, aligning, sorting, duplicate marking, and haplotype variant calling. In addition to haplotype variant calling, the pipeline supports calling of copy number and structural variants as well as detection of repeat expansions and targeted calls.

Input

Name
Description
Pattern

0 ()

1 ()

checkfingerprint_expected_vcf (file)

Input expected genotypes (VCF) for checkfingerprint comparison

*.vcf

cnv_combined_counts (file)

Specify combined PON file

*.combined.counts.txt.gz

cnv_exclude_bed (file)

Regions to exclude for CNV processing

*.bed

cnv_population_b_allele_vcf (file)

CNV population SNP input VCF file

*.vcf

cnv_segmentation_bed (file)

Intervals to limit segmentation to

*.bed

cnv_target_bed (file)

CNV target BED file

*.bed

cram_reference (file)

Reference file in FASTA format (only used for decompression)

*.{fasta,fa}

dbsnp (file)

Variant annotation database VCF (or .vcf.gz) file

*.vcf{,.gz}

fastqc_adapter_file (file)

FASTA file containing adapter sequences

*.{fasta,fa}

fastqc_kmer_file (file)

FASTA file containing kmers of interest

*.{fasta,fa}

ora_reference (directory)

Path to the directory that contains the compression reference and index file

qc_coverage_region (file)

bed files to report coverage on, max 3

*.bed

qc_cross_cont_vcf (file)

Variant file (.vcf/.vcf.gz) with population allele frequencies to estimate sample contamination

*.vcf{,.gz}

ref_dir (directory)

Directory with reference and hash tables

repeat_genotype_ref_fasta (file)

FASTA file containing repeat genotypes

*.{fasta,fa}

repeat_genotype_specs (file)

Repeat variant catalog file

sv_call_regions_bed (file)

BED file containing the set of regions to call (optionally gzip or bgzip compressed)

*.bed{,.gz}

sv_exclusion_bed (file)

BED file containing the set of exclusion regions for SV calling (optionally gzip or bzip compressed)

*.bed

sv_forcegt_vcf (file)

Specify a VCF of structural variants for forced genotyping, meaning these variants will be scored and emitted in the output VCF even if not found in the sample data. These variants will be merged with any additional variants discovered directly from the sample data.

*.vcf

sv_systematic_noise (file)

Systematic noise BEDPE file containing the set of noisy paired regions for SV calling (optionally gzip or bzip compressed).

*.bed{,.gz}

trim_adapter_read (file)

Files of adapter sequences to trim from the 3' end of read 1 and 2

*.{fasta,fa}

trim_adapter_read_5prime (file)

FASTA files that contains adapter sequences to trim from the 5' end of Read 1 and 2; The sequences should be in reverse order (with respect to their appearance in the FASTQ) but not complemented

*.{fasta,fa}

variant_annotation_data (directory)

Location of downloaded Nirvana annotation files

vc_combine_phased_variants_distance_bed (file)

Combine variants in the same phase set bed file.

*.bed

vc_excluded_regions_bed (file)

Excluded regions bed specifying where variants will be hard filtered

*.bed

vc_forcegt_vcf (file)

List of small variants to force genotype. Can be .vcf or .vcf.gz file

*.vcf{,.gz}

vc_log_bed (file)

Log information for regions in this BED file

*.bed

vc_mapping_metrics (file)

File containing mapping metrics

vc_ml_dir (directory)

directory containing machine learning package

vc_ntd_error_params (file)

Params file for per-nucleotide error rate calibration

vc_roh_blacklist_bed (file)

Blacklist BED file for ROH

*.bed

vc_snp_error_cal_bed (file)

BED file containing regions from which to estimate nucleotide substitution biases

*.bed

vc_systematic_noise (file)

Site specific noise file. This file enables the systematic-noise filter and improves specificity during somatic variant calling

vc_target_bed (file)

Target BED file

*.bed

vd_eh_vcf (file)

Expansion hunter vcf (optionally .gzip compressed)

*.vcf{,.gz}

vd_small_variant_vcf (file)

Small variant vcf (optionally .gzip compressed)

*.vcf{,.gz}

vd_sv_vcf (file)

Structural variant vcf (optionally .gzip compressed)

*.vcf{,.gz}

vntr_catalog_bed (file)

BED file specifying the TR regions for the VNTR Caller to act upon

*.bed

Output

Name
Description
Pattern

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

0 ()

Tools

dragen Documentation

The Illumina DRAGEN™ Bio-IT Platform is based on the highly reconfigurable DRAGEN Bio-IT Processor, which is integrated on a Field Programmable Gate Array (FPGA) card and is available in a preconfigured server that can be seamlessly integrated into bioinformatics workflows. The platform can be loaded with highly optimized algorithms for many different NGS secondary analysis pipelines.