Introduction
nf-core/nfmicrofinder is a bioinformatics pipeline that aids in the curation of bird genome assemblies by identifying putative microchromosome scaffolds and moving them to the start of the genome assembly FASTA file.
The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible. The Nextflow DSL2 implementation of this pipeline uses one container per process which makes it much easier to maintain and update software dependencies.
Pipeline Summary
- Input validation and parameter checks
- Index reference genome using Miniprot
- Align protein sequences to genome using Miniprot
- Filter alignments based on quality thresholds (identity ≥70%, score ≥60)
- Sort FASTA file based on filtered alignments to prioritize microchromosomes
- Generate final reordered assembly and pipeline reports
Quick Start
-
Install Nextflow
(>=24.10.5
)
-
Install any of Docker
, Singularity
, Podman
, Shifter
or Charliecloud
for full pipeline reproducibility (please only use Conda
as a last resort; see docs)
-
Download the pipeline and test it on a minimal dataset with a single command:
nextflow run main.nf -profile test,docker
Note that it is recommend to use the -profile
parameter to specify the container technology of your choice. See the nf-core pipeline documentation for more information.
-
Start running your own analysis!
nextflow run main.nf \
--input genome.fa \
--pep_file proteins.fa \
--output_prefix my_analysis \
--outdir <OUTDIR>
Documentation
Quick Start
-
Install
Nextflow
(>=24.10.5
) -
Install any of
Docker
,Singularity
,Podman
,Shifter
orCharliecloud
for full pipeline reproducibility (please only useConda
as a last resort; see docs) -
Download the pipeline and test it on a minimal dataset with a single command:
nextflow run main.nf -profile test,docker
Note that it is recommend to use the
-profile
parameter to specify the container technology of your choice. See the nf-core pipeline documentation for more information. -
Start running your own analysis!
nextflow run main.nf \ --input genome.fa \ --pep_file proteins.fa \ --output_prefix my_analysis \ --outdir <OUTDIR>
Documentation
The nfmicrofinder pipeline comes with documentation about the pipeline usage and output.
Credits
nfmicrofinder was originally written by Yumi Sims and Will Eagle (@weaglesBio).
We thank the following people for their extensive assistance in the development of this pipeline:
- Jim Downie (@prototaxites)
Contributions and Support
If you would like to contribute to this pipeline, please see the contributing guidelines.
Citations
An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md
file.
You can cite the nf-core
publication as follows:
The nf-core framework for community-curated bioinformatics pipelines.
Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.
Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.