I am a geneticist and bioinformatician with a PhD in Genetics and an MSc in Bioinformatics & Biostatistics, with hands-on experience building and maintaining reproducible workflows for RNA-seq, ATAC-seq, ChIP-seq, and Hi-C analysis in SLURM-based HPC environments.
My background is in collaborative academic research, where I worked at the interface of computational and experimental biology. In practice, this meant designing analyses, building pipelines, integrating multi-omics datasets, and translating results into biologically meaningful conclusions.
I am currently strengthening the software engineering / Python / data pipeline side of my profile while applying for roles in bioinformatics, bioinformatics engineering, and scientific data analysis.
- NGS pipeline development with Nextflow, Bash, and HPC execution
- Bulk omics analysis: RNA-seq, ATAC-seq, ChIP-seq, Hi-C
- Multi-omics integration for regulatory genomics
- R-based statistical analysis with Tidyverse and Bioconductor
- Python-based ETL/data workflows for genomics datasets
- Reproducible research practices: documentation, environment management, workflow standardization
A Nextflow DSL2 ChIP-seq pipeline developed for reproducible analysis in HPC environments.
Includes workflow modularization, metadata-driven execution, and standard ChIP-seq processing steps.
Focus: Nextflow, ChIP-seq, Conda, reproducible workflows
A 4-step pipeline designed for quantifying chromatin loop strength between two Hi-C conditions. Generates distance-matched random loops to compute empirical p-values and log2 fold changes, specifically solving low-replicate constraints.
Focus: Hi-C, 3D genomics, statistics, low-replicates
Utility workflow for preparing Hi-C-derived data for metaloops / loop-calling analyses, including format conversion and preprocessing.
Focus: Hi-C preprocessing, genomics formats, SLURM-based workflows
Python-based ETL workflow for retrieving and integrating RNA-seq and metadata from the Genomic Data Commons (GDC).
This project reflects my current effort to build more industry-oriented, Python-first genomics tooling.
Focus: Python, ETL, public cancer genomics data, metadata integration
Languages & scripting
- Python: genomics/data-processing workflows
- R: Tidyverse, Bioconductor, ggplot2
- Bash: workflow scripting and HPC execution
Workflow & infrastructure
- Nextflow
- SLURM HPC environments
- Conda / Mamba
- Linux / Unix
NGS / genomics
- RNA-seq
- ATAC-seq
- ChIP-seq
- Hi-C
- Multi-omics integration
- QC, differential analysis, regulatory genomics
Tools I have worked with
STAR · RSEM · DESeq2 · samtools · FastQC · MACS2 · Cooler · deepTools · IGV · BioMart · Ensembl · UCSC · GEO
Most repositories here were built to solve real research problems in an academic lab setting where I was often the main person developing bioinformatics workflows and analysis utilities.
They reflect practical work in NGS analysis, HPC execution, multi-omics integration, and custom genomics tooling. I am currently improving these repositories to make them more portable, reusable, and better aligned with software engineering best practices.
Selected publications from my research work:
- Camilleri-Robles, C., et al. (2024). Long non-coding RNAs involved in Drosophila development and regeneration. NAR Genomics and Bioinformatics
- Camilleri-Robles, C., et al. (2024). A shift in chromatin binding of phosphorylated p38 precedes transcriptional changes upon oxidative stress. FEBS Letters
- Camilleri-Robles, C., et al. (2022). Genomic and functional conservation of lncRNAs: lessons from flies. Mammalian Genome
- Llorens-Giralt, P., Camilleri-Robles, C., et al. 3D genome organization in tissue regeneration: functional requirement of long-range chromatin loops. Science Advances (in press)
Full publication list: ORCID
- ORCID: 0000-0001-7103-8354
- GitHub: ccarloscr
- Location: Barcelona, Spain