Bioinformatics Development & Statistical Genomics Team

CNAG-CRG

TEAM LEADER:
Simon Heath

STAFF SCIENTIST:
Emanuele Raineri

POST DOC:
Angelika Merkel, Ron Schuyler (until June 2017)

DATA ANALYST:
Anna Esteve-Codina

ENGINEER:
Marc Dabad

SOFTWARE ENGINEER:
Marcos Fernandez

Summary

The research focus of the team is on the development and application of efficient methods (from both statistical and computational perspectives) for the large-scale processing and integrative analysis of omics datasets. These methods have been developed into analysis pipelines at CNAG-CRG, and have been applied to both in house and collaborative projects (notably with the International Human Epigenome Consortium, IHEC). The largest dataset we are currently working on is from the IBD-CHARACTER project. This is a large multi-omics dataset from patients with Inflammatory Bowel Disease (IBD), consisting of expression, DNA-methylation and micro-biome data on tissue biopsies from inflamed and non-inflamed gut tissue. We are also re-analyzing WGBS (DNA methylation) datasets from TCGA and ICGC in the context of the Pan Cancer project to enable investigation of common epigenetic signatures across different cancer types.

The team also has a production focus in the processing, analysis and interpretation of epigenetic datasets (mostly DNA methylation studies from WGBS experiments).

Research lines

  • Further improvements to GEMBS, our WGBS analysis pipeline, to improve efficiency and accuracy. Major improvements include generation of QC metrics on-the-fly from the alignment and calling stages, avoiding the need to perform a separate statistics collection job, and improved standards compliance in the output VCF files.
  • Active participation in the IHEC assay standards committee to determine minimum reporting standards for WGBS analyses, which were then implemented into GEMBS.
  • Making and releasing a documented portable version of GEMBS that can be installed and used on single workstations or HPC clusters.
  • Continue investigation into the prediction of chromatin domains from DNA methylation data across different cell types using the full set of haematopoietic reference epigenomes from the BLUEPRINT project.
  • Continue characterization of partially methylated domains across cell types during differentiation and oncogenesis.
  • Integrative analysis of IBD-CHARACTER multi-omics dataset with the aim of identifying epigenetic signatures of disease from the tissue biopsies.

Services offered (production)

  • WGBS QC and alignment, SNP and DNA methylation calling
  • Differential methylation analysis
  • Hydroxymethylation analysis

Selected Publications

Ecker S, Chen L, Pancaldi V, Bagger FO, Fernández JM, Carrillo de Santa Pau E, Juan D, Mann AL, Watt S, Casale FP, Sidiropoulos N, Rapin N, Merkel A; BLUEPRINT Consortium, Stunnenberg HG, Stegle O, Frontini M, Downes K, Pastinen T, Kuijpers TW, Rico D, Valencia A, Beck S, Soranzo N, Paul D.
Genome-wide analysis of differential transcriptional and epigenetic variability across human immune cell types.”
Genome Biol, 18(1):18 (2017).

M Duran-Ferrer, G Clot, R Beekman, A Merkel, G Castellano, M Kulis, A Querios, R Vilarassa-Blasi, S Bea, R Royo, M Puiggos, D Torrens, X Agirre, F Propsper, E Ballester, L Seung-Tae, JL Wiemels, S Hoffman, R Siebert, A Lopez-Guillermo, S Heath, I Gut, E Campo, JI Martin-Subero.
A comprehensive portrait of the DNA methylome of 866 samples from different B cell neoplasms: biological insights and clinical applications.”
Haematologica, 102:93 (2017).

R Beekman, N Russinol, V Chapaprieta, N Verdaguer-Dot, R Vilarrasa-Blasi, G Clot, M Duran-Ferrer, M Kulis, G Castellano, BM Javierre, SW Wingett, J Blanc, F Serra, A Merkel, S Ullrich, A Vlasova, E Palumbo, M Pinyol, S Bea, R Royo, M Puiggros, A Datta, P Flicek, E Lowy, M Kostadima, L Clarke, J Delgado, A Lopez-Guillermo, XS Puente, C Lopez-Otin, D Torrents, ML Yaspo, M Aymerich, S Heath, R Guigo, M Gut, P Fraser, M Marti-Renom, I Gut, J Martens, H Stunnenberg, E Campo, I Martin-Subero.
Integrative analysis of the genome, epigenome, transcriptome and three-dimensional chromatin structure in chronic lymphocytic leukemia.
Haematologica, 102:9-10 (2017).

Jungfleisch J, Nedialkova DD, Dotu I, Sloan KE, Martinez-Bosch N, Brüning L, Raineri E, Navarro P, Bohnsack MT, Leidel SA, Díez J.
A novel translational control mechanism involving RNA structures within coding sequences.”
Genome Res, 27(1):95-106 (2017).

Merkel A, Fernandez-Callejo M, Casals E, Marco-Sola S, Schuyler R, Gut IG, Heath SC.
GEMBS — high through-put processing for DNA methylation data from Whole Genome Bisulfite Sequencing (WGBS).”
Biorxiv, (2017).