Senior Genomics Data Engineer

Centre / Institution:
Center for Genomic Regulation
Bioinformatics expertise:
Biomedical Informatics

Job description

The Institute

The Centro Nacional de Análisis Genómico (CNAG-CRG) is one of the largest Genome Sequencing Centers in Europe. CNAG-CRG researchers participate in major International Genomic Initiatives such as the International Cancer Genome Consortium (ICGC), the International Human Epigenome Consortium (IHEC), the International Rare Diseases Research Consortium (IRDiRC) and the European Infrastructure for life-science information (ELIXIR), as well as in several EU-funded projects.

It is integrated with the Centre for Genomic Regulation (CRG), an international biomedical research institute of excellence, based in Barcelona, Spain, with more than 400 scientists from 44 countries. The CRG is composed by an interdisciplinary, motivated and creative scientific team which is supported both by a flexible and efficient administration and by high-end and innovative technologies.

In April 2021, the Centre for Genomic Regulation (CRG) received the renewal of the 'HR Excellence in Research' logo from the European Commission. This is a recognition of the Institute's commitment to developing an HR Strategy for Researchers, designed to bring the practices and procedures in line with the principles of the European Charter for Researchers and the Code of Conduct for the Recruitment of Researchers (Charter and Code).

Please, check out our Recruitment Policy

The role

The CNAG-CRG is looking for a genomics data engineer to participate in the development tasks of the RD-Connect Genome-Phenome Analysis Platform (GPAP, using a mix of languages and tools to process and analyse multi-omics data. Technologies currently in use include Spark/hail, Python, Scala, Hadoop/Ceph, ElasticSearch, Postgres, DataSHIELD and Opal. The work will be mostly geared towards the infrastructure needs of the 3TR project, a large-scale European initiative towards solving autoimmune, inflammatory and allergic diseases (

The selected candidate will work on the automation of dataflow from sequencers and user submission to the HDFS/Ceph and Elasticsearch clusters. She/he will build data infrastructure to support the analysis of clinical, genomics and other omics data within personalize medicine framework. The candidate will also work on implementing a federated platform to support federated learning and federated analysis.


  • Participate in the design, development and maintenance of the RD-Connect GPAP data engineering
  • Automate workflows
  • Integrate the platform with other databases/platforms
  • Implement federated learning and federated analysis
  • Collaborate with data analysts and software engineers
  • Interact with national and international partners

About the team

The selected candidate will join the team of engineers coordinated by Dr. Davide Piscia within the Bioinformatics Analysis Unit led by Dr Sergi Beltran. The multi-disciplinary 20 members Unit is focused on NGS data analysis and software development, mostly related to human health. The Unit develops the RD-Connect GPAP ( and participates, among other, in Solve-RD (, EJP-RD (, Genomed4All (, 3TR (, ELIXIR (, MatchMaker Exchange (, GA4GH ( and Clúster de Valorització d'EGA per a la Indústria i la Societat (VEIS).

Desired skills and expertise

Professional experience

Must Have

  • You have At least 2 years working experience as Data engineer or similar role in Unix operating systems
  • You have experience with at least two of these programming languages (python, scala, java, clojure, groovy)

Desirable but not required/ Nice to have

  • You have experience with data version control
  • You have experience with federated learning
  • You have experience with some of this frameworks (Pytorch, Tensorflow, scikit-learn)
  • You have experience with data governance
  • You have knowledge of Bioinformatics / High Throughput Sequencing and/or human genetics
  • You have interest in functional programming

Education and training

  • You hold a degree in Software Engineering, Information Technology, Computer Science or similar


  • You are fluent in English

Technical skills

  • You have experience with machine learning
  • You have experience with task automation


  • You have highly developed organization skills
  • You are proactive
  • You are a team player

Contract duration and other benefits

The Offer - Working Conditions

  • Contract duration: 3 years.
  • Estimated annual gross salary: Salary is commensurate with qualifications and consistent with our pay scales.
  • Target start date: As soon as possible.

We provide a highly stimulating environment with state-of-the-art infrastructures, and unique professional career development opportunities. To check out our training and development portfolio, please visit our website in the training section.

We offer and promote a diverse and inclusive environment and welcomes applicants regardless of age, disability, gender, nationality, ethnicity, religion, sexual orientation or gender identity.

The CRG is committed to reconcile a work and family life of its employees and are offering extended vacation period and the possibility to benefit from flexible working hours.

Required information and contact

All applications must include:

  1. A motivation letter addressed to Dr. Davide Piscia.
  2. A complete CV including contact details.
  3. Contact details of two referees.

All applications must be addressed to Dr. Davide Piscia and be submitted online on the CRG Career site -

Selection Process

  • Pre-selection: The pre-selection process will be based on qualifications and expertise reflected on the candidates CVs. It will be merit-based.
  • Interview: Preselected candidates will be interviewed by the Hiring Manager of the position and a selection panel if required.
  • Offer Letter: Once the successful candidate is identified the Human Resources department will send a Job Offer, specifying the start day, salary, working conditions, among other important details.

Suggestions: The CRG believes in ongoing improvement and promotes a culture of feedback. This is one of the reasons we have in place, at your disposal as a candidate, a mechanism to gather your suggestions/complaints concerning your candidate experience in our recruitment processes. Your feedback really matters to us in our aim at creating a positive candidate journey. You can make a difference and help us improve by letting us know your suggestions through the following form.