Data Engineer - Bioinformatics

Engineering · Full-time · England, United Kingdom

Job description

We are looking for a Data Engineer to help solve some of the key challenges around a programme of work at industrial scale with global significance. The successful Data Engineer will know how to communicate to and between technical and non-technical stakeholders as well as facilitate discussions within a multidisciplinary team including scientists, software engineers, product managers and other data engineers.  

You will be contributing towards the delivery of data releases that will be used worldwide and will have experience with genetic data. 

Our Future Health will be the UK’s largest ever health research programme, bringing people together to develop new ways to detect, prevent and treat diseases. We are a charity, supported by the UK Government, in partnership with charities and industry. We work closely with the NHS and with public authorities across all nations and regions of the UK.

Our plan is to bring together 5 million volunteers from right across the UK who will be asked to contribute information to help build one of the most detailed pictures we have ever had of people’s health. Researchers will be able to use this information to make new discoveries about human health and diseases. So future generations can live in good health for longer.

What you’ll be doing

You’ll be part of a multidisciplinary team that’s creating pipelines that didn’t exist before, owning them in production and improving them over time. Your key responsibilities will include but not be limited to:

  • Supporting the build of data pipelines from data providers to our primary data store and trusted research environment.
  • Producing logic for data transformation steps as code, which meets the requirements for our end users and builds well curated, accessible and quality controlled data for analysis.
  • Developing prototypes for pipelines for complex transformations drawing on existing workflows developed in industry and academia.
  • Keeping abreast of best practice in data engineering across industry, research and Government and facilitating the adoption of standards.
  • Providing technical input into the upstream parts of the data pipeline, including the specification and transfer of data from data providers.
  • Routine ad-hoc data curation activities requiring hands on development of bespoke ETL cleaning scripts using languages such as Python.
  • Working with researchers to understand the data requirements and working with them to deliver the data needed for their projects.

Peers

View in org chart

A panel showing how The Org can help with contacting the right person.

Open roles at Our Future Health