We are working to build a first-of-its-kind AI/ML capability to transform how we extract biological insight from complex proteomic data, with a focus on developing foundation models for mass spectrometry that go beyond existing analytical pipelines.
The opportunity:
This will be a founding role in Engitix’s AI/ML research programme. You will take a leading role in the design and development of foundational models for proteomics analysis. This will involve developing novel architectures seeking to map and make use of approximately 60–70% of current proteomics data that remains unexplained. Our existing explorations have focused on self-supervision, and object centric learning e.g. slot-attention, though this by no means is set in stone. Success will build representations that dramatically improve peptide identification, quantification, and discovery of novel biology. There will also be ample opportunity to work on problems outside the proteomics domain if interested.
Your responsibilities:
- Lead the research, design, and implementation of foundational models for mass spectrometry data analysis, focused on proteomics
- Test and optimize performance on small and large-scale training datasets from public spectral repositories and internal Engitix data
- Benchmark against state-of-the-art tools (DIA-NN, Spectronaut, MSFragger-DIA, MaxDIA)
- Design active learning and experimental design strategies that close the loop between model predictions and wet-lab validation
- Publish at top-tier venues (NeurIPS, ICML, ICLR) and contribute to the open scientific community
- Shape the long-term AI/ML research roadmap at Engitix
