Dr Hajk Drost

Principal Investigator/Senior Lecturer

Computational Biology, School of Life Sciences

Hajk Drost
On this page

Contact

Email

[email protected]

Phone

+44 (0)1382 384826

Biography

Dr Hajk-Georg Drost is a Principal Investigator within the Division of Computational Biology. He studied computer science and bioinformatics with a strong focus on statistical learning, machine learning, and predictive modeling with applications in comparative genomics and evolutionary developmental biology (BSc and MSc in Bioinformatics, and in a PhD in Computer Science (2013-2015) at the Institute of Computer Science – Martin-Luther University Halle, Germany). In 2015, he joined the laboratory of Jerzy Paszkowski at the Sainsbury Laboratory and Genetics Department at the University of Cambridge (2015-2018) and then joined the second lab of Elliot Meyerowitz (first lab at Caltech) as a senior postdoc at the Sainsbury Laboratory in Cambridge (2018-2019). At Cambridge, he was fortunate to be elected Fellow of the Cambridge Philosophical Society and Postdoctoral Affiliate of Trinity College.

Between 2019-2024, he established a Computational Biology Group in the Department of Molecular Biology (led by Detlef Weigel) at the Max Planck Institute for Biology Tubingen, Germany. There, he contributed to the growing field of Machine Learning in biology and life sciences as well as to protein biology at tree-of-life scale which is part of the emerging field of Digital Biology. He moved to the University of Dundee as Senior Lecturer in 2024 and was fortunate to be awarded the Royal Society Wolfson Fellowship which supports his team’s efforts to harness AI and machine learning to uncover fundamental principles of developmental regulation to guide drug discovery efforts to prevent or mitigate the effects of human developmental diseases.

Research

Digital Biology is rapidly emerging as a transformative field, merging biological sciences with advanced computational technologies. Our discipline is concerned with how to effectively leverage intelligent software, predictive analytics, and high-performance computing to decode complex biological data to lead the way for groundbreaking advancements in healthcare and pharmaceuticals.

Intelligent software is adaptive, scalable, and user-friendly. Inspired by recent advancements in deep learning and cloud-computing, we develop intelligent open-source software and harness it to translate the predictive capacity of artificial intelligence into molecular biology research for healthcare benefits. Our ultimate aim is to drive fundamental research and develop enabling technologies to support the estimated $250B R&D transformation of the potential $10T global healthcare market through Digital Biology and Generative Artificial Intelligence.

We approach this ambitious goal by developing intelligent biological sequence search engines, AI models, and big data mining architectures to unlock comparative and functional genomics at tree-of-life scale for (generative) drug discovery and disease marker identification through inference of gene regulatory networks from integrative multi-omics datasets.

In detail, my team explored how causal hypotheses can be tested when inferring gene regulatory interactions from multi-omics datasets to understand how biological function is retained evolutionary time or diversifies with perturbation effects that can lead to (developmental) diseases (e.g. cancer). Here at the University of Dundee, we seek to make long-term investments into the intelligent software architectures and methodological innovation in Generative (Life Science) AI that is required to translate digital biology research and machine learning methodologies into bio-pharmaceutical and medical application.

Selected Publications

  1. Jaruwatana S. Lotharukpong, Min Zheng, Remy Luthringer, Hajk-Georg Drost*, Susana M. Coelho*. A Transcriptomic Hourglass In Brown Algae. Nature 635, 129-135 (2024)
  2. B Buchfink, K Reuter, HG Drost*. Sensitive protein alignments at tree-of-life scale using DIAMOND. Nature Methods, 18, 366–368 (2021)
  3. M Quint, HG Drost et al. A transcriptomic hourglass in plant embryogenesis. Nature 490 (7418), 89-101 (2012) (journal cover)
  4. HG Drost* et al. myTAI: evolutionary transcriptomics with R. Bioinformatics 34 (9), 1589-1590 (2018)
  5. Benjamin Buchfink, Haim Ashkenazy, Klaus Reuter, John A. Kennedy, Hajk-Georg Drost*. Sensitive clustering of protein sequences at tree-of-life scale using DIAMOND DeepClust. bioRxiv, 2023.01.24.525373 (2023)
  6. Josué Barrera-Redondo*, Jaruwatana Sodai Lotharukpong, Hajk-Georg Drost* and Susana M Coelho*. Uncovering gene-family founder events during major evolutionary transitions in animals, plants and fungi using GenEra. Genome Biology, 24:54 (2023)
  7. Andreas Grigorjew, Artur Gynter, Fernando Dias, Benjamin Buchfink, Hajk-Georg Drost*, and Alexandru I. Tomescu*. Sensitive inference of alignment-safe intervals from biodiverse protein sequence clusters using EMERALD. Genome Biology, 24:168 (2023)
  8. HG Drost*. Philentropy: Information Theory and Distance Quantification with R. Journal of Open Source Software, 3(26), 765 (2018)
  9. HG Drost et al. Evidence for Active Maintenance of Phylotranscriptomic Hourglass Patterns in Animal and Plant Embryogenesis. Molecular Biology and Evolution 32 (5), 1221-1231 (2015)
  10. HG Drost* and J Paszkowski. Biomartr: genomic data retrieval with R. Bioinformatics 33(8): 1216-1217 (2017)

People in my lab

  • Stefan Manolache
View full research profile and publications

Awards

Award Year
Personal Fellowships / Royal Society Wolfson Fellowship 2024

Stories