Onuralp Soylemez

I'm a research scientist based in San Francisco. I'm interested in applying generative models to practical problems in biotechnology and drug discovery.

I did my PhD in evolutionary genomics. Previously, I studied operations research and behavioral economics.

[Consulting] I'm available for consulting on projects related to the application of machine learning and generative AI to scientific problems. Please feel free to reach out via email.

Email   |   Scholar   |   Threads   |   Twitter   |   Github   |   LinkedIn

young me

Recent applied research

project image

Stability of interpretable circuits across models of varying size

showcase (WIP) | repo | brief report

This repo contains my ongoing experiments to better understand the recent advances in mechanistic interpretability research.

project image

Finetuning open source models under low-resource constraint

NeurIPS LLM Efficiency Challenge, 2023

challenge | repo | models | data

I participated in the LLM efficiency challenge and finetuned performant, open source models using custom open source datasets.

project image

Applying protein language models to predicting disease causing genetic mutations in medically actionable genes

Onuralp Soylemez, Pablo Cordero

NeurIPS Workshop on Learning Meaningful Representations of Life, 2022

workshop | paper | code

We developed a protein language model evaluation framework and revealed unappreciated structural features of proteins that are missed by other structure predictors like AlphaFold.

project image

Fine-tuning large foundational models in cancer biology

Pablo Cordero, Onuralp Soylemez, Darren Zhu

Bio x ML hackathon, 2022

hackathon | code | 5-min summary

We ranked 2nd place ($3,000 prize award) with our project on finetuning large language models for single cell biology on smaller datasets like DepMap cancer dependency.

project image

Prioritization of drug targets using Bayesian tensor factorization

ICML Workshop on Computational Biology, 2022

workshop | paper | code

Drug targets with human genetics evidence are shown to have better odds to succeed. We used Bayesian tensor factorization to integrate different types of human genetics evidence from rare genetic diseases to complex disorders.

project image

Accelerating scientific discovery process using large language models

interview | code

Proof-of-concept "chatGPT for genetics".


Design inspired by Chloe Hsu's website.