Onuralp Soylemez

I'm a research scientist based in San Francisco. I'm interested in applying generative models to practical problems in biotechnology and drug discovery.

I did my PhD in evolutionary genomics. Previously, I studied operations research and behavioral economics.

[Consulting] I'm available for consulting on projects related to the application of machine learning and generative AI to scientific problems. Please feel free to reach out via email.

Recent applied research

	Stability of interpretable circuits across models of varying size showcase (WIP) \| repo \| brief report This repo contains my ongoing experiments to better understand the recent advances in mechanistic interpretability research.
	Finetuning open source models under low-resource constraint NeurIPS LLM Efficiency Challenge, 2023 challenge \| repo \| models \| data I participated in the LLM efficiency challenge and finetuned performant, open source models using custom open source datasets.
	Applying protein language models to predicting disease causing genetic mutations in medically actionable genes Onuralp Soylemez, Pablo Cordero NeurIPS Workshop on Learning Meaningful Representations of Life, 2022 workshop \| paper \| code We developed a protein language model evaluation framework and revealed unappreciated structural features of proteins that are missed by other structure predictors like AlphaFold.
	Fine-tuning large foundational models in cancer biology Pablo Cordero, Onuralp Soylemez, Darren Zhu Bio x ML hackathon, 2022 hackathon \| code \| 5-min summary We ranked 2nd place ($3,000 prize award) with our project on finetuning large language models for single cell biology on smaller datasets like DepMap cancer dependency.
	Prioritization of drug targets using Bayesian tensor factorization ICML Workshop on Computational Biology, 2022 workshop \| paper \| code Drug targets with human genetics evidence are shown to have better odds to succeed. We used Bayesian tensor factorization to integrate different types of human genetics evidence from rare genetic diseases to complex disorders.
	Accelerating scientific discovery process using large language models interview \| code Proof-of-concept "chatGPT for genetics".

Design inspired by Chloe Hsu's website.

Onuralp Soylemez

Recent applied research

Stability of interpretable circuits across models of varying size

Finetuning open source models under low-resource constraint

Applying protein language models to predicting disease causing genetic mutations in medically actionable genes

Fine-tuning large foundational models in cancer biology

Prioritization of drug targets using Bayesian tensor factorization

Accelerating scientific discovery process using large language models