Research Intern at LIX
LIX | Mai 2023–Oct 2023
During my research internship at LIX within the DaSciM team, I'm developing Cell2Text, a cutting-edge foundation model designed to generate automated descriptions of single cells. This project bridges the gap between complex biological data and natural language, aiming to revolutionize how we understand and interact with single-cell information.
My responsibilities include:
- Biological Data Preprocessing: Handling and preparing diverse biological datasets, including single-cell RNA sequencing and gene expression data, for model training.
- Multimodal LLM Development: Architecting and developing Cell2Text, a multimodal large language model that integrates single-cell data with natural language processing techniques. This model will enable automated cell description generation and facilitate answering complex biological questions.
- Multimodal Fusion Engineering: Exploring and implementing innovative multimodal fusion techniques to extend the Cell2Text framework beyond single-cell analysis to broader biological domains.
Skills:
- Python: scikit-learn, PyTorch
- Fine Tuning: LoRA
- Single-cell RNA Sequencing Analysis
- Natural Language Processing (NLP)
- Machine Learning