- Career Center Home
- Search Jobs
- AI Research Scientist, Biological Foundation Models
Description
This AI Research Scientist will lead the design and build biological foundation models that learn shared representations across Quiver’s large-scale all-optical electrophysiology and multi-omic datasets. The role also involves hands-on development of signal and image processing for our proprietary optical electrophysiology data. The candidate will both lead and directly implement projects that integrate complementary data streams (e.g., transcriptomics, high-content imaging, clinical datasets, etc.) using deep representation learning, contrastive objectives, transformer architectures, and multi-modal fusion to build predictive models of disease biology and drug mechanism.
The successful applicant will work as part of a small, close-knit team at the hub of Quiver’s scientific programs. This position will actively mentor team members and set technical direction alongside contributing production-quality code and models to derive actionable insights from a broad range of relevant biomedical data modalities, including the company’s proprietary all-optical electrophysiology data and other genomics/omics and imaging data sets. The ideal candidate will be a strong computer scientist with deep expertise in representation learning, foundation model architectures, and multi-modal data alignment, with a demonstrated track record of leading teams while remaining deeply hands-on in coding, experimentation and system development, and a passion for applying technology to healthcare-related problems.
Quiver Bioscience is a technology-driven company established to create transformational medicines for the brain. We combine proprietary single-cell functional assays with other multi-modal measurements to discover new biology and new drug targets. We take advantage of cutting-edge AI/ML to build the world’s most information-rich maps of neuronal function to drive our drug discovery programs.
This position is based in Cambridge, MA, with the expectation of on-site presence 3-4 days per week to support lab integration, team meetings, and collaborative project work. Fully remote applicants will not be considered.
Responsibilities and Duties
Design, lead and directly implement computational pipelines for feature engineering and data integration for scientific data analysis.
Lead collaboration efforts while actively contributing to system architecture and implementation with internal and external teams responsible for production AI systems (e.g., LLM-based tools, agentic workflows). Provide input on system design, identify opportunities for new capabilities, and help define and integrate modeling workflows into user-facing applications.
Develop machine learning models to learn and align shared representations across heterogeneous biological datasets (e.g., time-series electrophysiology data, imaging, and perturbation assays).
Build and maintain scalable systems for embedding generation, storage, and retrieval (e.g., vector-based search, nearest-neighbor lookup) to support downstream analysis and querying. Own and develop rigorous evaluation frameworks and benchmarks for cross-modal embedding quality, including alignment metrics, retrieval precision, and biological plausibility checks against known drug-target and gene-pathway relationships. Partner closely with cross-functional wet-lab teams to design active learning cycles where model predictions suggest and prioritize validation experiments.
Lead by example in the development and deployment of AI/ML systems, ensuring best practices in reproducibility, version control, performance optimization, and documentation within a modern cloud-based environment.
Contribute to the development of data and machine learning pipelines, including implementation, evaluation, optimization, and maintenance of models and associated workflows. Balance team leadership with individual technical contributions, relationships, and work priorities to comfortably operate independently to make an impact.
Utilize excellent interpersonal skills to build consensus, share insights with relevant stakeholders, deliver interpretable data products, and serve both business and scientific goals of the company with your work.
Requirements
Minimum Qualifications Required
- PhD degree or corresponding demonstrable professional experience in Computer Science, Artificial Intelligence, Neuroscience, Physics, Computer Vision, Electrical Engineering, Mathematics, or related technical discipline (e.g., engineering, science, or biology with a strong quantitative flavor).
- Familiarity with production AI systems and workflows, including experience or exposure to LLM-based applications, retrieval-augmented generation (RAG), or agentic systems, and the ability to translate modeling insights into practical system improvements in collaboration with engineering teams.
- Experience with representation learning or embedding-based approaches for high-dimensional or multimodal data (e.g., time-series, imaging, or biological datasets), and familiarity with systems for large-scale data processing or embedding retrieval.
- Demonstrated experience training foundation models or large-scale representation learning systems, including self-supervised pre-training, contrastive learning (e.g., CLIP-style objectives), or multi-modal alignment of heterogeneous datasets.
- Strong proficiency in PyTorch (preferred) or TensorFlow, with hands-on experience implementing transformer architectures, CNNs, variational autoencoders, and/or contrastive learning frameworks for high-dimensional data. Experience with distributed training (multi-GPU/multi-node) preferred.
- Experience with image processing, video processing, signal processing, and/or time series analysis.
- Comfort with multiple programming languages and computational APIs with an expectation of daily hands-on coding and system building. (the role will utilize Python, C++/CUDA, git, SQL, and AWS, among others).
- Experience aligning or integrating heterogeneous biological datasets (e.g., transcriptomics, functional screens, imaging, perturbation assays) into shared latent spaces using learned embeddings, manifold alignment, or multi-view representation learning.
- Strong mathematical foundations in linear algebra, probability theory, and optimization. High level of creativity with a passion for applying computational methods to biomedical problems.
- Excellent skills in the areas of verbal/written communication, problem solving, and leadership, with the ability to lead a team while remaining a primary technical contributor (“player-coach” mindset).
Additional Qualifications Desired
- Knowledge and experience in biophysics, physiology, or neuroscience, especially electrophysiology, high-content imaging, or computational neuroscience.
- Experience working with single-cell data of any modality
- Familiarity with biological foundation models (e.g., scGPT, Geneformer, scVI) or related work in learning cell/gene embeddings from large-scale omics data.
- Experience with perturbation biology datasets (chemical or genetic screens), CMAP/L1000, or similar large-scale phenotypic profiling data. Publications or preprints in representation learning, multi-modal alignment, or computational biology are a strong plus.
