SandboxAQ is a high-growth company delivering AI solutions that address some of the world's greatest challenges. The company’s Large Quantitative Models (LQMs) power advances in life sciences, financial services, navigation, cybersecurity, and other sectors.
We are a global team that is tech-focused and includes experts in AI, chemistry, cybersecurity, physics, mathematics, medicine, engineering, and other specialties. The company emerged from Alphabet Inc. as an independent, growth capital-backed company in 2022, funded by leading investors and supported by a braintrust of industry leaders.
At SandboxAQ, we’ve cultivated an environment that encourages creativity, collaboration, and impact. By investing deeply in our people, we’re building a thriving, global workforce poised to tackle the world's epic challenges. Join us to advance your career in pursuit of an inspiring mission, in a community of like-minded people who value entrepreneurialism, ownership, and transformative impact.
SandboxAQ’s AI Simulation team develops new drugs and materials using a spectrum of AI and physics-based computational solutions. We are seeking an experienced and innovative Bioinformatics Research Engineer to amplify our ability to make inferences about biology by utilizing multimodal data ranging from various multi-omics sources to physics-based simulations. You will work closely with interdisciplinary teams of computational biologists, machine learning engineers, and software engineers developing and deploying bioinformatics research software and tooling. You will develop cutting-edge solutions at the intersection of machine learning, knowledge graphs, genetic sequencing technology, and more.
What You'll Do- Drive the development of our computational biology and bioinformatics tooling to enhance various stages in the drug discovery pipeline
- Identify relevant data sources and take responsibility for data ingestion, QC, and validation.
- Contribute to ongoing research for target ID, biomarker ID, patient stratification, and toxicity prediction.
- Research and implement novel bioinformatics and deep learning algorithms for deciphering human genetic variants, gene regulation, gene expression, and disease pathways by using information from clinical phenotypes and medical records, multi-omics, single-cell, proteomics, and genomic data.
- Present to and interact with anyone who needs to understand your work, including clients, other scientists, and non-technical team members.
- Write patents, journal articles, and whitepapers. Speak at conferences and in pitch meetings.
- Work closely with external partners to understand their current challenges for Drug Discovery projects, identify relevant data sets, and research requirements to drive research partnerships forward.
- Vastly improve drug discovery and development on a social scale. Help make better drugs, and help make the tools to do so.
- PhD in a relevant field (bioinformatics, computational biology, computer science, statistics, or similar).
- 1-5 years of relevant experience, including hands-on experience in the private sector working on projects related to one of: bioinformatics, statistical genetics, computational biology, machine learning, or knowledge graphs.
- Experience processing and curating large-scale omics, clinical data, medical records, and scientific literature data.
- Experienced with common Python toolkits for scientific computing (e.g., pandas, numpy, scipy), machine learning (e.g., scikit-learn, pytorch), and bioinformatics (e.g,. biotite, biopython).
- Experience with secondary and tertiary analysis of sequencing data related to DNA sequencing, RNA-seq, epigenomics, functional genomics, Single-cell and spatial omics, Single-cell CRISPR screens
- Ability to apply advanced machine learning methods to solve complex bioinformatics problems.
- Experience with Bioinformatics tools and pipelines (e.g., BWA, GATK, Samtools, Bedtools, Seurat, Scanpy, Nextflow, Apache Airflow)
- Familiarity with running analyses and training models on high-performance computing (GPU) environments for corporate R&D, innovation labs, or academic research.
- An interest in solving scientific problems in chemistry and biology via computational and data-driven methods.
- A drive to cooperate with colleagues to identify problems and communicate technical solutions in an accessible manner.
- Hands-on mentality & comfortable with getting deep into the technical weeds of highly complex problems, and a track record of driving projects to completion.
- Familiarity with cloud-based platforms (e.g., Google Cloud Platform, AWS) for data processing and storage.
- Demonstrated ability in analyzing one or more of the following modalities: Epigenomics, Proteomics (Including Mass Spectrometry), Transcriptomics (RNA-seq).
- Knowledge of Genomic databases (e.g,. 1000 Genomes, UK Biobank, GTEx, TCGA)
- Experience performing statistical analyses and data visualization to extract meaningful biological insights related to: tumor-immune interactions, immunotherapy response, and resistance mechanisms.
- Demonstrated experience in computational immunology (e.g., neoantigen prediction, TCR/BCR sequencing analysis, immunogenicity modeling).
- Experience applying bioinformatics workflows to one or more of: target ID, biomarker ID, clinical development and patient stratification.
- Excellent publication record.
- Willingness to travel less than 25% to conferences, offsites, customers, and internal meetings.
The US base salary range for this full-time position is expected to be $154k - $215k per year. Our salary ranges are determined by role and level. Within the range, individual pay is determined by factors including job-related skills, experience, and relevant education or training. This role may be eligible for annual discretionary bonuses and equity.