The role involves optimizing hardware support within a software stack for AI model deployment, collaborating with teams and hardware vendors to enhance performance and portability across platforms.
About Modular
About the role:
What you will do:
What you bring to the table:
Helpful, but not required:
What Modular brings to the table:
At Modular, we’re on a mission to revolutionize AI infrastructure by systematically rebuilding the AI software stack from the ground up. Our team, made up of industry leaders and experts, is building cutting-edge, modular infrastructure that simplifies AI development and deployment. By rethinking the complexities of AI systems, we’re empowering everyone to unlock AI’s full potential and tackle some of the world’s most pressing challenges.
If you’re passionate about shaping the future of AI and creating tools that make a real difference in people’s lives, we want you on our team. You can read about our culture and careers to understand how we work and what we value.
ML developers today face significant friction in taking trained models into deployment. They work in a highly fragmented space, with incomplete and patchwork solutions that require significant performance tuning and non-generalizable, model-specific enhancements. At Modular, we are building the next generation AI platform that will radically improve the way developers build and deploy AI models.
As part of our mission to build AI's unified compute layer, we are expanding the Modular software stack to a variety of new and exciting hardware platforms.
We are looking for a motivated engineer to join the Hardware Enablement team at Modular. In this role you will work across the Modular software stack — from Mojo kernels and the graph compiler to MAX model serving — to bring up and optimize support for new accelerator platforms. You'll collaborate closely with internal teams and external hardware partners, and you'll develop deep expertise in novel architectures while contributing to our portability story.
LOCATION: We welcome candidates who are based in and have work authorization in the United Kingdom, Norway, or the United States (Eastern Time Zone). To support growth and collaboration, those in earlier career stages work in a hybrid capacity at our Edinburgh, UK or Boston, MA office (minimum 2 days per week on-site). Onboarding for new hires is conducted in-person.
- Implement and validate support for new hardware architectures across the Modular stack, working under the guidance of senior engineers on the team
- Write and optimize Mojo kernels targeting novel accelerator architectures, with a focus on correctness first and performance iteration
- Contribute to cross-team efforts improving portability infrastructure, tooling, and debugging workflows for new target hardware
- Collaborate with hardware vendor engineers to understand target platforms, build integration tests, and triage platform-specific issues
- Develop working knowledge of new hardware platforms — including ISA documentation, memory hierarchies, and vendor toolchains — and share findings with the team through demos and write-ups
- Participate in company events such as on-sites and hackathons, contributing to a collaborative and open engineering culture
- 5+ years of experience in high-performance computing, compiler engineering, or related domains in industry or research
- Familiarity with how AI operators are implemented at a low level (e.g., experience writing or modifying GPU kernels, custom operators, or working with frameworks like PyTorch at the C++ layer)
- Proficiency in C++ and experience working in complex, multi-component software systems
- Hands-on experience with at least one heterogeneous programming model (CUDA, SYCL, OpenCL, or similar), either as a user or contributor
- Some exposure to non-GPU accelerator architectures (DSPs, NPUs, or other hardware accelerators) is a strong plus
- Curiosity and willingness to learn new hardware platforms quickly, comfortable reading architecture manuals and vendor documentation
- A collaborative, team-oriented attitude and alignment with our culture
Helpful, but not required:
- Experience with GPU DSLs/DSELs such as Triton, CUTLASS, or CuTe
- Familiarity with MLIR or LLVM compiler infrastructure
- Experience working directly with hardware vendor teams or on platform bring-up efforts
- Exposure to model serving or inference optimization workflows
- Amazing Team. We are a progressive and agile team with some of the industry’s best engineering and product leaders.
- World-class Benefits. In order to attract the best, we need to offer the best. Premier insurance plans, up to 5% 401k matching, flexible paid time off, and more are available to you! Please note that specific benefit packages may vary based on your location.
- Competitive Compensation. We offer very strong compensation packages, including stock options. We want people to be focused on their best work and believe in tailoring compensation plans to meet the needs of our workforce.
- Team Building Events. We organize regular team onsites and local meetups in Los Altos, CA as well as different cities. Traveling 2-4 times a year is expected for all roles.
Working at Modular will enable you to grow quickly as you work alongside incredibly motivated and talented people who have high standards, possess a growth mindset, and a purpose to truly change the world.
The estimated base salary range for this role to be performed in the United States is $198,000 - $242,000 USD.
Similar Jobs
Big Data • Food • Hardware • Machine Learning • Retail • Automation • Manufacturing
Lead end-to-end digital transformation of supply chain planning across Europe, implementing o9 and AI/ML-enabled forecasting, building data foundations, driving adoption, redesigning processes, and managing a team of solution owners, data engineers, and transformation leads to improve service, agility, and cost.
Top Skills:
Ai/MlData LakesErpKinaxisMlopsO9Real-Time AnalyticsSap Ibp
Big Data • Food • Hardware • Machine Learning • Retail • Automation • Manufacturing
Lead end-to-end design, standardization and continuous improvement of global Accounts Payable (Invoice-to-Pay). Own AP technology stack (SAP, ReadSoft, Tungsten), manage BPO partner, build KPIs and dashboards, deploy Celonis process mining, deliver automation and control improvements, and drive change, training and governance across Procurement, Finance and IT.
Top Skills:
Bpmn 2.0CelonisLean Six SigmaProcess DirectorReadsoftSAPSap Invoice ModuleSap Payment-Run ModuleTungsten
21 Hours Ago
Big Data • Food • Hardware • Machine Learning • Retail • Automation • Manufacturing
Responsible for leading financial integrity and performance, supporting business proposals, managing finance planning, and driving strategic initiatives within the Confectionery category.
Top Skills:
Data AnalysisFinancial ModelingFinancial PlanningFinancial ReportingPerformance Management
What you need to know about the Vancouver Tech Scene
Raincouver, Vancity, The Big Smoke — Vancouver is known by many names, and in recent years, it has gained a reputation as a growing hub for both tech and sustainability. Renowned for its natural beauty, the city has become a magnet for professionals eager to create environmental solutions, and with an emphasis on clean technology, renewable energy and environmental innovation, it's attracted companies across various industries, all working toward a shared goal: advancing clean technology.
