EviSmart Logo

EviSmart

ML Ops

Posted 8 Days Ago
Be an Early Applicant
Easy Apply
In-Office
Vancouver, BC
Senior level
Easy Apply
In-Office
Vancouver, BC
Senior level
The ML Ops role focuses on building and managing infrastructure for deploying AI models in production, ensuring scalability, performance, and reliability. Key responsibilities include optimizing GPU workloads, designing CI/CD pipelines, and implementing observability for their platform.
The summary above was generated by AI
About EviSmart

EviSmart is a B2B SaaS platform transforming how dental labs, dentists, and design centers work together. We connect practices across 28+ countries with AI-powered workflow automation, overnight CAD design services, and intelligent case management. We're a team of ~60 across Vancouver, Manila, and Seoul—building the operating system for modern dentistry.

The Opportunity

Our core AI research is done. The 3D segmentation models work. The generative models are built. The frontend is live, the storage layer is operational.

What's missing is the production backbone—the compute layer that turns research-grade models into fast, reliable, scalable services that real users depend on every day.

This role is about building that backbone. You'll architect and operate the infrastructure that powers heavy 3D mesh processing and GPU-intensive workloads for both internal teams and external customers. You'll own the systems that make our AI actually work in production—reproducibly, efficiently, and at scale.

This is a systems-level engineering role, not model research. If you're the person who bridges the gap between "it works in a notebook" and "it works for 10,000 users," we want to talk to you.

What You'll Be Doing

Productionizing Research Models

  • Convert research models into scalable, production-grade inference services
  • Design reproducible, version-controlled Docker environments with clear deployment standards
  • Optimize GPU memory usage and runtime performance for 3D workloads
  • Implement stateless service architectures with model versioning and rollback capabilities

Building Hybrid GPU Infrastructure

  • Architect asynchronous job orchestration across on-prem GPU servers and cloud GPU instances
  • Implement GPU scheduling, isolation, and intelligent workload routing
  • Design for concurrent users with fault tolerance and graceful failure recovery
  • Support hybrid infrastructure with cloud burst capacity for peak demand

Designing a Scalable Inference Platform

  • Build modular pipelines with clean separation between preprocessing, model inference, and postprocessing
  • Create well-defined service boundaries between backend APIs and compute services
  • Establish a platform architecture that allows future models to integrate without major refactoring

Deployment & Automation

  • Build CI/CD pipelines for model updates and infrastructure changes
  • Implement version control, rollback strategies, and release management across environments
  • Automate deployment workflows across hybrid infrastructure

Reliability & Observability

  • Implement logging, monitoring, and system observability across the compute layer
  • Track GPU utilization, job performance, and queue health
  • Design retry mechanisms and automated failure handling
  • Optimize latency and throughput to maintain stability under variable load
What We're Looking For

Required

  • 5+ years in ML Infrastructure, MLOps, or backend systems engineering
  • Proven track record deploying ML models into production environments
  • Strong expertise in Docker and container lifecycle management
  • Hands-on experience managing and optimizing GPU-based workloads
  • Experience designing asynchronous job systems or queue-based architectures
  • Solid Linux systems administration skills
  • Experience integrating ML services with backend APIs (FastAPI, Flask, etc.)

Preferred

  • Kubernetes or other container orchestration platforms
  • Hybrid cloud + on-prem infrastructure deployment
  • CI/CD pipeline design and automation
  • Monitoring and logging tools (Prometheus, Grafana, ELK, etc.)
  • Infrastructure-as-Code (Terraform, Pulumi, etc.)
  • Experience with high-compute 3D ML workloads
  • Model versioning systems and experiment tracking
You'll Thrive Here If You…
  • Have successfully transitioned ML systems from research to commercial deployment
  • Think in terms of platform architecture, not one-off deployments
  • Care about reliability, scalability, and production SLAs
  • Are comfortable debugging across GPU, container, and infrastructure layers
  • Design systems for long-term extensibility—not just what works today

EviSmart™ | Where Dental Work Flows Vancouver | Manila | Seoul

Apply today @ Evismart Careers 

 

Top Skills

Docker
Elk
Fastapi
Flask
Gpu
Grafana
Kubernetes
Prometheus
Terraform
HQ

EviSmart Vancouver, British Columbia, CAN Office

675 Hastings St W, Vancouver, BC , Canada, V6B 1N2

Similar Jobs

4 Days Ago
Easy Apply
Remote or Hybrid
Canada
Easy Apply
Senior level
Senior level
Marketing Tech • Social Media • Software • Analytics • Business Intelligence
The role involves building and maintaining AI/ML infrastructure, managing machine learning model lifecycles, and supporting AI/ML scientists with tooling for deployment. Additionally, it includes improving engineering processes and setting high-quality standards.
Top Skills: AWSC++JavaKubernetesPythonTerraform
21 Days Ago
Easy Apply
In-Office
Vancouver, BC, CAN
Easy Apply
Entry level
Entry level
Artificial Intelligence • Machine Learning • Software
Implement and deploy computer vision solutions (detection, tracking, segmentation, depth) for production, define labeling ontologies, create datasets, onboard customers, calibrate models, and work with infrastructure and ML engineers for robust, embedded and cloud-targeted deployments.
Top Skills: Python,Pytorch,Tensorflow,Opencv,Gstreamer,Ffmpeg,Deepstream,Tensorrt,Linux,Gcp,Azure,Aws,Flink,Spark,Beam,Nvidia Jetson,Intel Movidius,Java,Typescript
3 Hours Ago
Hybrid
Vancouver, BC, CAN
Mid level
Mid level
Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing
The Software Engineer II will design, build, and operate technology for Mastercard's payment applications, focusing on large scale systems and automation, while mentoring team members and participating in agile processes.
Top Skills: Apache FlinkApache KafkaBehavior Driven DevelopmentCi/CdContainerizationJavaNatsOopSolidTest Driven DevelopmentVirtualization

What you need to know about the Vancouver Tech Scene

Raincouver, Vancity, The Big Smoke — Vancouver is known by many names, and in recent years, it has gained a reputation as a growing hub for both tech and sustainability. Renowned for its natural beauty, the city has become a magnet for professionals eager to create environmental solutions, and with an emphasis on clean technology, renewable energy and environmental innovation, it's attracted companies across various industries, all working toward a shared goal: advancing clean technology.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account