Quantiphi Logo

Quantiphi

DevOps/Observability Engineer

Reposted 6 Days Ago
Remote
Hiring Remotely in Canada
Senior level
Remote
Hiring Remotely in Canada
Senior level
Lead the design and implementation of a unified observability platform, focusing on architecting observability pipelines and implementing tools like Prometheus and Grafana on AWS.
The summary above was generated by AI

While technology is the heart of our business, a global and diverse culture is the heart of our success. We love our people and we take pride in catering them to a culture built on transparency, diversity, integrity, learning and growth.
If working in an environment that encourages you to innovate and excel, not just in professional but personal life, interests you- you would enjoy your career with Quantiphi!

About Quantiphi:

Quantiphi is an award-winning, AI-First global digital engineering company that helps the world’s leading Fortune 1000 organizations transform bold ideas into measurable business impact. We go beyond building innovative AI technologies—we solve the problems that matter most to our clients.

Since our founding in 2013, Quantiphi has built a proven track record of turning complex challenges into meaningful outcomes across industries.

Headquartered in Boston, with more than 4,000 professionals worldwide, we partner with global enterprises to deliver large-scale digital, cloud, and AI-driven transformation. #SolvingWhatMatters.

We are an Elite and Premier partner to Google Cloud, AWS, NVIDIA, Snowflake, and other leading technology platforms, and our work has been recognized across the industry, including:

  • 3 AWS AI/ML Partner of the Year awards
  • 3 NVIDIA Partner of the Year awards
  • 3 Snowflake Partner of the Year awards
  • Rated Leaders by Gartner, Forrester, IDC, ISG, Everest Group and other leading analyst firms

Quantiphi delivers First-in-class AI solutions across Life Sciences, Healthcare, Banking, Financial Services, CPG, Manufacturing, Energy, High-Tech, Telecommunications, etc., powered by cutting-edge Generative AI and Agentic AI accelerators.

We are also proud to be certified as a Great Place to Work—reflecting our commitment to our people and our culture.

For more details, visit: Website or LinkedIn Page

Role:  DevOps/Observability Engineer

Experience Level: 8+ years

Employment type: Full Time

Location: Remote -  USA

What you will do:

We are seeking a highly experienced Senior DevOps/Observability Engineer with over  8 years of experience to lead the design and implementation of our next-generation, unified observability platform. This pivotal role will focus on architecting a sophisticated observability pipeline from the ground up, leveraging a modern, open-source-centric stack on Amazon Web Services (AWS). The ideal candidate will have deep expertise in designing and deploying observability solutions, with a strong emphasis on OpenTelemetry (OTel) and Kubernetes observability. You will be responsible for deploying, configuring, and integrating a suite of tools including Prometheus, Grafana, and Splunk to provide comprehensive insights into our complex, distributed systems. This is a hands-on role for a technical leader who is passionate about building scalable, reliable, and efficient monitoring and logging systems

Basic Qualifications (BQs):

  • Unified Pipeline Architecture: Proven ability to design and implement end-to-end observability pipelines using OpenTelemetry, Prometheus, and Grafana on centralized infrastructure.
  • Cross-Account AWS Observability: Deep expertise in centralizing AWS telemetry, including multi-account CloudTrail organization trails, cross-account CloudWatch metrics/logs, and VPC Flow Logs.
  • Log Aggregation & Routing: Strong experience designing log aggregation strategies, implementing noise reduction/filtering at the collector level, and configuring Splunk HTTP Event Collector (HEC) integrations.
  • Advanced Alerting & Dashboarding: Hands-on experience building comprehensive alerting frameworks using Alertmanager and CloudWatch Alarms, coupled with advanced dashboard engineering in Grafana (using PromQL).
  • Infrastructure as Code (IaC): Advanced proficiency in writing Terraform modules specifically for deploying and managing observability stacks and EC2 infrastructure.

Other Qualifications (OQs):

  • Enterprise Scale Log Management: Demonstrated experience managing, routing, and optimizing log pipelines at massive scale (TB/day).
  • Kubernetes/Container Observability: Experience deploying Prometheus and OTel within Kubernetes (EKS) or containerized (ECS) environments.
  • Cost Optimization: Proven track record of reducing observability spend through strategic metric dropping, log filtering, and efficient storage tiering.
     

What is in it for you:

  • Join one of the world’s fastest-growing AI-first digital engineering companies and make a real impact at scale.
  • Lead and collaborate with a high-energy team of talented, driven individuals solving complex, meaningful challenges.
  • Work with Fortune 500 companies and disruptive innovators in a research-driven environment with 60+ patents.
  • Stay ahead of the curve by gaining hands-on experience with cutting-edge AI, ML, data, and cloud technologies while continuously upskilling.

If you like wild growth and working with happy, enthusiastic over-achievers, you'll enjoy your career with us!

Similar Jobs

59 Minutes Ago
Remote or Hybrid
Senior level
Senior level
Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Lead production database troubleshooting and performance tuning across multi-tenant PostgreSQL/MariaDB fleets, perform incident RCA, develop observability and automation, influence cloud infrastructure decisions, mentor engineers, and support large-scale web-based distributed applications on Unix/Linux and container platforms.
Top Skills: AnsibleApacheCi/CdCloudContainersHyperscalerJavaScriptJbossKubernetesLinuxMariadbPaasPostgresPythonSaaSServicenowShell ScriptingTomcatUnixWeblogicWebsphere
5 Hours Ago
Remote or Hybrid
Senior level
Senior level
AdTech • Cloud • Digital Media • Information Technology • News + Entertainment • App development
Design high-level systems that integrate ML models into production, enable ML–rendering engine interaction, build language-agnostic APIs/wrappers and microservices, evaluate build-vs-buy options, coordinate cross-functional integration requirements, and decompose product vision into scalable ML-enabled architecture.
Top Skills: APIsC#C++ContainerizationDeep LearningGitMessage BrokersMicroservicesModel DeploymentModel Fine-TuningPythonReal-Time Rendering EngineReinforcement LearningUnix Shell
5 Hours Ago
Remote or Hybrid
Senior level
Senior level
AdTech • Cloud • Digital Media • Information Technology • News + Entertainment • App development
Develop and optimize C++ systems implementing machine learning, computer vision, and (inverse-)procedural 3D modeling algorithms. Collaborate with leadership to translate product vision, manage code with Git, deploy and test on cloud platforms, work with large-scale geospatial datasets, and operate in Unix (bash) environments.
Top Skills: BashC++Cloud PlatformCmakeGitLinuxmacOSMercurialUnix ShellUnreal Engine

What you need to know about the Vancouver Tech Scene

Raincouver, Vancity, The Big Smoke — Vancouver is known by many names, and in recent years, it has gained a reputation as a growing hub for both tech and sustainability. Renowned for its natural beauty, the city has become a magnet for professionals eager to create environmental solutions, and with an emphasis on clean technology, renewable energy and environmental innovation, it's attracted companies across various industries, all working toward a shared goal: advancing clean technology.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account