Loopio Logo

Loopio

Senior Manager, Software Engineering (Infrastructure)

Reposted 4 Days Ago
Be an Early Applicant
In-Office or Remote
Hiring Remotely in Toronto, ON
Senior level
In-Office or Remote
Hiring Remotely in Toronto, ON
Senior level
Lead the SRE, Infrastructure, and MLOps teams at Loopio, focusing on system reliability, scalability, and operational excellence while fostering a culture of development.
The summary above was generated by AI
Take your career to new heights with Loopio! 🚀✨

Loopio is looking for a senior engineering leader to own our Site Reliability Engineering (SRE), Infrastructure, and MLOps teams. In this role, you will be the primary architect of the reliability, scalability, and cost efficiency of the systems that power Loopio’s platform.

You’ll lead teams that design, build, and operate our production infrastructure, ensuring our services are resilient, observable, and ready to scale as we integrate advanced AI and agentic workflows. You’ll partner closely with Product Engineering, Security, and Data teams to enable fast, safe delivery while maintaining operational excellence.

Note: This is an existing vacancy on the team

🚀 What You’ll Be Doing

Leadership & Team Development

  • Lead and grow multiple teams across SRE, Cloud Infrastructure, and MLOps.

  • Coach and develop engineering managers and senior individual contributors, fostering a culture of ownership and high craft.

  • Build a "Platform-as-a-Product" mindset, ensuring that infrastructure and ML tooling serve as enablers for the rest of the engineering organization.

  • Partner with Recruiting to attract and retain specialized talent in the cloud, reliability, and machine learning infrastructure space.

Reliability & Operational Excellence

  • Own the operational health of production systems, including availability, latency, and durability.

  • Define and evolve SLIs, SLOs, and error budgets, moving the organization toward data-driven reliability decisions.

  • Lead incident response, driving blameless postmortems and systemic improvements to reduce "toil" and improve on-call sustainability.

  • Support ML-specific reliability, ensuring that model inference pipelines and vector databases meet the same high standards as our core SaaS platform.

Infrastructure & MLOps Strategy

  • Evolve Loopio’s cloud architecture, overseeing capacity planning, disaster recovery, and business continuity.

  • Drive the MLOps roadmap, establishing standards for model deployment, monitoring, and scaling (including LLM orchestration and RAG pipelines).

  • Lead Cloud FinOps, ensuring our infrastructure and AI compute costs are visible, intentional, and optimized.

  • Establish standards for infrastructure automation (IaC), configuration management, and secrets handling.

Security & Cross-Functional Leadership

  • Partner with Security to ensure "secure-by-default" infrastructure and robust backup/recovery strategies.

  • Communicate risks and trade-offs clearly to senior leadership, acting as a calm, trusted voice during high-severity events.

  • Collaborate with Product Engineering to support the delivery of high-impact AI features without sacrificing platform stability.

✨What You’ll Bring to the Team
  • 8+ years of experience in infrastructure, SRE, or cloud engineering roles, with 3+ years leading specialized engineering teams.

  • Deep Cloud Proficiency: Extensive experience with AWS (preferred) and modern infrastructure-as-code (Terraform).

  • Operational Grit: A proven track record of leading teams through production incidents and complex architectural migrations.

  • MLOps Awareness: Understanding of the unique infrastructure needs for machine learning, such as GPU orchestration, model serving, or data pipeline stability.

  • Systems Scaling & Observability: Proven expertise in managing large-scale containerized environments and leveraging observability stacks to ensure platform health.

  • Strategic Communication: Ability to align technical roadmaps with business objectives and advocate for infrastructure investment.

  • Experience with FinOps or managing significant cloud budgets is a plus.

  • Background in supporting AI agentic workflows or autonomous orchestration systems is a plus.

Where You’ll Work 📍
  • Loopio is a remote-first workplace because we recognize the advantages of working flexibly. We are HQ’d in Canada, with established hub regions around the world where we hire from.

  • Our employees (or Loopers, as we call ourselves!) live and work in 🇨🇦 Canada (British Columbia and Ontario), 🇬🇧 London, and 🇮🇳 India (specifically in Gujarat, Maharashtra, and Bengaluru).

  • The majority of our team is based in ON and BC, which means these employees live and work remotely within a 300km radius of Toronto (within Ontario) and Vancouver (Within BC). 

  • We offer flexible co-working locations available to Loopers in ON and BC. Those based in ON have the option of working out of our convenient co-working space located in the heart of Downtown Toronto and a 12-minute walk from Union Station. BC Loopers have the option to work centrally in Vancouver. It is whatever works best for you!

  • You’ll collaborate with your teams virtually across the UK, India, and North America (we’re just a Zoom call and Slack message away!) with core sync hours and focus time for headsdown work 🙇🏾 during the workday

  • We encourage asynchronous collaboration to effectively work as a global #OneTeam!

Why You’ll ♥️ Working at Loopio
  • Your manager supports your development by providing ongoing feedback and regular 1-on-1s, we leverage Lattice for our 1:1s and performance conversations  

  • You will have the opportunity to elevate 🪄 your craft and the opportunity to explore your creativity, with a dedicated professional mastery allowance for more learning support! We encourage experimentation and innovative thinking to drive business impact.

  • We offer a wide range of health and wellness benefits to support your physical and mental well-being, starting day 1️⃣ with Loopio.

  • We’ll set you up to work remotely with a MacBook laptop 🍏, a monthly phone and internet subsidy, and a work-from-home budget to help get your home office all set up.  

  • You’ll be joining a supportive culture that has thoughtfully built out opportunities for connections in a remote first environment.

  • Participate in 🎤 townhalls, AMA (Ask-Me-Anything), and quarterly celebrations to celebrate the big wins and milestones as #oneteam!

  • Our four active Employee Resource Groups offer opportunities for employees to learn and connect year-round. 

  • You’ll be a part of an award-winning workplace 🏆with an opportunity to make a big impact on the business.

Questioning your qualifications? Read this ‼️ Hi there, we recognize that all too often, potential candidates don’t apply for a position simply because they don’t hit every single criteria included in the job description—particularly members of underrepresented groups. 

Whether or not your experience ✅ checks off all the boxes on a job posting, we still encourage you to apply to ensure that your application receives a review from our team. We understand that a resume can only showcase so much during the applicant stage, so we've created prompts in the application for you to share more about yourself. If you've made a career transition (or a few!), you’re self taught in a new role, or you have skills/experience you’d like to highlight, we want to hear more about what you could bring to the table.

AI in Recruitment 👩🏻‍💻 At Loopio, we leverage  artificial intelligence (AI) technology to enhance our recruitment process. These tools assist with tasks such as resume screening, drafting preliminary job descriptions, generating initial interview questions, transcription and occasionally sourcing prospective candidates. However, AI is never used to make final hiring decisions; our use of AI  serves to support repetitive and administrative tasks in order to streamline our hiring and recruitment workflows. We are committed to the responsible use of AI in our hiring practices, prioritizing both  an improved candidate experience and operational efficiency. Our standardized hiring practices remain focused on reducing biases, with all key hiring decisions  solely made by our team. We continuously review and refine our hiring practices to align with  industry best practices and evolving legal guidelines

AI Usage in Interviews 🤖 The Loopio interview process is designed to help us better understand your professional experiences, as well as your unique perspective and approach to delivery and collaboration. To ensure a fair and accurate assessment, we expect candidates to refrain from using AI-generated content to produce responses to our interview questions during the live interview, as our goal is to evaluate your own knowledge and problem-solving abilities - not those of an AI assistant. If AI-generated responses are identified during the interview, Loopio reserves the right to disqualify candidates from the interview process.

For specific roles that include a take-home assignment, our AI Interviewee Policy allows the use of AI for initial support; however, the final submission must be a true reflection of your personal insights, skills and decision-making.

Loopio is an equal opportunity employer that is deeply committed to building equitable workplaces that are diverse and inclusive. We actively encourage candidates from all backgrounds and lifestyles to consider us as a future employer. Please contact a member of our Talent Experience team ([email protected]) should you require accommodations at any point during our virtual interview processes.

Top Skills

AWS
Terraform

Similar Jobs

52 Minutes Ago
Remote or Hybrid
Ontario, ON, CAN
Mid level
Mid level
Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
The Bilingual Incident Coordinator manages customer service functions related to fraud and identity theft incidents, providing client support and overseeing project tasks within a dynamic environment.
Top Skills: Claims Management SystemsMs Office SuiteReporting Systems
2 Hours Ago
Easy Apply
Remote or Hybrid
Canada
Easy Apply
Senior level
Senior level
Artificial Intelligence • Cloud • Computer Vision • Hardware • Internet of Things • Software
The Senior Software Engineer will drive business growth by building and scaling products for customer acquisition, lead management, and AI tools, collaborating cross-functionally with marketing and sales to innovate and impact revenue generation.
Top Skills: AWSFlaskJavaScriptPythonVue
3 Hours Ago
Remote or Hybrid
ON, CAN
Mid level
Mid level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
The Product Design Producer supports and enhances design processes, coordinates cross-functional initiatives, and fosters collaboration within the design organization.
Top Skills: Figma

What you need to know about the Vancouver Tech Scene

Raincouver, Vancity, The Big Smoke — Vancouver is known by many names, and in recent years, it has gained a reputation as a growing hub for both tech and sustainability. Renowned for its natural beauty, the city has become a magnet for professionals eager to create environmental solutions, and with an emphasis on clean technology, renewable energy and environmental innovation, it's attracted companies across various industries, all working toward a shared goal: advancing clean technology.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account