Flinks Logo

Flinks

Senior Reliability Engineer

Posted 8 Days Ago
Be an Early Applicant
Canada
Senior level
Canada
Senior level
As a Senior Reliability Engineer, you will ensure systems run reliably by monitoring services, handling production issues, and driving system improvements while mentoring team members and collaborating across departments for effective incident resolution.
The summary above was generated by AI

Description

About Flinks 🚀

At Flinks, we’re not just building data infrastructure; we’re shaping the future of finance. Our mission is to empower consumers with control over their financial data and unlock its full potential. We equip fintechs and banks with cutting-edge data tools, enabling them to create innovative, client-centric products that are transforming the financial industry.

Flinks is trusted by hundreds of companies and connects over 250 million financial accounts. Our products power digital finance, helping businesses streamline their processes, improve user experiences, and drive the next wave of financial innovation. We are the engine behind the future of digital finance, committed to creating open, consent-based exchanges of financial data.

About the Reliability Team

As a Senior Reliability Engineer, you will play a pivotal role in ensuring that our systems and applications run reliably while scaling rapidly. You’ll be handling Service Reliability Engineer (SRE) tasks within a support capacity, driving improvements in system stability, and acting as a leader in debugging and resolving complex production issues.

What You’ll Do

  • Provide live operational support for multiple client software applications, monitoring services and alerts to detect critical failures, ensuring rapid restoration of services and minimal downtime.
  • Develop and maintain code to resolve production issues quickly, leveraging strong development skills to ensure fast service recovery and long-term system stability.
  • Own and resolve incidents reported by clients and internal stakeholders, adhering to client SLA and internal SLO timelines.
  • Troubleshoot complex incidents, perform thorough root cause analyses, and implement solutions to prevent the recurrence of issues.
  • Utilize a data-driven approach to prepare detailed analyses and reports, presenting findings through charts, layouts, and diagrams.
  • Conduct deep technical analyses of product and feature deficiencies, addressing client pain points based on actual use cases.
  • Develop and enhance monitoring systems to proactively detect issues, implementing robust alert mechanisms to ensure continuous system stability.
  • Provide expert guidance on improving operational system stability and scalability.
  • Lead and execute initiatives that automate processes, improving operational efficiency across LiveOps.
  • Facilitate postmortem meetings following incidents, documenting findings, and assigning action items for future prevention.
  • Collaborate with cross-functional teams to ensure rapid resolution of production issues, implementing long-term fixes.
  • Lead and motivate project teams, ensuring tasks are completed on schedule and that high-quality standards are consistently met.
  • Mentor and provide ongoing training to reliability engineers, tracking their progress and ensuring adherence to high standards.
  • Actively contribute to maintaining the highest quality standards as the organization continues to scale.
  • Participate in after-hours on-call support as part of the LiveOps rotation.

Who You Are

  • Operationally focused with expertise in incident management and resolving live production issues
  • Strong debugging and troubleshooting skills, particularly in performance optimization of large-scale applications
  • Proven experience in building and maintaining reliable monitoring and alerting systems in high-demand environments, with a focus on production support
  • 7+ years of experience with .NET Framework (C#), ensuring production system stability
  • Strong knowledge of Kubernetes, Docker, and cloud platforms (GCP preferred)
  • Proficiency with monitoring tools like Prometheus, Grafana, and Kibana
  • Experience with incident ticketing/documentation tools like FreshDesk and Confluence
  • Critical thinker who can identify system weaknesses and find innovative solutions
  • Strong project management skills with a focus on scalability and system stability
  • ITIL Service Management certification (or equivalent) is highly desired, such as ITIL v3, ITIL v4, or other equivalent certifications.
  • Experience with PowerBI, web scraping, or Golang (nice to have)

What’s in it for You?

  • Clear Impact: You'll ensure that millions of users have reliable access to their financial data, directly contributing to the success of Flinks and its customers.
  • Autonomy and Ownership: Senior Engineers at Flinks are empowered to lead major initiatives, drive strategy, and influence the direction of our tech stack.
  • Trailblazing Technology: Be part of a cutting-edge company at the forefront of open banking and financial data management, during a pivotal time of growth and innovation.
  • Professional Growth: You’ll be continuously challenged by working with a passionate, smart team on a variety of technical and business problems.

The Interview Process 🏗

  • People Ops Generalist
  • Team Lead Interview
  • Case Assignment & Presentation
  • Stakeholder Interview
  • Director Interview

Top Skills

.Net Framework
C#

Similar Jobs

Be an Early Applicant
6 Hours Ago
Toronto, ON, CAN
20,000 Employees
Senior level
20,000 Employees
Senior level
Food • Retail • Agriculture • Manufacturing
The Sr Engineering Manager, SRE & Observability will lead the design, implementation, and monitoring of secure, fault-tolerant SRE and Observability infrastructure. Responsibilities include developing strategies, collaborating with teams, mentoring engineers, and driving operational excellence through advanced monitoring and automation techniques.
Be an Early Applicant
8 Days Ago
Toronto, ON, CAN
Hybrid
90,000 Employees
Senior level
90,000 Employees
Senior level
Big Data • Food • Hardware • Machine Learning • Retail • Automation • Manufacturing
The Senior Reliability Engineer will lead manufacturing operations, contribute to achieving key performance targets in safety, quality, and productivity, and manage change initiatives using Lean Six Sigma tools. This role requires effective communication and coaching to build high-performing teams and ensure compliance with manufacturing standards.
Be an Early Applicant
9 Days Ago
Vancouver, BC, CAN
Hybrid
6,500 Employees
Senior level
6,500 Employees
Senior level
Gaming • Information Technology • Mobile • Software
The Senior Site Reliability Engineer will support infrastructure, monitoring, and tooling needs. Responsibilities include developing automated scalable cloud infrastructure, monitoring systems, diagnosing technical issues, and enhancing CI/CD pipelines. The SRE will collaborate with engineers to maintain highly available services across various technologies, while participating in an on-call rotation for live service issues.

What you need to know about the Vancouver Tech Scene

Raincouver, Vancity, The Big Smoke — Vancouver is known by many names, and in recent years, it has gained a reputation as a growing hub for both tech and sustainability. Renowned for its natural beauty, the city has become a magnet for professionals eager to create environmental solutions, and with an emphasis on clean technology, renewable energy and environmental innovation, it's attracted companies across various industries, all working toward a shared goal: advancing clean technology.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account