Arista Channels Logo

Arista Channels

Site Reliability Engineer (SRE) - Cloudvision

Posted 22 Days Ago
Be an Early Applicant
Vancouver, BC
Mid level
Vancouver, BC
Mid level
As an SRE, you will manage Arista’s global CloudVision service fleet, focusing on building CI/CD processes, enhancing operational automation, overseeing disaster recovery, and ensuring infrastructure security. You'll lead incident response efforts and participate in an on-call team.
The summary above was generated by AI

Company Description

Arista Networks is an industry leader in data-driven, client-to-cloud networking for large data center, campus and routing environments. What sets us apart is our relentless pursuit of innovation. We leverage the latest advancements in cloud computing, artificial intelligence, and software-defined networking to provide our clients with a competitive edge in an increasingly interconnected world. Our solutions are designed to not only meet the current demands of the digital landscape but to also anticipate and adapt to future challenges.

At Arista we value the diversity of thought and perspectives that each employee brings to the table. We believe that fostering an inclusive environment, where individuals from various backgrounds and experiences feel welcome, is essential for driving creativity and innovation.

Our commitment to excellence has earned us several prestigious awards, such as Best Engineering Team, Best Company for Diversity, Compensation, and Work-Life Balance. At Arista, we take pride in our track record of success and strive to maintain the highest standards of quality and performance in everything we do.

Job Description

Who You’ll Work With

SREs at Arista combine strong software and systems engineering with a passion for operating production systems at scale. As an SRE you’ll be part of the team responsible for our global service fleet.

What You’ll Do
As an SRE you’ll be responsible for our global CloudVision service fleet. This includes:

  • Building the CI/CD lifecycle for services, from inception and design to deployment and scaling
  • Improving operational processes through automation
  • Identifying key service indicators to be used in capacity planning
  • Owning disaster recovery and management
  • Driving infrastructure and cloud-based application security design
  • Leading sustainable incident response and blameless postmortems
  • Being an active member of our globally distributed on-call team

Arista’s CloudVision is an enterprise network management and streaming telemetry SaaS offering. CloudVision is deployed on Kubernetes across global regions using Spinnaker for our CI/CD pipeline. Our tech stack runs on GKE, using HBase/Hadoop as main distributed database and storage layer, ElasticSearch for powering search data, ClickHouse for fast real time queries of flow data, our own Kafka-based distributed real time stream processing layer for analytics, and TensorFlow for ML analysis. Our monitoring system is built on top of Prometheus, Grafana, Loki, and other OSS tools.

Qualifications

  • BS/MS degree in Computer Science or a relevant experience subject.
  • 4+ years software engineering experience.
  • Experience developing or managing deployments of distributed database systems or scale out applications for a SaaS environment.
  • Must be able to work on PST time zone.

Compensation Information:

The new hire base pay for this role has a salary range of CAD 95,000 to 145,000. Arista offers different pay ranges based on work location, so that we can offer consistent and competitive pay appropriate to the market. The actual base pay offered will be based on a wide range of factors, including skills, qualifications, relevant experience, and work location. The pay range provided reflects base pay only and in addition certain roles may also be eligible for discretionary Arista bonuses and equity. Employees in Sales roles are eligible to participate in Arista’s Sales Incentive Plan, which pays commissions calculated as a percentage of eligible sales. US-based employees are also entitled to benefits including medical, dental, vision, wellbeing, tax savings and income protection. The recruiting team can share more details during the hiring process specific to the role and location.

Additional Information

All your information will be kept confidential according to EEO guidelines.

Compensation Information:

The new hire base pay for this role has a salary range of CAD 95,000 to 145,000. Arista offers different pay ranges based on work location, so that we can offer consistent and competitive pay appropriate to the market. The actual base pay offered will be based on a wide range of factors, including skills, qualifications, relevant experience, and work location. The pay range provided reflects base pay only and in addition certain roles may also be eligible for discretionary Arista bonuses and equity. Employees in Sales roles are eligible to participate in Arista’s Sales Incentive Plan, which pays commissions calculated as a percentage of eligible sales. US-based employees are also entitled to benefits including medical, dental, vision, wellbeing, tax savings and income protection. The recruiting team can share more details during the hiring process specific to the role and location.

Top Skills

Clickhouse
Elasticsearch
Gke
Grafana
Hadoop
Hbase
Kafka
Kubernetes
Loki
Prometheus
Spinnaker
TensorFlow

Similar Jobs

Be an Early Applicant
Yesterday
Vancouver, BC, CAN
13,285 Employees
Senior level
13,285 Employees
Senior level
Big Data • Cloud • Digital Media • Machine Learning • Mobile • Software • Industrial
The Senior Site Reliability Engineer will lead the Autodesk Identity service, develop and maintain cloud infrastructure, automate processes for system reliability, and collaborate with teams on best practices. The role focuses on monitoring, alerting, and ensuring compliance with security standards while mentoring team members.
Be an Early Applicant
Yesterday
Vancouver, BC, CAN
13,285 Employees
Senior level
13,285 Employees
Senior level
Big Data • Cloud • Digital Media • Machine Learning • Mobile • Software • Industrial
As a Principal Site Reliability Engineer, you will manage cloud infrastructure and MySQL/NoSQL databases, ensuring reliability and performance. Responsibilities include database administration, maintenance, security, and support for production systems, along with collaboration and troubleshooting across teams.
5 Days Ago
2 Locations
434 Employees
Mid level
434 Employees
Mid level
Big Data • Cloud
As a Site Reliability Engineer at Qumulo, you will help manage and monitor applications and infrastructure, implementing automated solutions for both on-prem and cloud environments. Responsibilities include troubleshooting build failures, implementing system monitoring, and participating in an on-call rotation for incident response.

What you need to know about the Vancouver Tech Scene

Raincouver, Vancity, The Big Smoke — Vancouver is known by many names, and in recent years, it has gained a reputation as a growing hub for both tech and sustainability. Renowned for its natural beauty, the city has become a magnet for professionals eager to create environmental solutions, and with an emphasis on clean technology, renewable energy and environmental innovation, it's attracted companies across various industries, all working toward a shared goal: advancing clean technology.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account