Monitor global infrastructure using Datadog and SolarWinds, triage and resolve L1 incidents, escalate complex issues, participate in incident response and post-incident reviews, maintain SOPs and ServiceNow tickets, support automation with basic scripting, and provide weekend on-call coverage.
JOB DESCRIPTION
- Monitor Sysco’s global infrastructure and systems using tools such as Datadog, SolarWinds, and other enterprise monitoring platforms.
- Detect, triage, and respond to incidents proactively before customer or business impact.
- Independently resolve:
- Server performance issues
- Monitoring agent issues
- Basic infrastructure and system alerts
- Escalate major incidents, complex infrastructure issues, and application-related incidents to L2/L3 teams in line with SOPs and SLAs.
- Ensure initial response and resolution targets are met for all priority levels.
- Participate in incident bridge calls and coordinate with internal and external stakeholders.
- Perform initial investigations and document findings to support faster resolution.
- Contribute to post-incident reviews and root cause analysis, including analysis via Datadog Watchdog.
- Follow and execute Standard Operating Procedures (SOPs) for known incidents.
- Maintain accurate documentation and ticket updates in ServiceNow.
- Support initiatives to improve First-Time Resolution (FTR) and reduce MTTR.
- Contribute to project-level operational improvements and initiatives tracked in Jira.
- Apply basic scripting or automation knowledge where applicable to support monitoring improvements and operational efficiency.
- Actively participate in knowledge sharing and continuous learning initiatives.
- Standard shift: Monday to Friday, from 10:30 AM to 7:30 PM CST
- Weekend on-call coverage required (one day per weekend, 10:30 AM – 7:30 PM CST; monthly shift rotation defined based on business needs, with prior notification provided by the team manager).
- Bachelor’s degree in Information Technology or equivalent experience.
- 2 years of experience in Operations Engineering, NOC, SRE, or similar roles.
- Strong understanding of:
- Windows Server and/or UNIX/Linux environments
- Networking fundamentals (LAN/WAN, TCP/IP, DHCP, firewalls, routing)
- Experience with an enterprise ticketing tool (e.g., ServiceNow,Jira).
- Strong communication skills in English and ability to work under pressure.
- Willingness to work in a Weekend on-call coverage required
- Excellent communication skills in English (B2+ or higher) and ability to collaborate across functions and geographies.
- Experience with Datadog, SolarWinds, or similar monitoring platforms.
- Exposure to AWS, Azure, or GCP.
- Familiarity with Jira for tracking initiatives and projects.
- ITIL certification or hands-on experience with ITIL practices.
- Basic scripting or automation knowledge (e.g., PowerShell, Bash, Python).
Benefits:
- This is a hybrid position based in Ultra Park II, Lagunilla (Heredia). On-site presence is required only when necessary, such as for meetings, trainings, or collaborative activities, in alignment with the company’s telework agreement, which currently requires employees to work on-site three (3) days per week)
- Private Medical Insurance
- Asociacion Solidarista
- Life Insurance
- Personal Day Off
Note: Only candidates with Costa Rican nationality or valid immigration status will be considered; applicants residing outside Costa Rica will not be considered, and relocation is not available
Similar Jobs
Fintech • Financial Services
Lead and grow a team of analytics engineers to build an AI-ready, centralized data foundation. Hands-on duties include writing/reviewing SQL and dbt, designing semantic and dimensional data models, driving data quality, observability, testing, and CI/CD, partnering cross-functionally to deliver scalable data products, and recruiting and developing engineering talent.
Top Skills:
CdcCi/CdDbtFivetranLlmLookerPythonSnowflakeSnowflake IntelligenceSQLStreamkapTableau
3 Hours Ago
Big Data • Food • Hardware • Machine Learning • Retail • Automation • Manufacturing
Perform monthly commodity analyses for grains and vegetable oils using supply/demand data and market forecasts; build and manage market intelligence databases; develop and improve constraint-optimization price models and long-range forecasts; support CPRM hedging and coverage strategy decisions; track team KPIs and deliver insights to inform pricing and risk management.
Top Skills:
Artificial IntelligenceExcelPythonR
Cloud • Security • Software • Cybersecurity • Automation
Build, ship, and maintain backend features enabling AI agents to interact with GitLab. Design GraphQL/REST APIs, extend tests (RSpec), work with PostgreSQL, troubleshoot production issues, and collaborate cross-functionally to integrate AI tooling responsibly.
Top Skills:
Background JobsGitlabGraphQLPostgresRest ApisRspecRubyRuby On RailsSQL
What you need to know about the Vancouver Tech Scene
Raincouver, Vancity, The Big Smoke — Vancouver is known by many names, and in recent years, it has gained a reputation as a growing hub for both tech and sustainability. Renowned for its natural beauty, the city has become a magnet for professionals eager to create environmental solutions, and with an emphasis on clean technology, renewable energy and environmental innovation, it's attracted companies across various industries, all working toward a shared goal: advancing clean technology.



