SAP started in 1972 as a team of five colleagues with a desire to do something new. Together, they changed enterprise software and reinvented how business was done.
Today, as a market leader in enterprise application software, we remain true to our roots. That’s why we engineer solutions to fuel innovation, foster equality and spread opportunity for our employees and customers across borders and cultures.
SAP values the entrepreneurial spirit, fostering creativity and building lasting relationships with our employees. We know that a diverse and inclusive workforce keeps us competitive and provides opportunities for all.
We believe that together we can transform industries, grow economics, lift up societies and sustain our environment. Because it’s the best-run businesses that make the world run better and improve people’s lives.
Global Cloud Services (GCS) is responsible for SAP’s Infrastructure & Technical foundation, including state-of-the-art data centers, public cloud, and associated platforms.
This serves as the underpinning for SAP’s Cloud Solutions, including internal development, training, and demo landscapes.
Service Reliability Engineering (SRE) is a team within the GCS organization. It contributes to ensure the reliability and availability of SAP cloud services (internal or external) by building and running tools that helps to either prevent or isolate an incident.
SRE’s proactively help automate and optimize processes. The primary goal of the team is to reduce MTTD / MTTR and contribute to technical troubleshooting for major incidents.
The SRE team runs globally in a follow the sun model.
We are looking for a Senior Service Reliability Engineer (SRE De-Escalation Infrastructure Expert) focusing on both soft and physical layers of our global operations.
In this role you will have visibility into all tiers of the service , from infrastructure to application lifecycle and code.
You will be a core driver in identifying technical gaps , while defining and driving preventive measures. You will have not only the freedom but the directive to own the situation and control the resources if necessary, to resolve the issue.
Everything you work on is geared to the big picture of SAP’s Cloud Solutions availability . You are semi-autonomous of the other Engineering, Infrastructure and Delivery teams.
This means providing an independent perspective on technical operations across GCS with a focus on availability, performance, and risk through our team driven solutions.
Do you love to solve infrastructure related problems and troubleshoot using tools you build? Are you a n etwork, compute, or storage expert ?
Do you analyze data / problems in such a way that helps to automate processes to avoid incidents? Have you developed integrations for monitoring tools?
Do you have an SRE / DevOps mindset with strong infrastructure skills? Would you like to work in an agile environment ?
What makes this position unique is the need for a strong technical background , combined with leadership and problem solving capabilities.
This results in a primary emphasis to reduce incident resolution time on service / business impacting events. Your mission will be to proactively own high severity technical escalations, lead the technical engagement across teams and help restore services as quickly as possible.
This results in a primary emphasis to reduce incident resolution time on service / business impacting events.
Bachelors or Masters Degree in Computer Science or a related technical field or equivalent applied experience
10+ years professional experience out of which 6+ years experience with networking, compute, storage and other infrastructure platforms, technical analysis (code or infrastructure) and / or software development
Self Starter who acts with a Sense of Urgency to quickly move issues forward efficiently and effectively.
Fast learner, with initiative to learn a new skill on your own by studying resources and practicing independently.
Excellent communication and interpersonal skills, a trustworthy team player.
Calm and composed in critical situations to interact with stakeholders and make timely decisions
Rotational weekend and / or holiday coverage with allowance and time compensation in accordance with local policies
Knowledgeable in (with one area of expertise) programmatic language development, networking, SaaS, PaaS, Software Cloud architecture, CI / CD, large-scale distributed systems
Solid Understanding of Enterprise / Service Provider Data Center Architecture (high density servers, backbone routers / switches, load balancers, SAN / NAS), IT security principles and disaster recovery
Strong familiarity Enterprise class Fault Monitoring and Performance Management tools
Kubernetes, Python (or other scripting language), Public cloud (AWS, GCP, Azure)
Elastic, Logstash, Kibana experience a plus
Industry Technical Certifications (CCNA, CCNP, OCA, RHCE, etc.) and ITIL related courseware a plusLanguages :
Position is based in Monterrey.