Sysadmin / Site Reliability Operator (SRO) to join our Site Operations team in Buenos Aires. You will be responsible for helping the team in keeping our customers applications running at peak performance. Not only will you be the first point of contact to external worldwide customers but you will be helping to identify, analyze and resolve first-tier technical issues on large scale productive platforms. You will be helping modify and improve our monitoring infrastructure which has multiple metrics and graphs which are generated every minute from a very diverse environment.
Advanced Linux skills and troubleshooting experience in a production environment.
Experience with monitoring graphing metrics and alerting services.
Experience in tracking problems with ticketing systems.
Strong communication and teamwork skills.
Strong communication skills in English.
Willingness to learn from others and share knowledge within teammates. The ability to rapidly self-educate on new concepts and tools as also being actively searching for increase self-knowledge.
Basic experience in on-premise infrastructure management and cloud-based infrastructure, in particular AWS.
Basic understanding in scripting on Bash, Python or similar.
Experience managing web servers.
An understanding of networking concepts of DNS, routing, load balancers, and firewalls.
Job Duties and Responsibilities:
Being able to follow incident management procedures in production environments.
Understanding Root Cause Analysis determination and timeline creation.
Create and maintain documentation on installations, incidents, and procedures.
Analyzing and troubleshooting large-scale distributed systems.
Monitor specific metrics for availability, latency and overall system health.
Development and implementation of new IT infrastructure monitoring.
You will be:
Developing your monitoring skills by using complex systems such as Sensu or Zenoss.
Interacting with Cloud Services from AWS and receiving continuous formation and courses from our AWS Specialists / Online .
Using and deploying differents applications with Containerization Software such as Docker Engine.
Learning to automate daily tasks using Orchestration Software such as Puppet, Ansible or Salt.
What we offer:
On boarding in San Francisco during 3 weeks
Direct contact with clients and the opportunity to share ideas.
Flexible retribution plan: you can adjust your compensation composition according to your needs.
Training and certifications.
Trips to events…and more!
Location: Buenos Aires, Argentina
Night (full home office): 11 p.m to 7 a.m from Tuesday to Saturday