* Work with the team to design for the performance, capacity and high availability of infrastructure and services
* Participate in problem resolution activities; Troubleshoot issues across the entire stack – software, database and infrastructure.
* Diagnose and troubleshoot complex distributed systems handling large volumes of data and develop solutions that have a significant impact at scale.
* Participate in building advanced tooling for testing, monitoring, administration and operations of multiple clusters across multiple geographically distributed data centers
* Develop innovative ways to smartly measure, monitor & report application and infrastructure health
* Experience improving the performance of micro-services and solve scaling/performance issues
* Define and Monitor SLI/SLO Error Budgets
* Drive efficiencies in systems and processes: capacity planning, configuration management, performance tuning, monitoring and root cause analysis.
You may be a good fit if you:
* Creative when solving problems and continuously seeking improvements for processes and solutions
* Are an evangelist of 12factor applications, GitOps practices, and a DevSecOps culture
* Facilitate knowledge sharing by creating and maintaining comprehensive documentation & diagrams
* Write high quality code to deliver automated solutions across the entire stack.
* Translate a passion for improvement into design & roadmap contributions, despite existing technical challenges
* Partner with the Engineering community to establish metrics, review & sign off on changes and introduce new services and schema changes
* Strong team player with a high degree of self-motivation
* Ability to learn new systems & manage additional technical resources to meet the project requirements
* Collaborate with development teams on best practices and infrastructure planning activities with a focus on reliability, performance and security
* BS degree in computer science or proven software engineering capability
* 3+ years of hands-on experience with cloud computing – including infrastructure, storage, platforms and data management
* Experience with traditional enterprise data-center technologies, including compute, storage appliances, virtual machines, and networking
* Experience managing Databases: MySQL, MariaDB, SQL Server, or PostgreSQL
* Experience working with scalable networking technologies such as Load Balancers/Firewalls and web standards (REST APIs,, web security mechanisms, OWASP top 10).
* Experience with container orchestration technologies, like Docker & Kubernetes
* Broader Integration and management experience of DevOps ecosystems and related deployment/orchestration tools such as Helm, Terraform, Gitlab CI/CD, Jenkins, Artifactory
* Knowledge of Java related frameworks (Spring Boot, Tomcat) and build tools (maven, gradle) is a plus
* 3+ years of experience in Linux Systems and general programming/scripting (Python, Shell, Java, Golang) and automation frameworks.
* Able to identify the root cause and resolve critical issues by looking across multiple layers (storage, OS, network, and application / DB stack)
* Play a part in incident management and emergency response
Location: ARGENTINA, MEX, COL