Site Reliability Engineer

Engineering · Manila, Philippines

Job description

About DT One

At DT One, we count on our Site Reliability Engineers (SREs) to empower our users with a rich feature set, high availability, and extreme performance level. As we expand our platform infrastructure and applications, we are currently seeking talented Site Reliability Engineer to maintain, improve, and flawlessly operate our environments, we are searching for someone who brings fresh ideas, demonstrates a unique and informed viewpoint, and enjoys collaborating with a globally distributed team to develop real-world solutions and positive user experiences at every interaction.

Key Responsibilities

  • Run the production environment by monitoring availability and taking a holistic view of system health
  • Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve
  • Establish and guarantee platform infrastructure, and applications service level objectives
  • Provide primary operational support and engineering for multiple large distributed software applications including on-call shifts
  • Build software and systems to manage network infrastructure, platform infrastructure, and applications
  • Improve reliability, quality, security, and time-to-market of our suite of software solutions
  • Partner with development teams to improve services through rigorous testing and release procedures

Professional Experience and Qualifications

  • Bachelor’s degree in computer science or other highly technical, scientific discipline
  • Ability to program (structured and OO) with one or more high-level languages, such as Golang, Python, Ruby, and JavaScript
  • Experience with AWS cloud infrastructure management and related services
  • Experience with Infrastructure as Code and Configuration Management concepts and related tools and technologies, such as Terraform and Ansible
  • Hands-on experience with Linux administration, command-line interface, and shell scripting
  • Experience with dynamic resource management frameworks, and technologies, such as Kubernetes and Nomad
  • Experience with source code management tools, and related workflows
  • Experience with continuous integration and continuous deployment concepts and related tools and technologies, such as Jenkins, GitlabCI, Bitbucket Pipelines
  • A proactive approach to spotting problems, areas for improvement, and performance bottlenecks
  • Good communication skills in English
  • Previous success in technical engineering
  • Previous experience with multiple large distributed software applications operations
  • Previous experience defining and implementing deployment and release standards
  • Experience with database administration and performance tunings, such as PostgreSQL, MySQL, ElasticSearch, and Redis
  • Experience with monitoring tools, such as Prometheus, DataDog, and NewRelic
  • Experience with VPN configuration and administration
  • Coding experience beyond simple scripts
  • Strong Site Reliability principles oriented mindset
  • Sharing and mentoring mindset

Sound like you? Apply now!

A panel showing how The Org can help with contacting the right person.

Open roles at DT One