Circonus
Site Reliability Engineer

Site Reliability Engineer

Engineering · Full-time · Remote · Remote possible

Job description

As a Site Reliability Engineer (SRE) at Circonus, you will be responsible for keeping Circonus SaaS and on-premise customers up and running as well as improving the automation, scalability, and performance of systems. This is an unparalleled opportunity to grow on a small, collaborative, and friendly team with established leadership in the field of SRE. A successful candidate will be able to effectively communicate across multiple departments and customers, can shift gears at a moment’s notice, and enjoys the challenges of supporting enterprise clients. This is a client facing role where presentation skills are important. Also, a successful candidate will be working in a support rotation capacity. This position is 100% remote.

Job Responsibilities

Install, upgrade and manage systems powering customer infrastructure running Circonus software
Troubleshoot availability and performance issues
Diagnose production issues and perform front-line remediation
Communicate with management and customers regarding aberrant system’s behavior
Influence software and architecture design based on system and architecture observations related to performance and reliability
Participate in an on-call schedule

Job Requirements

Linux (RHEL, CentOS, Ubuntu)
Experience working with cloud service providers such as AWS, Azure, or GCP
Ansible, Chef or similar configuration system
HAProxy, PostgreSQL, Apache or similar technologies
Strong networking knowledge: firewalls, TCP & UDP, DNS, SSL/TLS
Strong understanding of monitoring principles
Familiarity leveraging REST and REST-like APIs for operations tasks
UNIX troubleshooting skills: tcpdump, strace, bpftrace, etc
Fluency in one or more of the Git, Subversion or Mercurial version control systems

Preferred Experience

7+ years’ experience in the technology industry
Experience and/or senior technical knowledge of monitoring and analytics solutions
Experience with Docker, Kubernetes and containers
Terraform, Chef and Ansible experience
Open search experience
The right person will be highly technical and analytical much like the company itself

Org chart

Hiring

Site Reliability Engineer

Vacant position

Get started

Teams

This job is not in any teams

Offices

This job is not in any offices

Explore the world's biggest network of public org charts

Site Reliability Engineer

Job description

Org chart

Teams

Offices

Related jobs

Sales Analyst

Software Engineer 4

Junior Software Engineer

Senior Software Engineer - C++ Development, Vehicle Experience

Senior CAE Engineer, Strength & Durability