Site Reliability Engineer

Engineering · Full-time · Sydney, AU

Job description

Who we are and what we do

Audinate leads the world in networked media with our "Dante" technology which is used extensively in professional audio & video applications, including live events, broadcast, entertainment venues and communication systems.

Dante replaces all audio and video connections with a computer network, effortlessly sending video or hundreds of channels of audio over slender Ethernet cables with perfect digital fidelity. Adopted by hundreds of manufacturers in thousands of products, Dante is the de facto standard for modern AV connectivity.

You’ll find us in the largest companies and institutions like the Sydney Opera House, NFL Media Headquarters, Microsoft, major universities and even a 900-year old cathedral featured in Harry Potter.

About the role

We're looking for a full-time Site Reliability Engineer (SRE), who is comfortable with applying modern “dev” principles to sysadmin practices: whole of lifecycle from design to release and production support to ongoing development. We’re a dynamic engineering technology company with plenty of opportunities to take the initiative in a highly technical environment.

The role is multifaceted: around one-third to one-half of your time will involve working with our software, hardware and test engineering teams to help them deliver our awesome products; the balance will be project work in delivering the infrastructure that powers Audinate’s services and improving the flexibility and performance of our production environments. 

How we work

We have flexibility to work from home but also collaborate every week in-person at our office in Surry Hills as well as working remotely alongside engineering and operations colleagues in the UK, Belgium and the Philippines.

Responsibilities

  • Manage our on-prem services: Linux and Proxmox infrastructure
  • Manage our hyperscale cloud services, specifically AWS
  • Administer our CI/CD system’s implementation including configuration management to deliver a more reliable, scalable, secure solution
  • Work with engineering teams in effectively and efficiently using these infrastructures and services
  • Evangelise with engineering on improvements to build processes including Dockerisation and artefact management
  • Resource monitoring, detecting and troubleshooting issues
  • Contribute to our corporate and production systems architecture and design
  • Implement, test and deploy new and revised development and production infrastructure/services
  • Creating and developing documentation

What we're looking for

  • Hyperscale cloud design/architecture and operations experience, preferably with AWS (certification nice but by no means required – practical skills are far more highly valued)
  • Familiarity with cloud best practices, e.g. AWS Well Architected Framework
  • Linux systems and higher layer services expertise (OS through to web and database services)
  • Proven troubleshooting and fault analysis skills
  • Graceful under fire – able to stay calm and focused under pressure
  • Interpersonal skills (working with developers, engineers, business and operations & partners)
  • Ability to work independently and as part of a team
  • Driven, with grit, and a let’s-get-it-done-right attitude
  • Agile: flexible, open to change and striving for continuous improvement

Additional desired skills and experience

  • Provisioning and configuration management: Ansible, Ansible AWX/Tower, Terraform
  • Monitoring and alerting systems: Prometheus, Grafana
  • CI/CD systems: Jenkins or Bitbucket Pipelines