Senior Data Engineer

Engineering · Full-time · Canada · Remote possible

Job description

At Trunk, we're on a mission to empower growing software organizations to deliver high-quality software quickly. We understand the challenges of merge conflicts, poor code quality or consistency, flaky tests, and other distractions that can drain productivity and morale. Our unique approach enables engineering teams to stay focused on designing, implementing, and delivering software, leading to the creation of magical, high-quality projects and happier teams.

Our journey began in 2021, with our founders leveraging their experience from some of the world's largest and fastest-growing tech companies - Uber, Google, YouTube, and Microsoft. In 2022, we achieved a significant milestone by securing a $25M Series A funding led by Garry Tan at Initialized Capital (currently President of YC) and Peter Levine at a16z. This growth and recognition are a testament to our potential and the value we bring to the software development landscape.

We know the frustration of trying to deliver code while constantly being interrupted by slow CI, flaky tests, and fragile processes. At Trunk, we’re building the tools to bring the joy back to software development. We’re looking for entrepreneurial people who are passionate about solving these problems.

As a founding member of our Data Engineering team, you’ll leverage your technical expertise to build data pipelines for processing and storing the data generated by our customer's CI/CD and automated tests. You’ll also experiment with integrating AI models to drive analytics and insights for our customers. We're tackling challenging problems and need engineers who can operate well in ambiguity and develop great solutions.

As an engineering team, we thrive on our ability to move quickly and adapt as we learn. Quickly delivering value to customers and getting their feedback is critical to our success. Engineers will be able to work closely with customers to understand the nuances of their use cases. We value empathy, hard work, and collaboration.

Our data stack is constantly evolving, but built on the foundations of Python, PostgreSQL, Spark, TimescaleDB, AWS, Kubernetes, and AWS Glue.

What you'll do 🧑‍💻

  • Build fault-tolerant and scalable data pipelines
  • Design efficient data storage, collaborating with product engineers to create fast and reliable data-driven features
  • Debug, profile, and optimize distributed data-intensive applications to improve their latency, accuracy, resource consumption, and throughput
  • Design and build observability of data quality and accuracyIntegrate
  • ML models like Llama to analyze data and create features

We're looking for 🔎

  • 5+ years of experience as a software engineer with a strong understanding of key concepts in distributed systems
  • 3+ years of experience in building and deploying data applications, with a track record of regularly shipping new features
  • Fluency in at least two of these languages: Java/Scala/Kolin, Python, Go, Rust, or C++
  • Good understanding and practical experience with partitioning, replication, map-reduce, indexing, and CAP theorem
  • Experience with distributed storage systems (S3, HDFS, Hive, ClickHouse, Elastic, etc), distributed processing engines (Spark, etc), and message queues (Kafka, SQS, etc)
  • Passion for building large-scale ML applications and improving software engineers' productivity
  • Understanding of key concepts in natural language processing, machine learning, or statistical analysis

(Nice to have) Some experience with machine learning stack (pandas, PyTorch, numpy, sci-kit, transformers, etc) What we offer 🎁

  • Unlimited PTO
  • Competitive salary and equity
  • Work-life balance
  • Flexibility to be fully or partly remote
  • Up to $200/month stipend for coworking space for remote folks
  • Few meetings, so you can ship fast and focus on building
  • One Medical membership on us!
  • Top-notch medical, dental, vision, short-term disability, long-term disability, and life insurance
  • All insurance is 100% company-paid ($0 premiums) for employees and highly subsidized for dependents
  • FSA, HSA with company contributions, and pre-tax commuter benefits
  • 401(k) plan
  • Paid parental leave ( up to 12 weeks)