Senior Data Engineer

Engineering · Full-time · Global

Job description

About Cybersyn

Cybersyn is a new DaaS (data-as-a-service) company, backed by Sequoia, Coatue, and Snowflake. Our mission is to make the world's economic data transparent to businesses and entrepreneurs and enable a new generation of decision makers. We acquire unique data assets (companies, licenses, data rights, consumer dividends) and build derived products on top of that, focusing on measuring what consumers and businesses are spending money on. You can think of Cybersyn as a cross between an investment firm and a technology company focused on data: if we are successful, we will disrupt the traditional market intelligence space. The reward is great - if we are successful, we can disrupt an industry worth $100Bs and build SimCity for the real world.

We have two businesses - consumer insights and public data. We have already released many datasets in the public data domain that we have cleaned, restructured and made joinable on Snowflake Marketplace.

About the role:

Cybersyn is looking for an experienced data engineer to help us refine our technology stack for our data team and implement ingestion pipelines of public domain and private data sources. We are looking for someone who is passionate about the Snowflake Data Cloud and optimizing costs and workloads, in particular. This is the perfect role for someone who loves to tune databases, thinks about cost-compute optimization, and knows their way around a query plan. 

What you will do:

  • Take research and statistical models and pipelines and implement them in Snowflake in an efficient way. You need to worry about compute efficiency and also care about building context for what the data actually is.
  • Tune Snowflake for performance and cost optimization.
  • Provide infrastructure guidance of Snowflake capabilities to accommodate business/technical use cases.
  • Provide production support for Data Warehouse issues such data load problems, transformation translation problems, query optimization.
  • Take end-to-end ownership of your work and enjoy working with different functions across the company.

Who you are:

  • Experience with Snowflake is requisite
  • Experience with query optimization is required. You are comfortable in the Snowflake Query Profiler. Snowflake micro-partitions, sortkeys, query acceleration, and search optimization service should all be terms that you are familiar with and ready to discuss.
  • Experience in SQL is requisite.
  • Experience working with multiple (external) datasets, cleaning, joining, and munging data; experience working with public data sources (ie. US Census, ACS Survey) is a plus.
  • Experience with dbt and orchestrator systems (Dagster, Prefect, Mage, Kestra, or some equivalent) is highly valued.
  • Experience building and operating data pipelines for real customers in production systems.

What you get out of it:

  • Ability to shape Cybersyn’s initial technology decisions.

  • Access to some of the most interesting and largest economic data in the world, including real-time spending, transaction, clickstream data from both third-party and first-party sources. 

    • Much of our data is not available to any other third parties.
    • Our system is built with heterogeneous data sources in mind: we are not working on data from a single product or theme, but data from governments, payment processing systems (think bank records), mobile devices and apps, and SaaS exhaust (think data B2B SaaS collects)
  • Fast moving culture, lots of responsibility and autonomy from day 1.

  • Collaborate and learn from a very dynamic and motivated team in an in-office work environment.

Cybersyn benefits:

  • Unlimited PTO
  • Comprehensive health insurance (medical, dental, vision) - premiums are 100% covered
  • Monthly wellness stipend
  • 401k
  • Paid parental leave policy
  • FSA and commuter benefits
  • Dinners and Friday lunches are provided