Machine Learning Engineer

Engineering · Full-time · TX, United States

Job description

About Kiddom

Kiddom is a groundbreaking educational platform that promotes student equity and growth by uniting high-quality instructional materials with dynamic digital learning. Through unparalleled curriculum management functionality, Kiddom empowers schools and districts to take ownership of their curriculum – resulting in learning experiences tailored to meet the unique needs and goals of local communities. Kiddom’s high-quality curriculum is layered with robust teacher and leader data insights to drive the continuous improvement of instructional decisions, school/district programming, and professional learning.

You will work closely with other departments, including Product, Engineering, Machine Learning and Analytics, to understand and cater to their data and ML needs. You will also define and document data workflows, data and ML pipelines, and transformation processes for clear understanding and knowledge sharing.

We are looking for someone with excellent communication skills, with the ability to articulate complex technical concepts to non-technical stakeholders. Do you have a strong understanding of PII compliance and best practices in data handling and storage? If you also exhibit strong problem-solving skills, with a knack for optimizing performance and ensuring data integrity and accuracy, we want to chat!

You Will..

  • Design, build, and maintain scalable data pipelines to transform raw data into analytics-ready datasets.
  • Ensure optimal performance, reliability, and efficiency of the data pipelines.
  • Integrate machine learning models into data pipelines to enhance analytics capabilities.
  • Collaborate with data scientists to deploy and monitor ML models in production.
  • Ensure the scalability and reliability of ML workflows and infrastructure.
  • Develop and optimize ML models for predictive analytics and data-driven decision-making.
  • Monitor the data infrastructure for performance bottlenecks and implement optimizations as necessary.
  • Collaborate with other engineering teams to ensure seamless data integration with high availability.

What we look for...

  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
  • 3+ years of experience as a data engineer, and 8+ years of software engineering experience (including data engineering).
  • Expertise in using Amazon SageMaker for building, training, and deploying machine learning models.
  • Knowledge of AWS Lambda for serverless execution of code, especially for model inference and lightweight processing tasks.
  • Familiarity with AWS Glue or  similar ETL tools (Extract, Transform, Load)
  • Familiarity with Snowflake, RDS, Cassandra database services for structured data storage and querying. Proficiency in using Amazon S3 for data storage and retrieval, especially for large datasets used in machine learning.
  • Knowledge of AWS EC2 for scalable computing resources and ECS for containerized application deployment, useful for training and deploying models. Understanding of AWS Identity and Access Management (IAM) for managing permissions and security.
  • Familiarity with Amazon Kinesis for real-time data streaming and processing.
  • Skills in preprocessing and transforming raw data into a format suitable for machine learning using DBT
  • Experience with CI/CD tools and practices for automating the deployment and monitoring of machine learning models.
  • Knowledge of AWS CloudWatch and AWS CloudTrail for monitoring model performance and logging events.
  • Proficiency in using AWS CloudFormation or Terraform to manage and provision AWS resources programmatically.
  • Strong programming skills in Python
  • Proficiency in SQL for querying databases and manipulating structured data.
  • Understanding of security best practices in AWS, including data encryption and network security.
  • Knowledge of AWS cost management and optimization strategies to ensure efficient use of resources.
  • Experience in developing and deploying APIs for model inference and interaction with other systems using AWS API Gateway and AWS Lambda.