Samarpan Dutta

Machine Learning Engineer (Natural Language Processing) at Banjo Health

Samarpan Dutta has worked in various roles since 2018. In 2022, they began working as a Machine Learning Engineer (Natural Language Processing) at Banjo Health. Samarpan'sprimary responsibility includes fine tuning large language models for a variety of NLP tasks and implementing complete MLOps pipeline for them using AWS Sagemaker. Samarpan finetuned RoBERTa and T5 transformer models on a QA task using AWS Sagemaker and HuggingFace with 10TB of de-identified PHI. Samarpan also implemented active learning with Sagemaker ground-truth labeler leading to an overall 80% cost savings in data annotation.

From 2020 to 2022, Dutta worked at the University of South Florida Muma College of Business. Samarpan served as a Graduate Teaching Assistant for the courses Statistical Data Mining, Text Analytics, and Big Data Analytics. Samarpan also worked as a NLP Research Assistant, where they fetched 1TB of unstructured (text) data using Apollo GraphQL server API and stored them to S3 bucket in JSON format leading to 70% reduction of sequential API calls. Samarpan designed supervised deep neural network using TensorFlow Keras Functional API and trained them using labelled sentence vectors emerged from finetuned SBERT, leading to 23 percentage point improvement in model accuracy. Additionally, they implemented distributed training using SLURM Workload Manager, TensorFlow and MPI library on a cluster of 10 GPU-enabled nodes which led to 65% reduction of overall training time.

From 2018 to 2020, Dutta worked at ITC Infotech as a Statistical Analyst. Samarpan conceptualized and coded data preprocessing, cleaning, and transformation steps from a data warehouse having 29 high dimensional tables, determined top 10 price-elastic products in various zonal stores. Samarpan also analyzed the effect of existing promotional strategies and recommended price reduction strategy for revenue maximization which led to 10% increase in margin. Additionally, they performed data preprocessing, cleaning, designed model and provided actionable recommendation to customize offering for senior citizens, which resulted in 12% reduction in churn compared to previous quarter. Samarpan also developed Probit model to flag potentially delinquent customers based on their credit and income history and existing repayment record.

Samarpan Dutta's education history includes a Bachelor of Technology in Computer Science & Engineering from Maulana Abul Kalam Azad University of Technology, West Bengal formerly WBUT from 2014-2018, and a Master of Science in Business Analytics and Information Systems from the University of South Florida Muma College of Business from 2020-2022. Additionally, Samarpan has obtained several certifications from LinkedIn, SAS, and Coursera, including Blockchain: Beyond the Basics, SAS Programming 1: Essentials, Introduction to Financial Accounting, Linear Regression and Modeling, Inferential Statistics, and Introduction to Probability and Data.

Links

Previous companies

ITC Infotech logo