Karan Thacker

Hello and welcome! I'm a passionate technologist, dedicated to bridging the realms of Machine Learning, Data Engineering, and Software Development. My work experience and portfolio showcases an ensemble of projects, echoing my love for programming.

Work Experience

Apple, Inc

Cupertino, CA | 01/2022 - 08/2022 

  • • Built Machine Learning models by researching, implementing, and evaluating various published approaches for image processing.

    • Trained Deep Learning Model for Image Segmentation and object classification using PyTorch and using transfer learning from pre-trained Convolutional Neural Networks (CNN), thereby automating image editing tasks for Apple Media Products

    • Pruned and compressed Machine Learning models, improving prediction inference time by 50% and further reducing cloud computing resources.

    • Containerized (Docker) and deployed trained models on the cloud using AWS (EC2, ECR, Lambda, S3), creating a RESTful API, making the model available to be integrated as a micro-service

    • Gathered business requirements for automation/testing task at hand and carried out a preliminary analysis on best-suited Robotic Process Automation (RPA) tools available in the market for macOS and iOS devices

Infosys

Pune, India | 09/2017 - 01/2019 

  • • Contributed as software developer on an agile team, taking part in planning, analyzing, developing, testing and maintaining – Software Development Lifecycle (SDLC) of a product

    • Performed Data Migration and Remediation of Manufacturing Data in PostgreSQL, implemented QA automation and data analysis using R Programming, leading to creation of robust, optimized and fault tolerant data models for client

    • Created churn prediction POC on customer demographic data using ensemble and boosting decision trees in Python and Scikit-learn, giving 86% accuracy on test data, thereby helping client strategize targeted marketing campaign

Radiance Technologies

Ahmedabad | 05 /2020 - 08/2020 

  • Automated selection and ranking of bio-medical/clinical documents for medicinal research using Natural Language Processing models (spaCy, scispaCy) specializing in scientific text data, leading to 90%+ reduction of man hours spent on screening documents

    • Created data pre-processing and training pipeline on cloud ML infrastructure (AWS SageMaker), including tokenizing and vectorizing of text data

    • Deployed the ML inference pipeline on cloud, as containerized microservice application with Flask as the web framework

Indian Space Research Organization (ISRO)

Pune, India | 01/2022 - 08/2022 

  • Built a traffic load simulator to test satellite communication programs using Java and UDP protocols, to monitor performance and robustness of the system.

Skill Set

    • Python

    • MATLAB

    • R programming

    • C++

    • HTML

    • CSS

    • JavaScript

    • SQL

    • Docker

    • Kubernetes

    • Terraform

    • React.js

    • Node.js

    • Flask

    • Spark

    • MySQL

    • MongoDB

    • PostgreSQL

    • S3

    • EC2

    • Lambda

    • RDS

    • EKS

    • EMR

    • SageMaker

    • CloudFormation

    • Tableau

    • Power BI

    • Python (Matplotlib, Seaborn)

    • NumPy

    • Scikit-learn

    • TensorFlow

    • Keras

    • PyTorch

    • Pandas

    • Matplotlib

    • NLTK

    • Open-CV

    • PySpark

    • Jupyter

    • Notebooks

    • Git

    • Visual Studio

    • Code Google Colab

    • Apache Airflow

    • Hugging Face

Data Engineering and Analysis

Machine Learning

Web Development

Let’s Code Together