Skills

Data Exploration

pandas, numpy, dplyr (R), plotly

Data Cleaning and Preprocessing

pandas, scikit-learn, tidyr (R), OpenRefine

Data Visualization

matplotlib, seaborn, ggplot2 (R), Tableau, Power BI

Statistical Analysis

scipy, statsmodels, R (base)

Machine Learning

scikit-learn, tensorflow, keras, xgboost, lightGBM, caret (R)

Deep Learning

tensorflow, keras, PyTorch

Natural Language Processing

nltk, spaCy, gensim, transformers (Hugging Face)

Database and Data Management

SQL, MySQL, MongoDB, MS SQL Server Management

Cloud Computing and Distributed Computing

AWS, Azure, Google Cloud Platform (GCP), Snowflake, synapseML

Common Software Tools

Jupyter Notebooks, RStudio, VS Code, Git, Docker, GitHub

Professional/Internship Experience

Lead Data Science Intern

Spotlist Inc. (New Jersey, USA)
July 2023 – March 2024

  • Implemented a streamlined data pipeline using pandas and Transformers to automate data cleaning/preprocessing, reducing data collection time by 30%, allowing deeper insights and analysis.
  • Proposed and implemented data-driven engagement strategies, boosting user engagement by 15% and reducing churn by 10%.
  • Utilized machine learning algorithms to predict customer behavior, increasing targeted marketing campaign ROI by 30%.
  • Created a comprehensive data dictionary for Spotlist app, standardizing over 30 attributes and reducing data cleaning time by 15%, improving analysis accuracy.
  • Created interactive dashboards using Plotly Dash and Bokeh to visualize user demographics, aiding in identifying key segments and increasing app traction.

Junior Analyst (Data & Operations)

Dhakad Events and Entertainment (Mumbai, India)
August 2019 - July 2021

  • Spearheaded advanced data collection systems using SQL, Python, and Excel, improving data accuracy by 40% and reducing time-to-insight by 25%, enhancing marketing strategies.
  • Conducted statistical analyses using SPSS and ML models (pandas, scikit-learn) to evaluate celebrity metrics, enhancing marketing strategies by 30% through targeted content.
  • Developed predictive models using R and Python, increasing event attendance by 20% and profitability by 15% through targeted and optimized event features.
  • Utilized Tableau and Excel/Access for data visualization and manipulation, reducing operational costs by 10% and improving departmental efficiency through new procedures.

Certifications

Google Data Analytics

Google Data Analytics

Business Intelligence

Business Intelligence

Relational Database Design

Relational Database Design

Algorithms Data Structures

Algorithms Data Structures