Data Science Intern @ Claritas

Responsibilities

Duration: Apr 2023 - Feb 2024 Location: San Diego, California, United States ยท Hybrid

  • Developed a scalable PySpark pipeline, filtering over 1 million IP addresses across diverse data sources, increasing data processing efficiency by 30%
  • Automated the extraction of campaign data from Parquet files and calculated daily data quality metrics, reducing manual effort by 20%
  • Conducted multidimensional analysis of campaign data across tables and dates, identifying trends and detecting anomalies to improve data-driven decision-making