Data Science Intern @ Claritas
Responsibilities
Duration: Apr 2023 - Feb 2024 Location: San Diego, California, United States ยท Hybrid
- Developed a scalable PySpark pipeline, filtering over 1 million IP addresses across diverse data sources, increasing data processing efficiency by 30%
- Automated the extraction of campaign data from Parquet files and calculated daily data quality metrics, reducing manual effort by 20%
- Conducted multidimensional analysis of campaign data across tables and dates, identifying trends and detecting anomalies to improve data-driven decision-making