Sofia Castellanos

Sofia is a Python data engineer with 7 years building ingestion and transformation systems for media and adtech. She spent three years at Spotify on the personalization-data team, where she shipped a streaming-to-batch reconciliation pipeline that processes around 90 billion playback events per day, and two years before that at The New York Times on the subscriber-analytics platform. She focuses her writing on production pandas patterns (chunked reads, categorical memory tricks, Arrow interop), Airflow 2.x task groups, and the kinds of dbt + Python hybrid pipelines that show up once your warehouse bill stops being cute. She also maintains pyspark-helpers, a small library for column-name munging she keeps porting between jobs. Sofia is based in Madrid, originally from Bogota, and a relentless defender of type hints in notebook code.

Статии от Sofia Castellanos

Инженеринг на признаци в Python: Практическо ръководство с Pandas и Scikit-learn (2026)
Ръководства

Инженеринг на признаци в Python: Практическо ръководство с Pandas и Scikit-learn (2026)

Научете как да създавате ефективни признаци за ML модели с Pandas 3.0 и Scikit-learn. Практическо ръководство с код за кодиране, скалиране, TargetEncoder, ColumnTransformer и селекция на признаци.

Sofia Castellanos 14 мин четене