About Soufiane
French
Native or bilingual
English
Fluent
Spanish
Basic
Arabic
Native or bilingual
Experience
- RealyticsData EngineerTELECOMMUNICATIONSNovember 2023 - August 2025 (1 year and 9 months)Paris, FranceParticipation in the modernization and scaling of Realytics' analytical pipelines, as part of the BEE product, measuring the impact of TV campaigns on real-time web sessions.
- End-to-end design of Big Data pipelines, from ingestion to delivery (PySpark, Trino, Hive), including Hadoop administration aspects (HDFS management, Spark job monitoring, Spark job optimization).
- Direct contribution to the migration to Airflow on Kubernetes with Helm: implementation of dynamic triggers, Spark worker configuration, DAG supervision.
- Implementation of automated recovery and restart mechanisms in case of incidents (fine-grained error management).
- Daily RUN support: monitoring Airflow executions, analyzing Spark logs, detecting and resolving production anomalies (partition corruption, SparkSQL errors, S3 connectivity loss).
- Regular interactions with Backend, Product, Frontend, and Data Analyst teams to adapt workflows to their constraints and synchronize deployments.
- Continuous deployment via Jenkins and ArgoCD, writing Ansible playbooks to standardize initialization and testing tasks.
- Advanced use of Linux (CLI, Cron, memory management, system logs) to analyze abnormal behaviors.
- Proactive approach to technical choices and Spark optimization (partitioning, shuffle tuning, broadcast join).
Results:- Reduction of processing times by approximately 40%, with compute costs halved.
- Improved reliability of processes: 95% success rate for critical DAGs.
- Strong autonomy in resolving production incidents and contribution to internal documentation.
Technical Environment: PySpark, Trino, Hive, Spark SQL, HDFS, S3, Airflow, Helm, Jenkins, ArgoCD, Docker, Kubernetes, Ansible, Linux, Git, Grafana, Jira. - ZELROSData EngineerTECHOctober 2022 - October 2023 (1 year)Paris, France
- Implementation of an analytical pipeline on GCP to support customer recommendations in the insurance sector.
- Deployment of a complete pipeline in production: ingestion from Cloud Storage, processing, and populating BigQuery tables.
- Performance optimization through BigQuery partitioning, ensuring response times suitable for a real-time engine.
- Production technical support: troubleshooting cloud permission issues, scheduling errors, and incoming data anomalies.
- Collaboration with Product and Backend teams to ensure functional consistency of exposed data.
- Implementation of unit tests (Pytest), an alerting system, and participation in functional testing phases.
- Contribution to CI/CD maintenance (GitHub Actions, dependency management via Poetry, code quality control with Ruff).
Results:- Stable production pipeline with an SLA < 30 min.
- Zero critical errors after implementing automated tests.
Technical Environment: GCP, BigQuery, Cloud Storage, Airflow, Python, GitHub Actions, Ruff, Poetry, Unix. - ApnealApnea Data EngineerHEALTH AND WELLNESSMay 2022 - September 2022 (4 months)Paris, FranceParticipation in the development of a data pipeline for a sleep apnea screening device, including preparing data from SQLite databases and polysomnography files, orchestrating S3 ingestion/export flows, processing physiological signals, and industrializing modules via a documented Python package (Sphinx) deployed on AWS (S3, EC2, SageMaker).
Recommendations
Be the first to recommend Soufiane
Help this freelancer shine by sharing your experience working together.
These freelancer profiles also match your criteria
Agatha Frydrych
Backend Java Software Engineer
4.7
(3)
2
Baptiste Duhen
Fullstack developer
4.6
(4)
5
Amed Hamou
Senior Lead Developer
4
(2)
7
Audrey Champion
Web developer
4.3
(3)
4
Education
- Master in Data ScienceUniversité Paris Dauphine2022