About Soufiane
French
Native or bilingual
English
Fluent
Experience
- Société GénéraleData EngineerBANKING AND INSURANCESeptember 2023 - Today (2 years and 9 months)La Défense, FranceAs part of this long-term assignment, I act as a Data Engineer in a large-scale production environment.Key achievements:→ Optimization of the data infrastructure, generating several hundred thousand euros in savings through the redesign of underperforming Spark pipelines and significant reduction in processing times and compute costs.→ Design and development of scalable data pipelines (ETL/ELT) on Palantir Foundry, ensuring reliability and maintainability in production.→ Optimization of Big Data workflows under Apache Spark and Python: reduction of execution times, improved memory management, and data partitioning.→ Implementation of automated data quality control processes, reducing manual interventions and production errors.→ Close collaboration with Data Science teams to provide reliable, documented, and reusable datasets, accelerating the production deployment of ML models.→ Contribution to the overall data architecture with a focus on scalability, performance, and infrastructure cost reduction.Technical environment: Palantir Foundry, Apache Spark, Python, SQL, ETL/ELT, Big Data, MLOps, Data Quality, Pipeline Orchestration.
- Societe GeneraleData ScientistBANKING AND INSURANCEMarch 2023 - September 2023 (6 months)Paris, FranceAs part of my end-of-studies internship (Master's in Data Science), I worked on a synthetic data generation project, a topic at the intersection of data science, data privacy, and dataset quality.Key achievements:→ Design and development of a synthetic data generation engine capable of reproducing the statistical distributions of real data while ensuring the confidentiality and security of sensitive data.→ Application of advanced Machine Learning and statistical modeling techniques to generate realistic artificial datasets, usable for testing and analysis phases without exposing personal data.→ Implementation of an evaluation framework for synthetic data quality: similarity measurement, statistical fidelity metrics, and usability tests to ensure the reliability of generated datasets.→ Integration of the generation pipeline into the company's internal testing environment, with complete technical documentation facilitating adoption by teams.Impact: Reduced reliance on real data for testing phases, accelerated development cycles, and enhanced compliance with GDPR / Privacy by Design requirements.Technical environment: Python, Airflow, Jenkins, Docker, Git, Machine Learning, Statistical Modeling, Synthetic Data Generation, Data Privacy, Data Quality, GDPR, Pandas, Scikit-learn.
Recommendations
Be the first to recommend Soufiane
Help this freelancer shine by sharing your experience working together.
These freelancer profiles also match your criteria
Agatha Frydrych
Backend Java Software Engineer
4.7
(3)
2
Baptiste Duhen
Fullstack developer
4.6
(4)
5
Amed Hamou
Senior Lead Developer
4
(2)
7
Audrey Champion
Web developer
4.3
(3)
4
Education
- Double Degree MSIAM Data ScienceENSIMAG2023
- EngineerENSIMAG - Grenoble INP2023
Certifications
- Certified Palantir Foundry Data Engineer ProfessionalPalantir2025
- Databricks Certified Data Engineer AssociateDatabricks2026