You're seeing this page as if you were . The main menu is still yours, though. Exit from immersion
Soufiane LemrabetSL

Soufiane Lemrabet

Data Engineer & AI Palantir Foundry

€600/day
Paris, FR
3-7 years

Average response time: 1 hour

Freelancer profile translated to English.
Back to original language

About Soufiane

Data engineer with solid experience in production environments, I help companies build reliable, scalable, and AI-ready data architectures.
My optimizations on data pipelines have reduced costs by several hundred thousand euros in production. This is what I bring to each mission: a rigorous technical vision, focused on performance and real business impact.
I work across the entire data chain: ETL/ELT pipeline design, orchestration, modeling, and data quality assurance. I have worked on complex, large-scale data projects, particularly on enterprise platforms like Palantir Foundry.

My differentiating asset: coming from a Master's in Data Science, I understand the needs of ML and AI teams upstream. I build pipelines designed for real-world use—feature stores, training data, monitoring—without data scientists having to rework them.

Key Skills:
— Data Engineering: Python, Spark, SQL, ETL/ELT, orchestration
— Platforms: Palantir Foundry, Databricks
- Cloud: GCP, AWS
— MLOps: Data CI/CD, model deployment, monitoring
— AI: LLM integration, RAG pipelines, data for AI agents
  • French

    Native or bilingual

  • English

    Fluent

Can work on-site
Paris (up to 50km)

Experience

  • Société Générale
    Data Engineer
    BANKING AND INSURANCE
    September 2023 - Today (2 years and 9 months)
    La Défense, France
    As part of this long-term assignment, I act as a Data Engineer in a large-scale production environment.
    Key achievements:
    → Optimization of the data infrastructure, generating several hundred thousand euros in savings through the redesign of underperforming Spark pipelines and significant reduction in processing times and compute costs.
    → Design and development of scalable data pipelines (ETL/ELT) on Palantir Foundry, ensuring reliability and maintainability in production.
    → Optimization of Big Data workflows under Apache Spark and Python: reduction of execution times, improved memory management, and data partitioning.
    → Implementation of automated data quality control processes, reducing manual interventions and production errors.
    → Close collaboration with Data Science teams to provide reliable, documented, and reusable datasets, accelerating the production deployment of ML models.
    → Contribution to the overall data architecture with a focus on scalability, performance, and infrastructure cost reduction.
    Technical environment: Palantir Foundry, Apache Spark, Python, SQL, ETL/ELT, Big Data, MLOps, Data Quality, Pipeline Orchestration.
    Palantir Foundry Spark Python SQL MLOps
  • Societe Generale
    Data Scientist
    BANKING AND INSURANCE
    March 2023 - September 2023 (6 months)
    Paris, France
    As part of my end-of-studies internship (Master's in Data Science), I worked on a synthetic data generation project, a topic at the intersection of data science, data privacy, and dataset quality.
    Key achievements:
    → Design and development of a synthetic data generation engine capable of reproducing the statistical distributions of real data while ensuring the confidentiality and security of sensitive data.
    → Application of advanced Machine Learning and statistical modeling techniques to generate realistic artificial datasets, usable for testing and analysis phases without exposing personal data.
    → Implementation of an evaluation framework for synthetic data quality: similarity measurement, statistical fidelity metrics, and usability tests to ensure the reliability of generated datasets.
    → Integration of the generation pipeline into the company's internal testing environment, with complete technical documentation facilitating adoption by teams.
    Impact: Reduced reliance on real data for testing phases, accelerated development cycles, and enhanced compliance with GDPR / Privacy by Design requirements.
    Technical environment: Python, Airflow, Jenkins, Docker, Git, Machine Learning, Statistical Modeling, Synthetic Data Generation, Data Privacy, Data Quality, GDPR, Pandas, Scikit-learn.
    Python Airflow Machine Learning Artificial Intelligence (AI) Jenkins

Recommendations

Be the first to recommend Soufiane

Help this freelancer shine by sharing your experience working together.

These freelancer profiles also match your criteria

AgathaA

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

2

BaptisteB

Baptiste Duhen

Fullstack developer

4.6

(4)

5

AmedA

Amed Hamou

Senior Lead Developer

4

(2)

7

AudreyA

Audrey Champion

Web developer

4.3

(3)

4

Education

  • Double Degree MSIAM Data Science
    ENSIMAG
    2023
  • Engineer
    ENSIMAG - Grenoble INP
    2023

Certifications

Skill set

Categories