You're seeing this page as if you were . The main menu is still yours, though. Exit from immersion
Cheick SanogoCS

Cheick Sanogo

Senior Data Engineer | Databricks | Spark

€678/day
1 project
Paris, FR
8-15 years

Average response time: A few days

Freelancer profile translated to English.
Back to original language

About Cheick

Graduated with a PhD in Computer Science, I have over 10 years of experience in data analysis and processing, with proven expertise in designing and implementing big data solutions.
I specialize in the design, modeling, optimization, and orchestration of data pipelines, as well as the deployment of machine learning, Deep Learning, and LLM models.

My skills are primarily based on technologies such as Spark, SQL, Python, Scala, AWS, Azure, Databricks, Hadoop, Hive, Hbase, Airflow, MLFlow, etc...

I am mainly involved in:
- Design and implementation of the Medallion architecture: Bronze/Silver/Gold
- Data ingestion: ingesting data from multiple sources, in streaming or batch, into the Bronze layer,
- Design and Deployment of data pipelines (ETL):
Modeling, Normalization of data in the Silver/Gold layers.
Implementation of data quality controls,
Deployment of automated and optimized flows to feed analysis systems and BI tools,
- Orchestration and scheduling of ETLs
  • French

    Native or bilingual

  • English

    Fluent

Can work on-site
Paris (up to 50km)

Experience

  • FRAMATOME
    Senior Data Engineer
    April 2025 - Today (1 year and 2 months)
    Courbevoie, France
    Senior Data Engineer responsible for the ingestion, processing, and valorization of project planning data from sources such as Primavera P6, Jira, MS Project, etc.:
    - Design and implementation of data ingestion pipelines in a Medallion architecture on Databricks
    - Data modeling according to star schema or snowflake schema,
    - Data normalization respecting normal forms: NF1, NF2, NF3
    - Development of KPI calculations for monitoring project performance (progress, costs, deadlines) and financial indicators
    - Implementation of data quality controls
    - Optimization of Spark processing performance
    - Deployment and maintenance of ETL pipelines in a production environment
    - Conversion of data transformations written in M language (Power Query) in Power BI into PySpark scripts for optimized execution in the Databricks environment
    - Connection of data from the Gold layer of the Medallion architecture to Power BI, enabling smooth and secure feeding of dashboards for real-time visualization of project KPIs and metrics.
    Azure Databricks Azure DevOps Azure Data Factory PySpark
  • ENGIE SOLUTIONS
    Tech Lead Data Engineer
    ENERGY AND UTILITIES
    January 2023 - December 2024 (2 years)
    Bagneux, France
    Senior Data Engineer responsible for implementing electricity and gas consumption data ingestion flows:
    - Design and implementation of data processing pipelines
    - Processing and ingestion of different file formats (XML, JSON, CSV, PARQUET, etc.)
    - Implementation of streaming processes for real-time data flow ingestion.
    - Implementation of data processing pipeline orchestrators
    - Optimization of Spark processing performance: partition management, Spark configuration tuning, parallelization, caching, etc.
    - Migration of Oracle data flows to Databricks
    - Database management and optimization.
    - Production deployment of ETLs
    Databricks Airflow Python Spark Scala
  • Natixis
    Senior Data Scientist/Engineer Consultant
    BANKING AND INSURANCE
    October 2020 - October 2022 (2 years and 1 month)
    75013 Paris, France
    Data Engineer/Scientist responsible for implementing data solutions for fraud detection, money laundering, and terrorist financing models for compliance:
    - Implementation of data pipelines for data extraction, transformation, and loading (ETL)
    - Implementation of models for suspicious transaction detection,
    - Implementation of matching models between clients and politically exposed persons and individuals on sanction/embargo lists
    - Segmentation of countries based on the associated risk of money laundering and terrorist financing
    - Implementation of data processing pipeline orchestrators
    - Optimization of Spark processing performance: partition management, Spark configuration tuning, parallelization, caching, etc.
    Python PySpark Hadoop

Recommendations

These freelancer profiles also match your criteria

AgathaA

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

2

BaptisteB

Baptiste Duhen

Fullstack developer

4.6

(4)

5

AmedA

Amed Hamou

Senior Lead Developer

4

(2)

7

AudreyA

Audrey Champion

Web developer

4.3

(3)

4

Education

  • Ph.D in Mathematics / Computer Science.
    Université Pierre et Marie Curie
    2017
    Ph.D in Mathematics / Computer Science.
  • Master's Degree in Probability and Random models.
    Université Pierre et Marie Curie
    2012
    Master's Degree in Probability and Random models.

Certifications

Skill set

Categories