You're seeing this page as if you were . The main menu is still yours, though. Exit from immersion
Ibrahima DiaoID

Ibrahima Diao

Data Engineer - GCP - Python

€665/day
Paris, FR
8-15 years

Average response time: 1 hour

Freelancer profile translated to English.
Back to original language

About Ibrahima

✅ Data Quality Monitoring: Implement effective monitoring via Datadog to ensure data quality.
✅ Pipeline Industrialization: Leverage Airflow and DBT to design robust and maintainable workflows.
✅ Framework Development: Create solid and scalable foundations for your Python projects.
✅ Microservice Deployment: Use Cloud Functions and Cloud Run for fast, scalable, and cost-effective deployments.
✅ Skill Development: Train your Data Engineer teams with 1:1 or group sessions tailored to their needs.
✅ DataOps Automation: Standardize your CI/CD processes with Git and GitLab CI/CD for reliable continuous delivery.
✅ System Reliability: Develop and improve your Site Reliability Engineering (SRE) on GCP.
✅ Cloud Data Platform: Build a performant and scalable infrastructure with BigQuery and Terraform.
✅ Data Regression Testing: Ensure pipeline stability with automated tests guaranteeing data integrity and consistency.
✅ FinOps Optimization: Master your Cloud costs by integrating effective FinOps strategies.
  • French

    Native or bilingual

Can work on-site
Paris (up to 50km), Lyon (up to 50km)

Experience

  • sfr
    Data Engineer
    TELECOMMUNICATIONS
    June 2023 - June 2025 (2 years)
    Paris, France
    Integration of on-premise decision-making to GCP:

    Design of PoCs on GCP with dbt and Spark: Technical proof-of-concept development to validate architectural choices, including the creation of dbt models and Python scripts using Spark for data transformation.

    Development of SQL transformation modules on Cloud Functions: Creation of two modules deployed on Cloud Functions enabling the execution of SQL transformations on BigQuery.

    Dynamic generation of Airflow DAGs via a custom GUI: Design and development of a module integrated into a user interface allowing automated generation of Airflow DAGs from YAML files, facilitating workflow creation and maintenance.

    Automated data quality control on GCP: Implementation of a module deployed in Cloud Functions to verify data quality.

    Streaming data integration via Kafka: Development of Kafka modules to consume messages from a topic, structure them, and automatically deposit them into Google Cloud Storage.

    Orchestration of GCP services with Apache Airflow: Design and management of DAGs orchestrating all data processing on GCP (Cloud Storage, BigQuery, Cloud Functions).

    Active participation in data migration to GCP: Contribution to the recovery and integration of historical datasets from the on-premise infrastructure into the GCP ecosystem, ensuring data integrity and quality.
    Execution of regression tests: Participation in the processing validation strategy.

    Daily project coordination and monitoring: Leading daily meetings, managing tickets via tracking tools, follow-ups, and coordination between Data and DevOps teams.
    Google Cloud Airflow Apache Kafka SQL Agile Method
  • FeeZeen
    Data Engineer
    SOFTWARE PUBLISHING
    October 2022 - January 2023 (3 months)
    Paris, France
    Design, implementation of an architecture on GCP and API development:
    Configuration and structuring of a Data Lake for storing products from marketplaces
    - Implementation of ETL for eco-product extraction with python-airflow
    - Design and provisioning of a Cloud SQL database (PostgreSQL)
    - Development of necessary business APIs for exchange with our web application
    - Setup of DEV and PROD environments
    - Deployment of APIs with Cloud Run
    - Code versioning with git
    Python Google Cloud PostgreSQL Airflow API Flask ETL
  • Casino
    Data Engineer
    RETAIL (LARGE RETAILERS)
    May 2022 - September 2022 (5 months)
    Paris, France
    As part of a project for an application allowing Casino group store managers to place quick orders from their central purchasing department for in-store products. In a team of 3 data engineers, I was responsible for setting up a Spark ETL pipeline and a Machine Learning pipeline:
    - Development and creation of an ETL with PySpark using a Dataproc Spark cluster for reading data from a BigQuery table and saving it to a Google Storage bucket
    - Spark job optimization
    - Populating the application's database tables
    - Implementation and maintenance of Kafka code
    - Use of Cloud Build and Airflow for automation and creation of triggers for model training
    - Implementation of unit tests, regression tests, and integration tests
    - Code versioning with git
    Python Kubernetes Airflow GCP REST API MLOps MLflow

Recommendations

Be the first to recommend Ibrahima

Help this freelancer shine by sharing your experience working together.

These freelancer profiles also match your criteria

AgathaA

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

2

BaptisteB

Baptiste Duhen

Fullstack developer

4.6

(4)

5

AmedA

Amed Hamou

Senior Lead Developer

4

(2)

7

AudreyA

Audrey Champion

Web developer

4.3

(3)

4

Education

  • Master of Science (MS), Scientific Computing
    University of Strasbourg
    2016
    Calcul scientifique et mathématiques de l'information - Probabilités et statistiques -Apprentissage automatique -Traitement du signal et des images -Programmation (java, python, C++, matlab) -Base de données mysql -Équation aux dérivées partielles - Algèbre - Analyse (Mathématiques)

Skill set

Categories