Freelancer profile translated to English.

Back to original language

Description

✅ Data Quality Monitoring: Implement effective monitoring via Datadog to ensure data quality.

✅ Pipeline Industrialization: Leverage Airflow and DBT to design robust and maintainable workflows.

✅ Framework Development: Create solid and scalable foundations for your Python projects.

✅ Microservice Deployment: Use Cloud Functions and Cloud Run for fast, scalable, and cost-effective deployments.

✅ Skill Development: Train your Data Engineer teams with 1:1 or group sessions tailored to their needs.

✅ DataOps Automation: Standardize your CI/CD processes with Git and GitLab CI/CD for reliable continuous delivery.

✅ System Reliability: Develop and improve your Site Reliability Engineering (SRE) on GCP.

✅ Cloud Data Platform: Build a performant and scalable infrastructure with BigQuery and Terraform.

✅ Data Regression Testing: Ensure pipeline stability with automated tests guaranteeing data integrity and consistency.

✅ FinOps Optimization: Master your Cloud costs by integrating effective FinOps strategies.

Industry field of expertise

Languages

French
Native or bilingual

Workplace preferences

Can work on-site

Paris (up to 50km), Lyon (up to 50km)

sfr
Data Engineer
TELECOMMUNICATIONS
June 2023 - June 2025 (2 years)
Paris, France
Integration of on-premise decision-making to GCP:

Design of PoCs on GCP with dbt and Spark: Technical proof-of-concept development to validate architectural choices, including the creation of dbt models and Python scripts using Spark for data transformation.

Development of SQL transformation modules on Cloud Functions: Creation of two modules deployed on Cloud Functions enabling the execution of SQL transformations on BigQuery.

Dynamic generation of Airflow DAGs via a custom GUI: Design and development of a module integrated into a user interface allowing automated generation of Airflow DAGs from YAML files, facilitating workflow creation and maintenance.

Automated data quality control on GCP: Implementation of a module deployed in Cloud Functions to verify data quality.

Streaming data integration via Kafka: Development of Kafka modules to consume messages from a topic, structure them, and automatically deposit them into Google Cloud Storage.

Orchestration of GCP services with Apache Airflow: Design and management of DAGs orchestrating all data processing on GCP (Cloud Storage, BigQuery, Cloud Functions).

Active participation in data migration to GCP: Contribution to the recovery and integration of historical datasets from the on-premise infrastructure into the GCP ecosystem, ensuring data integrity and quality.
Execution of regression tests: Participation in the processing validation strategy.

Daily project coordination and monitoring: Leading daily meetings, managing tickets via tracking tools, follow-ups, and coordination between Data and DevOps teams.
Google Cloud Airflow Apache Kafka SQL Agile Method
FeeZeen
Data Engineer
SOFTWARE PUBLISHING
October 2022 - January 2023 (3 months)
Paris, France
Design, implementation of an architecture on GCP and API development:
Configuration and structuring of a Data Lake for storing products from marketplaces
- Implementation of ETL for eco-product extraction with python-airflow
- Design and provisioning of a Cloud SQL database (PostgreSQL)
- Development of necessary business APIs for exchange with our web application
- Setup of DEV and PROD environments
- Deployment of APIs with Cloud Run
- Code versioning with git
Python Google Cloud PostgreSQL Airflow API Flask ETL
Casino
Data Engineer
RETAIL (LARGE RETAILERS)
May 2022 - September 2022 (5 months)
Paris, France
As part of a project for an application allowing Casino group store managers to place quick orders from their central purchasing department for in-store products. In a team of 3 data engineers, I was responsible for setting up a Spark ETL pipeline and a Machine Learning pipeline:
- Development and creation of an ETL with PySpark using a Dataproc Spark cluster for reading data from a BigQuery table and saving it to a Google Storage bucket
- Spark job optimization
- Populating the application's database tables
- Implementation and maintenance of Kafka code
- Use of Cloud Build and Airflow for automation and creation of triggers for model training
- Implementation of unit tests, regression tests, and integration tests
- Code versioning with git
Python Kubernetes Airflow GCP REST API MLOps MLflow

Check out Ibrahima's experience

Be the first to recommend Ibrahima

Help this freelancer shine by sharing your experience working together.

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

Baptiste Duhen

Fullstack developer

4.6

(4)

Amed Hamou

Senior Lead Developer

(2)

Audrey Champion

Web developer

4.3

(3)

Signup to reveal

Master of Science (MS), Scientific Computing
University of Strasbourg
2016
Calcul scientifique et mathématiques de l'information - Probabilités et statistiques -Apprentissage automatique -Traitement du signal et des images -Programmation (java, python, C++, matlab) -Base de données mysql -Équation aux dérivées partielles - Algèbre - Analyse (Mathématiques)

Ibrahima Diao

Data Engineer - GCP - Python

About Ibrahima

Experience

Recommendations

These freelancer profiles also match your criteria

Education

Skill set

Categories