Freelancer profile translated to English.

Description

Data Engineer specializing in Databricks, Spark, and Azure with 5 years of experience building robust and performant data pipelines for critical environments. I design and industrialize Lakehouse architectures (Bronze / Silver / Gold) on Azure, with a focus on performance, code quality, and observability. What I bring to your projects: ▸ Ingestion & pipelines — building complex ETL/ELT pipelines (Databricks, Azure Data Factory, Airflow, dbt), multi-source ingestion (API, SFTP, files, streaming) ▸ Spark Optimization — execution plan analysis, shuffle elimination, partitioning, Z-ordering. I reduced a critical job from 6h to 1h at Société Générale ▸ Data Architecture — Delta Lake modeling, Unity Catalog governance, end-to-end orchestration ▸ Industrialization — modular and testable code (Pytest, Cucumber), CI/CD (GitLab, Jenkins, Azure DevOps), complete Dev→Prod migration My environments: Apache Spark, PySpark, Scala, Python, SQL, Delta Lake, Databricks, Azure Data Lake Gen2, Snowflake, Airflow 2.x, ADF, dbt, Oracle, Hive Sectors: Banking (Société Générale) · Telecoms (Canal+) · Asset Management (Carmignac) · Energy (Engie) · Sports Betting (Betclic) Databricks Certifications: ✦ Data Engineer Professional (2026) ✦ Data Engineer Associate (2024) ✦ Apache Spark 3.0 Developer (2024) EPITA engineering graduate, specialization in Artificial Intelligence. I work in French and English.

Languages

French
Native or bilingual
English
Fluent

Workplace preferences

Can work on-site

Paris (up to 50km)

CANAL+ TELECOM
Data Engineer / DevOps / Software Engineer / Architect
February 2025 - Today (1 year and 4 months)
Architecture for SFTP collection and multi-operator invoice historicalization (OI & subcontractors), multi-system correlation (Praxedo, Interop, OI refactoring), and billing anomaly certification. Oracle.
Key achievements
▸ Complete Airflow industrialization: design and development of SFTP synchronization DAGs with tree structure preservation, exclusion management, error recovery, and structured logging.
▸ Modular ingestion architecture: BaseProcessing pattern (load_data / clean_data / insert_data) deployed on 6+ flows (PXO, GUDI, MTHD, YANA/KOUROU, SRR).
▸ Migration Dev → Prod: complete project packaging (config, connections, Airflow variables), implementation of unit tests and data validations.
▸ Creation of Oracle tables (DDL) and historicalization schemas.
▸ End-to-end orchestration: SFTP → data server → Oracle.
Stack: Airflow 2.x, Python (Pandas), Oracle, Linux, Git/GitLab
SOCIÉTÉ GÉNÉRALE
Data Engineer
October 2023 - November 2024 (1 year and 1 month)
Near real-time firewall flow mapping pipeline (1.3 Cloud LUCID, Hive).
Key achievements
▸ Major Spark optimization: reduction of critical job execution time from 6 hours to 1 hour (÷6) by analyzing execution plans, repartitioning, and eliminating shuffles.
▸ Enrichment pipeline: collection of partner APIs, daily population of Hive repositories, multi-source joins for raw log mapping.
▸ Observability tooling: partition diagnostic utility function (volume per partition) to accelerate production debugging.
▸ JSON → Hive transformation via Spark jobs, orchestrated daily by Control-M.
▸ Monitoring of job progress via Yarn.
Stack: Control-M, Yarn, Hive, Jenkins, Scala, Spark, HQL, GitHub
CARMIGNAC
Data Engineer
March 2022 - October 2023 (1 year and 7 months)
▸ Shared Scala/Spark library: co-development of a library via IntelliJ to generalize redundant processing, shared within the Databricks ecosystem.
▸ Event-driven ADF pipeline: automatic triggering on Excel file upload, multi-tab parsing, schema validation by Spark job (bad/valid routing), metadata enrichment, and storage in Delta tables.
▸ Cross-source join: merging Morningstar (investment funds) and Vendome (financial assets), writing to PostgreSQL for BI feeding.
▸ Implementation of Cucumber tests (BDD) for development validation.
▸ ADF pipeline construction (Linked Services, Datasets, Triggers, Alerts).
▸ Production deployment of the BI Digitalization stream.
Stack: Azure Data Factory, Blob Storage, Databricks, Cucumber, Scala, Spark, SQL, IntelliJ
Backend refactoring of the Agathe application (AI predictive maintenance): migration to FastAPI to improve maintainability and performance. Real-time IoT sensors on industrial equipment.

Check out Fayssal's experience

Be the first to recommend Fayssal

Help this freelancer shine by sharing your experience working together.

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

Baptiste Duhen

Fullstack developer

4.6

(4)

Amed Hamou

Senior Lead Developer

(2)

Audrey Champion

Web developer

4.3

(3)

Signup to reveal

Engineering degree, Computer Science
EPITA: Engineering School in Computer Science
2021
Diplôme d'ingénieur, Informatique
Industrial Engineering,
Chulalongkorn University
Industrial Engineering,

Databricks Certified Associate Developer for Apache Spark 3.0
Databricks
2024
https://credentials.databricks.com/f4580356-92fd-405c-b173-e5713a078fc0#gs.4ijo9m
Databricks Spark
Databricks Certified Data Engineer Associate
Databricks
2024
https://credentials.databricks.com/8643a29f-2245-4d14-a4a7-c7dfa68b24e2#gs.hcq9rw#acc.jPpXrUw1
Data Engineer Microsoft Azure

Fayssal's certifications are only visible to Malt Community members

Data Engineer

Fayssal B.

Data Engineer · Databricks · Azure Lakehouse Spark

About Fayssal

Experience

Recommendations

These freelancer profiles also match your criteria

Education

Certifications

Skill set (20)

Categories