You're seeing this page as if you were . The main menu is still yours, though. Exit from immersion
Habib HerbiHH

Habib Herbi

Freelance Big Data Engineer & Cloud

€550/day
Paris, FR
8-15 years

Average response time: 1 hour

Freelancer profile translated to English.
Back to original language

About Habib

Big Data and Cloud Engineer with 6 years of experience, holding a Big Data Analysis certificate with Spark and Scala/Python, AZURE Databricks certificate, and Cloud experience (AZURE, AWS).
6 years of experience in Spark development, Data Ingestion and transformation, ETL Pipelines, streaming, databases, and Data Warehouses.
  • French

    Native or bilingual

  • English

    Native or bilingual

Can work on-site
Paris (up to 50km)

Experience

  • Groupement Les Mousquetaires - Stime
    Data Engineer / Azure Cloud
    AGRICULTURE
    February 2024 - Today (2 years and 4 months)
    Paris, France
    Azure / Spark Data Engineer: Developing ETLs with Spark using Delta Lake technology from Databricks
    Using Databricks SQL Analytics for Data warehousing
    Using:
    Databricks, Data Factory ADF, Synapse, ADLS, Azure DEVOPS, Repos, Pipelines, CI/CD
    Spark, Spark SQL, DataFrames, Scala, Python, ADLS, Azure Data Factory, Azure databricks, Azure DevOps, Azure Pipelines, Delta Lake, Lake House, Scrum, PowerBI, CI/CD

    Skills: Databricks SQL Analytics - Delta lake - Spark 3 - Microsoft Azure

    - Setting up a Maven solution
    - Use of Prototype Classes (OOP) for maximum code optimization
    - Code modularization
    - Intervention on several cross-functional projects
    - Propose architectures and guide the choice of technologies adapted to needs
    - Development of a Generative AI solution with databricks using GPT4
    - Identify, collect, explore, understand and integrate the data necessary to resolve problematics
    - Development of the Spark/Scala solution on IntelliJ with Maven – unit testing – git copilot
    - Development on Azure Databricks + Data Factory
    - Maintenance and provisioning of the Maven solution (dependencies, build, plugins, etc.)
    - CI/CD with Azure Repos, DevOps and Pipelines

    - SonarQube integration with Azure DevOps


    - Migrating to Unity Catalog

    - Azure Cost Optimization (FinOps)
  • AXADirectAssurance
    Big Data / Cloud Engineer
    March 2022 - January 2023 (10 months)
    Migrate a legacy ETL on SQL Server to the Azure Cloud. We used Databricks' Delta Lake technology to benefit from the Lake House advantages. The team is also responsible for production deployment, continuous integration, and DevOps process management. The mission consisted of:
    • Propose architectures and guide the choice of technologies adapted to the needs of different Data projects
    • Collaborate with business experts to understand business and operational problematics
    • Identify, collect, explore, understand and integrate the data necessary to resolve these problematics
    • Development, monitoring, and scheduling of Azure Data Factory pipelines
    • Development of unit tests with Scala Test
    • Development on Azure Databricks + Data Factory
    • Scheduling jobs with ADF
    • CI/CD with Azure Repos, DevOps, and Pipelines
    • Monitoring of Prod, PreProd, investigation in case of bugs. Quality testing by comparing with the SQL Server source.
    • Debugging on Databricks
    • Participate in Retrospectives to improve team performance
    • Propose architectures and guide the choice of technologies adapted to the needs of different Data projects: Data Model, ETL pipeline
    • Collaborate with POs and clients to understand business problematics
    • Participate, with the team, in the development of the platform on Azure and in defining good development practices
    • Caching and Persisting Z-ordering, Data Skipping Build and deploy pipelines - Azure cost optimization (FinOps) - live delta tables - Unity Catalog
    Tools: Spark, Spark SQL, DataFrames, Scala, Python, ADLS, Azure Data Factory, Azure databricks, Azure DevOps, Azure Pipelines, Delta Lake, Lake House, Scrum, PowerBI, CI/CD, Azure Repos, SQL Server, IntelliJIDE - maven - sbt - Delta Live Tables - Unity Catalog - FinOps
  • Societe Generale
    Big Data Engineer
    September 2019 - February 2022 (2 years and 6 months)
    Paris Area, France
    Pyspark Data Engineer handling the Full big data stack: data ingestion, data transformation, data warehousing, data analytics. Also, data streaming, data visualization, Spark optimization and high-performance job optimization and managing high-requirement data workloads. Using: Spark Scala - PySpark - Python - Hadoop - Cloud - Hive - SparkML - SQL, noSQL, Kafka, oozie, rest api, Structured streaming, delta lake, databricks, Machine Learning, AWS, spark optimization, kibana, scrum master, swagger, project management, azure databricks, hdfs, S3, spark, cluster configuration. Tableau software, Git, Kibana, Ansible, Grafana.

Recommendations

Be the first to recommend Habib

Help this freelancer shine by sharing your experience working together.

These freelancer profiles also match your criteria

AgathaA

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

2

BaptisteB

Baptiste Duhen

Fullstack developer

4.6

(4)

5

AmedA

Amed Hamou

Senior Lead Developer

4

(2)

7

AudreyA

Audrey Champion

Web developer

4.3

(3)

4

Education

  • Master's degree
    Sorbonne Université
    2018
    ISSI Master , proposed by the UPMC University - Paris This Master's Program provides knowledge and experience in image/audio processing for smart systems including Deep and Machine learning. These are the Labs supporting the Program: * L'Institut des Systèmes Intelligents et de Robotique (ISIR, UPMC, INSERM, CNRS) * L'Institut de La vision (IDV, CNRS, INSERM, UPMC) * UMR sciences et technologie de la musique et du son (STMS, IRCAM, UPMC)
  • Master's degree
    Paris-Sud University (Paris XI)
    2017
    Master 1 E3A at Paris-Sud Paris This Master's program prepares students to a more advanced studies in electronics, computer science, image processing and machine learning

Certifications

  • Azure Databricks
    Databricks
    2021
  • Databricks Delta Lake
    Databricks
    2021

Skill set

Categories