You're seeing this page as if you were . The main menu is still yours, though. Exit from immersion
Firas Ben YounesFB

Firas Ben Younes

Cloud Data Engineer GCP BigQuery Airflow Spark

€700/day
Courbevoie, FR
3-7 years

Average response time: 1 hour

Freelancer profile translated to English.
Back to original language

About Firas

I am a **Google Cloud Professional Certified Data Engineer**, passionate about transforming data into strategic assets. With over 5 years of experience, I excel in designing and optimizing robust and scalable data architectures.

What I offer:

  • Certified Expertise:As a Google Cloud Certified Professional Data Engineer, I master GCP tools and services, includingBigQuery, Dataflow, Dataproc, and Cloud Composer (Airflow).
  • Leadership in Design and Optimization:As a Data Engineer at EDF, I led the strategic migration of data infrastructures to GCP, optimizing performance and reducing costs. I designed medallion data pipeline architectures, enabling efficient data management across different layers (Bronze, Silver, Gold). For example, with very large data sources, I improved pipeline speed by over90%and reduced costs by the same ratio usingDataprocin batch.
  • Strategic Documentation:I write detailed technical architecture documents and develop data processing strategies, ensuring optimal long-term management. My approach guarantees clarity and traceability of technical decisions.
  • Tailored Support:Whether for migration to GCP, setting up new environments, or developing custom solutions, I support you throughout the process. At SEALK, I created a framework for data pipeline management, facilitating the integration of new technologies and optimizing data ingestion and transformation.
I am determined to transform your data challenges into strategic opportunities through a professional and certified approach. Together, let's make your data a true asset for your business!
  • French

    Native or bilingual

  • English

    Native or bilingual

  • Arabic

    Native or bilingual

Remote only
Primarily works remotely

Experience

  • EDF
    Data Engineer
    ENERGY AND UTILITIES
    September 2023 - June 2025 (1 year and 9 months)
    92800 Puteaux, France
    Project Context
    As a Data Engineer at EDF, I am leading the migration of data infrastructures to Google Cloud Platform (GCP). This project aims to reduce costs, optimize performance, and decrease the execution time of complex data pipelines, while managing large data volumes.

    Responsibilities and Achievements
    Architecture Design
    I designed architectures tailored to business needs, considering data sources and types. Service choices were made to ensure a robust and scalable solution.

    Architecture Examples
    Medallion Data Pipeline:
    Bronze Layer: Raw storage.
    Silver Layer: Transformation into structured format.
    Gold Layer: Data ready for analysis.
    Batch Processing: Use of Dataproc to execute PySpark jobs on large data volumes.
    Migration to GCP: Evaluation of sources, setup of the GCP environment, and migration orchestrated by Cloud Composer.
    Documentation and Monitoring
    Architectural Decisions
    Architecture Decision Records (ADR) were created to document critical choices, such as adopting GCP for its scalability and using Dataproc for complex data processing.

    Infrastructure and Tools
    I used Terraform to manage infrastructure, including the configuration of GCS buckets and BigQuery. Setup was coordinated with several teams to deploy necessary environments.

    Impact and Collaboration
    The medallion architecture improved data access for analytical teams. GCP training sessions were organized to enhance the team's autonomy.

    Team and Methodology
    The project involved collaboration with Data Architects, Data Engineers, and DevOps, using the SAFe methodology to foster agility. Technologies used include Terraform, Cloud Composer, PySpark, and BigQuery.
    Big Query Airflow Terraform Google Cloud Platform PySpark
  • Sealk
    Data Engineer
    PRIVATE EQUITY
    June 2022 - July 2023 (1 year and 1 month)
    Paris, France
    Project Context:
    As a Data Engineer, I designed and created a framework for data pipeline management, aiming to optimize ingestion and transformation while ensuring smooth integration with various environments.

    Responsibilities and Achievements:
    Framework Creation:
    Developed a framework based on hexagonal architecture, allowing the isolation of application logic from external tools, facilitating testing and technological evolution.
    Data Management:
    Managed various file types (text, XML, CSV, JSON) from sources like LinkedIn and Creditsafe, and used databases such as MongoDB and Oracle DB.
    Pipeline Setup:
    Established synchronization chains between data sources and Google Cloud Storage (GCS), prepared pipelines for data transformation and model evaluation following a graph theory logic for pipeline sequences.
    Using Apache Beam on Dataflow:
    Implemented Apache Beam for real-time and batch data processing, creating robust and scalable pipelines.
    Resource Optimization:
    Clustering & partitioning on BigQuery tables.
    Staging BigQuery tables.
    Training and Support:
    Trained Data Engineers on the framework and provided customer support to ensure successful adoption of solutions.
    Impact:
    This architecture improved continuous integration and data management, with positive feedback from clients. A junior Data Engineer was able to generate complex pipelines quickly, demonstrating the framework's effectiveness.
    Team Collaboration:
    Worked in SCRUM mode with a team of Data Architects and Data Engineers, supported by Google.
    Technologies Used:
    Orchestration: Cloud Composer
    Processing: Apache Beam, DataFlow
    Storage: BigQuery, Google Cloud Storage
    Languages: Python
    Databases: Oracle, PostgreSQL, MongoDB
    Apache Beam Google Cloud Platform PySpark Airflow Cloud Architecture
  • Agence des Monts
    Data Engineer
    January 2021 - May 2022 (1 year and 4 months)
    Tunisia
    Project: AI-Powered SEO Optimized Article Generator (GPT Model)
    As a Data Engineer, I developed an advanced system for generating SEO-optimized content, aiming to produce relevant articles and increase website traffic.

    Responsibilities:
    Web Crawling
    Data Processing: Developed processing chains in Google Cloud Storage (GCS).
    Data Pipelines: Designed pipelines for data transformation and evaluation.
    APIs: Developed APIs for fine-tuning deep learning models on Google Compute Engine.
    Data Warehouse: Designed a Data Warehouse on BigQuery for data analysis.
    Dashboards: Visualization of SEO performance.
    Impact:
    Traffic Increase: Optimized content leading to a significant increase in traffic.
    SEO Improvement: Better visibility of generated content.
    Client Satisfaction: Positive feedback on article quality.
    Project: Plagiarism Detection System
    Promoted to Tech Lead, I supervised the development of a plagiarism detection system, reducing costs by 90% compared to SaaS solutions.

    Responsibilities:
    Needs Analysis: Feasibility study to define technical specifications.
    Algorithm Development: Text search using Natural Language Processing (NLP) techniques.
    Transfer to GCS and BigQuery with Python and PySpark.
    Web Crawling via residential proxies
    API Development with Socket.IO for real-time communication.
    Impact:
    Cost Reduction: Lowered plagiarism-related costs while maintaining quality.
    Quality Improvement: Effective identification of plagiarism cases.
    Client Satisfaction: Clients satisfied with the flexibility and efficiency of the solutions.
    Technologies Used:
    GCP, Google Cloud Scheduler, Apache Spark, PySpark, BigQuery, Python, Scala, Flask, GitLab, SonarQube, Nginx, Socket.IO.

Recommendations

Be the first to recommend Firas

Help this freelancer shine by sharing your experience working together.

These freelancer profiles also match your criteria

AgathaA

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

2

BaptisteB

Baptiste Duhen

Fullstack developer

4.6

(4)

5

AmedA

Amed Hamou

Senior Lead Developer

4

(2)

7

AudreyA

Audrey Champion

Web developer

4.3

(3)

4

Education

  • Google Cloud Professional Data Engineer certification
    GCP
    2024
  • Engineer's degree, Data science
    Ecole Supérieure Privée d'Ingénierie et de Technologies - ESPRIT
    2020
    Engineer's degree, Data science

Certifications

Skill set

Categories