About Firas
- Certified Expertise:As a Google Cloud Certified Professional Data Engineer, I master GCP tools and services, includingBigQuery, Dataflow, Dataproc, and Cloud Composer (Airflow).
- Leadership in Design and Optimization:As a Data Engineer at EDF, I led the strategic migration of data infrastructures to GCP, optimizing performance and reducing costs. I designed medallion data pipeline architectures, enabling efficient data management across different layers (Bronze, Silver, Gold). For example, with very large data sources, I improved pipeline speed by over90%and reduced costs by the same ratio usingDataprocin batch.
- Strategic Documentation:I write detailed technical architecture documents and develop data processing strategies, ensuring optimal long-term management. My approach guarantees clarity and traceability of technical decisions.
- Tailored Support:Whether for migration to GCP, setting up new environments, or developing custom solutions, I support you throughout the process. At SEALK, I created a framework for data pipeline management, facilitating the integration of new technologies and optimizing data ingestion and transformation.
French
Native or bilingual
English
Native or bilingual
Arabic
Native or bilingual
Experience
- EDFData EngineerENERGY AND UTILITIESSeptember 2023 - June 2025 (1 year and 9 months)92800 Puteaux, FranceProject ContextAs a Data Engineer at EDF, I am leading the migration of data infrastructures to Google Cloud Platform (GCP). This project aims to reduce costs, optimize performance, and decrease the execution time of complex data pipelines, while managing large data volumes.Responsibilities and AchievementsArchitecture DesignI designed architectures tailored to business needs, considering data sources and types. Service choices were made to ensure a robust and scalable solution.Architecture ExamplesMedallion Data Pipeline:Bronze Layer: Raw storage.Silver Layer: Transformation into structured format.Gold Layer: Data ready for analysis.Batch Processing: Use of Dataproc to execute PySpark jobs on large data volumes.Migration to GCP: Evaluation of sources, setup of the GCP environment, and migration orchestrated by Cloud Composer.Documentation and MonitoringArchitectural DecisionsArchitecture Decision Records (ADR) were created to document critical choices, such as adopting GCP for its scalability and using Dataproc for complex data processing.Infrastructure and ToolsI used Terraform to manage infrastructure, including the configuration of GCS buckets and BigQuery. Setup was coordinated with several teams to deploy necessary environments.Impact and CollaborationThe medallion architecture improved data access for analytical teams. GCP training sessions were organized to enhance the team's autonomy.Team and MethodologyThe project involved collaboration with Data Architects, Data Engineers, and DevOps, using the SAFe methodology to foster agility. Technologies used include Terraform, Cloud Composer, PySpark, and BigQuery.
- SealkData EngineerPRIVATE EQUITYJune 2022 - July 2023 (1 year and 1 month)Paris, FranceProject Context:As a Data Engineer, I designed and created a framework for data pipeline management, aiming to optimize ingestion and transformation while ensuring smooth integration with various environments.Responsibilities and Achievements:Framework Creation:Developed a framework based on hexagonal architecture, allowing the isolation of application logic from external tools, facilitating testing and technological evolution.Data Management:Managed various file types (text, XML, CSV, JSON) from sources like LinkedIn and Creditsafe, and used databases such as MongoDB and Oracle DB.Pipeline Setup:Established synchronization chains between data sources and Google Cloud Storage (GCS), prepared pipelines for data transformation and model evaluation following a graph theory logic for pipeline sequences.Using Apache Beam on Dataflow:Implemented Apache Beam for real-time and batch data processing, creating robust and scalable pipelines.Resource Optimization:Clustering & partitioning on BigQuery tables.Staging BigQuery tables.Training and Support:Trained Data Engineers on the framework and provided customer support to ensure successful adoption of solutions.Impact:This architecture improved continuous integration and data management, with positive feedback from clients. A junior Data Engineer was able to generate complex pipelines quickly, demonstrating the framework's effectiveness.Team Collaboration:Worked in SCRUM mode with a team of Data Architects and Data Engineers, supported by Google.Technologies Used:Orchestration: Cloud ComposerProcessing: Apache Beam, DataFlowStorage: BigQuery, Google Cloud StorageLanguages: PythonDatabases: Oracle, PostgreSQL, MongoDB
- Agence des MontsData EngineerJanuary 2021 - May 2022 (1 year and 4 months)TunisiaProject: AI-Powered SEO Optimized Article Generator (GPT Model)As a Data Engineer, I developed an advanced system for generating SEO-optimized content, aiming to produce relevant articles and increase website traffic.Responsibilities:Web CrawlingData Processing: Developed processing chains in Google Cloud Storage (GCS).Data Pipelines: Designed pipelines for data transformation and evaluation.APIs: Developed APIs for fine-tuning deep learning models on Google Compute Engine.Data Warehouse: Designed a Data Warehouse on BigQuery for data analysis.Dashboards: Visualization of SEO performance.Impact:Traffic Increase: Optimized content leading to a significant increase in traffic.SEO Improvement: Better visibility of generated content.Client Satisfaction: Positive feedback on article quality.Project: Plagiarism Detection SystemPromoted to Tech Lead, I supervised the development of a plagiarism detection system, reducing costs by 90% compared to SaaS solutions.Responsibilities:Needs Analysis: Feasibility study to define technical specifications.Algorithm Development: Text search using Natural Language Processing (NLP) techniques.Transfer to GCS and BigQuery with Python and PySpark.Web Crawling via residential proxiesAPI Development with Socket.IO for real-time communication.Impact:Cost Reduction: Lowered plagiarism-related costs while maintaining quality.Quality Improvement: Effective identification of plagiarism cases.Client Satisfaction: Clients satisfied with the flexibility and efficiency of the solutions.Technologies Used:GCP, Google Cloud Scheduler, Apache Spark, PySpark, BigQuery, Python, Scala, Flask, GitLab, SonarQube, Nginx, Socket.IO.
Recommendations
Be the first to recommend Firas
Help this freelancer shine by sharing your experience working together.
These freelancer profiles also match your criteria
Agatha Frydrych
Backend Java Software Engineer
4.7
(3)
2
Baptiste Duhen
Fullstack developer
4.6
(4)
5
Amed Hamou
Senior Lead Developer
4
(2)
7
Audrey Champion
Web developer
4.3
(3)
4
Education
- Google Cloud Professional Data Engineer certificationGCP2024
- Engineer's degree, Data scienceEcole Supérieure Privée d'Ingénierie et de Technologies - ESPRIT2020Engineer's degree, Data science
Certifications
- Google Cloud Professional Data Engineer certificationGoogle2024