You're seeing this page as if you were . The main menu is still yours, though. Exit from immersion
Abdelmajid BoutjimAB

Abdelmajid Boutjim

Data Architect | Data & ML Engineer

€600/day
Paris, FR
8-15 years

Average response time: 1 hour

Freelancer profile translated to English.
Back to original language

About Abdelmajid

Expert in data engineering, passionate about mathematics, programming, and innovative technologies. With proven experience in designing and deploying large-scale data pipelines and distributed systems, I excel at creating robust, performant, and scalable architectures to harness the full potential of data and address strategic challenges. Driven by a constant pursuit of performance, code quality, and business impact, I combine technical expertise with rigor to design reliable and optimized solutions.

Areas of Expertise: Advanced data engineering, cloud computing, large-scale graph analysis, and implementation of optimization and operations research algorithms.

Technical Skills: Python, Scala, SQL, Apache Spark, Databricks, AWS
  • French

    Native or bilingual

  • English

    Fluent

Can work on-site
Paris (up to 50km)

Experience

  • ENGIE
    Data & ML Engineer
    ENERGY AND UTILITIES
    January 2024 - Today (2 years and 5 months)
    Paris, France
    Design and development of a robust data processing framework for ENGIE clients on the Databricks platform.

    • Development of reusable data processing libraries in Python and PySpark, enabling large-scale and scalable data ingestion and transformation.
    • Refactoring and optimization of PySpark jobs on Databricks, with significant performance gains and a notable reduction in execution times for distributed workloads.
    • Implementation of CI/CD pipelines to automate the deployment of Databricks jobs via GitLab, ensuring fast, reliable, and traceable updates.
    • Design and orchestration of data pipelines for large-scale processing and analysis of gas and electricity consumption data.
    • Design and development of a forecasting engine to anticipate customer consumption patterns from historical data.
    • Contribution to the design of ENGIE's data lake architecture, ensuring the maintainability and reusability of data pipelines.
    *Technical Environment: Databricks, Python, PySpark, Airflow, GitLab.

    Python Spark PySpark Code Optimization
  • SACEM
    DATA ARCHITECT
    FILM AND AV
    December 2021 - November 2023 (1 year and 11 months)
    Paris, France
    Design and deployment of a cloud data platform on AWS for processing data streams from major music platforms (Spotify, YouTube, Deezer, iTunes), optimizing business analysis efficiency and decision-making.
    • Design and implementation of the complete data processing infrastructure architecture on AWS, using S3, Glue, EMR, Lambda, and Elasticsearch.
    • Development of reusable Python libraries to interact with AWS services, promoting standardization of ingestion and transformation processes.
    • Automation and scheduling of data ingestion flows for collecting and processing information from multiple streaming platforms, ensuring reliable and continuously updated datasets.
    • Migration of IBM DataStage workflows (financial data processing) to AWS Glue.
    • Implementation of analytical pipelines on AWS EMR for large-scale analysis of user behavior, listening patterns, and usage statistics.
    • Indexing and making data available in Elasticsearch, facilitating its use by Frontend teams to power visualization applications and dynamic dashboards, offering fluid and efficient data analysis.
    *Technical Environment: Python, PySpark, AWS (S3, Lambda, SNS, SQS, Glue, EMR, Step Functions, API Gateway, Elasticsearch).

    Spark Python Amazon Web Services Big Data Code Optimization
  • Caisse des Dépôts et Consignations
    Software & DATA Engineer
    PUBLIC SECTOR
    November 2018 - November 2021 (3 years)
    Arcueil, Paris, France
    Design and deployment of the centralized data platform for the Caisse des Dépôts Group (CDC), based on the Cloudera distribution to meet the data storage and processing needs of all subsidiaries. Implementation of a scalable Data Lake, supporting both batch and real-time processing, with the goal of industrializing ingestion flows, ensuring GDPR compliance, and providing reliable data for business teams.
    • Design of the data ingestion and processing architecture on the Cloudera environment.
    • Automation of HDFS directory and Hive table structure configuration via Shell scripts, reducing environment deployment time.
    • Provision of work tools for Data Engineers, including JupyterLab notebooks and ready-to-use Hive/HDFS/HBase environments.
    • Implementation of a Kafka-based streaming pipeline for real-time data ingestion.
    • Development of an application for managing and processing application logs using the ELK stack (Elasticsearch, Logstash, Kibana), facilitating continuous monitoring and analysis.
    • Development of a generic RDBMS ingestion solution with Python and Apache Sqoop for relational data integration.
    • Building ETL pipelines for large-scale data processing with PySpark, ensuring robustness and scalability.
    • Data modeling and schema denormalization to support high-performance OLAP analytical loads on Hive, improving query speed and scalability on large data volumes.
    • Implementation and deployment of GDPR-compliant solutions, including encryption, anonymization, and deletion of sensitive data.
    *Technical Environment: Python, Cloudera (HDFS, Yarn, Hue, Hive, HBase, Phoenix, Kafka), ELK, Jenkins, GitLab.

    Sqoop PySpark Hive Big Data Hadoop

Recommendations

Be the first to recommend Abdelmajid

Help this freelancer shine by sharing your experience working together.

These freelancer profiles also match your criteria

AgathaA

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

2

BaptisteB

Baptiste Duhen

Fullstack developer

4.6

(4)

5

AmedA

Amed Hamou

Senior Lead Developer

4

(2)

7

AudreyA

Audrey Champion

Web developer

4.3

(3)

4

Education

  • Master in Computer Science and Operations Research
    Ecole Polytechnique de Paris (l'X)
    2018
    Master Informatique et recherche opérationnelle
  • State Engineering Diploma in Computer Science
    ENSIAS
    2016
    Diplôme d'ingénieur d'état en informatique

Skill set (23)

Categories