You're seeing this page as if you were . The main menu is still yours, though. Exit from immersion
Aimen SijoumiAS

Aimen Sijoumi

Big Data Engineer, UI-UX Designer, Full Stack Dev

€722/day
Paris, FR
3-7 years

Average response time: 1 hour

Freelancer profile translated to English.
Back to original language

About Aimen

Data Engineer, I have carried out several missions in Big Data.
I have worked for La Banque Postale, Economie d'Energie, and MyMoneyBank.
A multidisciplinary profile, I adapt easily and produce quality deliverables.
  • French

    Native or bilingual

  • English

    Fluent

Can work on-site
Paris (up to 50km)

Experience

  • Mymoneybank
    Data Engineer
    BANKING AND INSURANCE
    January 2020 - Today (6 years and 4 months)
    Courbevoie, France
    MyMoneyBank had to face the shutdown of its credit management software (FIBOS) and consequently, develop all the components to do it internally (with the collaboration of Sopra for the Cassiopae software). This project was named GROM, for Grand Raid d'Outre Mer.

    The GROM project lasted 2 years for me. I was asked to get involved at all levels to enable the development of Spark Jobs for accounting processing (there are about fifty). These were sourced from a DataLake on EMR (Elastic Map Reduce) where Parquet files were available. In order to schedule all the processing, I participated in the development of workflows (DAGs) on Airflow.

    In addition, outside the GROM project, there were needs for specific processing (without going through the DataLake) which led to the implementation of Java Batches for particular needs. These processes were in Java (Spring).

    In more detail, I mainly acted on:
    - The creation of more than sixty Scala Spark Jobs retrieving data from AWS S3 via AWS EMR; then filtering, formatting, aggregating and finally saving them in a database (AWS RDS);
    - The creation (with the business team) of a non-payment calculation algorithm which is subsequently made available to the entire Finance team;
    - The implementation of a database management process by historizing SQL scripts via Flyway;
    - The creation of about ten Airflow DAGs (Python) for scheduling Spark Jobs and Java (Spring) Batches meeting needs, contributing to more than thirty Airflow DAGs maintained by the Accounting team;
    - The execution of about ten Java Spring Batches retrieving data from a database to generate files that can be integrated into the accounting interpreter;

    Malt limits the number of characters...
    Scala Python Gitlab Hadoop Apache Kafka Apache Spark Apache Airflow Docker Kibana Amazon EMR Amazon RDS Apache Hadoop Spring boot SQL Hashicorp Vault
  • La banque postale
    Data Engineer
    November 2018 - July 2019 (8 months)
    Ivry-sur-Seine, France
    La Banque Postale wanted to launch the "Vision 360" project to have a complete overview of all its clients. The objective for them was therefore to recruit data engineers to work on feeding a DataLake.

    In more detail, I mainly acted on:
    - Implementation of NIFI Workflows: Apache NIFI is a task orchestrator that allows automating tasks with sequencing specific to needs. In my case, the need was to retrieve files (textual), validate them, transform them, and then ingest them into a DataLake (here HDFS);
    - Implementation of an internal ingestion engine: Apache NIFI having its limits on volumes, I initiated the development of an internal ingestion engine (in Spark), allowing to read different file sources, validate them, transform them and load them into HDFS;
    - Implementation of HQL scripts and Spark jobs for transforming data stored on HDFS and ingested into Hive;
    - Resolution of production anomalies and data cleaning.
    Apache Nifi Python Apache Spark Gitlab Apache Hadoop SQL Scala
  • Économie d'Énergie SAS
    Data Engineer
    September 2018 - October 2018 (1 month)
    Economie d'énergie is a company that allows French people to carry out insulation work for a symbolic €1 (with government aid).
    Having several clients, and therefore several documents, the goal was to categorize all of their documents by creating predictive models to target new clients. Documents of all types (forms, invoices, technical notices, etc.) were transmitted as scans or images.

    Mission:
    With 6 machines available, the objective was to classify 700,000 documents weighing between 500KB and 5MB.
    The mission was divided into 2 parts: retrieving text from files (data engineering) and classifying files based on this text (data science).

    I worked on the first part: extracting text from documents.
    The first step was to create a Python program that took a file as input and could extract text from it: this is called OCR (Optical Character Recognition). The processing time for a file varied from 30 seconds to 5 minutes. It was therefore necessary to parallelize this.

    To parallelize the processing on the 6 machines, I set up a Kafka broker to send messages (file location) in order to extract text from them. Docker containers were started on the 6 machines that listen to the Kafka topic to process the files. The text files were made available on an NFS so that the Data Scientist could retrieve them and continue with the second part.
    Apache Kafka Docker Python Ansible Gitlab

Recommendations

Be the first to recommend Aimen

Help this freelancer shine by sharing your experience working together.

These freelancer profiles also match your criteria

AgathaA

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

2

BaptisteB

Baptiste Duhen

Fullstack developer

4.6

(4)

5

AmedA

Amed Hamou

Senior Lead Developer

4

(2)

7

AudreyA

Audrey Champion

Web developer

4.3

(3)

4

Education

  • Computer Engineering
    ENSIIE - National School of Computer Science for Industry and Business
    2018
    Cycle ingénieur en spécialité Génie-Logiciel
  • Master 2 (M2) - DataScale
    Université Paris-Saclay
    2018
    Gestion de données dans un monde numérique - Data Management in a Digital World (DataScale)

Certifications

Skill set (31)

Categories