About Aimen
French
Native or bilingual
English
Fluent
Experience
- MymoneybankData EngineerBANKING AND INSURANCEJanuary 2020 - Today (6 years and 4 months)Courbevoie, FranceMyMoneyBank had to face the shutdown of its credit management software (FIBOS) and consequently, develop all the components to do it internally (with the collaboration of Sopra for the Cassiopae software). This project was named GROM, for Grand Raid d'Outre Mer.The GROM project lasted 2 years for me. I was asked to get involved at all levels to enable the development of Spark Jobs for accounting processing (there are about fifty). These were sourced from a DataLake on EMR (Elastic Map Reduce) where Parquet files were available. In order to schedule all the processing, I participated in the development of workflows (DAGs) on Airflow.In addition, outside the GROM project, there were needs for specific processing (without going through the DataLake) which led to the implementation of Java Batches for particular needs. These processes were in Java (Spring).In more detail, I mainly acted on:- The creation of more than sixty Scala Spark Jobs retrieving data from AWS S3 via AWS EMR; then filtering, formatting, aggregating and finally saving them in a database (AWS RDS);- The creation (with the business team) of a non-payment calculation algorithm which is subsequently made available to the entire Finance team;- The implementation of a database management process by historizing SQL scripts via Flyway;- The creation of about ten Airflow DAGs (Python) for scheduling Spark Jobs and Java (Spring) Batches meeting needs, contributing to more than thirty Airflow DAGs maintained by the Accounting team;- The execution of about ten Java Spring Batches retrieving data from a database to generate files that can be integrated into the accounting interpreter;Malt limits the number of characters...
- La banque postaleData EngineerNovember 2018 - July 2019 (8 months)Ivry-sur-Seine, FranceLa Banque Postale wanted to launch the "Vision 360" project to have a complete overview of all its clients. The objective for them was therefore to recruit data engineers to work on feeding a DataLake.In more detail, I mainly acted on:- Implementation of NIFI Workflows: Apache NIFI is a task orchestrator that allows automating tasks with sequencing specific to needs. In my case, the need was to retrieve files (textual), validate them, transform them, and then ingest them into a DataLake (here HDFS);- Implementation of an internal ingestion engine: Apache NIFI having its limits on volumes, I initiated the development of an internal ingestion engine (in Spark), allowing to read different file sources, validate them, transform them and load them into HDFS;- Implementation of HQL scripts and Spark jobs for transforming data stored on HDFS and ingested into Hive;- Resolution of production anomalies and data cleaning.
- Économie d'Énergie SASData EngineerSeptember 2018 - October 2018 (1 month)Economie d'énergie is a company that allows French people to carry out insulation work for a symbolic €1 (with government aid).Having several clients, and therefore several documents, the goal was to categorize all of their documents by creating predictive models to target new clients. Documents of all types (forms, invoices, technical notices, etc.) were transmitted as scans or images.Mission:With 6 machines available, the objective was to classify 700,000 documents weighing between 500KB and 5MB.The mission was divided into 2 parts: retrieving text from files (data engineering) and classifying files based on this text (data science).I worked on the first part: extracting text from documents.The first step was to create a Python program that took a file as input and could extract text from it: this is called OCR (Optical Character Recognition). The processing time for a file varied from 30 seconds to 5 minutes. It was therefore necessary to parallelize this.To parallelize the processing on the 6 machines, I set up a Kafka broker to send messages (file location) in order to extract text from them. Docker containers were started on the 6 machines that listen to the Kafka topic to process the files. The text files were made available on an NFS so that the Data Scientist could retrieve them and continue with the second part.
Recommendations
Be the first to recommend Aimen
Help this freelancer shine by sharing your experience working together.
These freelancer profiles also match your criteria
Agatha Frydrych
Backend Java Software Engineer
4.7
(3)
2
Baptiste Duhen
Fullstack developer
4.6
(4)
5
Amed Hamou
Senior Lead Developer
4
(2)
7
Audrey Champion
Web developer
4.3
(3)
4
Education
- Computer EngineeringENSIIE - National School of Computer Science for Industry and Business2018Cycle ingénieur en spécialité Génie-Logiciel
- Master 2 (M2) - DataScaleUniversité Paris-Saclay2018Gestion de données dans un monde numérique - Data Management in a Digital World (DataScale)