Freelancer profile translated to English.

Description

🔧 Enthusiastic Data Engineer | Mastering Batch & Streaming Data Processing Solutions 🔧

As a passionate data engineer, I strive to transform raw data into powerful insights that guide strategic decisions. With expertise in batch and streaming data processing, I build scalable and efficient data pipelines to meet real-time and historical data needs.

From designing robust ETL workflows to managing large-scale data architectures, I take a hands-on approach to tackle modern data engineering challenges. I master cutting-edge technologies such as Apache Spark, Kafka, and cloud platforms to deliver high-performance data solutions. My goal is to enable organizations to leverage their data for maximum impact, whether through real-time analytics or massive data processing.

Let's connect and explore how to transform your data into a strategic asset!

#DataEngineering #BigData #Streaming #BatchProcessing #CloudComputing #ETL #DataPipelines #RealTimeAnalytics

Industry field of expertise

Languages

French
Native or bilingual
English
Native or bilingual

Workplace preferences

Can work on-site

Paris (up to 50km)

Contentsquare
Senior Data Engineer
E-COMMERCE
October 2021 - Today (4 years and 7 months)
Paris, France
Designed a quota system by presenting different options, trade-offs, technologies, costs, and estimating the time required for production.
Developed a service to receive, validate, and process quota requests using Scala and Akka Http.
Developed a streaming service to read and aggregate credit deduction messages from Kafka, and update credits in the PostgreSQL database.
Created a monitoring system for quota services using Prometheus, Grafana, and Alert Manager.
Benchmarked services by injecting artificial traffic to estimate the necessary resources for proper functioning (CPU, memory, number of instances…).
Deployed services on cloud (kubernetes, AWS, and Azure) using Jenkins, Terraform, and Ansible.
Designed a scraping system composed of 3 services: extractor, scrapper, and provider, capable of managing over 10k resources per second.
Developed an extractor service to extract URLs from payloads (protobuf) using Scala, Akka stream, and Kafka.
Developed a scraping service to download resources and store them in a cloud storage system (aws, Azure).
Managed resource retention using Lifecycle rules on S3 and Azure blob storage.
Optimized scraping by implementing revisit, caching, and rate limiting strategies.
Implemented a provider service to retrieve resources from cloud storage.
Estimated the cost of the scraping service (storage, cloud operations, Kubernetes…).
Created a metadata validation system for session replays by aggregating a massive amount of data (100k msg/s) and implementing business rules using Flink.
Technologies: scala, golang, kafka, akka, aws, azure, kubernetes, clickhouse, aerospike, promotheus, grafana, jenkins, terraform
Société Générale
Data Engineer
September 2019 - October 2021 (2 years and 1 month)
Developed batch jobs in Spark/Scala to create regulatory and financial reports to meet the needs of the recovery and resilience plan. Inputs included files in HDFS, Hive tables, REST APIs, and Teradata databases.
Automated and orchestrated the processing workflow using Oozie.
Executed data analysis queries on Hive.
Created a CI/CD pipeline using Jenkins, Ansible, and Nexus.
Migrated to a new Big Data platform (Cloudera).
Created regulatory reports and dashboards on Power BI.
Managed the connection with Hive using Presto.
Implemented a REST API to insert and read regulatory report configurations using Scala, Akka HTTP, and Postgres.
Deployed the service using Openshift.

Technologies: Scala, Spark, HDFS, Sqoop, Hive, Oozie, Hue, jenkins, ansible, Power BI, Presto, Akka Http, Postgres, Openshift
Kayrros
Data Engineer
ENERGY AND UTILITIES
January 2019 - September 2019 (8 months)
Paris, France
Developed, automated, and optimized the performance of data pipelines for analyzing satellite images to monitor oil and gas production using Pyspark.
Scraped information on oil and gas production and stored it in Elasticsearch.
Industrialized image processing and machine learning mathematical models.
Deployed services on a Kubernetes cluster with Rancher.
Manipulated SQL and NoSQL databases.
Created monitoring dashboards with Kibana.

Technologies: Python, Spark, Pandas, HDFS, airflow, elasticsearch, kibana, docker, rancher

Check out Houssem's experience

Be the first to recommend Houssem

Help this freelancer shine by sharing your experience working together.

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

Baptiste Duhen

Fullstack developer

4.6

(4)

Amed Hamou

Senior Lead Developer

(2)

Audrey Champion

Web developer

4.3

(3)

Signup to reveal

Computer Science Engineering Degree (Double Degree)
Télécom SudParis
2019
Diplôme d'ingénieur en informatique (Double diplôme)
Telecommunications Engineering Degree
Ecole supérieure des communications de Tunis
2017
Diplôme d'ingénieur en Télécommunications

Check out Houssem's education

Data Engineer

Houssem T.

Senior Data Engineer

About Houssem

Experience

Recommendations

These freelancer profiles also match your criteria

Education

Skill set (28)

Categories