You're seeing this page as if you were . The main menu is still yours, though. Exit from immersion
Houssem BziouchHB

Houssem Bziouch

Data Engineer

€700/day
Paris, FR
8-15 years

Average response time: 1 hour

Freelancer profile translated to English.
Back to original language

About Houssem

Senior Data Engineer | High-performance data platforms architect

I design scalable, resilient and production-grade data platforms turning raw data into high-value analytics products.

Expertise in:
• Medallion architectures (Bronze → Silver → Gold)
• Spark / PySpark at scale
• Cloud data platforms (AWS, Databricks, Kafka)
• Data quality frameworks & observability

I bring a high-engineering standard mindset: clean code, automation, performance tuning, production readiness.

🎯 My mission: build reliable, scalable and future-proof data systems.


Senior Data Engineer | Architecte de plateformes data haute performance

Je conçois des plateformes data robustes, scalables et orientées fiabilité pour transformer des volumes massifs de données brutes en produits analytiques à forte valeur business.

Spécialisé dans :
• Architectures Bronze → Silver → Gold
• Spark / PySpark à grande échelle
• Cloud data platforms (AWS, Databricks, Kafka)
• Data quality, validation & monitoring avancés

Mon approche : engineering de haut niveau, performance, clean code, automatisation et standards production.
Je m’investis dans chaque projet comme un owner, avec un fort sens du détail et de la qualité.

🎯 Mission : construire des systèmes data durables, performants et exploitables par les métiers.
  • English

    Native or bilingual

  • French

    Native or bilingual

  • Arabic

    Native or bilingual

  • German

    Conversational

Can work on-site
Paris (up to 50km)

Experience

  • Quantum Signals,
    Senior Data Engineer
    March 2025 - Today (1 year and 3 months)
    California, USA
    • • Architected a production-grade Bronze → Silver → Gold platform for high-frequency market data (Databento futures & equities), enabling research and trading ready datasets from raw ticks.
    • • Designed a manifest-driven incremental engine (per symbol/day) guaranteeing idempotence, restart safety and deterministic outputs across replays, backfills and partial-day scenarios.
    • • Led Databricks → self-hosted Spark migration (Hetzner), improving cost control and throughput through shuffle tuning, S3A committers optimization and Parquet layout strategies.
    • • Implemented a strict data correctness framework (DuckDB + automated validation): historical parity checks, numeric drift detection and Silver/Gold coverage reconciliation.
    • • Solved critical market-data integrity issues: sentinel normalization (9223372036854775807), price scaling (1e5) and timestamp semantics (nanoseconds → UTC and NY trading sessions).
    • • Built CI quality gates (GitHub Actions) enforcing schema stability, metric correctness and end-to-end pipeline reliability.
    • • Owned architecture, release lifecycle and reliability standards in close collaboration with research and trading teams.
    • • Tech: PySpark, DuckDB, Databricks, AWS S3, Parquet, Linux, Bash, GitHub Actions, JSON-driven specs.
    Python SQL a-b-testing Cloud AWS etl-processes
  • BNP Paribas,
    Data Engineer
    November 2022 - March 2025 (2 years and 4 months)
    Paris, France
    • • AML & Supply Chain (QUANTEXA): led Spark pipelines for AML compliance and delivered a daily reporting system surfacing country-level AML KPIs.
    • • KYC Integration (BNP DataHub): implemented end-to-end ETL workflows to ingest, monitor and supervise transaction feeds; secured outputs stored in IBM S3.
    • • GCARS Decommissioning: migrated legacy Python/Pandas processes to Spark + IBM S3, improving scalability and operational reliability.
    • • Phonetic Search (BNP Switzerland): built NLP pipelines using stemming, lemmatization and phonetic hashing to support entity matching analytics.
    • • ETL Engineering: designed robust transformations from CSV and private cloud sources into refined datasets and KPIs, orchestrated with Airflow and productionized with CI/CD.
    • • Tech: Apache Spark, Apache Airflow, Docker, SQL/NoSQL, Git, Autosys, Jenkins.
  • Bpifrance,
    Data Engineer
    April 2022 - November 2022 (7 months)
    Paris, France
    • • Financial Monitoring (CDC): built a detection platform consolidating multi-institution datasets to identify irregular transaction patterns across EU/US accounts.
    • • Engineered and optimized Spark-based AWS Glue ETL ingesting heterogeneous sources into raw S3 data lakes.
    • • Ensured daily data quality investigations in Athena; partnered with BAs/PMs via Jira to deliver prioritized features.
    • • Delivered internal data products via APIs (Flask, FastAPI, API Gateway) with automated deployments using CodeDeploy.
    • • Tech: AWS Glue, Spark, S3, Athena, MongoDB, Flask/FastAPI, API Gateway, CodeDeploy, Jira.

Recommendations

Be the first to recommend Houssem

Help this freelancer shine by sharing your experience working together.

These freelancer profiles also match your criteria

AgathaA

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

2

BaptisteB

Baptiste Duhen

Fullstack developer

4.6

(4)

5

AmedA

Amed Hamou

Senior Lead Developer

4

(2)

7

AudreyA

Audrey Champion

Web developer

4.3

(3)

4

Education

  • Engineering Degree in Computer Science
    École Polytechnique de Sousse
    2016
    Engineering Degree in Computer Science

Skill set

Categories