About Hassina
French
Native or bilingual
English
Fluent
Experience
- BNP ParibasData Engineer - Data Quality, description= **Scoping & business requirements:** Workshops with Product Owners and Data Scientists to define virtual assistant service requirements: Q&A business rules, data quality criteria, and IT constraints (banking security, DMZR, IBM COS, Elasticsearch). **Architecture & Design (DDD):** Design of a Domain / Application / Infrastructure architecture. Modeling of key entities (Document, Chunk, Embedding, IndexRecord) and implementation of a modular, scalable, and maintainable pipeline. **Ingestion & Data Quality (ETL):** Development of a complete ingestion pipeline from IBM COS: automatic format detection (CSV/JSON), robust parsing, normalization, quality controls, and data lifecycle (raw → parsed → enriched → indexed → dead_letter). **Data Quality & Reliability:** Definition and implementation of Data Quality rules (completeness, consistency, uniqueness, conformity). Anomaly detection (missing data, duplicates, errors), error management, and processing traceability. **Data Security & Access:** Development of a secure Python connector for IBM COS (DMZR) with dynamic credential retrieval via Vault and secure logging. **Structuring & Embeddings:** Implementation of a chunking strategy adapted to the banking context (semantic consistency, controlled sizes). Embedding generation with batch management, retries, and structured logs. **Elasticsearch Industrialization:** Creation and management of indexes, optimized mappings (custom analyzers, nested fields, multi-fields). Bulk indexing with partial error management and atomic alias switching without downtime. **Documentation & Agility:** Writing technical documentation on Confluence. Working in Agile Scrum methodology, managing technical user stories, and tracking via Jira.BANKING AND INSURANCESeptember 2025 - December 2025 (3 months)Montreuil, France
- LetxbeData Engineer – Data Quality & Governance - Cloud AWSSOFTWARE PUBLISHINGDecember 2023 - August 2025 (1 year and 8 months)Paris, FranceData Scoping & RequirementsGathering requirements from business and technical stakeholders with a strong focus on data quality, reliability, and governance: business rules, security requirements, IT constraints, costs, and cloud service choices.Data Quality by DesignDefinition and implementation of data quality rules (completeness, consistency, uniqueness, schema conformity).Integration of automated quality controls in ingestion and indexing pipelines to detect anomalies (missing data, inconsistencies, partial errors).Data Platform & InfrastructureDeployment and industrialization of OpenSearch on AWS via Terraform: secured clusters (IAM, TLS/KMS), CloudWatch logging, multi-AZ private subnets, and VPC Endpoints ensuring data integrity and confidentiality.Reliable & Scalable PipelinesDesign of Python indexing and search pipelines with systematic data validation: dynamic mappings, custom analyzers, nested fields, and consistency checks before exposure.Query optimization and low-latency API exposure.Data Migration & ReliabilityMigration from ArangoDB to OpenSearch: extraction, cleaning, transformation, and post-migration quality checks to ensure data completeness and conformity.Monitoring & GovernanceProactive monitoring of data quality and freshness (alerts on errors, volumes, shards, snapshots).Securing flows via AWS Transfer Family (SFTP), SQS → Lambda → API automation, and FinOps tracking for sustainable data governance.
- StellantisData Engineer – Data Quality & Pipeline Industrialization (GCP | Autonomous Vehicles)AUTOMOBILESeptember 2021 - December 2023 (2 years and 2 months)Paris, FranceData Scoping & RequirementsCollaboration with Data, ML, and Vehicle Engineering teams to define data quality requirements for road test data: sensor stream reliability, temporal consistency, analytical and ML usability, volume and performance constraints.Ingestion & Data Pipelines (GCP)Implementation of automated pipelines for collecting, synchronizing, and transferring sensor data (video, audio, LIDAR, CAN logs) to Google Cloud Storage, orchestrated by Apache Airflow and triggered upon raw file reception.Data Processing & Data QualityDevelopment of distributed processing with Dataflow to ensure data quality: cleaning (audio filtering, redundant frame removal), multi-sensor timestamp normalization, completeness and consistency checks, enrichment with metadata (vehicle ID, GPS, weather conditions).Reliability & Quality ControlsImplementation of Data Quality rules on incoming and transformed data: automatic detection of corrupted, incomplete, or inconsistent data, isolation of non-compliant streams, and securing datasets used for analysis and ML.Storage & StructuringStructuring data in BigQuery (partitioned tables, controlled schemas), with monitoring of freshness, volumes, and traceability of flows from source to final datasets.Orchestration & MonitoringComplete pipeline orchestration with Airflow, integrating quality controls at each key stage, job monitoring, failure management, and automatic recovery to ensure processing continuity.ML Datasets & DeploymentPreparation of reliable datasets for model training on Vertex AI, then deployment of validated models on embedded platforms (NVIDIA Jetson), using Docker, RTMaps, and ROS2 to ensure reproducibility and robustness.
Recommendations
These freelancer profiles also match your criteria
Agatha Frydrych
Backend Java Software Engineer
4.7
(3)
2
Baptiste Duhen
Fullstack developer
4.6
(4)
5
Amed Hamou
Senior Lead Developer
4
(2)
7
Audrey Champion
Web developer
4.3
(3)
4
Education
- Master 2Créteil2020Système distribués et technologies de la data science
Certifications
- ROSOrsys2023
- Hands-on Machine Learning with NVIDIA and AWSCoursera2023