About Toni
Arabic
Native or bilingual
English
Fluent
French
Fluent
Experience
- CORUMData EngineerBANKING AND INSURANCEJanuary 2026 - Today (5 months)Paris, FranceDesign and development of an end-to-end Azure serverless platform for ingesting, processing, and exposing market data from the Bloomberg Data License API, covering investment, valuation, and portfolio tracking needs.Development of Azure Functions in Python to automate Bloomberg flows (DataRequest, HistoryRequest), with OAuth2 / JWT HS256 authentication, asynchronous polling, retry policy, and exponential back-off for long-running requests.Optimization of large financial data retrieval (CSV, CSV.gz, ZIP) with streaming read, Python parsing, schema normalization, and quality controls: detection of missing values, forward-fill on business days, and complete traceability of REAL / FORWARD_FILLED / FALLBACK statuses.Automation of quantitative processing on financial historical data: calculation of returns, NAV, valuation, and temporal aggregations in Python, producing datasets directly usable by Finance, Risk, and Investment teams.Incremental ingestion of financial data: after each Bloomberg execution, data is uploaded incrementally to Azure SQL via Azure Data Factory (ADF) pipelines, with delta management and flow orchestration between environments.Daily feeding of an SFTP: a dedicated ADF pipeline consumes data stored in Azure SQL and generates a file daily, automatically deposited on the target SFTP, ensuring reliable and scheduled delivery to consuming systems.Storage and exposure of data in Azure Cosmos DB and Azure SQL, with collection modeling, SQL queries for interrogation and aggregation, and development of stored procedures to encapsulate critical business processes.Containerization of Azure Functions with Docker and multi-environment deployment (dev / preprod / prod) via YAML Azure DevOps CI/CD pipelines.
- cnasData EngineerTRAVEL AND TOURISMJune 2025 - December 2025 (5 months)Guyancourt, FranceAnalysis, redesign, and security of Azure integration flows for the Voyagiste project following the migration of SharePoint sources to SFTP (FileZilla), within an Azure Cloud environment.Design and development of Azure Data Factory (ADF) pipelines, including Data Flows for ingestion, transformation, and automatic orchestration of multi-format CSV and TXT files.Centralization of data in Azure Data Lake Storage Gen2 (ADLS) through the implementation of a standardized landing zone, ensuring schema consistency.Implementation of data quality rules (cleaning, typing, normalization, consistency checks) directly within ADF Mapping Data Flows to ensure the reliability of the Azure Data Lake.Advanced management of ingestion errors (inconsistent schemas, corrupted files, missing data) via logging, alerting, and exception handling mechanisms in Azure Data Factory.Support and maintenance of historical Talend flows, correction of incident tickets, and impact analysis in coordination with the RUN team.Support for the technical transition from Talend to Azure Data Factory, ensuring service continuity and gradual scaling of Azure processes.Contribution to the High-Level Design (HLD/HLDF) of the Azure integration architecture, in collaboration with the Data Architect, integrating principles of Cloud scalability, maintainability, and evolvability.
- Projet personnelData Engineer - LLMTELECOMMUNICATIONSMay 2025 - September 2025 (4 months)Paris, France- Collection, ingestion, and preparation of textual data from Goodreads and Project Gutenberg (titles, authors, genres, summaries, ratings) via structured Python pipelines, with HTML cleaning, field normalization, UTF-8 encoding, and advanced corpus structuring to ensure the quality of data used by LLMs.- Generation of semantic embeddings via OpenAI text-embedding-ada-002 for vector representation of book meaning, tone, and style, combined with large-scale indexing using FAISS for high-performance semantic search across thousands of documents.- Design and implementation of a RAG (Retrieval-Augmented Generation) architecture with LangChain RetrievalQA, enabling LLMs to answer natural language queries contextually, accurately, and reliably, by leveraging structured knowledge bases.- Implementation of a semantic and business reranking system, combining embeddings, metadata (SQL: ratings, popularity, genres), and user context to improve the relevance, diversity, and personalization of generated responses.- Optimization of the LLM pipeline: adaptive chunking, dynamic context adjustment, fine-tuning of similarity thresholds, and prompt versioning to balance response quality and scalability.- Development of an interactive GenAI application with Streamlit, offering personalized recommendations, intelligent conversational exploration of the catalog, and a natural language querying interface.- Implementation of rigorous LLMOps practices: prompt versioning, query logging, continuous evaluation of response quality through relevance metrics, performance monitoring, and iterative improvement of production models.
Recommendations
These freelancer profiles also match your criteria
Agatha Frydrych
Backend Java Software Engineer
4.7
(3)
2
Baptiste Duhen
Fullstack developer
4.6
(4)
5
Amed Hamou
Senior Lead Developer
4
(2)
7
Audrey Champion
Web developer
4.3
(3)
4
Education
- Analysis, Data Management, and InnovationUniversité Gustave Eiffel2022- Ingestion et transformation de données (ETL / ELT) - Conception de pipelines data batch - Traitements distribués Spark / Databricks - Modélisation analytique (facts, dimensions) - Requêtage et transformations SQL - Data Engineer - Hadoop - Power BI - Scrum - Azure Data Engineering - Databricks - Palantir - Python - SQL
Certifications
- Data Warehousing with Microsoft Azure Synapse AnalyticsCoursera2023
- Data Engineering with MS Azure Synapse Apache Spark PoolsCoursera2023