About Michaël
Summary
- E-commerce fraud detection (xgboost, feature engineering, MLOps, AWS, Kubernetes)
- Deployment of state-of-the-art NLP models for detecting secrets in source code files (Transformers, Pytorch, FastAPI, ONNX Runtime, AWS Kubernetes EKS)
- Development of a complete Terraform module for an unstructured data processing pipeline: Transforming PDF, PPTX, DOCX files into vectors in Pinecone (LLMs, multi-modal models / VLMs, OCR, Terraform, AWS, Weight's and Biases Weave)
- Training and fine-tuning of classic ML models and LLMs / VLMs
- Agents
- LLM FinOps: cost/performance evaluation (OpenAI, Bedrock, Benchmarks…)
- Deployment of AI services: microservices, APIs, batch jobs
- Implementation of MLOps best practices: reproducibility, versioning (DVC), CI/CD, Docker, experiment tracking, scalability
- Software Engineering: PEP8, clean, documented, and modular code, error handling, monitoring
- Multi-cloud: AWS, Kubernetes
- IaC: Terraform
- Orchestration with Dagster, Airflow
- Development of ETL / ELT pipelines (pyspark, snowpark)
- Snowflake, Hadoop, PostgreSQL
Stack
French
Native or bilingual
English
Native or bilingual
Spanish
Conversational
Experience
- SanofiMLOps Engineer - GenAI Platform team - Sanofi AcceleratorHEALTH AND WELLNESSDecember 2024 - Today (1 year and 6 months)Paris, France• Implementation of an Unstructured Data Pipeline (UDP) with Terraform, AWS Textract, and multimodal LLMs (Claude 3.7, GPT-4o…) to process PDF, DOCX, PPTX files...• Structuring the UDP into modular steps: document parsing (OCR + VLM), metadata extraction (LLMs like Amazon Nova Lite), chunking, then vectorization to Pinecone for RAG use cases.• Packaging the UDP as a Terraform module, enabling specific team deployments without data governance friction.• Design of hybrid OCR + VLM solutions for complex document parsing, optimized for accuracy, cost, and throughput.• Development of an internal benchmark with DVC & Weave to compare open-source libraries and VLMs on structured data extraction.• Containerization of Lambda functions to support certain libraries (python-docx, Weave) while respecting AWS constraints.• Deployment of Weights & Biases Weave to several teams to standardize LLM monitoring and evaluation (LLM-as-a-Judge, leaderboards, tutorials, support).
- GitGuardianMACHINE LEARNING ENGINEERTECHOctober 2023 - Today (2 years and 8 months)Paris, France- Creation of the MLOps stack with GitLab CI, SkyPilot, DVC, ONNX Runtime, BentoML, Helm, ArgoCD, and Dagster.- Fine-tuning and integration of NLP models into the GitGuardian secret detection engine, reducing False Positives by a factor of 5.- Development of a PoC for automatic secret leak remediation using AST parsers and OpenAI API, planned for production in Q1 2025.
- Ubisoft InternationalMACHINE LEARNING ENGINEERVIDEO GAMES AND ANIMATIONFebruary 2021 - October 2023 (2 years and 8 months)Paris, France- Optimization of models (xgboost, fp-growth, feature engineering, semi-supervised learning) for fraud detection in Ubisoft's e-commerce, leading to 5% savings in Ubisoft sales, thus €4M annually.- Sharing MLOps best practices with Data teams at Ubisoft (DVC, ClearML, K8s training jobs, AWS)- Collaboration with Data Engineers on integrating new data sources for our models: Spark, Hadoop, Airflow
Reviews
Recommendations
These freelancer profiles also match your criteria
Agatha Frydrych
Backend Java Software Engineer
4.7
(3)
2
Baptiste Duhen
Fullstack developer
4.6
(4)
5
Amed Hamou
Senior Lead Developer
4
(2)
7
Audrey Champion
Web developer
4.3
(3)
4
Education
- Engineer - Data ScienceISAE-Supaero2020Spécialisation Data Science 2 stages en césure Semestre à Singapour