You're seeing this page as if you were . The main menu is still yours, though. Exit from immersion
Bakary S.BS

Bakary S.

AI/ML Engineer & MLOps Specialist

€600/day
Choisy-le-Roi, FR
3-7 years

Average response time: 1 hour

Freelancer profile translated to English.
Back to original language

About Bakary

Senior LLMOps & MLOps Engineer (5+ years) – Specialized in RAG and Generative AI systems in production (AWS, GCP)

I design and deploy AI systems in production (RAG, LLMs, ML) for critical use cases, with a focus on performance, scalability, and cost optimization.

Core Expertise:
• Production-ready RAG architecture (hybrid search, reranking, vector DB: Qdrant, Pinecone)
• LLM Industrialization (vLLM, GPU deployment, scalable APIs)
• End-to-end MLOps (CI/CD, Prefect/Airflow orchestration, monitoring & drift)

Impact:
• Improved accuracy of semantic search systems
• Reduced LLM inference latency and costs
• Deployment of robust and automated ML pipelines

Stack:
LLMOps: LangChain, LangGraph, vLLM, Hugging Face
Vector DB: Qdrant, Pinecone, FAISS
Cloud: AWS (SageMaker, S3, Lambda), GCP (Vertex AI)
Data/ML: PyTorch, XGBoost, PySpark

Available for projects:
• RAG / Generative AI
• MLOps / ML Industrialization
• LLM production system optimization
  • French

    Native or bilingual

  • English

    Fluent

Can work on-site
Choisy-le-Roi (up to 50km)

Experience

  • Base Claude Bernard
    Lead Data & AI
    October 2025 - Today (8 months)
    Île-de-France, France
    • Design and deployment of a medical RAG system in production (880k+ documents), improving response accuracy and ensuring traceability (sourced citations)

    • Implementation of a multi-step retrieval pipeline:
    - Query validation via LLM (medical filtering)
    - Multi-query expansion (semantic coverage)
    - Hybrid search (Qdrant: dense BGE-M3 + sparse Splade + fusion)
    - Reranking via cross-encoder for high clinical accuracy

    • LLM Industrialization:
    - vLLM deployment (OpenAI-like API) on GPU (RunAI)
    - Asynchronous processing (Celery/Redis) → latency reduction

    • Implementation of a production LLMOps stack:
    - GitLab CI/CD, Docker containerization, monitoring
    - Reliability, reproducibility, and cost optimization

    • Development of incremental data pipelines (Prefect):
    - Multi-source ingestion (ANSM, HAS…)
    - Intelligent versioning (hash) → recalculation reduction

    Stack: Qdrant, vLLM, LangChain, Prefect, FastAPI, Docker, GitLab CI/CD, PostgreSQL
  • Lisi
    Data & MLOps Engineer
    CONSULTING AND AUDITS
    November 2022 - Today (3 years and 7 months)
    Paris, France
    • • Development of end-to-end MLOps pipelines on AWS SageMaker:
    - GitLab CI/CD, automated deployment, model registry
    - Drift monitoring with alerts → robustness improvement

    • Design of scalable data pipelines:
    - ETL (AWS Glue, PySpark), orchestration (Airflow)
    - S3 Data Lake + analytics (Athena)

    • Development of APIs and access security:
    - AWS Lambda + API Gateway
    - User management via Cognito

    • Implementation of an industrial RAG system:
    - Semantic search on technical documentation
    - Observability (LangSmith): latency, costs, hallucinations

    Stack: AWS (SageMaker, Glue, Airflow, Lambda, S3), LangChain, OpenSearch, PySpark
    Cloud AWS MLOps / Machine Learning Engineering AWS SageMaker GenAI RAG
  • TradeIn
    Data Scientist
    April 2021 - October 2022 (1 year and 6 months)
    Paris, France
    ️ Stack: AWS (S3, Textract, QuickSight, SageMaker, Lambda, Athena), Airflow, FastAPI, XGBoost, PostgreSQL, PySpark

Recommendations

Be the first to recommend Bakary

Help this freelancer shine by sharing your experience working together.

These freelancer profiles also match your criteria

AgathaA

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

2

BaptisteB

Baptiste Duhen

Fullstack developer

4.6

(4)

5

AmedA

Amed Hamou

Senior Lead Developer

4

(2)

7

AudreyA

Audrey Champion

Web developer

4.3

(3)

4

Education

  • Master of Science
    École Polytechnique
    2020
    Master 2, Data Science
  • Master 2 (M2), Multimedia Networking
    Télécom ParisTech
    2019
    Master 2 (M2), Multimedia Networking

Certifications

Skill set (20)

Categories