You're seeing this page as if you were . The main menu is still yours, though. Exit from immersion
Sidi C.SC

Sidi C.

Senior Data Engineer & AI Engineer | RAG | LLM

€750/day
Paris, FR
8-15 years

Average response time: 1 hour

Freelancer profile translated to English.
Back to original language

About Sidi

Looking for an expert to design and deploy an end-to-end enterprise RAG architecture? Build robust LLM pipelines for your GenAI projects? Integrate generative AI into your business workflows?
With nearly 10 years of experience in demanding environments (Sanofi, BNP Paribas CIB, Société Générale), I work on your Generative AI and Data Engineering projects, from scoping to production.
🚀 What I bring concretely:

Currently at Sanofi: Design and development of the complete RAG architecture for a pharmaceutical GenAI platform (parsing, chunking, embedding, vector store, retrieval, LLM generation). Azure OpenAI, AWS Bedrock, Pinecone, Weave/W&B observability.
Generative AI & RAG: Deployment of a document assistance RAG platform at BNP Paribas with a 90% reduction in search time. LangChain, LangGraph, Vector DB, FastAPI.
Data Engineering: End-to-end AML & Fraud Detection pipelines processing 50M+ transactions/day. Spark, Kafka, Kubernetes, AWS.
Technical Leadership: Building and managing a team of 7+ people (Data Engineers, DevOps, BAs). Code reviews, architectural decisions, mentoring.
ML in production: Fraud detection model (GBM/H2O) deployed in production at Société Générale.

📦 Technical Stack: Python, LangChain, LangGraph, Azure OpenAI, AWS Bedrock, Pinecone, FAISS, S3 Vectors, Scala, Spark, Kafka, FastAPI, Kubernetes, Docker, Terraform, Databricks, Weave/W&B, Snowflake.
🎯 Sectors: Pharma, Banking, Finance, Insurance | Available | Île-de-France & Remote
  • English

    Native or bilingual

  • French

    Native or bilingual

Can work on-site
Paris (up to 50km), Lyon (up to 50km), Lille (up to 50km), Nanterre (up to 50km), Bordeaux (up to 50km)

Experience

  • Sanofi Accelerator
    Sanofi - Data & AI Engineer
    PHARMACEUTICALS INDUSTRY
    April 2026 - Today (2 months)
    Paris, France
    Context:
    GenAI platform for automated generation of regulatory documents (Clinical Trial Documents) in the pharmaceutical industry. Critical production environment with strict traceability, security, and compliance requirements.
    Achievements:

    Design and development of the end-to-end RAG architecture: document parsing, chunking, embedding, vector store (Pinecone, S3 Vectors), retrieval, and LLM generation
    Integration of LLM models in production: Azure OpenAI (GPT-4o), AWS Bedrock (Claude)
    Observability architecture for LLM pipelines with Weave/W&B: step-by-step tracing for Data Science teams
    Performance optimization: replaced FAISS with pre-computed S3 Vectors, reducing costs by ~70%
    Refactoring of the backend architecture towards DDD-light: resolved 12 audit findings
    Writing technical specifications (16-section design doc) aligning Data Science, Data Engineering, and Backend
    Multi-environment configuration (dev/test/prod) with Pinecone and EventBridge

    Technical Stack:
    Python 3.12 · FastAPI · AWS (Lambda, Step Functions, ECS, S3, Bedrock) · Azure OpenAI · LangChain · Pinecone · Weave/W&B · Terraform · Docker · GitHub Actions · Snowflake · NestJS · React · TypeScript
    Langchain Retrieval-Augmented Generation (RAG) Python AWS artificial intelligence
  • BNP Paribas CIB
    Senior Data & AI Engineer
    BANKING AND INSURANCE
    May 2022 - February 2026 (3 years and 9 months)
    Pantin, France
    Involvement in Data Engineering and Generative AI projects for the IT Trade Finance team, focusing on AML (Anti-Money Laundering) and Fraud Detection.

    📊 Data Project — AML & Fraud Detection Pipelines
    Development of end-to-end pipelines processing millions of transactions: ETL, transformation, scoring, and alert generation.
    → Spark Optimization (advanced tuning, data skew management)
    → Quantexa Integration for relational graphs and contextual alert enrichment
    → Private cloud deployment with Kubernetes, Skaffold, Kustomize

    👥 Establishment and Structuring of a New Data Engineering Team
    Leadership in building a data team from scratch with 7+ members: defining needs, recruitment, onboarding, and skill development.
    → Creation and scaling of an offshore team in India (4 Data Engineers, 1 DevOps, 1 BA, 1 PO)
    → Implementation of development standards, architectural patterns, and best practices
    → Daily technical supervision: code reviews, architectural decisions, mentoring

    🤖 Generative AI Project — Document Assistance RAG Platform
    Design and deployment of a conversational platform for natural language querying of all project documentation (Confluence, Jira, Elasticsearch, emails).
    → 90% reduction in information retrieval time for teams
    → Multi-source vectorization pipeline, vector database, LLM orchestration via LangChain with prompt engineering and optimized retrieval strategies
    → Python/FastAPI backend API, Kubernetes deployment

    Stack: Python, LangChain, LangGraph, FastAPI, Elasticsearch, Vector DB, Scala, Spark, Kafka, Kubernetes, AWS, S3, Quantexa, ELK, RAG
    Spark RAG Langchain Scala Python
  • Bedrock streaming
    Senior Data engineer
    PRESS AND MEDIA
    January 2022 - May 2022 (4 months)
    Lyon, France
    Freelance mission within the A/B Testing team, on the M6+, RTL+ Hungary, and Videoland streaming platforms.

    📊 Multi-platform Data Pipelines
    Design and development of real-time and batch pipelines for experimentation and analytics across multiple international streaming platforms.
    → Ingestion of high volumes of user events via AWS Glue, EMR, and Athena
    → Scalable workflows with Spark and Databricks to ensure the reliability of experimentation metrics
    → Infrastructure automation via Terraform and CI/CD pipelines (Jenkins, GitHub Actions)

    Stack: AWS (Glue, EMR, Athena), Terraform, Python, Scala, Spark, Databricks, Airflow, Docker, Jenkins, GitHub Actions, Iceberg, dbt
    Terraform Spark Python AWS Databricks

Recommendations

Olivier KanaOK
Santiago MosqueraSM
FU
Olivier Kana and 2 other people have recommended Sidi

These freelancer profiles also match your criteria

AgathaA

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

2

BaptisteB

Baptiste Duhen

Fullstack developer

4.6

(4)

5

AmedA

Amed Hamou

Senior Lead Developer

4

(2)

7

AudreyA

Audrey Champion

Web developer

4.3

(3)

4

Education

  • Computer Science Master's degree
    Sorbonne université (ex Université Pierre et Marie Curie)
    2018

Certifications

Skill set

Categories