You're seeing this page as if you were . The main menu is still yours, though. Exit from immersion
Julien CardiJC

Julien Cardi

Freelance AI | LLM & RAG Expert | Python · FastAPI

€500/day
Paris, FR
3-7 years

Average response time: 1 hour

Freelancer profile translated to English.
Back to original language

About Julien

90% of AI PoCs die before production.
Causes: overly complex architecture, **explosive inference costs (OpenAI/Gemini)**, latency, and zero monitoring.

As an AI & Backend Engineer, I don't build disposable prototypes.I rebuild slow, costly, or unstable LLM pipelines to scale them in production.

I assist Startups & Scale-ups who are blocked on the technical execution and infrastructure of their AI projects.

► MY APPROACH:
Simplification.I remove unnecessary layers (e.g., replacing a complex RAG + Redis with an asynchronous Kafka pipeline + solid Prompt Engineering). I implement intelligent routing tocut your costs by 3xand trueLLMOpsfor complete observability.

► WHAT I IMPLEMENT (Stack & Expertise):
  • **Backend & Infra**: Python, FastAPI, Pydantic v2, asyncio, Kafka, S3, PostgreSQL, Redis.
  • **LLM Engineering**: LangGraph, Function Calling, Schema-first JSON, Prompt Engineering, Multi-model routing (Gemini 2.5, OpenAI).
  • **LLMOps & Monitoring**: Token/cost tracking, Rate-limiting, DLQ, Prometheus, Grafana.
  • **Deployment**: Docker, Kubernetes (K8s), AWS (EKS, Bedrock), vLLM, RunPod, CI/CD.
► TANGIBLE RESULTS:
  • Extraction Pipeline(Scale-up SOLV): Replaced an unstable system with a minimalist Kafka architecture.50k+ docs processed, 99.2% success, API costs ÷3.
  • AI Constraint Clustering**: Designed a **scalable hybrid algorithmwhere DBSCAN/K-Means failed semantically.
  • Automation (Venio AI)**: **Production-deliveredagent platform via OpenAPI spec.

💡 "**Senior-level execution from day one** — cross-functional deliveries across backend, DevOps, and the entire AI stack." — Luca F. (CTO, Venio AI)

► TERMS:
• Full remote (EU Timezone) | 3-6 month missions | Immediate availability.

Shall we look under the hood of your LLM infra?Contact me.
  • French

    Native or bilingual

  • English

    Fluent

  • Spanish

    Fluent

Can work on-site
Paris (up to 50km)

Experience

  • SOLV
    Production LLM Engineer
    October 2025 - Today (8 months)
    Bruxelles, Belgium
    Belgian scale-up in stakeholder analytics & risk management for complex infrastructure projects.

    LLM Document Extraction Pipeline:
    Complete reconstruction of an unstable extraction pipeline (Redis + embeddings + RAG + premium models, crashing at 10+ docs) using a minimalist asynchronous Kafka system in Python/FastAPI.
    → 50,000+ documents processed, 99.2% success, cost reduced by 3x

    Constrained Clustering Algorithm:
    Design and implementation of a hybrid algorithm: feature extraction via LLM (orientation, entities, nature) injected as penalties into the distance matrix before hierarchical clustering. Solved the limitations of two previous attempts (DBSCAN, HDBSCAN+K-Means).

    Multi-model Routing & LLMOps:
    Intelligent routing Gemini Flash ↔ Gemini 2.5 Pro (OpenAI fallback), selection based on complexity/cost. Production Prometheus/Grafana dashboards (p95 latency, costs, extraction density), rate-limiting, exp-backoff retries, DLQ.
    LLM Python LLMOps RAG Langchain
  • Venio AI
    AI Engineer
    February 2025 - September 2025 (7 months)
    Reggio d'Émilie, Italy
    Startup specializing in AI agent automation for non-tech companies.

    Conversational Agent Platform:
    Built a platform for LLM agents in Python/FastAPI: the system understands user needs in natural language, generates a suitable agent, and exposes a ready-to-use API endpoint. Automatic generation of agent tools from OpenAPI specs.

    Benchmarking & Deployment:
    Benchmarking suite (accuracy, cost, latency) to compare LLM models and prompts before production deployment. Automated Docker/Kubernetes deployments via GitLab CI/CD.
    FastAPI Docker Python LLM AI Agent
  • ONECLICKHIRED
    Founder
    January 2025 - September 2025 (8 months)
    AI SaaS: CV parsing + automated personalized outreach. Full stack built solo: React/TS, Fastify, PostgreSQL, Redis/BullMQ, Stripe.

    Multi-provider LLM integration (Gemini + OpenAI), reliable asynchronous jobs. 150 sign-ups.
    LLM PostgreSQL Artificial Intelligence

Recommendations

Be the first to recommend Julien

Help this freelancer shine by sharing your experience working together.

These freelancer profiles also match your criteria

AgathaA

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

2

BaptisteB

Baptiste Duhen

Fullstack developer

4.6

(4)

5

AmedA

Amed Hamou

Senior Lead Developer

4

(2)

7

AudreyA

Audrey Champion

Web developer

4.3

(3)

4

Education

  • Engineer, AI
    EPITA
    2025
    Ingénieur, IA
  • MP
    CPGE N.D. de Sion
    2022
    MP

Skill set

Categories