You're seeing this page as if you were . The main menu is still yours, though. Exit from immersion
Mohammed EhiriME

Mohammed Ehiri

AI & MLOps Engineer

€500/day
Rabat, MA
3-7 years

Average response time: 1 hour

Freelancer profile translated to English.
Back to original language

About Mohammed

🚀Are your GenAI PoCs struggling to go live?

High latency, hallucinations, sensitive data?

I'm Mohammed, an AI & LLMOps Engineer. I help CTOs and CIOs transform their LLM experiments into robust, scalable, production-ready architectures.

💡My Value Proposition:
Unlike simple API integrators, I master end-to-end industrialization:
  • Sovereignty & Open-Source:Deploying local LLMs (Mistral, Llama 3, Qwen) on your own GPUs viavLLMto ensure total data privacy.
  • Measurable Reliability:Strict evaluation of response quality viaRAGASto mathematically prove the absence of hallucinations.
  • Large Account Expertise:I've designed GenAI architectures deployed for industry giants (Airbus, Renault, Stellantis, Mercedes).

🛠️Projects and deliverables I can assist you with:
  • Advanced RAG (Retrieval-Augmented Generation):Dynamic chunking, vector databases (ChromaDB, pgvector), hybrid search (BM25 + Semantic), and Reranking for absolute precision.
  • Agentic AI (LangGraph):Creation of autonomous AI workflows capable of reasoning and using tools (APIs, DBs).
  • Multimodal AI (OCR + VLM):Extracting data from complex PDFs and charts via Tesseract coupled with visual models (Qwen3-VL, GPT-4o).
  • LLMOps:Inference optimization (< 5s latency), containerization (Docker), and Cloud (Azure, GCP) or On-Premise deployment.
  • Real-time Voice AI:Low-latency transcription/translation via Faster Whisper and WebSockets.

🌍 Based in Morocco, I work on your timezone (CET) with bilingual communication (Native FR/Fluent EN) perfectly adapted to B2B requirements.

📩Ready to industrialize your AI?
Let's discuss your architecture during an initial 15-min chat!
  • French

    Native or bilingual

  • English

    Fluent

Remote only
Primarily works remotely

Experience

  • ALTEN Maroc
    AI & MLOps Engineer
    February 2025 - Today (1 year and 4 months)
    Rabat, Morocco
    • Designed the flagshipGenAIsolution for the ALTEN group to query large-scale knowledge bases (data), reducing manual retrieval time to under 5 seconds through a RAG approach with dynamic chunking (page level for PDF/PPTX/Word; line level for XLSX), FAISS vector store, and ensemble search (BM25 keywords + semantic similarity).
    • Developed a hybrid OCR + VLM pipeline using Tesseract OCR for text extraction—used as prompt context with images for the multimodal LLM Qwen3 analyzing images/charts/scanned documents; leveraged the Qwen3 family for embedding generation and reranking maximizing multi-format accuracy.
    • Orchestrated vLLM with Docker for on-premise GPU serving of open-source LLMs, plus Azure OpenAI/GCP API integration for hybrid cloud/on-prem deployments optimizing cost, privacy, and scalability.
    • Deployed at 10+ major clients **(Stellantis, Airbus, Renault, Mercedes)**, enabling production-scale GenAI for industrial and internal use.
    • Evolved a RAG system into the ALTEN Group Bot, an autonomous group-wide agentic platform; built LangGraphagenticlayer for multi-step reasoning and tool usage with CIO and Mistral AI. Leveraged Mistral Large, Azure AI Search, and PostgreSQL + pgvector for hybrid search; deployed on Azure Container Apps to automate global workflows.

    Tech Stack: Python, LLM, LangChain, LangGraph, Tesseract OCR, HuggingFace, Postgres/PGvector, FAISS, Azure AI Search, vLLM, Docker, Git.
    Artificial Intelligence LLMOps Retrieval-Augmented Generation (RAG) Generative AI Python
  • Freelance
    AI & Software Engineer
    September 2024 - January 2025 (4 months)
    • Designed a document processing pipeline architecture based onRAGusing Spring AI and the OpenAI API (GPT-4o and text-embedding-3-small) with PostgreSQL/pgvector as the vector database: PDF documents were chunked using a Part-of-Speech (POS) based strategy and then indexed for semantic search, reducing manual analysis time by 80%.
    • Evaluated the performance of theRAGsystem using RAGAS metrics (faithfulness, context precision, answer relevance), ensuring high retrieval quality and response consistency.

    Tech Stack: Java, Spring Boot, Spring AI, PostgreSQL (pgvector), React.js, OpenAI API, RAGAS.
    Artificial Intelligence Retrieval-Augmented Generation (RAG) Data Science AI Agent LLMOps
  • ALTEN Maroc
    AI & Cloud Data Engineer Intern
    AUTOMOBILE
    April 2024 - September 2024 (5 months)
    Rabat, Morocco
    • Fine-tuned a Llama3-8B LLM with PEFT (LoRA) on business data to create a 'Text-to-PySpark' assistant, reducing script development time by 75%.
    • Optimized large-scale ETL/ELT pipelines on Databricks (PySpark) through partitioning and caching strategies, improving data processing performance.
    • Designed fault-tolerant Airflow workflows with automatic retries and alerting, reducing ingestion errors.

    Tech Stack: Python, PySpark, LLM, Hugging Face, Azure Databricks, Apache Airflow, PEFT, SQL, Power BI.
    Artificial Intelligence Data Analysis Data Science Python LLM Fine-tuning

Recommendations

Be the first to recommend Mohammed

Help this freelancer shine by sharing your experience working together.

These freelancer profiles also match your criteria

AgathaA

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

2

BaptisteB

Baptiste Duhen

Fullstack developer

4.6

(4)

5

AmedA

Amed Hamou

Senior Lead Developer

4

(2)

7

AudreyA

Audrey Champion

Web developer

4.3

(3)

4

Skill set

Categories