Freelancer profile translated to English.

Description

🚀Are your GenAI PoCs struggling to go live?

High latency, hallucinations, sensitive data?

I'm Mohammed, an AI & LLMOps Engineer. I help CTOs and CIOs transform their LLM experiments into robust, scalable, production-ready architectures.

💡My Value Proposition:

Unlike simple API integrators, I master end-to-end industrialization:

Sovereignty & Open-Source:Deploying local LLMs (Mistral, Llama 3, Qwen) on your own GPUs viavLLMto ensure total data privacy.
Measurable Reliability:Strict evaluation of response quality viaRAGASto mathematically prove the absence of hallucinations.
Large Account Expertise:I've designed GenAI architectures deployed for industry giants (Airbus, Renault, Stellantis, Mercedes).

🛠️Projects and deliverables I can assist you with:

Advanced RAG (Retrieval-Augmented Generation):Dynamic chunking, vector databases (ChromaDB, pgvector), hybrid search (BM25 + Semantic), and Reranking for absolute precision.
Agentic AI (LangGraph):Creation of autonomous AI workflows capable of reasoning and using tools (APIs, DBs).
Multimodal AI (OCR + VLM):Extracting data from complex PDFs and charts via Tesseract coupled with visual models (Qwen3-VL, GPT-4o).
LLMOps:Inference optimization (< 5s latency), containerization (Docker), and Cloud (Azure, GCP) or On-Premise deployment.
Real-time Voice AI:Low-latency transcription/translation via Faster Whisper and WebSockets.

🌍 Based in Morocco, I work on your timezone (CET) with bilingual communication (Native FR/Fluent EN) perfectly adapted to B2B requirements.

📩Ready to industrialize your AI?

Let's discuss your architecture during an initial 15-min chat!

Industry field of expertise

Languages

French
Native or bilingual
English
Fluent

Workplace preferences

Remote only

Primarily works remotely

ALTEN Maroc
AI & MLOps Engineer
February 2025 - Today (1 year and 6 months)
Rabat, Morocco
Designed the flagshipGenAIsolution for the ALTEN group to query large-scale knowledge bases (data), reducing manual retrieval time to under 5 seconds through a RAG approach with dynamic chunking (page level for PDF/PPTX/Word; line level for XLSX), FAISS vector store, and ensemble search (BM25 keywords + semantic similarity).
Developed a hybrid OCR + VLM pipeline using Tesseract OCR for text extraction—used as prompt context with images for the multimodal LLM Qwen3 analyzing images/charts/scanned documents; leveraged the Qwen3 family for embedding generation and reranking maximizing multi-format accuracy.
Orchestrated vLLM with Docker for on-premise GPU serving of open-source LLMs, plus Azure OpenAI/GCP API integration for hybrid cloud/on-prem deployments optimizing cost, privacy, and scalability.
Deployed at 10+ major clients **(Stellantis, Airbus, Renault, Mercedes)**, enabling production-scale GenAI for industrial and internal use.
Evolved a RAG system into the ALTEN Group Bot, an autonomous group-wide agentic platform; built LangGraphagenticlayer for multi-step reasoning and tool usage with CIO and Mistral AI. Leveraged Mistral Large, Azure AI Search, and PostgreSQL + pgvector for hybrid search; deployed on Azure Container Apps to automate global workflows.

Tech Stack: Python, LLM, LangChain, LangGraph, Tesseract OCR, HuggingFace, Postgres/PGvector, FAISS, Azure AI Search, vLLM, Docker, Git.
Artificial Intelligence LLMOps Retrieval-Augmented Generation (RAG) Generative AI Python
Freelance
AI & Software Engineer
September 2024 - January 2025 (4 months)
Designed a document processing pipeline architecture based onRAGusing Spring AI and the OpenAI API (GPT-4o and text-embedding-3-small) with PostgreSQL/pgvector as the vector database: PDF documents were chunked using a Part-of-Speech (POS) based strategy and then indexed for semantic search, reducing manual analysis time by 80%.
Evaluated the performance of theRAGsystem using RAGAS metrics (faithfulness, context precision, answer relevance), ensuring high retrieval quality and response consistency.

Tech Stack: Java, Spring Boot, Spring AI, PostgreSQL (pgvector), React.js, OpenAI API, RAGAS.
Artificial Intelligence Retrieval-Augmented Generation (RAG) Data Science AI Agent LLMOps
ALTEN Maroc
AI & Cloud Data Engineer Intern
AUTOMOBILE
April 2024 - September 2024 (5 months)
Rabat, Morocco
Fine-tuned a Llama3-8B LLM with PEFT (LoRA) on business data to create a 'Text-to-PySpark' assistant, reducing script development time by 75%.
Optimized large-scale ETL/ELT pipelines on Databricks (PySpark) through partitioning and caching strategies, improving data processing performance.
Designed fault-tolerant Airflow workflows with automatic retries and alerting, reducing ingestion errors.

Tech Stack: Python, PySpark, LLM, Hugging Face, Azure Databricks, Apache Airflow, PEFT, SQL, Power BI.
Artificial Intelligence Data Analysis Data Science Python LLM Fine-tuning