About Michael
- Inference Optimization: vLLM, llama.cpp, TensorRT-LLM, Quantization (GGUF, AWQ, GPTQ, int8), Speculative Decoding (EAGLE), Model Pruning.
- Frameworks & Tools: PyTorch, Hugging Face (Transformers, PEFT, TRL), LangChain, LangGraph, LlamaIndex.
- Agentic AI: Development of autonomous agents, Function Calling, MCP, Multi-Agent Systems.
- RAG: Building Retrieval-Augmented Generation Pipelines, Vector Databases (Pinecone, ChromaDB), Embeddings.
- Core: C/C++, Embedded Linux (Yocto/Buildroot), RTOS.
- Robotics: Inverse Kinematics, Sensor Fusion (IMU), Control Engineering, ROS/ROS2, dlib.
- Communication: CAN Bus (J1939, CANopen), SPI, I2C, MQTT, TCP/IP.
- Languages: Python & C (Expert), C++, TypeScript/JavaScript.
- Infrastructure: Docker, Kubernetes (K8s), AWS (EC2, S3, Lambda), NVIDIA GPU Containers.
- CI/CD: GitLab CI, GitHub Actions, CMake, Make.
- Web/Backend: FastAPI, Flask, Next.js, Supabase, PostgreSQL, GraphQL.
- Requirements analysis, Mentoring.
- Languages: German (Native), English (Business Fluent).
German
Native or bilingual
English
Fluent
Experience
- Internal R&DLLM Inference Optimization & Fine-TuningDecember 2025 - January 2026 (1 month)- Goal:Evaluation and implementation of SOTA techniques for accelerating LLM inference on hardware-constrained systems.- Performance:Application of int8 quantization (via `llmcompressor`) to Qwen models. Increase in throughput by 50% (>5000 tokens/s) with consistent accuracy (GSM8K).- Advanced AI:Investigation of Speculative Decoding (training an EAGLE draft model) and execution of Fine-Tuning (SFT & LoRA).- Tech Stack:Python, vLLM, Hugging Face (PEFT, TRL), Kubernetes, Docker, NVIDIA Dynamo
- Proof of Concept (PoC)Deployment of a Local LLM (Edge AI)December 2025 - January 2026 (1 month)- Task:Replacing a cloud solution with a local LLM (Privacy & Latency).- Solution:Custom build of `llama.cpp` with CPU-specific optimizations. Benchmarking of GGUF quantizations.- Integration:Connection to Open WebUI via API as a drop-in replacement.- Tech Stack:Linux, Docker, CMake, Open WebUI, Python, llama.cpp
- Proof of Concept (PoC)Automation of Customer OrdersDecember 2025 - January 2026 (1 month)- Task:Mapping and automation of an order process.- Solution:Modeling in BPMN (Camunda) and automation using multiple Python workers (inventory check, invoice, delivery).- Tech Stack:Linux, Camunda 7, Python, PostgreSQL, Docker
Recommendations
Be the first to recommend Michael
Help this freelancer shine by sharing your experience working together.
These freelancer profiles also match your criteria
Agatha Frydrych
Backend Java Software Engineer
4.7
(3)
2
Baptiste Duhen
Fullstack developer
4.6
(4)
5
Amed Hamou
Senior Lead Developer
4
(2)
7
Audrey Champion
Web developer
4.3
(3)
4
Education
- M.Sc. Robotics, Cognition, IntelligenceTechnical University of Munich2018Vereint interdisziplinäre Kenntnisse aus den Bereichen Robotik, künstliche Intelligenz, Maschinelles Lernen und Kognitive Systeme. Ziel ist es, intelligente, autonome Systeme zu verstehen und zu entwickeln.