Freelancer profile translated to English.

Description

I bring AI initiatives in the mid-sized sector into productive operation across the entire lifecycle: from strategy and business case through build to live operation, all from a single source and without hand-offs.

My background is both business and implementation: 15+ years of product and tech responsibility "0→1 and beyond", scaled tech orgs from 0 to 30+ FTE, budget and turnaround ownership (> €3 million), board reporting and due diligence – consistently in regulated and finance-related environments (ISO 27001, EU-GMP, IFS, ISO 22716, GDPR). I think in business value, not in demos: every AI function must improve a measurable process.

My technical focus is on document-heavy processes. Specifically: OCR and Key Information Extraction (KIE) – structured, schema-validated data from unstructured emails, PDFs, and specs. This includes production-ready RAG pipelines (extraction, classification, normalization, semantic matching) and agent systems with tool-calling that independently trigger business processes – always with governance: logging, prompt versioning, audit export, and mandatory human approval.

As a full-stack engineer, I work end-to-end in TypeScript (Next.js/React, HonoJS), Python (FastAPI), and Kotlin/Java (Spring, DDD/Event-Sourcing) – from UI to database. I work cloud-compatible (Vertex AI / Azure) and sovereign on-premise: my own bare-metal GPU infrastructure allows local inference for sensitive documents – GDPR-compliant, without data leaving the network. Stack includes vLLM, Qwen3, pgvector/Milvus, Document AI, PaddleOCR-VL.

I build AI products myself with AI assistance – using Claude Code, proprietary tooling, and established processes (multi-agent orchestration with review gates) that teams can adopt.

In short: strategy, build, and operation from a single source – with a focus on AI that actually runs.

Industry field of expertise

Languages

German
Native or bilingual
English
Native or bilingual

Workplace preferences

Can work on-site

Berlin (up to 50km)

BOLT-BYTE
AI-powered Procurement Platform in a Regulated Environment
PHARMACEUTICALS INDUSTRY
August 2025 - June 2026 (10 months)
Full-cycle initiative for a German pharmaceutical & cosmetics group – from strategy to build to live operation. Four legal entities, three regulatory worlds (GMP Pharma, IFS Food/HACCP, ISO 22716). In production since March 2026.

Starting situation: four fragmented ERP/legacy systems with no cross-entity view of procurement, suppliers, and prices; purchasing via Excel workarounds, no audit trail. Guiding principle "No Data – No AI" – master data cleanup as a prerequisite for every AI function. High manual effort for supplier inquiries and bid qualification.

Performance (end-to-end):
Discovery on the floor, Business Case & Build-vs-Buy, Requirements Spec across ~12 modules
PoC for hypothesis validation, followed by full build & live operation

Product Features:
RFQ lifecycle (DRAFT → ACTIVE → AWARDED → CLOSED), What-if & Award workflow
Suppliers/Offers: central master, line-item extraction, automatic matching
Dashboards with role-specific KPIs; Compliance: multi-tenancy with RBAC, audit trail

AI Features (15+ Production Services):
Extraction of RFQ, offer & condition data with requirement normalization
Semantic material matching with confidence ranking, email/document classification
Document AI layout parsing: PDF → Markdown
RAG enrichment via PubChem, CosIng, Wikidata, ECICS, DSLD, USDA; matching via pgvector
AI Governance: LLM call logging, prompt versioning, audit export – every AI output remains a suggestion with mandatory approval

Tech Stack: Next.js 16 (React 19) · TanStack · shadcn/ui · Tailwind v4 · HonoJS · TypeScript · Zod · PostgreSQL (Supabase) · Prisma v6 · ZenStack (RLS, Multi-Tenant) · Vertex AI / Gemini 2.5 Flash · Vercel AI SDK v6 · pgvector · Document AI · Vercel Workflow Kit · QStash · Upstash Redis · Microsoft Graph & Gmail API · pnpm/Turbo-Monorepo · Vitest Mandatory reviews

Delivery consistently AI-assisted with Claude Code: multi-agent orchestration with human review and steering.
LLM Generative AI AI Strategy OCR AI Agent
BOLT-BYTE
Sovereign AI Infrastructure – On-Premise GenAI/RAG
CONSULTING AND AUDITS
May 2025 - April 2026 (11 months)
Setup of a completely self-hosted on-premise AI infrastructure – end-to-end from market analysis, sourcing, hardware assembly, and cluster setup to operation in a Berlin data center (Speedbone).

The conflict: sending specifications, contracts, and regulatory documents to external cloud models means loss of control and often GDPR compliance. Solution: a private, data protection-compliant environment where RAG and document extraction use cases are prototyped and operated productively with local LLMs. Sensitive documents never leave the network.

Cluster (Bare-Metal, HPE/ASUS):

~328 CPU cores, ~1.6 TB RAM
up to 960 GB GPU VRAM (NVIDIA RTX PRO 6000 Blackwell), GPU sharing via CUDA MPS & MIG
50+ TB NVMe + encrypted S3

Tasks (end-to-end):
Sourcing of all components by price/performance & AI suitability
Hardware assembly (CPU, RAM, GPUs, NVMe, Networking)
OS/Driver/GPU Setup: Ubuntu 24.04 LTS, NVIDIA/CUDA stack
Kubernetes & Networking: microk8s HA, 100-GbE & InfiniBand with RDMA
Operation: Colocation, Observability, encrypted backups

MLOps / Platform:
GitOps via Flux & Kustomize, MAAS for bare-metal provisioning
CI pipelines with Tekton + Kaniko, reproducible builds & deployments
cert-manager, ingress-nginx, HAProxy for secure operation

RAG Stack (local, GPU-accelerated):
LLM Inference: vLLM with Qwen3 (35B, FP8), 131k context, Speculative Decoding
Embeddings & Reranking: Qwen3-Embedding-8B + Qwen3-Reranker-8B
Vector DB: Milvus (GPU and CPU variant) · OCR: PaddleOCR-VL (PDF → structured Markdown)
Storage: Mayastor via NVMe-oF/RDMA (AES-XTS), MinIO-S3 with TCG-OPAL
Observability: Prometheus, Grafana, Loki, Tempo

Result: Complete RAG pipeline from a single source, real benchmarks before any cloud investment, end-to-end encryption, and GitOps reproducibility. Principle: as much local as possible, as much cloud as necessary.
RAG Sovereign AI LLMOps Kubernetes CUDA
BOLT-BYTE
AI Chat Assistant (Consulting)
CONSULTING AND AUDITS
August 2025 - September 2025 (1 month)
• Dialog-driven Website Assistant: Consultation, lead qualification, and appointment booking without media breaks.
• CRM automation via tool calling (HubSpot), calendar integration (Google), HR integration (Personio); GDPR-compliant consent flow.
• Stack: Next.js 15 / React 19, Vercel AI SDK, Vertex AI (Gemini 2.5) & OpenAI (GPT-4o), Streaming + Tool Calling.