release: bump version to 0.0.5.4

feat: Stille-Toleranz fuer Aufnahme einstellbar in App-Settings
Neuer +/- Block in SettingsScreen → Spracheingabe → "Stille-Toleranz", 1.0-8.0s, Default 2.8s. Wert in AsyncStorage (aria_vad_silence_sec). audio.ts liest den Wert beim Aufnahme-Start und nutzt ihn fuer den VAD-Auto-Stop-Schwellwert. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 14:45:17 +02:00 · 2026-04-24 14:44:17 +02:00 · 2026-04-24 14:41:59 +02:00 · 2026-04-24 14:40:58 +02:00 · 2026-04-24 14:34:11 +02:00 · 2026-04-24 14:16:08 +02:00
87 changed files with 6931 additions and 14159 deletions
@@ -1,20 +1,50 @@
-# ARIA Environment Configuration
-# Copy to .env and fill in values
+# ════════════════════════════════════════════════
+#  ARIA — Umgebungsvariablen
+#  Kopieren nach .env und Werte eintragen
+# ════════════════════════════════════════════════

-# Auth token for ARIA Core (generate a long random string)
-# openssl rand -hex 32
+# ── ARIA Auth Token ──────────────────────────────
+# Authentifizierung fuer den OpenClaw Gateway (aria-core).
+# Wird von Diagnostic, Bridge und App genutzt um sich am Gateway anzumelden.
+# Alle Services die mit aria-core kommunizieren brauchen diesen Token.
+# Generieren: openssl rand -hex 32
 ARIA_AUTH_TOKEN=change-me-to-a-long-random-string

-# RVS — Rendezvous-Server (Bridge + App verbinden sich hierüber)
+# ── RVS — Rendezvous-Server ─────────────────────
+# Der RVS ist ein WebSocket-Relay im Rechenzentrum.
+# App, Bridge, Diagnostic und XTTS-Bridge verbinden sich hierueber.
+# Alle muessen den gleichen Host, Port und Token nutzen.
+
+# Hostname des RVS-Servers (z.B. rvs.example.de oder mobil.hacker-net.de)
 RVS_HOST=rvs.example.de
+
+# Port auf dem der RVS laeuft (muss mit rvs/docker-compose.yml uebereinstimmen)
 RVS_PORT=443
+
+# TLS (wss://) verwenden? true = verschluesselt, false = unverschluesselt (ws://)
 RVS_TLS=true
+
 # Bei TLS-Fehler automatisch auf ws:// (ohne TLS) fallback?
-# true = Fallback erlaubt, false = nur mit TLS verbinden
+# Nuetzlich wenn kein TLS-Zertifikat vorhanden (z.B. Entwicklung)
 RVS_TLS_FALLBACK=true
+
+# Pairing-Token: Wer den gleichen Token hat, landet im gleichen RVS-Room.
+# Wird von generate-token.sh automatisch generiert und hier eingetragen.
+# Die Android App bekommt den Token per QR-Code beim Pairing.
+# WICHTIG: Muss auf ARIA-VM, Gaming-PC (xtts/.env) und App identisch sein!
+# Generieren: ./generate-token.sh (traegt den Token automatisch ein)
 RVS_TOKEN=

-# Gitea (for release.sh — Kennwort wird interaktiv abgefragt)
+# ── Gitea — Release-Verwaltung ───────────────────
+# Wird von release.sh genutzt um APKs auf Gitea zu veroeffentlichen.
+# Kennwort wird beim Release interaktiv abgefragt (nicht in .env!).
 GITEA_URL=https://git.hacker-net.de
 GITEA_REPO=Hacker-Software/ARIA-AGENT
 GITEA_USER=duffyduck
+
+# ── Auto-Update — APK auf RVS-Server kopieren ───
+# SSH-Ziel fuer scp: release.sh kopiert die APK dorthin.
+# Der RVS-Server stellt sie dann per WebSocket an die App bereit.
+# Format: user@host (z.B. root@aria-rvs oder root@rvs.example.de)
+# Leer lassen = Auto-Update ueberspringen, APK manuell auf RVS kopieren.
+RVS_UPDATE_HOST=
@@ -29,9 +29,14 @@ yarn-error.log*
 android/build/
 android/.gradle/
 android/app/build/
+android/android/.gradle/
+android/android/app/build/
+android/android/local.properties
 android/local.properties
+android/package-lock.json
 *.apk
 *.aab
+rvs/updates/*.apk

 # ── Tauri / Desktop Build ───────────────────────
 desktop/src-tauri/target/
@@ -29,11 +29,18 @@ ARIA hat zwei Rollen:
 ┌─────────────────────────────────────────────────────────┐
 │              RVS — Rendezvous-Server                     │
 │         Node.js WebSocket Relay (Docker, Rechenzentrum)  │
-│         Reiner Relay — kennt keine Tokens, leitet durch  │
+│         Relay + Auto-Update (APK-Verteilung)             │
 │                  rvs/docker-compose.yml                  │
-└───────────────────────┬─────────────────────────────────┘
-                        │ WebSocket Tunnel
-                        ▼
+└───────────┬───────────────────────────┬─────────────────┘
+            │ WebSocket Tunnel          │ WebSocket Tunnel
+            ▼                           ▼
+┌───────────────────────────┐
+│  Gaming-PC (optional)      │
+│  RTX 3060, Docker+WSL2    │
+│  XTTS v2 (natuerliche     │
+│  Stimmen, Voice Cloning)   │
+│  xtts/docker-compose.yml  │
+└───────────────────────────┘
 ┌─────────────────────────────────────────────────────────┐
 │     ARIA-VM (Proxmox, Debian 13) — ARIAs Wohnung        │
 │     Basissystem + Docker. Rest richtet ARIA selbst ein.  │
@@ -50,8 +57,8 @@ ARIA hat zwei Rollen:
 │  │             Liest BOOTSTRAP.md + AGENT.md         │    │
 │  │                                                   │    │
 │  │  [bridge]   ARIA Voice Bridge Container           │    │
-│  │             Whisper STT · Piper TTS · Wake-Word   │    │
-│  │             Ramona (weiblich) + Thorsten (tief)    │    │
+│  │             Whisper STT · Wake-Word                │    │
+│  │             TTS remote via XTTS v2 auf Gaming-PC  │    │
 │  │             Bruecke: App <> RVS <> Bridge <> ARIA │    │
 │  │                                                   │    │
 │  │  [diagnostic] Selbstcheck-UI + Einstellungen      │    │
@@ -66,13 +73,14 @@ ARIA hat zwei Rollen:
 └─────────────────────────────────────────────────────────┘
 ```

-**Drei separate Deployments:**
+**Vier separate Deployments:**

 | Was | Wo | Wie |
 |-----|----|-----|
 | RVS | Rechenzentrum | `cd rvs && docker compose up -d` |
 | ARIA Core | Debian 13 VM | `docker compose up -d && ./aria-setup.sh` |
-| Android App | Stefans Handy | APK installieren, QR-Code scannen |
+| XTTS v2 (optional) | Gaming-PC (GPU) | `cd xtts && docker compose up -d` |
+| Android App | Stefans Handy | APK installieren (Auto-Update via RVS) |

 ---

@@ -95,16 +103,31 @@ cd ~/ARIA-AGENT
 cp .env.example .env
 ```

-`.env` Datei editieren:
+`.env` Datei editieren (Details siehe `.env.example`):
 ```bash
+# Gateway-Auth: Alle Services die mit aria-core reden brauchen diesen Token
+# Diagnostic, Bridge, App nutzen ihn fuer den WebSocket-Handshake
 ARIA_AUTH_TOKEN=        # openssl rand -hex 32
+
+# RVS-Verbindung: Hostname + Port deines Rendezvous-Servers
 RVS_HOST=               # z.B. rvs.hackersoft.de
 RVS_PORT=443
 RVS_TLS=true
 RVS_TLS_FALLBACK=true
-RVS_TOKEN=              # wird von generate-token.sh automatisch gesetzt
+
+# Pairing-Token: Verbindet App, Bridge, Diagnostic und XTTS im gleichen RVS-Room
+# MUSS auf allen Geraeten identisch sein (ARIA-VM, Gaming-PC, App)
+# Wird von generate-token.sh automatisch generiert und eingetragen
+RVS_TOKEN=              # ./generate-token.sh
+
+# Optional: SSH-Host des RVS-Servers fuer Auto-Update (z.B. root@aria-rvs)
+RVS_UPDATE_HOST=
 ```

+**Zwei Tokens, zwei Zwecke:**
+- **ARIA_AUTH_TOKEN**: Authentifizierung am OpenClaw Gateway (aria-core). Wer diesen Token hat, kann ARIA Befehle geben.
+- **RVS_TOKEN**: Pairing-Token fuer den Rendezvous-Server. Alle Geraete mit dem gleichen Token landen im gleichen "Room" und koennen kommunizieren. Die App bekommt diesen Token per QR-Code.
+
 ### 2. Claude CLI einloggen (Proxy-Auth)

 Der Proxy-Container nutzt deine Claude Max Subscription. Die Credentials muessen
@@ -120,21 +143,16 @@ claude login
 **Wichtig:** Der Ordner `~/.claude/` (nicht `~/.config/claude/`!) wird als Volume
 in den Proxy gemountet. Die Credentials ueberleben Container-Restarts.

-### 3. Stimmen herunterladen
-
-```bash
-./get-voices.sh
-# Laedt Ramona + Thorsten (Piper TTS) nach aria-data/voices/
-# Ca. 100MB, dauert ein paar Minuten
-```
-
-### 4. Voice Bridge konfigurieren
+### 3. Voice Bridge konfigurieren

 ```bash
 cp aria-data/config/aria.env.example aria-data/config/aria.env
-# Bei Bedarf anpassen (Whisper-Modell, Sprache, Stimmen-Pfade)
+# Bei Bedarf anpassen (Whisper-Modell, Sprache, Wake-Word)
 ```

+TTS laeuft ausschliesslich ueber XTTS v2 auf dem Gaming-PC — siehe Abschnitt
+"XTTS v2 — High-Quality TTS" weiter unten.
+
 ### 5. RVS-Token generieren & Container starten

 ```bash
@@ -230,7 +248,6 @@ Danach werden per `sed` vier Patches angewendet:
 - Sicherheitsregeln (kein ClawHub, Prompt Injection abwehren)
 - Tool-Freigaben (alle Claude Code Tools: WebFetch, Bash, etc.)
 - SSH-Zugriff auf aria-wohnung (VM)
- Stimmen-Auswahl (Ramona vs Thorsten)
 - Gedaechtnis-System

 ### openclaw.json (via aria-setup.sh)
@@ -271,15 +288,19 @@ Die Bridge verbindet die Android App mit ARIA und bietet lokale Sprachverarbeitu

 **Nachrichtenfluss:**
 ```
-App → RVS → Bridge → aria-core
-aria-core → Bridge → RVS → App
-                   → Lautsprecher (TTS)
+Text:   App → RVS → Bridge → chat.send → aria-core
+Audio:  App → RVS → Bridge → FFmpeg → Whisper STT → chat.send → aria-core
+Datei:  App → RVS → Bridge → /shared/uploads/ → chat.send (mit Pfad) → aria-core
+
+aria-core → Antwort → Gateway → Diagnostic → RVS → App
+                              → Bridge → XTTS (PCM-Stream) → RVS → App AudioTrack
 ```

 ### Features

 - **STT**: faster-whisper (lokal, offline, 16kHz mono)
- **TTS**: Piper (Ramona + Thorsten, offline)
+- **TTS**: XTTS v2 (remote auf Gaming-PC, GPU, Voice Cloning) — Streaming ueber PCM-Chunks
+- **Text-Cleanup**: `<voice>...</voice>` Tag bevorzugt, Markdown/Code/Einheiten/URLs werden TTS-gerecht aufbereitet
 - **Wake-Word**: openwakeword (lokales Mikrofon auf der VM)
 - **App-Audio**: Base64 Audio von App → FFmpeg → Whisper STT → Text an aria-core
 - **Modi**: Normal, Nicht stoeren, Fluestern, Hangar, Gaming
@@ -294,13 +315,6 @@ aria-core → Bridge → RVS → App
 | Hangar | `"ARIA, ich arbeite"` | Nur wichtige Meldungen |
 | Gaming | `"ARIA, Gaming-Modus"` | Nur auf direkte Fragen antworten |

-### Stimmen
-
-| Stimme | Modell | Wann |
-|--------|--------|------|
-| **Ramona** (weiblich) | `de_DE-ramona-low` | Alltag, Antworten, Gespraeche |
-| **Thorsten** (maennlich, tief) | `de_DE-thorsten-high` | Epische Momente, Alarme |
-
 ---

 ## Diagnostic — Selbstcheck-UI und Einstellungen
@@ -310,13 +324,19 @@ Erreichbar unter `http://<VM-IP>:3001`. Teilt das Netzwerk mit aria-core.
 ### Features

 - **Status-Karten**: Gateway (Handshake), RVS (TLS-Fallback), Proxy (Auth)
- **Chat-Test**: Nachrichten direkt an ARIA senden (Gateway oder via RVS)
- **Session-Verwaltung**: Sessions auflisten, wechseln, erstellen, loeschen
+- **Chat-Test**: Nachrichten direkt an ARIA senden (Gateway oder via RVS), Vollbild-Modus
+- **"ARIA denkt..." Indikator**: Zeigt live was ARIA gerade tut (Denken, Tool, Schreiben)
+- **Abbrechen-Button**: Stoppt laufende Anfragen + doctor --fix
+- **Session-Verwaltung**: Sessions auflisten, wechseln, erstellen, loeschen, als Markdown exportieren (⬇ Button)
 - **Chat-History**: Wird beim Laden und Session-Wechsel angezeigt (read-only aus JSONL)
+- **TTS-Diagnose Tab**: Stimmen testen, Status pruefen, Fehler anzeigen
+- **Einstellungen**: TTS aktiv-Toggle, XTTS-Voice (gecloned), Betriebsmodi, Whisper-Modell (tiny…large-v3, Hot-Reload)
+- **XTTS Voice Cloning**: Audio-Samples hochladen, eigene Stimme erstellen
 - **Claude Login**: Browser-Terminal zum Einloggen in den Proxy
 - **Core Terminal**: Shell in aria-core (openclaw CLI)
- **Container-Logs**: Echtzeit-Logs aller Container (gefiltert nach Tab)
+- **Container-Logs**: Echtzeit-Logs aller Container (gefiltert nach Tab + Pipeline)
 - **SSH Terminal**: Direkter SSH-Zugang zu aria-wohnung
+- **Watchdog**: Erkennt stuck Runs (2min Warnung → 5min doctor --fix → 8min Container-Restart)

 ### Session-Verwaltung

@@ -334,10 +354,19 @@ API-Endpoint fuer andere Services: `GET http://localhost:3001/api/session`

 - Text-Chat mit ARIA
 - **Sprachaufnahme**: Push-to-Talk (halten) oder Tap-to-Talk (tippen, Auto-Stop bei Stille)
+- **Gespraechsmodus** (Ohr-Button): Nach jeder ARIA-Antwort startet automatisch die Aufnahme — wie ein natuerliches Gespraech hin und her, ohne Buttons druecken
 - **VAD (Voice Activity Detection)**: Erkennt 1.8s Stille und stoppt automatisch
- **Wake Word**: Toggle-Button aktiviert kontinuierliches Mikrofon-Monitoring
- **TTS-Wiedergabe**: ARIA antwortet per Lautsprecher (Ramona/Thorsten)
- Datei- und Kamera-Upload
+- **Speech Gate**: Aufnahme wird verworfen wenn keine Sprache erkannt (kein Rauschen an Whisper)
+- **STT (Speech-to-Text)**: Audio wird als 16kHz mono aufgenommen und in der Bridge per Whisper transkribiert, transkribierter Text erscheint im Chat
+- **"ARIA denkt..." Indicator**: Zeigt live den Status vom Core (Denken, Tool, Schreiben) + Abbrechen-Button
+- **TTS-Wiedergabe**: ARIA antwortet per Lautsprecher — XTTS v2 PCM-Streaming direkt in AudioTrack, keine Wait-Gaps
+- **Play-Button**: Jede ARIA-Nachricht kann nochmal vorgelesen werden
+- **Chat-Suche**: Lupe in der Statusleiste filtert Nachrichten live
+- **Mehrere Anhaenge**: Bilder + Dateien sammeln, Text hinzufuegen, dann zusammen senden
+- **Paste-Support**: Bilder aus Zwischenablage einfuegen (Diagnostic)
+- **Anhaenge**: Bridge speichert in Shared Volume, ARIA kann darauf zugreifen, Re-Download ueber RVS
+- **Einstellungen**: TTS aktiv, XTTS-Voice, Speicherort, Auto-Download, GPS
+- **Auto-Update**: Prueft beim Start + per Button auf neue Version, Download + Installation ueber RVS (FileProvider)
 - GPS-Position (optional)
 - QR-Code Scanner fuer Token-Pairing

@@ -361,15 +390,90 @@ cd android
 # APK liegt unter android/app/build/outputs/apk/release/
 ```

-### Audio-Pipeline
+### Release auf Gitea veroeffentlichen
+
+```bash
+./release.sh 1.2.0
+```
+
+Das Script macht alles in einem Schritt:
+1. Setzt Versionsnummern (package.json, build.gradle, SettingsScreen)
+2. Fragt Gitea-Kennwort ab (wird nirgends gespeichert)
+3. Baut die Release-APK
+4. Git Commit + Tag + Push
+5. Erstellt Gitea Release + laedt APK hoch
+6. Kopiert APK auf RVS-Server (Auto-Update, optional)
+
+Voraussetzung in `.env`:
+```bash
+GITEA_URL=https://gitea.hackersoft.de
+GITEA_REPO=stefan/aria-agent
+GITEA_USER=stefan
+RVS_UPDATE_HOST=root@aria-rvs    # Optional: fuer Auto-Update
+```
+
+### Docker-Cleanup
+
+Das Bridge-Image zieht grosse ML-Deps (faster-whisper, ctranslate2, onnxruntime,
+openwakeword) — bei jedem Rebuild waechst der Docker-Build-Cache. Wenn
+die VM voll laeuft:
+
+```bash
+./cleanup.sh           # sicher: Build-Cache + ungenutzte Images
+./cleanup.sh --full    # aggressiv: zusaetzlich ungenutzte Volumes (mit Rueckfrage)
+```
+
+### Auto-Update
+
+Die App prueft beim Start ob eine neuere Version auf dem RVS liegt.
+Der Update-Flow:
+1. `./release.sh 0.0.3.0` → APK wird auf RVS kopiert (via scp)
+2. Alternativ: `git pull` auf dem RVS-Server → APK in `rvs/updates/`
+3. App sendet `update_check` mit aktueller Version
+4. RVS vergleicht → sendet `update_available`
+5. App zeigt Dialog → Download ueber WebSocket → Installation
+
+### Audio-Pipeline (Spracheingabe)

 ```
 App (Mikrofon) → AAC/MP4 Aufnahme → Base64 → RVS → Bridge
 Bridge: FFmpeg (16kHz PCM) → Whisper STT → Text → aria-core
-aria-core → Antwort → Bridge → Piper TTS (WAV) → Base64 → RVS → App
-App: Base64 → WAV → Lautsprecher
+Bridge: STT-Ergebnis → RVS → App (Placeholder wird durch transkribierten Text ersetzt)
+aria-core → Antwort → Bridge → XTTS (Gaming-PC) → PCM-Stream → RVS → App
+App: AudioTrack MODE_STREAM (nahtlos), Cache als WAV pro Message
 ```

+### Datei-Pipeline (Bilder & Anhaenge)
+
+```
+App (Kamera/Dateimanager) → Base64 → RVS → Bridge
+Bridge: Speichert in /shared/uploads/ (Shared Volume, fuer aria-core sichtbar)
+Bridge: chat.send → "Stefan hat ein Bild geschickt: foto.jpg — liegt unter /shared/uploads/..."
+ARIA: Kann Datei per Bash/Read-Tool oeffnen und analysieren
+```
+
+**Unterstuetzte Formate:** Bilder (JPG, PNG), Dokumente (PDF, DOCX, TXT), beliebige Dateien.
+Bilder werden in der App inline angezeigt, andere Dateien als Icon + Dateiname.
+
+**Re-Download:** Wird der lokale Cache in der App geleert (Einstellungen → Anhang-Speicher → Cache leeren),
+werden fehlende Anhaenge automatisch ueber RVS vom Server neu geladen. Der Speicherort
+ist in den App-Einstellungen konfigurierbar.
+
+> **Tipp Speicherplatz:** Das Docker Volume `aria-shared` liegt standardmaessig auf ARIAs VM-Disk.
+> Bei vielen Uploads kann das den Speicher der VM belasten (dort laufen auch alle Container).
+> Empfehlung: Das Volume auf ein Netzwerk-Filesystem mounten (CephFS, NFS, GlusterFS):
+> ```yaml
+> # docker-compose.yml
+> volumes:
+>   aria-shared:
+>     driver: local
+>     driver_opts:
+>       type: nfs
+>       o: addr=nas.local,rw
+>       device: ":/exports/aria-uploads"
+> ```
+> So bleibt ARIAs VM-Disk sauber und die Uploads liegen auf dediziertem Storage.
+
 ---

 ## Datenverzeichnis — aria-data/
@@ -384,10 +488,6 @@ aria-data/
 │
 ├── skills/                         ← ARIAs Faehigkeiten (selbst geschrieben!)
 │
-├── voices/                         ← Piper TTS Stimmen (offline)
-│   ├── de_DE-ramona-low.onnx
-│   └── de_DE-thorsten-high.onnx
-│
 ├── config/
 │   ├── BOOTSTRAP.md                ← System-Prompt (Identitaet, Regeln, Tools)
 │   ├── AGENT.md                    ← Persoenlichkeit & Arbeitsprinzipien
@@ -396,6 +496,11 @@ aria-data/
 │   ├── aria.env                    ← Voice Bridge Config
 │   └── diag-state/                 ← Diagnostic persistenter State
 │
+│   (im Shared Volume /shared/config/):
+│   ├── voice_config.json           ← TTS-Einstellungen (Stimme, Speed, Engine)
+│   ├── highlight_triggers.json     ← Highlight-Trigger Woerter
+│   └── chat_backup.jsonl           ← Nachrichten-Backup (on-the-fly)
+│
 └── ssh/                            ← SSH Keys fuer VM-Zugriff
    ├── id_ed25519                  ← Private Key (generiert von aria-setup.sh)
    ├── id_ed25519.pub              ← Public Key (muss in VM authorized_keys!)
@@ -411,7 +516,7 @@ tar -czf aria-backup-$(date +%Y%m%d).tar.gz aria-data/

 ## RVS — Rendezvous-Server

-Laeuft im Rechenzentrum. Reiner Relay — kennt keine Tokens, speichert nichts.
+Laeuft im Rechenzentrum. WebSocket Relay + Auto-Update Server.
 Wer sich mit dem gleichen Token verbindet, landet im gleichen Room.

 ```bash
@@ -419,10 +524,90 @@ cd rvs
 docker compose up -d
 ```

+**Features:**
+- WebSocket Relay (alle Message-Types: chat, audio, file, config, xtts, update, etc.)
+- Auto-Update: APK-Verteilung an Apps ueber WebSocket
+- Heartbeat + tote Verbindungen aufraeumen
+
+**Auto-Update APK bereitstellen:**
+```bash
+# APK in updates/ legen (manuell oder via release.sh)
+cp ARIA-v0.0.3.0.apk ~/ARIA-AGENT/rvs/updates/
+# RVS erkennt die Version aus dem Dateinamen
+```
+
 **Multi-Instanz:** Mehrere ARIA-VMs koennen denselben RVS nutzen — jede mit eigenem Token.

 ---

+## XTTS v2 — GPU TTS Server (optional)
+
+Laeuft auf einem separaten Rechner mit NVIDIA GPU (z.B. Gaming-PC mit RTX 3060).
+Verbindet sich ueber RVS mit der ARIA-Infrastruktur — kein VPN noetig, funktioniert
+ueber verschiedene Netze hinweg.
+
+### Architektur
+
+```
+Gaming-PC (Windows, RTX 3060, Docker Desktop + WSL2)
+├── aria-xtts        XTTS v2 GPU Server (Port 8020 intern)
+└── aria-xtts-bridge RVS-Relay (empfaengt Requests, sendet Audio)
+    └── Beide teilen ./voices/ Volume fuer Voice Cloning
+
+         ↕ RVS (Rechenzentrum, WebSocket Relay)
+
+ARIA-VM
+└── aria-bridge: tts_engine="xtts" → xtts_request via RVS → wartet auf xtts_response
+```
+
+### Voraussetzungen
+
+- Docker Desktop mit WSL2 (Windows) oder Docker mit NVIDIA Runtime (Linux)
+- NVIDIA Container Toolkit
+- GPU mit mindestens 4GB VRAM (6GB+ empfohlen)
+- **Gleicher RVS_TOKEN wie auf der ARIA-VM!**
+
+### Setup
+
+```bash
+cd xtts
+cp .env.example .env
+# .env mit RVS-Verbindungsdaten fuellen (gleicher Token wie ARIA-VM!)
+docker compose up -d
+# Erster Start laedt ~2GB Model herunter (danach gecacht)
+```
+
+**Wichtig:** Der XTTS-Server laeuft intern auf Port **8020** (nicht 8000).
+Das Model wird im Volume `xtts-models` gecacht und muss nur einmal geladen werden.
+
+### Features
+
+- **Natuerliche Stimmen**: Deutlich bessere Qualitaet als TTS der alten Generation
+- **Voice Cloning**: Eigene Stimme mit 6-10s Audio-Sample (~2s Latenz auf RTX 3060)
+- **Streaming**: PCM-Chunks alle ~170ms → App spielt ohne Warten nahtlos
+- **16 Sprachen**: Deutsch, Englisch, Franzoesisch, etc.
+
+### TTS-Config
+
+In der Diagnostic unter Einstellungen → Sprachausgabe:
+- **TTS aktiv**: Global An/Aus
+- **XTTS Stimme**: Default oder gecloned (Maia, etc.)
+
+> XTTS ist die einzige Engine — wenn der Gaming-PC offline ist, bleibt ARIA stumm.
+> Chat-Antworten kommen weiter an (nur kein Audio).
+
+### Stimme klonen
+
+1. "Stimme klonen" → Audio-Dateien hochladen (WAV/MP3, 1-10 Dateien, min. 6-10s gesamt)
+2. Name vergeben → "Stimme erstellen"
+3. "Laden" klicken → neue Stimme in der Auswahl
+4. Stimme auswaehlen → Config wird automatisch gespeichert
+
+> **Tipp:** Fuer beste Ergebnisse: saubere Aufnahme, eine Stimme, kein Hintergrund,
+> 10-30 Sekunden Gesamtlaenge. Mehrere kurze Dateien werden zusammengefuegt.
+
+---
+
 ## Docker Volumes

 | Volume | Pfad im Container | Zweck |
@@ -433,6 +618,8 @@ docker compose up -d
 | `./aria-data/ssh` (bind) | `/root/.ssh`, `/home/node/.ssh` | SSH Keys |
 | `./aria-data/brain` (bind) | `/home/node/.openclaw/workspace/memory` | Gedaechtnis |
 | `./aria-data/skills` (bind) | `/home/node/.openclaw/workspace/skills` | Skills |
+| `aria-shared` | `/shared` (Core + Bridge + Proxy + Diag) | Datei-Austausch, Config, Uploads |
+| `./aria-data/config/diag-state` (bind) | `/data` (Diagnostic) | Persistenter State (aktive Session) |

 ---

@@ -487,8 +674,15 @@ docker exec aria-core ssh aria-wohnung hostname
  Dadurch ist ARIA langsamer als die direkte Claude CLI. Timeout ist auf 900s (15 Min).
 - **Kein Streaming zur App**: Die App zeigt erst die fertige Antwort, keine Streaming-Tokens.
 - **Wake Word nur auf VM**: Die Bridge hoert auf "ARIA" ueber das lokale Mikrofon der VM.
-  In der App gibt es Energy-basierte Erkennung (Phase 1).
+  In der App gibt es Energy-basierte Erkennung (Phase 1). On-device "ARIA"-Keyword (Porcupine) ist Phase 2.
 - **Audio-Format**: App nimmt AAC/MP4 auf, Bridge konvertiert via FFmpeg zu 16kHz PCM.
+- **RVS Zombie-Connections**: WebSocket-Verbindungen sterben gelegentlich ohne Fehlermeldung.
+  Bridge hat Ping-Check (5s), Diagnostic nutzt frische Verbindungen pro Request.
+- **Bildanalyse eingeschraenkt**: Bilder werden in `/shared/uploads/` gespeichert. ARIA kann
+  sie per Bash/Read-Tool oeffnen, aber Claude Vision (direkte Bildanalyse) ist ueber den
+  Proxy-Pfad (`claude --print`) noch nicht moeglich. ARIA sieht den Dateipfad, nicht das Bild.
+- **Dateigroesse**: Grosse Dateien (>5MB) koennen WebSocket-Limits ueberschreiten.
+  Bilder werden in der App auf max 1920x1920px @ 80% Qualitaet komprimiert.

 ---

@@ -504,8 +698,28 @@ docker exec aria-core ssh aria-wohnung hostname
 - [x] Android App (Chat + Sprache + Uploads)
 - [x] Tool-Permissions (alle Tools freigeschaltet)
 - [x] SSH-Zugriff auf VM (aria-wohnung)
- [x] Diagnostic Web-UI
+- [x] Diagnostic Web-UI + Einstellungen
 - [x] Session-Verwaltung + Chat-History
+- [x] Stimmen-Einstellungen (Ramona/Thorsten, Speed, Highlight-Trigger) — durch XTTS v2 Voice Cloning ersetzt
+- [x] Piper komplett entfernt — nur noch XTTS v2 als TTS (Gaming-PC)
+- [x] Streaming TTS: PCM-Chunks direkt in AudioTrack, nahtlose Wiedergabe
+- [x] TTS satzweise fuer lange Texte
+- [x] Datei-/Bild-Upload mit Shared Volume
+- [x] Watchdog (stuck Run Erkennung + Auto-Fix + Container-Restart)
+- [x] Auto-Update System (APK via RVS)
+- [x] Chat-Suche, Play-Button, Abbrechen-Button
+- [x] XTTS v2 Integration (GPU, Voice Cloning, remote ueber RVS)
+- [x] Gespraechsmodus (Ohr-Button, automatische Aufnahme nach ARIA-Antwort)
+- [x] Mehrere Anhaenge + Text vor dem Senden + Paste-Support
+- [x] Markdown-Bereinigung fuer TTS
+- [x] Auto-Update mit FileProvider + Update-Check Button
+- [x] Inverted FlatList (zuverlaessiges Scroll-to-Bottom)
+- [x] Speech Gate (VAD verwirft Aufnahme ohne erkannte Sprache)
+- [x] Session-Persistenz ueber Container-Restarts (sessionFromFile + atomic write)
+- [x] Session-Export als Markdown-Datei (Download-Button pro Session)
+- [x] "ARIA denkt..."-Indicator + Abbrechen-Button in App (via Bridge → RVS)
+- [x] Whisper-Modell waehlbar in Diagnostic (tiny…large-v3, Hot-Reload)
+- [x] App-Aufnahme explizit 16kHz mono (optimal fuer Whisper, kein Resample)

 ### Phase 2 — ARIA wird produktiv

@@ -513,7 +727,8 @@ docker exec aria-core ssh aria-wohnung hostname
 - [ ] Gitea-Integration
 - [ ] VM einrichten (Desktop, Browser, Tools)
 - [ ] Heartbeat (periodische Selbst-Checks)
- [ ] Lokales LLM als Wächter (Triage vor Claude-Call)
+- [ ] Lokales LLM als Waechter (Triage vor Claude-Call)
+- [ ] Auto-Compacting / Memory-Verwaltung

 ### Phase 3 — Erweiterungen

@@ -521,3 +736,4 @@ docker exec aria-core ssh aria-wohnung hostname
 - [ ] Desktop Client (Tauri)
 - [ ] bKVM Remote IT-Support
 - [ ] Porcupine Wake Word (on-device "ARIA" in der App)
+- [ ] Claude Vision direkt (Bildanalyse ohne Dateipfad-Umweg)
@@ -1,245 +0,0 @@
-package org.gradle.accessors.dm;
-
-import org.gradle.api.NonNullApi;
-import org.gradle.api.artifacts.MinimalExternalModuleDependency;
-import org.gradle.plugin.use.PluginDependency;
-import org.gradle.api.artifacts.ExternalModuleDependencyBundle;
-import org.gradle.api.artifacts.MutableVersionConstraint;
-import org.gradle.api.provider.Provider;
-import org.gradle.api.model.ObjectFactory;
-import org.gradle.api.provider.ProviderFactory;
-import org.gradle.api.internal.catalog.AbstractExternalDependencyFactory;
-import org.gradle.api.internal.catalog.DefaultVersionCatalog;
-import java.util.Map;
-import org.gradle.api.internal.attributes.ImmutableAttributesFactory;
-import org.gradle.api.internal.artifacts.dsl.CapabilityNotationParser;
-import javax.inject.Inject;
-
-/**
- * A catalog of dependencies accessible via the `libs` extension.
- */
-@NonNullApi
-public class LibrariesForLibs extends AbstractExternalDependencyFactory {
-
-    private final AbstractExternalDependencyFactory owner = this;
-    private final AndroidLibraryAccessors laccForAndroidLibraryAccessors = new AndroidLibraryAccessors(owner);
-    private final KotlinLibraryAccessors laccForKotlinLibraryAccessors = new KotlinLibraryAccessors(owner);
-    private final VersionAccessors vaccForVersionAccessors = new VersionAccessors(providers, config);
-    private final BundleAccessors baccForBundleAccessors = new BundleAccessors(objects, providers, config, attributesFactory, capabilityNotationParser);
-    private final PluginAccessors paccForPluginAccessors = new PluginAccessors(providers, config);
-
-    @Inject
-    public LibrariesForLibs(DefaultVersionCatalog config, ProviderFactory providers, ObjectFactory objects, ImmutableAttributesFactory attributesFactory, CapabilityNotationParser capabilityNotationParser) {
-        super(config, providers, objects, attributesFactory, capabilityNotationParser);
-    }
-
-        /**
-         * Creates a dependency provider for gson (com.google.code.gson:gson)
-         * This dependency was declared in catalog libs.versions.toml
-         */
-        public Provider<MinimalExternalModuleDependency> getGson() {
-            return create("gson");
-    }
-
-        /**
-         * Creates a dependency provider for guava (com.google.guava:guava)
-         * This dependency was declared in catalog libs.versions.toml
-         */
-        public Provider<MinimalExternalModuleDependency> getGuava() {
-            return create("guava");
-    }
-
-        /**
-         * Creates a dependency provider for javapoet (com.squareup:javapoet)
-         * This dependency was declared in catalog libs.versions.toml
-         */
-        public Provider<MinimalExternalModuleDependency> getJavapoet() {
-            return create("javapoet");
-    }
-
-        /**
-         * Creates a dependency provider for junit (junit:junit)
-         * This dependency was declared in catalog libs.versions.toml
-         */
-        public Provider<MinimalExternalModuleDependency> getJunit() {
-            return create("junit");
-    }
-
-    /**
-     * Returns the group of libraries at android
-     */
-    public AndroidLibraryAccessors getAndroid() {
-        return laccForAndroidLibraryAccessors;
-    }
-
-    /**
-     * Returns the group of libraries at kotlin
-     */
-    public KotlinLibraryAccessors getKotlin() {
-        return laccForKotlinLibraryAccessors;
-    }
-
-    /**
-     * Returns the group of versions at versions
-     */
-    public VersionAccessors getVersions() {
-        return vaccForVersionAccessors;
-    }
-
-    /**
-     * Returns the group of bundles at bundles
-     */
-    public BundleAccessors getBundles() {
-        return baccForBundleAccessors;
-    }
-
-    /**
-     * Returns the group of plugins at plugins
-     */
-    public PluginAccessors getPlugins() {
-        return paccForPluginAccessors;
-    }
-
-    public static class AndroidLibraryAccessors extends SubDependencyFactory {
-        private final AndroidGradleLibraryAccessors laccForAndroidGradleLibraryAccessors = new AndroidGradleLibraryAccessors(owner);
-
-        public AndroidLibraryAccessors(AbstractExternalDependencyFactory owner) { super(owner); }
-
-        /**
-         * Returns the group of libraries at android.gradle
-         */
-        public AndroidGradleLibraryAccessors getGradle() {
-            return laccForAndroidGradleLibraryAccessors;
-        }
-
-    }
-
-    public static class AndroidGradleLibraryAccessors extends SubDependencyFactory {
-
-        public AndroidGradleLibraryAccessors(AbstractExternalDependencyFactory owner) { super(owner); }
-
-            /**
-             * Creates a dependency provider for plugin (com.android.tools.build:gradle)
-             * This dependency was declared in catalog libs.versions.toml
-             */
-            public Provider<MinimalExternalModuleDependency> getPlugin() {
-                return create("android.gradle.plugin");
-        }
-
-    }
-
-    public static class KotlinLibraryAccessors extends SubDependencyFactory {
-        private final KotlinGradleLibraryAccessors laccForKotlinGradleLibraryAccessors = new KotlinGradleLibraryAccessors(owner);
-
-        public KotlinLibraryAccessors(AbstractExternalDependencyFactory owner) { super(owner); }
-
-        /**
-         * Returns the group of libraries at kotlin.gradle
-         */
-        public KotlinGradleLibraryAccessors getGradle() {
-            return laccForKotlinGradleLibraryAccessors;
-        }
-
-    }
-
-    public static class KotlinGradleLibraryAccessors extends SubDependencyFactory {
-
-        public KotlinGradleLibraryAccessors(AbstractExternalDependencyFactory owner) { super(owner); }
-
-            /**
-             * Creates a dependency provider for plugin (org.jetbrains.kotlin:kotlin-gradle-plugin)
-             * This dependency was declared in catalog libs.versions.toml
-             */
-            public Provider<MinimalExternalModuleDependency> getPlugin() {
-                return create("kotlin.gradle.plugin");
-        }
-
-    }
-
-    public static class VersionAccessors extends VersionFactory  {
-
-        public VersionAccessors(ProviderFactory providers, DefaultVersionCatalog config) { super(providers, config); }
-
-            /**
-             * Returns the version associated to this alias: agp (8.1.1)
-             * If the version is a rich version and that its not expressible as a
-             * single version string, then an empty string is returned.
-             * This version was declared in catalog libs.versions.toml
-             */
-            public Provider<String> getAgp() { return getVersion("agp"); }
-
-            /**
-             * Returns the version associated to this alias: gson (2.8.9)
-             * If the version is a rich version and that its not expressible as a
-             * single version string, then an empty string is returned.
-             * This version was declared in catalog libs.versions.toml
-             */
-            public Provider<String> getGson() { return getVersion("gson"); }
-
-            /**
-             * Returns the version associated to this alias: guava (31.0.1-jre)
-             * If the version is a rich version and that its not expressible as a
-             * single version string, then an empty string is returned.
-             * This version was declared in catalog libs.versions.toml
-             */
-            public Provider<String> getGuava() { return getVersion("guava"); }
-
-            /**
-             * Returns the version associated to this alias: javapoet (1.13.0)
-             * If the version is a rich version and that its not expressible as a
-             * single version string, then an empty string is returned.
-             * This version was declared in catalog libs.versions.toml
-             */
-            public Provider<String> getJavapoet() { return getVersion("javapoet"); }
-
-            /**
-             * Returns the version associated to this alias: junit (4.13.2)
-             * If the version is a rich version and that its not expressible as a
-             * single version string, then an empty string is returned.
-             * This version was declared in catalog libs.versions.toml
-             */
-            public Provider<String> getJunit() { return getVersion("junit"); }
-
-            /**
-             * Returns the version associated to this alias: kotlin (1.8.0)
-             * If the version is a rich version and that its not expressible as a
-             * single version string, then an empty string is returned.
-             * This version was declared in catalog libs.versions.toml
-             */
-            public Provider<String> getKotlin() { return getVersion("kotlin"); }
-
-    }
-
-    public static class BundleAccessors extends BundleFactory {
-
-        public BundleAccessors(ObjectFactory objects, ProviderFactory providers, DefaultVersionCatalog config, ImmutableAttributesFactory attributesFactory, CapabilityNotationParser capabilityNotationParser) { super(objects, providers, config, attributesFactory, capabilityNotationParser); }
-
-    }
-
-    public static class PluginAccessors extends PluginFactory {
-        private final KotlinPluginAccessors paccForKotlinPluginAccessors = new KotlinPluginAccessors(providers, config);
-
-        public PluginAccessors(ProviderFactory providers, DefaultVersionCatalog config) { super(providers, config); }
-
-        /**
-         * Returns the group of plugins at plugins.kotlin
-         */
-        public KotlinPluginAccessors getKotlin() {
-            return paccForKotlinPluginAccessors;
-        }
-
-    }
-
-    public static class KotlinPluginAccessors extends PluginFactory {
-
-        public KotlinPluginAccessors(ProviderFactory providers, DefaultVersionCatalog config) { super(providers, config); }
-
-            /**
-             * Creates a plugin provider for kotlin.jvm to the plugin id 'org.jetbrains.kotlin.jvm'
-             * This plugin was declared in catalog libs.versions.toml
-             */
-            public Provider<PluginDependency> getJvm() { return createPlugin("kotlin.jvm"); }
-
-    }
-
-}
@@ -1,298 +0,0 @@
-package org.gradle.accessors.dm;
-
-import org.gradle.api.NonNullApi;
-import org.gradle.api.artifacts.MinimalExternalModuleDependency;
-import org.gradle.plugin.use.PluginDependency;
-import org.gradle.api.artifacts.ExternalModuleDependencyBundle;
-import org.gradle.api.artifacts.MutableVersionConstraint;
-import org.gradle.api.provider.Provider;
-import org.gradle.api.model.ObjectFactory;
-import org.gradle.api.provider.ProviderFactory;
-import org.gradle.api.internal.catalog.AbstractExternalDependencyFactory;
-import org.gradle.api.internal.catalog.DefaultVersionCatalog;
-import java.util.Map;
-import org.gradle.api.internal.attributes.ImmutableAttributesFactory;
-import org.gradle.api.internal.artifacts.dsl.CapabilityNotationParser;
-import javax.inject.Inject;
-
-/**
- * A catalog of dependencies accessible via the `libs` extension.
- */
-@NonNullApi
-public class LibrariesForLibsInPluginsBlock extends AbstractExternalDependencyFactory {
-
-    private final AbstractExternalDependencyFactory owner = this;
-    private final AndroidLibraryAccessors laccForAndroidLibraryAccessors = new AndroidLibraryAccessors(owner);
-    private final KotlinLibraryAccessors laccForKotlinLibraryAccessors = new KotlinLibraryAccessors(owner);
-    private final VersionAccessors vaccForVersionAccessors = new VersionAccessors(providers, config);
-    private final BundleAccessors baccForBundleAccessors = new BundleAccessors(objects, providers, config, attributesFactory, capabilityNotationParser);
-    private final PluginAccessors paccForPluginAccessors = new PluginAccessors(providers, config);
-
-    @Inject
-    public LibrariesForLibsInPluginsBlock(DefaultVersionCatalog config, ProviderFactory providers, ObjectFactory objects, ImmutableAttributesFactory attributesFactory, CapabilityNotationParser capabilityNotationParser) {
-        super(config, providers, objects, attributesFactory, capabilityNotationParser);
-    }
-
-        /**
-         * Creates a dependency provider for gson (com.google.code.gson:gson)
-         * This dependency was declared in catalog libs.versions.toml
-     * @deprecated Will be removed in Gradle 9.0.
-         */
-    @Deprecated
-        public Provider<MinimalExternalModuleDependency> getGson() {
-        org.gradle.internal.deprecation.DeprecationLogger.deprecateBehaviour("Accessing libraries or bundles from version catalogs in the plugins block.").withAdvice("Only use versions or plugins from catalogs in the plugins block.").willBeRemovedInGradle9().withUpgradeGuideSection(8, "kotlin_dsl_deprecated_catalogs_plugins_block").nagUser();
-            return create("gson");
-    }
-
-        /**
-         * Creates a dependency provider for guava (com.google.guava:guava)
-         * This dependency was declared in catalog libs.versions.toml
-     * @deprecated Will be removed in Gradle 9.0.
-         */
-    @Deprecated
-        public Provider<MinimalExternalModuleDependency> getGuava() {
-        org.gradle.internal.deprecation.DeprecationLogger.deprecateBehaviour("Accessing libraries or bundles from version catalogs in the plugins block.").withAdvice("Only use versions or plugins from catalogs in the plugins block.").willBeRemovedInGradle9().withUpgradeGuideSection(8, "kotlin_dsl_deprecated_catalogs_plugins_block").nagUser();
-            return create("guava");
-    }
-
-        /**
-         * Creates a dependency provider for javapoet (com.squareup:javapoet)
-         * This dependency was declared in catalog libs.versions.toml
-     * @deprecated Will be removed in Gradle 9.0.
-         */
-    @Deprecated
-        public Provider<MinimalExternalModuleDependency> getJavapoet() {
-        org.gradle.internal.deprecation.DeprecationLogger.deprecateBehaviour("Accessing libraries or bundles from version catalogs in the plugins block.").withAdvice("Only use versions or plugins from catalogs in the plugins block.").willBeRemovedInGradle9().withUpgradeGuideSection(8, "kotlin_dsl_deprecated_catalogs_plugins_block").nagUser();
-            return create("javapoet");
-    }
-
-        /**
-         * Creates a dependency provider for junit (junit:junit)
-         * This dependency was declared in catalog libs.versions.toml
-     * @deprecated Will be removed in Gradle 9.0.
-         */
-    @Deprecated
-        public Provider<MinimalExternalModuleDependency> getJunit() {
-        org.gradle.internal.deprecation.DeprecationLogger.deprecateBehaviour("Accessing libraries or bundles from version catalogs in the plugins block.").withAdvice("Only use versions or plugins from catalogs in the plugins block.").willBeRemovedInGradle9().withUpgradeGuideSection(8, "kotlin_dsl_deprecated_catalogs_plugins_block").nagUser();
-            return create("junit");
-    }
-
-    /**
-     * Returns the group of libraries at android
-     * @deprecated Will be removed in Gradle 9.0.
-     */
-    @Deprecated
-    public AndroidLibraryAccessors getAndroid() {
-        org.gradle.internal.deprecation.DeprecationLogger.deprecateBehaviour("Accessing libraries or bundles from version catalogs in the plugins block.").withAdvice("Only use versions or plugins from catalogs in the plugins block.").willBeRemovedInGradle9().withUpgradeGuideSection(8, "kotlin_dsl_deprecated_catalogs_plugins_block").nagUser();
-        return laccForAndroidLibraryAccessors;
-    }
-
-    /**
-     * Returns the group of libraries at kotlin
-     * @deprecated Will be removed in Gradle 9.0.
-     */
-    @Deprecated
-    public KotlinLibraryAccessors getKotlin() {
-        org.gradle.internal.deprecation.DeprecationLogger.deprecateBehaviour("Accessing libraries or bundles from version catalogs in the plugins block.").withAdvice("Only use versions or plugins from catalogs in the plugins block.").willBeRemovedInGradle9().withUpgradeGuideSection(8, "kotlin_dsl_deprecated_catalogs_plugins_block").nagUser();
-        return laccForKotlinLibraryAccessors;
-    }
-
-    /**
-     * Returns the group of versions at versions
-     */
-    public VersionAccessors getVersions() {
-        return vaccForVersionAccessors;
-    }
-
-    /**
-     * Returns the group of bundles at bundles
-     * @deprecated Will be removed in Gradle 9.0.
-     */
-    @Deprecated
-    public BundleAccessors getBundles() {
-        org.gradle.internal.deprecation.DeprecationLogger.deprecateBehaviour("Accessing libraries or bundles from version catalogs in the plugins block.").withAdvice("Only use versions or plugins from catalogs in the plugins block.").willBeRemovedInGradle9().withUpgradeGuideSection(8, "kotlin_dsl_deprecated_catalogs_plugins_block").nagUser();
-        return baccForBundleAccessors;
-    }
-
-    /**
-     * Returns the group of plugins at plugins
-     */
-    public PluginAccessors getPlugins() {
-        return paccForPluginAccessors;
-    }
-
-    /**
-     * @deprecated Will be removed in Gradle 9.0.
-     */
-    @Deprecated
-    public static class AndroidLibraryAccessors extends SubDependencyFactory {
-        private final AndroidGradleLibraryAccessors laccForAndroidGradleLibraryAccessors = new AndroidGradleLibraryAccessors(owner);
-
-        public AndroidLibraryAccessors(AbstractExternalDependencyFactory owner) { super(owner); }
-
-        /**
-         * Returns the group of libraries at android.gradle
-         * @deprecated Will be removed in Gradle 9.0.
-         */
-        @Deprecated
-        public AndroidGradleLibraryAccessors getGradle() {
-            org.gradle.internal.deprecation.DeprecationLogger.deprecateBehaviour("Accessing libraries or bundles from version catalogs in the plugins block.").withAdvice("Only use versions or plugins from catalogs in the plugins block.").willBeRemovedInGradle9().withUpgradeGuideSection(8, "kotlin_dsl_deprecated_catalogs_plugins_block").nagUser();
-            return laccForAndroidGradleLibraryAccessors;
-        }
-
-    }
-
-    /**
-     * @deprecated Will be removed in Gradle 9.0.
-     */
-    @Deprecated
-    public static class AndroidGradleLibraryAccessors extends SubDependencyFactory {
-
-        public AndroidGradleLibraryAccessors(AbstractExternalDependencyFactory owner) { super(owner); }
-
-            /**
-             * Creates a dependency provider for plugin (com.android.tools.build:gradle)
-             * This dependency was declared in catalog libs.versions.toml
-         * @deprecated Will be removed in Gradle 9.0.
-             */
-        @Deprecated
-            public Provider<MinimalExternalModuleDependency> getPlugin() {
-            org.gradle.internal.deprecation.DeprecationLogger.deprecateBehaviour("Accessing libraries or bundles from version catalogs in the plugins block.").withAdvice("Only use versions or plugins from catalogs in the plugins block.").willBeRemovedInGradle9().withUpgradeGuideSection(8, "kotlin_dsl_deprecated_catalogs_plugins_block").nagUser();
-                return create("android.gradle.plugin");
-        }
-
-    }
-
-    /**
-     * @deprecated Will be removed in Gradle 9.0.
-     */
-    @Deprecated
-    public static class KotlinLibraryAccessors extends SubDependencyFactory {
-        private final KotlinGradleLibraryAccessors laccForKotlinGradleLibraryAccessors = new KotlinGradleLibraryAccessors(owner);
-
-        public KotlinLibraryAccessors(AbstractExternalDependencyFactory owner) { super(owner); }
-
-        /**
-         * Returns the group of libraries at kotlin.gradle
-         * @deprecated Will be removed in Gradle 9.0.
-         */
-        @Deprecated
-        public KotlinGradleLibraryAccessors getGradle() {
-            org.gradle.internal.deprecation.DeprecationLogger.deprecateBehaviour("Accessing libraries or bundles from version catalogs in the plugins block.").withAdvice("Only use versions or plugins from catalogs in the plugins block.").willBeRemovedInGradle9().withUpgradeGuideSection(8, "kotlin_dsl_deprecated_catalogs_plugins_block").nagUser();
-            return laccForKotlinGradleLibraryAccessors;
-        }
-
-    }
-
-    /**
-     * @deprecated Will be removed in Gradle 9.0.
-     */
-    @Deprecated
-    public static class KotlinGradleLibraryAccessors extends SubDependencyFactory {
-
-        public KotlinGradleLibraryAccessors(AbstractExternalDependencyFactory owner) { super(owner); }
-
-            /**
-             * Creates a dependency provider for plugin (org.jetbrains.kotlin:kotlin-gradle-plugin)
-             * This dependency was declared in catalog libs.versions.toml
-         * @deprecated Will be removed in Gradle 9.0.
-             */
-        @Deprecated
-            public Provider<MinimalExternalModuleDependency> getPlugin() {
-            org.gradle.internal.deprecation.DeprecationLogger.deprecateBehaviour("Accessing libraries or bundles from version catalogs in the plugins block.").withAdvice("Only use versions or plugins from catalogs in the plugins block.").willBeRemovedInGradle9().withUpgradeGuideSection(8, "kotlin_dsl_deprecated_catalogs_plugins_block").nagUser();
-                return create("kotlin.gradle.plugin");
-        }
-
-    }
-
-    public static class VersionAccessors extends VersionFactory  {
-
-        public VersionAccessors(ProviderFactory providers, DefaultVersionCatalog config) { super(providers, config); }
-
-            /**
-             * Returns the version associated to this alias: agp (8.1.1)
-             * If the version is a rich version and that its not expressible as a
-             * single version string, then an empty string is returned.
-             * This version was declared in catalog libs.versions.toml
-             */
-            public Provider<String> getAgp() { return getVersion("agp"); }
-
-            /**
-             * Returns the version associated to this alias: gson (2.8.9)
-             * If the version is a rich version and that its not expressible as a
-             * single version string, then an empty string is returned.
-             * This version was declared in catalog libs.versions.toml
-             */
-            public Provider<String> getGson() { return getVersion("gson"); }
-
-            /**
-             * Returns the version associated to this alias: guava (31.0.1-jre)
-             * If the version is a rich version and that its not expressible as a
-             * single version string, then an empty string is returned.
-             * This version was declared in catalog libs.versions.toml
-             */
-            public Provider<String> getGuava() { return getVersion("guava"); }
-
-            /**
-             * Returns the version associated to this alias: javapoet (1.13.0)
-             * If the version is a rich version and that its not expressible as a
-             * single version string, then an empty string is returned.
-             * This version was declared in catalog libs.versions.toml
-             */
-            public Provider<String> getJavapoet() { return getVersion("javapoet"); }
-
-            /**
-             * Returns the version associated to this alias: junit (4.13.2)
-             * If the version is a rich version and that its not expressible as a
-             * single version string, then an empty string is returned.
-             * This version was declared in catalog libs.versions.toml
-             */
-            public Provider<String> getJunit() { return getVersion("junit"); }
-
-            /**
-             * Returns the version associated to this alias: kotlin (1.8.0)
-             * If the version is a rich version and that its not expressible as a
-             * single version string, then an empty string is returned.
-             * This version was declared in catalog libs.versions.toml
-             */
-            public Provider<String> getKotlin() { return getVersion("kotlin"); }
-
-    }
-
-    /**
-     * @deprecated Will be removed in Gradle 9.0.
-     */
-    @Deprecated
-    public static class BundleAccessors extends BundleFactory {
-
-        public BundleAccessors(ObjectFactory objects, ProviderFactory providers, DefaultVersionCatalog config, ImmutableAttributesFactory attributesFactory, CapabilityNotationParser capabilityNotationParser) { super(objects, providers, config, attributesFactory, capabilityNotationParser); }
-
-    }
-
-    public static class PluginAccessors extends PluginFactory {
-        private final KotlinPluginAccessors paccForKotlinPluginAccessors = new KotlinPluginAccessors(providers, config);
-
-        public PluginAccessors(ProviderFactory providers, DefaultVersionCatalog config) { super(providers, config); }
-
-        /**
-         * Returns the group of plugins at plugins.kotlin
-         */
-        public KotlinPluginAccessors getKotlin() {
-            return paccForKotlinPluginAccessors;
-        }
-
-    }
-
-    public static class KotlinPluginAccessors extends PluginFactory {
-
-        public KotlinPluginAccessors(ProviderFactory providers, DefaultVersionCatalog config) { super(providers, config); }
-
-            /**
-             * Creates a plugin provider for kotlin.jvm to the plugin id 'org.jetbrains.kotlin.jvm'
-             * This plugin was declared in catalog libs.versions.toml
-             */
-            public Provider<PluginDependency> getJvm() { return createPlugin("kotlin.jvm"); }
-
-    }
-
-}
@@ -1,2 +0,0 @@
-#Sun Mar 29 11:32:18 CEST 2026
-gradle.version=8.3
@@ -79,8 +79,8 @@ android {
        applicationId "com.ariacockpit"
        minSdkVersion rootProject.ext.minSdkVersion
        targetSdkVersion rootProject.ext.targetSdkVersion
-        versionCode 1
-        versionName "1.0"
+        versionCode 504
+        versionName "0.0.5.4"
        // Fallback fuer Libraries mit Product Flavors
        missingDimensionStrategy 'react-native-camera', 'general'
    }
@@ -1,97 +0,0 @@
-
-package com.facebook.react;
-
-import android.app.Application;
-import android.content.Context;
-import android.content.res.Resources;
-
-import com.facebook.react.ReactPackage;
-import com.facebook.react.shell.MainPackageConfig;
-import com.facebook.react.shell.MainReactPackage;
-import java.util.Arrays;
-import java.util.ArrayList;
-
-// react-native-screens
-import com.swmansion.rnscreens.RNScreensPackage;
-// react-native-safe-area-context
-import com.th3rdwave.safeareacontext.SafeAreaContextPackage;
-// react-native-document-picker
-import com.reactnativedocumentpicker.RNDocumentPickerPackage;
-// react-native-sound
-import com.zmxv.RNSound.RNSoundPackage;
-// @react-native-community/geolocation
-import com.reactnativecommunity.geolocation.GeolocationPackage;
-// react-native-image-picker
-import com.imagepicker.ImagePickerPackage;
-// react-native-permissions
-import com.zoontek.rnpermissions.RNPermissionsPackage;
-// react-native-camera-kit
-import com.rncamerakit.RNCameraKitPackage;
-// @react-native-async-storage/async-storage
-import com.reactnativecommunity.asyncstorage.AsyncStoragePackage;
-// react-native-fs
-import com.rnfs.RNFSPackage;
-// react-native-audio-recorder-player
-import com.dooboolab.audiorecorderplayer.RNAudioRecorderPlayerPackage;
-// react-native-live-audio-stream
-import com.imxiqi.rnliveaudiostream.RNLiveAudioStreamPackage;
-
-public class PackageList {
-  private Application application;
-  private ReactNativeHost reactNativeHost;
-  private MainPackageConfig mConfig;
-
-  public PackageList(ReactNativeHost reactNativeHost) {
-    this(reactNativeHost, null);
-  }
-
-  public PackageList(Application application) {
-    this(application, null);
-  }
-
-  public PackageList(ReactNativeHost reactNativeHost, MainPackageConfig config) {
-    this.reactNativeHost = reactNativeHost;
-    mConfig = config;
-  }
-
-  public PackageList(Application application, MainPackageConfig config) {
-    this.reactNativeHost = null;
-    this.application = application;
-    mConfig = config;
-  }
-
-  private ReactNativeHost getReactNativeHost() {
-    return this.reactNativeHost;
-  }
-
-  private Resources getResources() {
-    return this.getApplication().getResources();
-  }
-
-  private Application getApplication() {
-    if (this.reactNativeHost == null) return this.application;
-    return this.reactNativeHost.getApplication();
-  }
-
-  private Context getApplicationContext() {
-    return this.getApplication().getApplicationContext();
-  }
-
-  public ArrayList<ReactPackage> getPackages() {
-    return new ArrayList<>(Arrays.<ReactPackage>asList(
-      new MainReactPackage(mConfig),
-      new RNScreensPackage(),
-      new SafeAreaContextPackage(),
-      new RNDocumentPickerPackage(),
-      new RNSoundPackage(),
-      new GeolocationPackage(),
-      new ImagePickerPackage(),
-      new RNPermissionsPackage(),
-      new RNCameraKitPackage(),
-      new AsyncStoragePackage(),
-      new RNFSPackage(),
-      new RNAudioRecorderPlayerPackage(),
-      new RNLiveAudioStreamPackage()
-    ));
-  }
-}
@@ -1,16 +0,0 @@
-/**
- * Automatically generated file. DO NOT MODIFY
- */
-package com.ariacockpit;
-
-public final class BuildConfig {
-  public static final boolean DEBUG = false;
-  public static final String APPLICATION_ID = "com.ariacockpit";
-  public static final String BUILD_TYPE = "release";
-  public static final int VERSION_CODE = 1;
-  public static final String VERSION_NAME = "1.0";
-  // Field from default config.
-  public static final boolean IS_HERMES_ENABLED = true;
-  // Field from default config.
-  public static final boolean IS_NEW_ARCHITECTURE_ENABLED = false;
-}
@@ -1,24 +0,0 @@
-{
-  "schemaVersion": "1.1.0",
-  "buildSystem": "Gradle",
-  "buildSystemVersion": "8.3",
-  "buildPlugin": "org.jetbrains.kotlin.gradle.plugin.KotlinAndroidPluginWrapper",
-  "buildPluginVersion": "1.8.0",
-  "projectSettings": {
-    "isHmppEnabled": true,
-    "isCompatibilityMetadataVariantEnabled": false,
-    "isKPMEnabled": false
-  },
-  "projectTargets": [
-    {
-      "target": "org.jetbrains.kotlin.gradle.plugin.mpp.KotlinAndroidTarget",
-      "platformType": "androidJvm",
-      "extras": {
-        "android": {
-          "sourceCompatibility": "17",
-          "targetCompatibility": "17"
-        }
-      }
-    }
-  ]
-}
@@ -3,6 +3,7 @@
    <uses-permission android:name="android.permission.INTERNET" />
    <uses-permission android:name="android.permission.CAMERA" />
    <uses-permission android:name="android.permission.RECORD_AUDIO" />
+    <uses-permission android:name="android.permission.REQUEST_INSTALL_PACKAGES" />

    <application
      android:name=".MainApplication"
@@ -24,5 +25,15 @@
            <category android:name="android.intent.category.LAUNCHER" />
        </intent-filter>
      </activity>
+
+      <provider
+        android:name="androidx.core.content.FileProvider"
+        android:authorities="${applicationId}.fileprovider"
+        android:exported="false"
+        android:grantUriPermissions="true">
+        <meta-data
+          android:name="android.support.FILE_PROVIDER_PATHS"
+          android:resource="@xml/file_paths" />
+      </provider>
    </application>
 </manifest>
@@ -0,0 +1,44 @@
+package com.ariacockpit
+
+import android.content.Intent
+import android.net.Uri
+import android.os.Build
+import androidx.core.content.FileProvider
+import com.facebook.react.bridge.ReactApplicationContext
+import com.facebook.react.bridge.ReactContextBaseJavaModule
+import com.facebook.react.bridge.ReactMethod
+import com.facebook.react.bridge.Promise
+import java.io.File
+
+class ApkInstallerModule(reactContext: ReactApplicationContext) : ReactContextBaseJavaModule(reactContext) {
+    override fun getName() = "ApkInstaller"
+
+    @ReactMethod
+    fun install(filePath: String, promise: Promise) {
+        try {
+            val file = File(filePath)
+            if (!file.exists()) {
+                promise.reject("FILE_NOT_FOUND", "APK nicht gefunden: $filePath")
+                return
+            }
+
+            val context = reactApplicationContext
+            val uri: Uri = if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.N) {
+                FileProvider.getUriForFile(context, "${context.packageName}.fileprovider", file)
+            } else {
+                Uri.fromFile(file)
+            }
+
+            val intent = Intent(Intent.ACTION_VIEW).apply {
+                setDataAndType(uri, "application/vnd.android.package-archive")
+                addFlags(Intent.FLAG_ACTIVITY_NEW_TASK)
+                addFlags(Intent.FLAG_GRANT_READ_URI_PERMISSION)
+            }
+
+            context.startActivity(intent)
+            promise.resolve(true)
+        } catch (e: Exception) {
+            promise.reject("INSTALL_ERROR", e.message, e)
+        }
+    }
+}
@@ -0,0 +1,16 @@
+package com.ariacockpit
+
+import com.facebook.react.ReactPackage
+import com.facebook.react.bridge.NativeModule
+import com.facebook.react.bridge.ReactApplicationContext
+import com.facebook.react.uimanager.ViewManager
+
+class ApkInstallerPackage : ReactPackage {
+    override fun createNativeModules(reactContext: ReactApplicationContext): List<NativeModule> {
+        return listOf(ApkInstallerModule(reactContext))
+    }
+
+    override fun createViewManagers(reactContext: ReactApplicationContext): List<ViewManager<*, *>> {
+        return emptyList()
+    }
+}
@@ -0,0 +1,99 @@
+package com.ariacockpit
+
+import android.content.Context
+import android.media.AudioAttributes
+import android.media.AudioFocusRequest
+import android.media.AudioManager
+import android.os.Build
+import com.facebook.react.bridge.Promise
+import com.facebook.react.bridge.ReactApplicationContext
+import com.facebook.react.bridge.ReactContextBaseJavaModule
+import com.facebook.react.bridge.ReactMethod
+
+/**
+ * Steuert Audio-Focus fuer Ducking/Muten anderer Apps.
+ *
+ * - requestDuck()      → andere Apps werden leiser (ARIA spricht TTS)
+ * - requestExclusive() → andere Apps werden pausiert (Mikrofon-Aufnahme)
+ * - release()          → Focus abgeben, andere Apps duerfen wieder
+ */
+class AudioFocusModule(reactContext: ReactApplicationContext) : ReactContextBaseJavaModule(reactContext) {
+    override fun getName() = "AudioFocus"
+
+    private var currentRequest: AudioFocusRequest? = null
+
+    private fun audioManager(): AudioManager? =
+        reactApplicationContext.getSystemService(Context.AUDIO_SERVICE) as? AudioManager
+
+    private fun requestFocus(durationHint: Int, usage: Int, promise: Promise) {
+        val am = audioManager()
+        if (am == null) {
+            promise.reject("NO_AUDIO_MANAGER", "AudioManager nicht verfuegbar")
+            return
+        }
+
+        release()
+
+        val result: Int = if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.O) {
+            val attrs = AudioAttributes.Builder()
+                .setUsage(usage)
+                .setContentType(AudioAttributes.CONTENT_TYPE_SPEECH)
+                .build()
+            val req = AudioFocusRequest.Builder(durationHint)
+                .setAudioAttributes(attrs)
+                .setOnAudioFocusChangeListener { /* kein Callback noetig */ }
+                .build()
+            currentRequest = req
+            am.requestAudioFocus(req)
+        } else {
+            @Suppress("DEPRECATION")
+            am.requestAudioFocus(null, AudioManager.STREAM_MUSIC, durationHint)
+        }
+
+        promise.resolve(result == AudioManager.AUDIOFOCUS_REQUEST_GRANTED)
+    }
+
+    /** Andere Apps werden pausiert (TTS spricht).
+     *
+     *  TRANSIENT (statt TRANSIENT_MAY_DUCK): Spotify/YouTube pausieren komplett
+     *  statt nur leiser zu werden. Verhindert auch das "kommt-wieder-hoch"-
+     *  Problem mit MAY_DUCK, wo das System nach kurzer Zeit den Duck-Effekt
+     *  wieder aufgehoben hat obwohl wir den Fokus noch hielten.
+     */
+    @ReactMethod
+    fun requestDuck(promise: Promise) {
+        requestFocus(
+            AudioManager.AUDIOFOCUS_GAIN_TRANSIENT,
+            AudioAttributes.USAGE_ASSISTANT,
+            promise,
+        )
+    }
+
+    /** Andere Apps werden pausiert (Mikrofon-Aufnahme / Gespraech). */
+    @ReactMethod
+    fun requestExclusive(promise: Promise) {
+        requestFocus(
+            AudioManager.AUDIOFOCUS_GAIN_TRANSIENT_EXCLUSIVE,
+            AudioAttributes.USAGE_VOICE_COMMUNICATION,
+            promise,
+        )
+    }
+
+    /** Focus abgeben — andere Apps duerfen wieder volle Lautstaerke. */
+    @ReactMethod
+    fun release(promise: Promise) {
+        release()
+        promise.resolve(true)
+    }
+
+    private fun release() {
+        val am = audioManager() ?: return
+        if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.O) {
+            currentRequest?.let { am.abandonAudioFocusRequest(it) }
+        } else {
+            @Suppress("DEPRECATION")
+            am.abandonAudioFocus(null)
+        }
+        currentRequest = null
+    }
+}
@@ -0,0 +1,16 @@
+package com.ariacockpit
+
+import com.facebook.react.ReactPackage
+import com.facebook.react.bridge.NativeModule
+import com.facebook.react.bridge.ReactApplicationContext
+import com.facebook.react.uimanager.ViewManager
+
+class AudioFocusPackage : ReactPackage {
+    override fun createNativeModules(reactContext: ReactApplicationContext): List<NativeModule> {
+        return listOf(AudioFocusModule(reactContext))
+    }
+
+    override fun createViewManagers(reactContext: ReactApplicationContext): List<ViewManager<*, *>> {
+        return emptyList()
+    }
+}
@@ -18,8 +18,9 @@ class MainApplication : Application(), ReactApplication {
      object : DefaultReactNativeHost(this) {
        override fun getPackages(): List<ReactPackage> =
            PackageList(this).packages.apply {
-              // Packages that cannot be autolinked yet can be added manually here, for example:
-              // add(MyReactNativePackage())
+              add(ApkInstallerPackage())
+              add(AudioFocusPackage())
+              add(PcmStreamPlayerPackage())
            }

        override fun getJSMainModuleName(): String = "index"
@@ -0,0 +1,252 @@
+package com.ariacockpit
+
+import android.media.AudioAttributes
+import android.media.AudioFormat
+import android.media.AudioManager
+import android.media.AudioTrack
+import android.util.Base64
+import android.util.Log
+import com.facebook.react.bridge.Promise
+import com.facebook.react.bridge.ReactApplicationContext
+import com.facebook.react.bridge.ReactContextBaseJavaModule
+import com.facebook.react.bridge.ReactMethod
+import java.util.concurrent.LinkedBlockingQueue
+
+/**
+ * Streamt PCM-s16le Audio direkt via AudioTrack MODE_STREAM mit Pre-Roll.
+ *
+ * Pre-Roll: AudioTrack wird zwar direkt gebaut und gefuttert, aber play()
+ * wird erst aufgerufen wenn PREROLL_SECONDS Audio im Buffer ist. So hat
+ * der Stream Zeit einen Vorrat aufzubauen — wenn XTTS mit RTF>1 rendert
+ * (langsamer als Echtzeit), laeuft der Buffer trotzdem nicht leer.
+ *
+ * Flow:
+ *   JS: start(sampleRate, channels) → öffnet AudioTrack (noch nicht play())
+ *   JS: writeChunk(base64)           → dekodiert, queued, Writer schreibt
+ *   Writer: spielt los sobald PREROLL erreicht ist
+ *   JS: end()                         → wartet bis Queue leer, schließt
+ *   JS: stop()                        → Hart stoppen (Cancel)
+ */
+class PcmStreamPlayerModule(reactContext: ReactApplicationContext) : ReactContextBaseJavaModule(reactContext) {
+    companion object {
+        private const val TAG = "PcmStreamPlayer"
+        // Fallback wenn JS keinen Wert uebergibt.
+        private const val DEFAULT_PREROLL_SECONDS = 3.5
+        private const val MIN_PREROLL_SECONDS = 0.5
+        private const val MAX_PREROLL_SECONDS = 10.0
+        // Stille am Stream-Anfang, damit AudioTrack sauber anfaehrt und die
+        // ersten Samples nicht abgeschnitten werden (XTTS-Warmup + play()-Latenz).
+        private const val LEADING_SILENCE_SECONDS = 0.2
+    }
+
+    override fun getName() = "PcmStreamPlayer"
+
+    private var track: AudioTrack? = null
+    private val queue = LinkedBlockingQueue<ByteArray>()
+    private var writerThread: Thread? = null
+    @Volatile private var writerShouldStop = false
+    @Volatile private var endRequested = false
+    @Volatile private var prerollBytes: Int = 0
+    @Volatile private var playbackStarted = false
+    @Volatile private var bytesBuffered: Long = 0
+    @Volatile private var streamBytesPerFrame: Int = 2 // mono s16le default
+
+    // ── Lifecycle ──
+
+    @ReactMethod
+    fun start(sampleRate: Int, channels: Int, prerollSeconds: Double, promise: Promise) {
+        try {
+            // Alte Session beenden falls vorhanden
+            stopInternal()
+
+            val prerollSec = prerollSeconds
+                .coerceIn(MIN_PREROLL_SECONDS, MAX_PREROLL_SECONDS)
+                .let { if (it.isFinite() && it > 0) it else DEFAULT_PREROLL_SECONDS }
+
+            val channelConfig = if (channels == 2) AudioFormat.CHANNEL_OUT_STEREO else AudioFormat.CHANNEL_OUT_MONO
+            val encoding = AudioFormat.ENCODING_PCM_16BIT
+            val minBuf = AudioTrack.getMinBufferSize(sampleRate, channelConfig, encoding)
+            val bytesPerSecond = sampleRate * channels * 2 // 16-bit = 2 bytes
+            // Buffer muss mindestens PREROLL + etwas Spielraum fassen.
+            val prerollTarget = (bytesPerSecond * prerollSec).toInt()
+            val bufferSize = (minBuf * 32).coerceAtLeast(prerollTarget * 2)
+            prerollBytes = prerollTarget
+            bytesBuffered = 0
+            playbackStarted = false
+            streamBytesPerFrame = channels * 2 // s16 = 2 bytes per sample
+
+            val newTrack = AudioTrack.Builder()
+                .setAudioAttributes(
+                    AudioAttributes.Builder()
+                        .setUsage(AudioAttributes.USAGE_ASSISTANT)
+                        .setContentType(AudioAttributes.CONTENT_TYPE_SPEECH)
+                        .build(),
+                )
+                .setAudioFormat(
+                    AudioFormat.Builder()
+                        .setSampleRate(sampleRate)
+                        .setChannelMask(channelConfig)
+                        .setEncoding(encoding)
+                        .build(),
+                )
+                .setBufferSizeInBytes(bufferSize)
+                .setTransferMode(AudioTrack.MODE_STREAM)
+                .build()
+
+            // AudioTrack erstellen — play() wird erst aufgerufen wenn Pre-Roll erreicht.
+            track = newTrack
+            queue.clear()
+            writerShouldStop = false
+            endRequested = false
+
+            writerThread = Thread({
+                val t = track ?: return@Thread
+                try {
+                    // Leading-Silence in den Buffer — gibt AudioTrack Zeit anzufahren.
+                    val silenceBytes = ((sampleRate * channels * 2) * LEADING_SILENCE_SECONDS).toInt() and 0x7FFFFFFE
+                    if (silenceBytes > 0) {
+                        val silence = ByteArray(silenceBytes)
+                        var silOff = 0
+                        while (silOff < silence.size && !writerShouldStop) {
+                            val w = t.write(silence, silOff, silence.size - silOff)
+                            if (w <= 0) break
+                            silOff += w
+                        }
+                        bytesBuffered += silence.size
+                    }
+                    while (!writerShouldStop) {
+                        val data = queue.poll(50, java.util.concurrent.TimeUnit.MILLISECONDS) ?: run {
+                            if (endRequested) {
+                                // Falls wir vor Pre-Roll enden (kurzer Text): trotzdem abspielen
+                                if (!playbackStarted) {
+                                    try { t.play() } catch (_: Exception) {}
+                                    playbackStarted = true
+                                }
+                                return@Thread
+                            }
+                            null
+                        } ?: continue
+
+                        // Pre-Roll Check: play() erst wenn genug gepuffert
+                        if (!playbackStarted && bytesBuffered + data.size >= prerollBytes) {
+                            try {
+                                t.play()
+                                playbackStarted = true
+                                Log.i(TAG, "Playback gestartet nach Pre-Roll ${bytesBuffered + data.size} Bytes")
+                            } catch (e: Exception) {
+                                Log.w(TAG, "play() failed: ${e.message}")
+                            }
+                        }
+
+                        var offset = 0
+                        while (offset < data.size && !writerShouldStop) {
+                            val written = t.write(data, offset, data.size - offset)
+                            if (written <= 0) break
+                            offset += written
+                        }
+                        bytesBuffered += data.size
+                    }
+                } catch (e: Exception) {
+                    Log.w(TAG, "Writer-Thread Fehler: ${e.message}")
+                } finally {
+                    // Warten bis alle geschriebenen Samples tatsaechlich abgespielt sind,
+                    // sonst cuttet t.release() die letzten Sekunden ab.
+                    try {
+                        val totalFrames = (bytesBuffered / streamBytesPerFrame).toInt()
+                        var lastPos = -1
+                        var stalledCount = 0
+                        while (!writerShouldStop) {
+                            val pos = t.playbackHeadPosition
+                            if (pos >= totalFrames) break
+                            // Safety: wenn Position 2s nicht mehr vorwaerts → AudioTrack hing
+                            if (pos == lastPos) {
+                                stalledCount++
+                                if (stalledCount > 40) {
+                                    Log.w(TAG, "playback stalled at $pos/$totalFrames — give up")
+                                    break
+                                }
+                            } else {
+                                stalledCount = 0
+                                lastPos = pos
+                            }
+                            Thread.sleep(50)
+                        }
+                        Log.i(TAG, "Playback fertig: frames=$totalFrames pos=${t.playbackHeadPosition}")
+                    } catch (_: Exception) {}
+                    try { t.stop() } catch (_: Exception) {}
+                    try { t.release() } catch (_: Exception) {}
+                }
+            }, "PcmStreamWriter").apply { start() }
+
+            Log.i(TAG, "Stream gestartet: ${sampleRate}Hz ch=$channels buf=${bufferSize}B preroll=${prerollBytes}B (${prerollSec}s)")
+            promise.resolve(true)
+        } catch (e: Exception) {
+            Log.e(TAG, "start fehlgeschlagen", e)
+            promise.reject("START_FAILED", e.message, e)
+        }
+    }
+
+    @ReactMethod
+    fun writeChunk(base64Pcm: String, promise: Promise) {
+        try {
+            if (base64Pcm.isEmpty()) {
+                promise.resolve(true)
+                return
+            }
+            val bytes = Base64.decode(base64Pcm, Base64.DEFAULT)
+            queue.put(bytes)
+            promise.resolve(true)
+        } catch (e: Exception) {
+            promise.reject("WRITE_FAILED", e.message, e)
+        }
+    }
+
+    /** Signalisiert: keine weiteren Chunks. Writer spielt aus, dann stoppt.
+     *  Das Promise resolved erst wenn der Writer-Thread fertig ist —
+     *  wichtig damit der Aufrufer den AudioFocus erst NACH dem letzten
+     *  abgespielten Sample wieder freigibt (sonst dreht Spotify hoch
+     *  waehrend das Pre-Roll noch ausspielt).
+     */
+    @ReactMethod
+    fun end(promise: Promise) {
+        endRequested = true
+        val t = writerThread
+        if (t == null || !t.isAlive) {
+            promise.resolve(true)
+            return
+        }
+        // Im Hintergrund auf den Writer warten — kein Threading-Block fuer JS-Bridge
+        Thread({
+            try {
+                t.join(15_000) // hartes Cap, falls Writer haengt
+            } catch (_: InterruptedException) {}
+            promise.resolve(true)
+        }, "PcmStreamEndWaiter").start()
+    }
+
+    /** Harter Stop (Cancel) — Queue verwerfen. */
+    @ReactMethod
+    fun stop(promise: Promise) {
+        stopInternal()
+        promise.resolve(true)
+    }
+
+    private fun stopInternal() {
+        writerShouldStop = true
+        endRequested = true
+        queue.clear()
+        writerThread?.interrupt()
+        writerThread = null
+        val t = track
+        if (t != null) {
+            try { t.stop() } catch (_: Exception) {}
+            try { t.release() } catch (_: Exception) {}
+        }
+        track = null
+    }
+
+    override fun onCatalystInstanceDestroy() {
+        stopInternal()
+        super.onCatalystInstanceDestroy()
+    }
+}
@@ -0,0 +1,16 @@
+package com.ariacockpit
+
+import com.facebook.react.ReactPackage
+import com.facebook.react.bridge.NativeModule
+import com.facebook.react.bridge.ReactApplicationContext
+import com.facebook.react.uimanager.ViewManager
+
+class PcmStreamPlayerPackage : ReactPackage {
+    override fun createNativeModules(reactContext: ReactApplicationContext): List<NativeModule> {
+        return listOf(PcmStreamPlayerModule(reactContext))
+    }
+
+    override fun createViewManagers(reactContext: ReactApplicationContext): List<ViewManager<*, *>> {
+        return emptyList()
+    }
+}
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="utf-8"?>
+<paths>
+    <cache-path name="cache" path="." />
+</paths>
@@ -1,6 +1,6 @@
 {
  "name": "aria-cockpit",
-  "version": "0.1.0",
+  "version": "0.0.5.4",
  "private": true,
  "scripts": {
    "android": "react-native run-android",
@@ -24,8 +24,7 @@
    "react-native-camera-kit": "^13.0.0",
    "@react-native-async-storage/async-storage": "^1.21.0",
    "react-native-fs": "^2.20.0",
-    "react-native-audio-recorder-player": "^3.6.7",
-    "react-native-live-audio-stream": "^1.1.1"
+    "react-native-audio-recorder-player": "^3.6.7"
  },
  "devDependencies": {
    "typescript": "^5.3.3",
@@ -17,6 +17,7 @@ import {
 import DocumentPicker, {
  DocumentPickerResponse,
 } from 'react-native-document-picker';
+import RNFS from 'react-native-fs';

 // --- Typen ---

@@ -74,15 +75,17 @@ const FileUpload: React.FC<FileUploadProps> = ({ onFileSelected, onCancel }) =>

    setLoading(true);
    try {
-      // In Produktion: Datei lesen und zu Base64 konvertieren
-      // const base64 = await RNFS.readFile(selectedFile.fileCopyUri || selectedFile.uri, 'base64');
-      const base64Placeholder = '';
+      // Datei lesen und zu Base64 konvertieren
+      const filePath = selectedFile.fileCopyUri || selectedFile.uri;
+      // URI-Schema entfernen fuer RNFS (file:// → absoluter Pfad)
+      const cleanPath = filePath.replace('file://', '');
+      const base64 = await RNFS.readFile(cleanPath, 'base64');

      const fileData: FileData = {
        name: selectedFile.name || 'unbenannt',
        type: selectedFile.type || 'application/octet-stream',
        size: selectedFile.size || 0,
-        base64: base64Placeholder,
+        base64,
        uri: selectedFile.uri,
      };

@@ -0,0 +1,362 @@
+/**
+ * VoiceCloneModal — Eigene Stimme aufnehmen und an XTTS uploaden.
+ *
+ * Flow:
+ *   - Modal zeigt Vorlesetext (>30s Lesedauer) + Aufnahme-Button
+ *   - Bei Aufnahme: max 30s, Fortschrittsbalken, Countdown
+ *   - Bei Stop: Name abfragen, dann als voice_upload ueber RVS schicken
+ *   - XTTS-Bridge speichert /voices/<name>.wav, antwortet mit xtts_voice_saved
+ */
+
+import React, { useCallback, useEffect, useRef, useState } from 'react';
+import {
+  Modal,
+  View,
+  Text,
+  TouchableOpacity,
+  StyleSheet,
+  Alert,
+  ScrollView,
+  ActivityIndicator,
+  TextInput,
+} from 'react-native';
+import audioService from '../services/audio';
+import rvs from '../services/rvs';
+
+interface Props {
+  visible: boolean;
+  onClose: () => void;
+}
+
+const SAMPLE_TEXT = `Das ist meine eigene Stimme fuer ARIA. Ich lese jetzt einen laengeren Absatz laut vor, damit das Voice-Cloning eine gute Grundlage hat. Guten Tag, ich heisse Stefan und baue gerade mit grosser Begeisterung an meinem persoenlichen KI-Assistenten. Wir automatisieren Infrastruktur, managen Sessions und spielen mit Sprachsynthese. Die letzten Jahre habe ich viel gelernt, vor allem dass Geduld genauso wichtig ist wie Neugier. Hoert sich das jetzt an wie ich selbst? Wenn alles klappt, spricht ARIA bald mit dieser Stimme.`;
+
+const MAX_DURATION_MS = 30000;
+const TARGET_DURATION_MS = 15000;
+
+const VoiceCloneModal: React.FC<Props> = ({ visible, onClose }) => {
+  const [recording, setRecording] = useState(false);
+  const [durationMs, setDurationMs] = useState(0);
+  const [voiceName, setVoiceName] = useState('');
+  const [processing, setProcessing] = useState(false);
+  const [recordingPath, setRecordingPath] = useState('');
+  const timerRef = useRef<ReturnType<typeof setInterval> | null>(null);
+  const startTimeRef = useRef<number>(0);
+
+  // Zustand zuruecksetzen wenn Modal schliesst/oeffnet
+  useEffect(() => {
+    if (!visible) {
+      setRecording(false);
+      setDurationMs(0);
+      setVoiceName('');
+      setProcessing(false);
+      setRecordingPath('');
+      if (timerRef.current) clearInterval(timerRef.current);
+    }
+  }, [visible]);
+
+  // Cleanup bei Unmount
+  useEffect(() => {
+    return () => {
+      if (timerRef.current) clearInterval(timerRef.current);
+      if (recording) audioService.stopRecording().catch(() => {});
+    };
+  }, [recording]);
+
+  const startRecording = useCallback(async () => {
+    // Frische Aufnahme
+    setDurationMs(0);
+    setRecordingPath('');
+    const ok = await audioService.startRecording(false);
+    if (!ok) {
+      Alert.alert('Fehler', 'Aufnahme konnte nicht gestartet werden (Mikrofon-Berechtigung?)');
+      return;
+    }
+    setRecording(true);
+    startTimeRef.current = Date.now();
+    timerRef.current = setInterval(async () => {
+      const elapsed = Date.now() - startTimeRef.current;
+      setDurationMs(elapsed);
+      if (elapsed >= MAX_DURATION_MS) {
+        await stopRecording();
+      }
+    }, 100);
+  }, []);
+
+  const stopRecording = useCallback(async () => {
+    if (timerRef.current) {
+      clearInterval(timerRef.current);
+      timerRef.current = null;
+    }
+    if (!recording) return;
+    const result = await audioService.stopRecording();
+    setRecording(false);
+    if (!result) {
+      Alert.alert('Keine Sprache erkannt', 'Versuch es bitte nochmal — sprich bis der Timer mindestens 10 Sekunden anzeigt.');
+      setDurationMs(0);
+      return;
+    }
+    // Temp-Datei wurde schon geloescht (stopRecording cleaned up).
+    // Wir brauchen aber base64 aus result direkt fuers Upload.
+    // result.base64 ist bereits da.
+    setRecordingPath(result.base64);
+  }, [recording]);
+
+  const uploadVoice = useCallback(async () => {
+    const name = voiceName.trim();
+    if (!name) {
+      Alert.alert('Name fehlt', 'Bitte gib der Stimme einen Namen (nur Buchstaben, Zahlen, _ und -).');
+      return;
+    }
+    if (!/^[a-zA-Z0-9_-]+$/.test(name)) {
+      Alert.alert('Ungueltiger Name', 'Nur Buchstaben, Zahlen, _ und - erlaubt.');
+      return;
+    }
+    if (!recordingPath) {
+      Alert.alert('Keine Aufnahme', 'Bitte zuerst aufnehmen.');
+      return;
+    }
+    setProcessing(true);
+    try {
+      // voice_upload erwartet samples als Array mit base64 (aus Diagnostic-Format kopiert)
+      rvs.send('voice_upload' as any, {
+        name,
+        samples: [{ base64: recordingPath }],
+      });
+      Alert.alert('Hochgeladen', `Stimme "${name}" wird vom XTTS-Server verarbeitet. Nach ein paar Sekunden in der Liste verfuegbar.`);
+      onClose();
+    } catch (err: any) {
+      Alert.alert('Fehler', err.message);
+    } finally {
+      setProcessing(false);
+    }
+  }, [voiceName, recordingPath, onClose]);
+
+  const progress = Math.min(durationMs / MAX_DURATION_MS, 1);
+  const sec = Math.floor(durationMs / 1000);
+  const enoughRecorded = durationMs >= TARGET_DURATION_MS;
+
+  return (
+    <Modal visible={visible} animationType="slide" onRequestClose={onClose}>
+      <View style={styles.container}>
+        <View style={styles.header}>
+          <Text style={styles.title}>Eigene Stimme aufnehmen</Text>
+          <TouchableOpacity onPress={onClose}>
+            <Text style={styles.closeX}>{'\u2715'}</Text>
+          </TouchableOpacity>
+        </View>
+
+        <ScrollView style={styles.content} contentContainerStyle={{padding: 16}}>
+          <Text style={styles.hint}>
+            Lies den Text laut und deutlich vor. Maximal 30 Sekunden. Je mehr du sprichst
+            (ziel: bis zum Ende des Textes, ca. 20-30s), desto besser wird die geklonte
+            Stimme.
+          </Text>
+
+          <View style={styles.sampleTextBox}>
+            <Text style={styles.sampleText}>{SAMPLE_TEXT}</Text>
+          </View>
+
+          {/* Timer + Fortschritt */}
+          <View style={{marginTop: 20, alignItems: 'center'}}>
+            <Text style={[styles.timer, recording && styles.timerActive]}>
+              {sec.toString().padStart(2, '0')} / 30 s
+            </Text>
+            <View style={styles.progressBar}>
+              <View style={[styles.progressFill, {width: `${progress * 100}%`, backgroundColor: recording ? '#FF3B30' : '#0096FF'}]} />
+            </View>
+          </View>
+
+          {/* Aufnahme-Button */}
+          {!recordingPath && (
+            <TouchableOpacity
+              style={[styles.recordBtn, recording && styles.recordBtnActive]}
+              onPress={recording ? stopRecording : startRecording}
+            >
+              <Text style={styles.recordIcon}>{recording ? '\u25A0' : '\u25CF'}</Text>
+              <Text style={styles.recordLabel}>{recording ? 'Stop' : 'Aufnahme starten'}</Text>
+            </TouchableOpacity>
+          )}
+
+          {/* Nach Aufnahme: Name + Upload */}
+          {recordingPath && (
+            <View style={{marginTop: 20}}>
+              <Text style={styles.hint}>
+                Aufnahme ({sec}s) fertig. Vergib einen Namen und lade hoch.
+              </Text>
+              <TextInput
+                style={styles.nameInput}
+                value={voiceName}
+                onChangeText={setVoiceName}
+                placeholder="z.B. stefan"
+                placeholderTextColor="#555570"
+                autoCapitalize="none"
+                autoCorrect={false}
+              />
+              <View style={{flexDirection: 'row', gap: 8, marginTop: 12}}>
+                <TouchableOpacity
+                  style={[styles.secondaryBtn, {flex: 1}]}
+                  onPress={() => { setRecordingPath(''); setDurationMs(0); }}
+                >
+                  <Text style={styles.secondaryBtnText}>Nochmal aufnehmen</Text>
+                </TouchableOpacity>
+                <TouchableOpacity
+                  style={[styles.primaryBtn, {flex: 1}]}
+                  onPress={uploadVoice}
+                  disabled={processing}
+                >
+                  {processing
+                    ? <ActivityIndicator color="#fff" />
+                    : <Text style={styles.primaryBtnText}>Hochladen</Text>
+                  }
+                </TouchableOpacity>
+              </View>
+            </View>
+          )}
+
+          {recording && !enoughRecorded && (
+            <Text style={[styles.hint, {marginTop: 12, color: '#FFD60A', textAlign: 'center'}]}>
+              Bitte weiter lesen — mindestens 15 Sekunden
+            </Text>
+          )}
+
+          {recording && enoughRecorded && (
+            <Text style={[styles.hint, {marginTop: 12, color: '#34C759', textAlign: 'center'}]}>
+              Genug Audio fuer eine gute Clonung. Du kannst stoppen.
+            </Text>
+          )}
+        </ScrollView>
+      </View>
+    </Modal>
+  );
+};
+
+const styles = StyleSheet.create({
+  container: {
+    flex: 1,
+    backgroundColor: '#0D0D1A',
+  },
+  header: {
+    flexDirection: 'row',
+    alignItems: 'center',
+    justifyContent: 'space-between',
+    paddingHorizontal: 16,
+    paddingTop: 48,
+    paddingBottom: 16,
+    borderBottomWidth: 1,
+    borderBottomColor: '#1E1E2E',
+  },
+  title: {
+    color: '#FFFFFF',
+    fontSize: 18,
+    fontWeight: '700',
+  },
+  closeX: {
+    color: '#8888AA',
+    fontSize: 24,
+    paddingHorizontal: 8,
+  },
+  content: {
+    flex: 1,
+  },
+  hint: {
+    color: '#8888AA',
+    fontSize: 13,
+    lineHeight: 20,
+  },
+  sampleTextBox: {
+    marginTop: 12,
+    padding: 14,
+    backgroundColor: '#12122A',
+    borderRadius: 10,
+    borderWidth: 1,
+    borderColor: '#1E1E2E',
+  },
+  sampleText: {
+    color: '#E0E0F0',
+    fontSize: 15,
+    lineHeight: 24,
+  },
+  timer: {
+    color: '#666680',
+    fontSize: 42,
+    fontWeight: '700',
+    fontVariant: ['tabular-nums'],
+  },
+  timerActive: {
+    color: '#FF3B30',
+  },
+  progressBar: {
+    marginTop: 8,
+    width: '100%',
+    height: 8,
+    backgroundColor: '#1E1E2E',
+    borderRadius: 4,
+    overflow: 'hidden',
+  },
+  progressFill: {
+    height: '100%',
+  },
+  recordBtn: {
+    marginTop: 24,
+    flexDirection: 'row',
+    alignItems: 'center',
+    justifyContent: 'center',
+    gap: 12,
+    backgroundColor: '#1E1E2E',
+    borderRadius: 12,
+    padding: 18,
+    borderWidth: 2,
+    borderColor: '#34C759',
+  },
+  recordBtnActive: {
+    borderColor: '#FF3B30',
+    backgroundColor: 'rgba(255,59,48,0.15)',
+  },
+  recordIcon: {
+    color: '#FF3B30',
+    fontSize: 24,
+    fontWeight: '700',
+  },
+  recordLabel: {
+    color: '#FFFFFF',
+    fontSize: 17,
+    fontWeight: '600',
+  },
+  nameInput: {
+    marginTop: 10,
+    backgroundColor: '#1E1E2E',
+    borderRadius: 8,
+    paddingHorizontal: 14,
+    paddingVertical: 12,
+    color: '#FFFFFF',
+    fontSize: 15,
+    borderWidth: 1,
+    borderColor: '#2A2A3E',
+  },
+  primaryBtn: {
+    backgroundColor: '#0096FF',
+    borderRadius: 10,
+    padding: 14,
+    alignItems: 'center',
+  },
+  primaryBtnText: {
+    color: '#FFFFFF',
+    fontSize: 15,
+    fontWeight: '700',
+  },
+  secondaryBtn: {
+    backgroundColor: '#1E1E2E',
+    borderRadius: 10,
+    padding: 14,
+    alignItems: 'center',
+    borderWidth: 1,
+    borderColor: '#2A2A3E',
+  },
+  secondaryBtnText: {
+    color: '#8888AA',
+    fontSize: 14,
+    fontWeight: '600',
+  },
+});
+
+export default VoiceCloneModal;
@@ -15,10 +15,29 @@ import {
  StyleSheet,
  Alert,
  Platform,
+  ToastAndroid,
+  ActivityIndicator,
 } from 'react-native';
+import AsyncStorage from '@react-native-async-storage/async-storage';
+import RNFS from 'react-native-fs';
+import DocumentPicker from 'react-native-document-picker';
 import rvs, { ConnectionState, RVSMessage, ConnectionConfig, ConnectionLogEntry } from '../services/rvs';
+import {
+  TTS_PREROLL_DEFAULT_SEC,
+  TTS_PREROLL_MIN_SEC,
+  TTS_PREROLL_MAX_SEC,
+  TTS_PREROLL_STORAGE_KEY,
+  VAD_SILENCE_DEFAULT_SEC,
+  VAD_SILENCE_MIN_SEC,
+  VAD_SILENCE_MAX_SEC,
+  VAD_SILENCE_STORAGE_KEY,
+} from '../services/audio';
 import ModeSelector from '../components/ModeSelector';
 import QRScanner from '../components/QRScanner';
+import VoiceCloneModal from '../components/VoiceCloneModal';
+
+const STORAGE_PATH_KEY = 'aria_attachment_storage_path';
+const DEFAULT_STORAGE_PATH = `${RNFS.DocumentDirectoryPath}/chat_attachments`;

 // --- Typen ---

@@ -62,6 +81,18 @@ const SettingsScreen: React.FC = () => {
  const [logs, setLogs] = useState<LogEntry[]>([]);
  const [events, setEvents] = useState<EventEntry[]>([]);
  const [connLog, setConnLog] = useState<ConnectionLogEntry[]>(rvs.getConnectionLog());
+  const [storagePath, setStoragePath] = useState(DEFAULT_STORAGE_PATH);
+  const [autoDownload, setAutoDownload] = useState(true);
+  const [storageSize, setStorageSize] = useState('...');
+  const [ttsEnabled, setTtsEnabled] = useState(true);
+  const [ttsPrerollSec, setTtsPrerollSec] = useState<number>(TTS_PREROLL_DEFAULT_SEC);
+  const [vadSilenceSec, setVadSilenceSec] = useState<number>(VAD_SILENCE_DEFAULT_SEC);
+  const [editingPath, setEditingPath] = useState(false);
+  const [xttsVoice, setXttsVoice] = useState('');
+  const [loadingVoice, setLoadingVoice] = useState<string | null>(null);
+  const [availableVoices, setAvailableVoices] = useState<Array<{name: string, size: number}>>([]);
+  const [voiceCloneVisible, setVoiceCloneVisible] = useState(false);
+  const [tempPath, setTempPath] = useState('');

  let logIdCounter = 0;

@@ -73,8 +104,131 @@ const SettingsScreen: React.FC = () => {
      setManualPort(String(config.port));
      setManualToken(config.token);
    }
+    // Speicherpfad + Auto-Download laden
+    AsyncStorage.getItem(STORAGE_PATH_KEY).then(saved => {
+      if (saved) setStoragePath(saved);
+    });
+    AsyncStorage.getItem('aria_auto_download').then(saved => {
+      if (saved !== null) setAutoDownload(saved === 'true');
+    });
+    AsyncStorage.getItem('aria_tts_enabled').then(saved => {
+      if (saved !== null) setTtsEnabled(saved === 'true');
+    });
+    AsyncStorage.getItem(TTS_PREROLL_STORAGE_KEY).then(saved => {
+      if (saved != null) {
+        const n = parseFloat(saved);
+        if (isFinite(n) && n >= TTS_PREROLL_MIN_SEC && n <= TTS_PREROLL_MAX_SEC) {
+          setTtsPrerollSec(n);
+        }
+      }
+    });
+    AsyncStorage.getItem(VAD_SILENCE_STORAGE_KEY).then(saved => {
+      if (saved != null) {
+        const n = parseFloat(saved);
+        if (isFinite(n) && n >= VAD_SILENCE_MIN_SEC && n <= VAD_SILENCE_MAX_SEC) {
+          setVadSilenceSec(n);
+        }
+      }
+    });
+    AsyncStorage.getItem('aria_xtts_voice').then(saved => {
+      if (saved) setXttsVoice(saved);
+    });
+    // Voice-Liste vom XTTS-Server holen (via RVS)
+    rvs.send('xtts_list_voices' as any, {});
  }, []);

+  // Speichergroesse berechnen
+  useEffect(() => {
+    const calcSize = async () => {
+      try {
+        const exists = await RNFS.exists(storagePath);
+        if (!exists) { setStorageSize('0 KB'); return; }
+        const items = await RNFS.readDir(storagePath);
+        const totalBytes = items.reduce((sum, f) => sum + (f.size || 0), 0);
+        if (totalBytes > 1024 * 1024) {
+          setStorageSize(`${(totalBytes / 1024 / 1024).toFixed(1)} MB (${items.length} Dateien)`);
+        } else {
+          setStorageSize(`${Math.round(totalBytes / 1024)} KB (${items.length} Dateien)`);
+        }
+      } catch { setStorageSize('nicht verfuegbar'); }
+    };
+    calcSize();
+  }, [storagePath]);
+
+  const saveStoragePath = useCallback(async (newPath: string) => {
+    const clean = newPath.trim();
+    if (!clean) return;
+    await AsyncStorage.setItem(STORAGE_PATH_KEY, clean);
+    setStoragePath(clean);
+    setEditingPath(false);
+    Alert.alert('Gespeichert', `Neuer Speicherort:\n${clean}\n\nWird ab der naechsten Nachricht verwendet.`);
+  }, []);
+
+  const showPathPicker = useCallback(() => {
+    Alert.alert(
+      'Speicherort waehlen',
+      'Wo sollen Anhaenge gespeichert werden?',
+      [
+        {
+          text: 'Ordner auswaehlen...',
+          onPress: async () => {
+            try {
+              const result = await DocumentPicker.pickDirectory();
+              if (result?.uri) {
+                // SAF URI decodieren (content://com.android.externalstorage...)
+                const decoded = decodeURIComponent(result.uri);
+                // Versuche einen lesbaren Pfad zu extrahieren
+                const match = decoded.match(/primary[:%]3A(.+)/);
+                const readablePath = match
+                  ? `/storage/emulated/0/${match[1].replace(/%2F|%3A/g, '/')}`
+                  : decoded;
+                saveStoragePath(readablePath);
+              }
+            } catch (e: any) {
+              if (!DocumentPicker.isCancel(e)) {
+                Alert.alert('Fehler', 'Ordnerauswahl fehlgeschlagen');
+              }
+            }
+          },
+        },
+        {
+          text: 'App-intern (Standard)',
+          onPress: () => saveStoragePath(DEFAULT_STORAGE_PATH),
+        },
+        {
+          text: 'Pfad manuell eingeben',
+          onPress: () => { setTempPath(storagePath); setEditingPath(true); },
+        },
+        { text: 'Abbrechen', style: 'cancel' as const },
+      ],
+    );
+  }, [storagePath]);
+
+  const clearStorageCache = useCallback(async () => {
+    Alert.alert(
+      'Cache loeschen',
+      `Alle lokalen Anhaenge in\n${storagePath}\nloeschen?\n\nDateien koennen ueber RVS erneut heruntergeladen werden.`,
+      [
+        { text: 'Abbrechen', style: 'cancel' },
+        {
+          text: 'Loeschen',
+          style: 'destructive',
+          onPress: async () => {
+            try {
+              const exists = await RNFS.exists(storagePath);
+              if (exists) await RNFS.unlink(storagePath);
+              await RNFS.mkdir(storagePath);
+              setStorageSize('0 KB (0 Dateien)');
+              Alert.alert('Erledigt', 'Cache geleert. Anhaenge werden bei Bedarf neu geladen.');
+            } catch (e: any) {
+              Alert.alert('Fehler', e.message);
+            }
+          },
+        },
+      ],
+    );
+  }, [storagePath]);
+
  // RVS-Nachrichten und Verbindungslog abonnieren
  useEffect(() => {
    const unsubState = rvs.onStateChange(setConnectionState);
@@ -111,6 +265,47 @@ const SettingsScreen: React.FC = () => {
        const mode = message.payload.mode as string;
        if (mode) setCurrentMode(mode);
      }
+
+      // XTTS-Voice-Liste
+      if (message.type === ('xtts_voices_list' as any)) {
+        const voices = ((message.payload as any).voices || []) as Array<{name: string, size: number}>;
+        setAvailableVoices(voices);
+      }
+
+      // Voice wurde gespeichert → Liste neu laden + ggf. auswaehlen
+      if (message.type === ('xtts_voice_saved' as any)) {
+        const name = (message.payload as any).name as string;
+        if (name) {
+          setXttsVoice(name);
+          AsyncStorage.setItem('aria_xtts_voice', name);
+        }
+        rvs.send('xtts_list_voices' as any, {});
+      }
+
+      // Diagnostic-Voice-Wechsel → lokale App-Stimme auf den neuen Default zuruecksetzen.
+      // Zusaetzlich Preload triggern, damit der User weiss wann's geladen ist.
+      if (message.type === ('config' as any)) {
+        const newVoice = ((message.payload as any).xttsVoice as string) ?? '';
+        setXttsVoice(newVoice);
+        AsyncStorage.setItem('aria_xtts_voice', newVoice);
+        if (newVoice) {
+          setLoadingVoice(newVoice);
+        }
+      }
+
+      // XTTS-Bridge meldet: Stimme fertig geladen
+      if (message.type === ('voice_ready' as any)) {
+        const v = ((message.payload as any).voice as string) ?? '';
+        const err = (message.payload as any).error as string | undefined;
+        const ms = (message.payload as any).loadMs as number | undefined;
+        setLoadingVoice(null);
+        if (err) {
+          ToastAndroid.show(`Stimme "${v}" konnte nicht geladen werden: ${err}`, ToastAndroid.LONG);
+        } else {
+          const suffix = ms ? ` (${(ms / 1000).toFixed(1)}s)` : '';
+          ToastAndroid.show(`Stimme "${v || 'Standard'}" bereit${suffix}`, ToastAndroid.SHORT);
+        }
+      }
    });

    return () => {
@@ -174,6 +369,43 @@ const SettingsScreen: React.FC = () => {
    // In Produktion: Wert in AsyncStorage persistieren
  }, []);

+  // --- XTTS Voice ---
+
+  const selectVoice = useCallback((voiceName: string) => {
+    setXttsVoice(voiceName);
+    AsyncStorage.setItem('aria_xtts_voice', voiceName);
+    // Preload nur fuer Custom-Voices — "Standard" braucht keinen Ladevorgang
+    if (voiceName) {
+      setLoadingVoice(voiceName);
+      rvs.send('voice_preload' as any, { voice: voiceName, source: 'app' });
+    } else {
+      setLoadingVoice(null);
+    }
+  }, []);
+
+  const deleteVoice = useCallback((name: string) => {
+    Alert.alert(
+      'Stimme loeschen',
+      `Stimme "${name}" vom Server endgueltig loeschen?\nAlle Apps verlieren sie.`,
+      [
+        { text: 'Abbrechen', style: 'cancel' },
+        {
+          text: 'Loeschen',
+          style: 'destructive',
+          onPress: () => {
+            rvs.send('xtts_delete_voice' as any, { name });
+            if (xttsVoice === name) {
+              setXttsVoice('');
+              AsyncStorage.setItem('aria_xtts_voice', '');
+            }
+            // Liste nach kurzer Wartezeit neu laden (XTTS-Bridge schickt eh neue Liste)
+            setTimeout(() => rvs.send('xtts_list_voices' as any, {}), 500);
+          },
+        },
+      ],
+    );
+  }, [xttsVoice]);
+
  // --- Modus aendern ---

  const handleModeChange = useCallback((modeId: string) => {
@@ -207,6 +439,10 @@ const SettingsScreen: React.FC = () => {
      onScan={handleQRScan}
      onClose={() => setScannerVisible(false)}
    />
+    <VoiceCloneModal
+      visible={voiceCloneVisible}
+      onClose={() => setVoiceCloneVisible(false)}
+    />
    <ScrollView style={styles.container} contentContainerStyle={styles.content}>

      {/* === Verbindung === */}
@@ -332,6 +568,239 @@ const SettingsScreen: React.FC = () => {
        </View>
      </View>

+      {/* === Spracheingabe (geraetelokal) === */}
+      <Text style={styles.sectionTitle}>Spracheingabe</Text>
+      <View style={styles.card}>
+        <Text style={styles.toggleLabel}>Stille-Toleranz</Text>
+        <Text style={styles.toggleHint}>
+          Wie lange du eine Sprechpause machen darfst, bevor die Aufnahme
+          automatisch beendet und gesendet wird. Hoeher = mehr Zeit zum
+          Nachdenken; niedriger = schnelleres Senden.
+          Default: {VAD_SILENCE_DEFAULT_SEC.toFixed(1)}s.
+        </Text>
+        <View style={styles.prerollRow}>
+          <TouchableOpacity
+            style={styles.prerollButton}
+            onPress={() => {
+              const next = Math.max(VAD_SILENCE_MIN_SEC, Math.round((vadSilenceSec - 0.5) * 10) / 10);
+              setVadSilenceSec(next);
+              AsyncStorage.setItem(VAD_SILENCE_STORAGE_KEY, String(next));
+            }}
+            disabled={vadSilenceSec <= VAD_SILENCE_MIN_SEC}
+          >
+            <Text style={styles.prerollButtonText}>−0.5</Text>
+          </TouchableOpacity>
+          <Text style={styles.prerollValue}>{vadSilenceSec.toFixed(1)} s</Text>
+          <TouchableOpacity
+            style={styles.prerollButton}
+            onPress={() => {
+              const next = Math.min(VAD_SILENCE_MAX_SEC, Math.round((vadSilenceSec + 0.5) * 10) / 10);
+              setVadSilenceSec(next);
+              AsyncStorage.setItem(VAD_SILENCE_STORAGE_KEY, String(next));
+            }}
+            disabled={vadSilenceSec >= VAD_SILENCE_MAX_SEC}
+          >
+            <Text style={styles.prerollButtonText}>+0.5</Text>
+          </TouchableOpacity>
+        </View>
+      </View>
+
+      {/* === Sprachausgabe (geraetelokal) === */}
+      <Text style={styles.sectionTitle}>Sprachausgabe</Text>
+      <View style={styles.card}>
+        <View style={styles.toggleRow}>
+          <View style={styles.toggleInfo}>
+            <Text style={styles.toggleLabel}>Sprachausgabe auf diesem Geraet</Text>
+            <Text style={styles.toggleHint}>
+              Nur lokal — andere Geraete sind unabhaengig.
+              Wenn aus, erscheint im Chat auch kein Mund-Button.
+            </Text>
+          </View>
+          <Switch
+            value={ttsEnabled}
+            onValueChange={(val) => {
+              setTtsEnabled(val);
+              AsyncStorage.setItem('aria_tts_enabled', String(val));
+            }}
+            trackColor={{ false: '#2A2A3E', true: '#0096FF' }}
+            thumbColor={ttsEnabled ? '#FFFFFF' : '#666680'}
+          />
+        </View>
+
+        {ttsEnabled && (
+          <View style={{marginTop: 20}}>
+            <Text style={styles.toggleLabel}>Puffer vor Wiedergabestart</Text>
+            <Text style={styles.toggleHint}>
+              Wie viel Audio gesammelt wird bevor die Wiedergabe startet.
+              Hoeher = robuster gegen Render-Pausen, aber mehr Startverzoegerung.
+              Default: {TTS_PREROLL_DEFAULT_SEC.toFixed(1)}s.
+            </Text>
+            <View style={styles.prerollRow}>
+              <TouchableOpacity
+                style={styles.prerollButton}
+                onPress={() => {
+                  const next = Math.max(TTS_PREROLL_MIN_SEC, Math.round((ttsPrerollSec - 0.5) * 10) / 10);
+                  setTtsPrerollSec(next);
+                  AsyncStorage.setItem(TTS_PREROLL_STORAGE_KEY, String(next));
+                }}
+                disabled={ttsPrerollSec <= TTS_PREROLL_MIN_SEC}
+              >
+                <Text style={styles.prerollButtonText}>−0.5</Text>
+              </TouchableOpacity>
+              <Text style={styles.prerollValue}>{ttsPrerollSec.toFixed(1)} s</Text>
+              <TouchableOpacity
+                style={styles.prerollButton}
+                onPress={() => {
+                  const next = Math.min(TTS_PREROLL_MAX_SEC, Math.round((ttsPrerollSec + 0.5) * 10) / 10);
+                  setTtsPrerollSec(next);
+                  AsyncStorage.setItem(TTS_PREROLL_STORAGE_KEY, String(next));
+                }}
+                disabled={ttsPrerollSec >= TTS_PREROLL_MAX_SEC}
+              >
+                <Text style={styles.prerollButtonText}>+0.5</Text>
+              </TouchableOpacity>
+            </View>
+          </View>
+        )}
+
+        {ttsEnabled && (
+          <View style={{marginTop: 20}}>
+            <Text style={styles.toggleLabel}>Stimme (geraetelokal)</Text>
+            <Text style={styles.toggleHint}>
+              Eigene Wahl fuer dieses Geraet. Ohne Auswahl gilt der Diagnostic-Default.
+            </Text>
+
+            {/* Default-Option */}
+            <TouchableOpacity
+              style={[styles.voiceRow, xttsVoice === '' && styles.voiceRowActive]}
+              onPress={() => selectVoice('')}
+            >
+              <Text style={[styles.voiceRowName, xttsVoice === '' && styles.voiceRowNameActive]}>
+                Standard (Diagnostic-Default)
+              </Text>
+              {xttsVoice === '' && <Text style={styles.voiceRowCheck}>{'\u2713'}</Text>}
+            </TouchableOpacity>
+
+            {availableVoices.length === 0 ? (
+              <Text style={[styles.toggleHint, {marginTop: 8, textAlign: 'center'}]}>
+                Keine eigenen Stimmen auf dem XTTS-Server.
+              </Text>
+            ) : (
+              availableVoices.map(v => (
+                <View key={v.name} style={[styles.voiceRow, xttsVoice === v.name && styles.voiceRowActive]}>
+                  <TouchableOpacity
+                    style={{flex: 1}}
+                    onPress={() => selectVoice(v.name)}
+                  >
+                    <Text style={[styles.voiceRowName, xttsVoice === v.name && styles.voiceRowNameActive]}>
+                      {v.name}
+                    </Text>
+                    <Text style={styles.voiceRowMeta}>{(v.size / 1024).toFixed(0)} KB</Text>
+                  </TouchableOpacity>
+                  {loadingVoice === v.name && (
+                    <ActivityIndicator size="small" color="#0096FF" style={{marginRight: 8}} />
+                  )}
+                  {xttsVoice === v.name && loadingVoice !== v.name && <Text style={styles.voiceRowCheck}>{'\u2713'}</Text>}
+                  <TouchableOpacity onPress={() => deleteVoice(v.name)} style={styles.voiceRowDelete}>
+                    <Text style={styles.voiceRowDeleteIcon}>X</Text>
+                  </TouchableOpacity>
+                </View>
+              ))
+            )}
+
+            <View style={{flexDirection: 'row', gap: 8, marginTop: 12}}>
+              <TouchableOpacity
+                style={[styles.connectButton, {flex: 1}]}
+                onPress={() => setVoiceCloneVisible(true)}
+              >
+                <Text style={styles.connectButtonText}>{'\uD83C\uDFA4'} Eigene Stimme aufnehmen</Text>
+              </TouchableOpacity>
+              <TouchableOpacity
+                style={[styles.clearButton, {flex: 0.4, marginTop: 0}]}
+                onPress={() => rvs.send('xtts_list_voices' as any, {})}
+              >
+                <Text style={styles.clearButtonText}>Aktualisieren</Text>
+              </TouchableOpacity>
+            </View>
+          </View>
+        )}
+      </View>
+
+      {/* === Speicher === */}
+      <Text style={styles.sectionTitle}>Anhang-Speicher</Text>
+      <View style={styles.card}>
+        <View style={styles.toggleRow}>
+          <View style={styles.toggleInfo}>
+            <Text style={styles.toggleLabel}>Auto-Download</Text>
+            <Text style={styles.toggleHint}>
+              Fehlende Anhaenge beim App-Start automatisch vom Server laden
+            </Text>
+          </View>
+          <Switch
+            value={autoDownload}
+            onValueChange={(val) => {
+              setAutoDownload(val);
+              AsyncStorage.setItem('aria_auto_download', String(val));
+            }}
+            trackColor={{ false: '#2A2A3E', true: '#0096FF' }}
+            thumbColor={autoDownload ? '#FFFFFF' : '#666680'}
+          />
+        </View>
+
+        <View style={{height: 16}} />
+        <Text style={styles.toggleLabel}>Lokaler Speicherort</Text>
+        <Text style={styles.toggleHint}>
+          Hier werden Bilder und Dateien aus dem Chat gespeichert.
+          {autoDownload ? ' Fehlende Dateien werden automatisch nachgeladen.' : ' Fehlende Dateien koennen per Tippen geladen werden.'}
+        </Text>
+
+        {editingPath ? (
+          <View style={{marginTop: 10}}>
+            <TextInput
+              style={styles.input}
+              value={tempPath}
+              onChangeText={setTempPath}
+              placeholder="z.B. /storage/emulated/0/ARIA/attachments"
+              placeholderTextColor="#555570"
+              autoCapitalize="none"
+            />
+            <View style={{flexDirection: 'row', gap: 8}}>
+              <TouchableOpacity
+                style={[styles.connectButton, {flex: 1}]}
+                onPress={() => saveStoragePath(tempPath)}
+              >
+                <Text style={styles.connectButtonText}>Speichern</Text>
+              </TouchableOpacity>
+              <TouchableOpacity
+                style={[styles.clearButton, {flex: 1, marginTop: 0}]}
+                onPress={() => setEditingPath(false)}
+              >
+                <Text style={styles.clearButtonText}>Abbrechen</Text>
+              </TouchableOpacity>
+            </View>
+          </View>
+        ) : (
+          <View style={{marginTop: 10}}>
+            <Text style={styles.storagePathText} numberOfLines={2}>{storagePath}</Text>
+            <Text style={styles.storageSizeText}>{storageSize}</Text>
+            <View style={{flexDirection: 'row', gap: 8, marginTop: 8}}>
+              <TouchableOpacity
+                style={[styles.clearButton, {flex: 1, marginTop: 0}]}
+                onPress={showPathPicker}
+              >
+                <Text style={styles.clearButtonText}>Pfad aendern</Text>
+              </TouchableOpacity>
+              <TouchableOpacity
+                style={[styles.clearButton, {flex: 1, marginTop: 0, backgroundColor: 'rgba(255,59,48,0.15)'}]}
+                onPress={clearStorageCache}
+              >
+                <Text style={[styles.clearButtonText, {color: '#FF3B30'}]}>Cache leeren</Text>
+              </TouchableOpacity>
+            </View>
+          </View>
+        )}
+      </View>
+
      {/* === Logs === */}
      <Text style={styles.sectionTitle}>Protokoll</Text>
      <View style={styles.card}>
@@ -416,11 +885,21 @@ const SettingsScreen: React.FC = () => {
      <Text style={styles.sectionTitle}>{'\u00DC'}ber</Text>
      <View style={styles.card}>
        <Text style={styles.aboutTitle}>ARIA Cockpit</Text>
-        <Text style={styles.aboutVersion}>Version 0.1.0 (Alpha)</Text>
+        <Text style={styles.aboutVersion}>Version {require('../../package.json').version}</Text>
        <Text style={styles.aboutInfo}>
          Stefans Kommandozentrale f{'\u00FC'}r ARIA.{'\n'}
          Gebaut mit React Native + TypeScript.
        </Text>
+        <TouchableOpacity
+          style={[styles.connectButton, {marginTop: 12}]}
+          onPress={() => {
+            const updateService = require('../services/updater').default;
+            updateService.checkForUpdate();
+            Alert.alert('Update-Check', 'Pruefe auf neue Version...');
+          }}
+        >
+          <Text style={styles.connectButtonText}>Auf Updates pr{'\u00FC'}fen</Text>
+        </TouchableOpacity>
      </View>

      {/* Platz am Ende */}
@@ -559,6 +1038,99 @@ const styles = StyleSheet.create({
    marginTop: 2,
  },

+  // XTTS Voice List
+  voiceRow: {
+    flexDirection: 'row',
+    alignItems: 'center',
+    backgroundColor: '#1E1E2E',
+    borderRadius: 8,
+    padding: 10,
+    marginTop: 6,
+    borderWidth: 1,
+    borderColor: 'transparent',
+  },
+  voiceRowActive: {
+    borderColor: '#0096FF',
+    backgroundColor: '#0D1A2E',
+  },
+  voiceRowName: {
+    color: '#CCCCDD',
+    fontSize: 14,
+    fontWeight: '500',
+  },
+  voiceRowNameActive: {
+    color: '#FFFFFF',
+  },
+  voiceRowMeta: {
+    color: '#666680',
+    fontSize: 11,
+    marginTop: 2,
+  },
+  voiceRowCheck: {
+    color: '#34C759',
+    fontSize: 16,
+    fontWeight: '700',
+    marginHorizontal: 6,
+  },
+  voiceRowDelete: {
+    width: 28,
+    height: 28,
+    borderRadius: 14,
+    backgroundColor: 'rgba(255,59,48,0.2)',
+    alignItems: 'center',
+    justifyContent: 'center',
+    marginLeft: 4,
+  },
+  voiceRowDeleteIcon: {
+    color: '#FF3B30',
+    fontSize: 12,
+    fontWeight: '700',
+  },
+
+  // Stimmen
+  voiceBtn: {
+    flex: 1,
+    padding: 12,
+    borderRadius: 10,
+    backgroundColor: '#1E1E2E',
+    alignItems: 'center',
+    borderWidth: 2,
+    borderColor: 'transparent',
+  },
+  voiceBtnActive: {
+    borderColor: '#0096FF',
+    backgroundColor: '#0D1A2E',
+  },
+  voiceBtnIcon: {
+    fontSize: 28,
+    marginBottom: 4,
+  },
+  voiceBtnText: {
+    color: '#8888AA',
+    fontSize: 14,
+    fontWeight: '600',
+  },
+  voiceBtnTextActive: {
+    color: '#FFFFFF',
+  },
+  voiceBtnHint: {
+    color: '#555570',
+    fontSize: 11,
+    marginTop: 2,
+  },
+
+  // Speicher
+  storagePathText: {
+    color: '#0096FF',
+    fontSize: 12,
+    fontFamily: Platform.OS === 'ios' ? 'Menlo' : 'monospace',
+  },
+  storageSizeText: {
+    color: '#8888AA',
+    fontSize: 12,
+    marginTop: 4,
+  },
+
  // Logs
  tabRow: {
    flexDirection: 'row',
@@ -685,6 +1257,34 @@ const styles = StyleSheet.create({
  bottomSpacer: {
    height: 40,
  },
+
+  prerollRow: {
+    flexDirection: 'row',
+    alignItems: 'center',
+    justifyContent: 'center',
+    marginTop: 12,
+    gap: 16,
+  },
+  prerollButton: {
+    backgroundColor: '#2A2A3E',
+    paddingHorizontal: 18,
+    paddingVertical: 10,
+    borderRadius: 8,
+    minWidth: 72,
+    alignItems: 'center',
+  },
+  prerollButtonText: {
+    color: '#FFFFFF',
+    fontSize: 16,
+    fontWeight: '600',
+  },
+  prerollValue: {
+    color: '#FFFFFF',
+    fontSize: 20,
+    fontWeight: '700',
+    minWidth: 80,
+    textAlign: 'center',
+  },
 });

 export default SettingsScreen;
@@ -6,9 +6,10 @@
 * Nutzt react-native-audio-recorder-player fuer Aufnahme.
 */

-import { Platform, PermissionsAndroid } from 'react-native';
+import { Platform, PermissionsAndroid, NativeModules } from 'react-native';
 import Sound from 'react-native-sound';
 import RNFS from 'react-native-fs';
+import AsyncStorage from '@react-native-async-storage/async-storage';
 import AudioRecorderPlayer, {
  AudioEncoderAndroidType,
  AudioSourceAndroidType,
@@ -16,6 +17,38 @@ import AudioRecorderPlayer, {
  OutputFormatAndroidType,
 } from 'react-native-audio-recorder-player';

+// Base64-Encoder fuer Binary-Strings (Header-Bytes → Base64)
+const B64_CHARS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/';
+function btoaSafe(bin: string): string {
+  let out = '';
+  const len = bin.length;
+  for (let i = 0; i < len; i += 3) {
+    const b1 = bin.charCodeAt(i) & 0xff;
+    const b2 = i + 1 < len ? bin.charCodeAt(i + 1) & 0xff : 0;
+    const b3 = i + 2 < len ? bin.charCodeAt(i + 2) & 0xff : 0;
+    out += B64_CHARS[b1 >> 2];
+    out += B64_CHARS[((b1 & 0x03) << 4) | (b2 >> 4)];
+    out += i + 1 < len ? B64_CHARS[((b2 & 0x0f) << 2) | (b3 >> 6)] : '=';
+    out += i + 2 < len ? B64_CHARS[b3 & 0x3f] : '=';
+  }
+  return out;
+}
+
+// Native Module fuer Audio-Focus (Ducking/Muten anderer Apps)
+const { AudioFocus, PcmStreamPlayer } = NativeModules as {
+  AudioFocus?: {
+    requestDuck: () => Promise<boolean>;
+    requestExclusive: () => Promise<boolean>;
+    release: () => Promise<boolean>;
+  };
+  PcmStreamPlayer?: {
+    start: (sampleRate: number, channels: number, prerollSeconds: number) => Promise<boolean>;
+    writeChunk: (base64Pcm: string) => Promise<boolean>;
+    end: () => Promise<boolean>;
+    stop: () => Promise<boolean>;
+  };
+};
+
 // --- Typen ---

 export interface RecordingResult {
@@ -41,7 +74,52 @@ const AUDIO_ENCODING = 'audio/wav';

 // VAD (Voice Activity Detection) — Stille-Erkennung
 const VAD_SILENCE_THRESHOLD_DB = -45;  // dB unter dem als "Stille" gilt
-const VAD_SILENCE_DURATION_MS = 1800;  // ms Stille bevor Auto-Stop
+const VAD_SPEECH_THRESHOLD_DB = -28;   // dB ueber dem als "Sprache" gilt (Sprach-Gate) — hoeher = weniger Umgebungsgeraeusche
+const VAD_SPEECH_MIN_MS = 500;         // ms Sprache bevor Aufnahme zaehlt — laenger = keine Huestler/Klopfer mehr
+
+// VAD-Stille (in Sekunden) — wie lange Sprechpause toleriert wird, bevor
+// die Aufnahme automatisch beendet wird. Einstellbar in den App-Settings.
+export const VAD_SILENCE_DEFAULT_SEC = 2.8;
+export const VAD_SILENCE_MIN_SEC = 1.0;
+export const VAD_SILENCE_MAX_SEC = 8.0;
+export const VAD_SILENCE_STORAGE_KEY = 'aria_vad_silence_sec';
+
+async function loadVadSilenceMs(): Promise<number> {
+  try {
+    const raw = await AsyncStorage.getItem(VAD_SILENCE_STORAGE_KEY);
+    if (raw != null) {
+      const n = parseFloat(raw);
+      if (isFinite(n) && n >= VAD_SILENCE_MIN_SEC && n <= VAD_SILENCE_MAX_SEC) {
+        return Math.round(n * 1000);
+      }
+    }
+  } catch {}
+  return Math.round(VAD_SILENCE_DEFAULT_SEC * 1000);
+}
+
+// Max-Dauer einer Aufnahme (Notbremse gegen Runaway-Loops). Auf 2 Minuten
+// hochgezogen damit auch laengere Erklaerungen durchgehen.
+const MAX_RECORDING_MS = 120000;
+
+// Pre-Roll: Wie lange Audio im AudioTrack-Buffer liegt bevor play() startet.
+// Einstellbar via Diagnostic/Settings (Key: aria_tts_preroll_sec).
+export const TTS_PREROLL_DEFAULT_SEC = 3.5;
+export const TTS_PREROLL_MIN_SEC = 1.0;
+export const TTS_PREROLL_MAX_SEC = 6.0;
+export const TTS_PREROLL_STORAGE_KEY = 'aria_tts_preroll_sec';
+
+async function loadPrerollSec(): Promise<number> {
+  try {
+    const raw = await AsyncStorage.getItem(TTS_PREROLL_STORAGE_KEY);
+    if (raw != null) {
+      const n = parseFloat(raw);
+      if (isFinite(n) && n >= TTS_PREROLL_MIN_SEC && n <= TTS_PREROLL_MAX_SEC) {
+        return n;
+      }
+    }
+  } catch {}
+  return TTS_PREROLL_DEFAULT_SEC;
+}

 // --- Audio-Service ---

@@ -55,10 +133,30 @@ class AudioService {
  private recorder: AudioRecorderPlayer;
  private recordingPath: string = '';

+  // Audio-Queue fuer sequentielle TTS-Wiedergabe
+  private audioQueue: string[] = [];
+  private isPlaying: boolean = false;
+  private preloadedSound: Sound | null = null;
+  private preloadedPath: string = '';
+
+  // Sprach-Gate: Aufnahme erst senden wenn tatsaechlich gesprochen wurde
+  private speechDetected: boolean = false;
+  private speechStartTime: number = 0;
+
+  // PCM-Stream (XTTS): aktive Session + Cache-Puffer pro messageId
+  private pcmStreamActive: boolean = false;
+  private pcmMessageId: string = '';
+  private pcmSampleRate: number = 24000;
+  private pcmChannels: number = 1;
+  private pcmBuffer: string[] = []; // base64-chunks zum spaeteren WAV-Build
+  private pcmBytesCollected: number = 0;
+  private readonly PCM_MAX_CACHE_BYTES = 30 * 1024 * 1024; // 30MB
+
  // VAD State
  private vadEnabled: boolean = false;
  private lastSpeechTime: number = 0;
  private vadTimer: ReturnType<typeof setInterval> | null = null;
+  private maxDurationTimer: ReturnType<typeof setTimeout> | null = null;

  constructor() {
    this.recorder = new AudioRecorderPlayer();
@@ -108,6 +206,10 @@ class AudioService {
      // Laufende Wiedergabe stoppen (damit ARIA sich nicht selbst hoert)
      this.stopPlayback();

+      // Aufraeumen: Alte aria_recording_ und aria_tts_ Files loeschen
+      // (Schutz gegen Cache-Ueberlauf im Gespraechsmodus bei vielen Zyklen)
+      this._cleanupStaleCacheFiles().catch(() => {});
+
      this.recordingPath = `${RNFS.CachesDirectoryPath}/aria_recording_${Date.now()}.mp4`;

      // Aufnahme mit Metering starten
@@ -115,6 +217,8 @@ class AudioService {
        AudioEncoderAndroid: AudioEncoderAndroidType.AAC,
        AudioSourceAndroid: AudioSourceAndroidType.MIC,
        OutputFormatAndroid: OutputFormatAndroidType.MPEG_4,
+        AudioSamplingRateAndroid: 16000,
+        AudioChannelsAndroid: 1,
      }, true); // meteringEnabled = true

      // Metering-Callback
@@ -122,7 +226,21 @@ class AudioService {
        const db = e.currentMetering ?? -160;
        this.meterListeners.forEach(cb => cb(db));

-        // VAD: Stille erkennen
+        // Sprach-Gate: Erkennen ob tatsaechlich gesprochen wird
+        if (db > VAD_SPEECH_THRESHOLD_DB) {
+          if (!this.speechDetected && this.speechStartTime === 0) {
+            this.speechStartTime = Date.now();
+          }
+          if (this.speechStartTime > 0 && Date.now() - this.speechStartTime >= VAD_SPEECH_MIN_MS) {
+            this.speechDetected = true;
+          }
+        } else {
+          if (!this.speechDetected) {
+            this.speechStartTime = 0; // Reset wenn noch nicht als Sprache erkannt
+          }
+        }
+
+        // VAD: Stille erkennen (nur wenn Sprache erkannt wurde)
        if (this.vadEnabled) {
          if (db > VAD_SILENCE_THRESHOLD_DB) {
            this.lastSpeechTime = Date.now();
@@ -132,18 +250,30 @@ class AudioService {

      this.recordingStartTime = Date.now();
      this.lastSpeechTime = Date.now();
+      this.speechDetected = false;
+      this.speechStartTime = 0;
      this.setState('recording');

-      // VAD aktivieren
+      // Andere Apps waehrend der Aufnahme pausieren (Musik, Videos etc.)
+      AudioFocus?.requestExclusive().catch(() => {});
+
+      // VAD aktivieren — Stille-Dauer aus AsyncStorage (Settings-konfigurierbar)
      this.vadEnabled = autoStop;
      if (autoStop) {
+        const vadSilenceMs = await loadVadSilenceMs();
+        console.log('[Audio] VAD-Stille:', vadSilenceMs, 'ms');
        this.vadTimer = setInterval(() => {
          const silenceDuration = Date.now() - this.lastSpeechTime;
-          if (silenceDuration >= VAD_SILENCE_DURATION_MS) {
+          if (silenceDuration >= vadSilenceMs) {
            console.log(`[Audio] VAD: ${silenceDuration}ms Stille — Auto-Stop`);
            this.silenceListeners.forEach(cb => cb());
          }
        }, 200);
+        // Notbremse: Nach MAX_RECORDING_MS zwangsweise stoppen
+        this.maxDurationTimer = setTimeout(() => {
+          console.warn(`[Audio] Max-Dauer ${MAX_RECORDING_MS}ms erreicht — Zwangs-Stop`);
+          this.silenceListeners.forEach(cb => cb());
+        }, MAX_RECORDING_MS);
      }

      console.log('[Audio] Aufnahme gestartet (autoStop: %s)', autoStop);
@@ -168,12 +298,28 @@ class AudioService {
      clearInterval(this.vadTimer);
      this.vadTimer = null;
    }
+    if (this.maxDurationTimer) {
+      clearTimeout(this.maxDurationTimer);
+      this.maxDurationTimer = null;
+    }

    try {
      await this.recorder.stopRecorder();
      this.recorder.removeRecordBackListener();

+      // Audio-Focus freigeben — andere Apps duerfen wieder
+      AudioFocus?.release().catch(() => {});
+
      const durationMs = Date.now() - this.recordingStartTime;
+      const hadSpeech = this.speechDetected;
+
+      // Sprach-Gate: Wenn keine Sprache erkannt → Aufnahme verwerfen
+      if (!hadSpeech) {
+        RNFS.unlink(this.recordingPath).catch(() => {});
+        this.setState('idle');
+        console.log('[Audio] Aufnahme verworfen — keine Sprache erkannt (nur Umgebungsgeraeusche)');
+        return null;
+      }

      // Audio-Datei als Base64 lesen
      const base64Data = await RNFS.readFile(this.recordingPath, 'base64');
@@ -182,7 +328,7 @@ class AudioService {
      RNFS.unlink(this.recordingPath).catch(() => {});

      this.setState('idle');
-      console.log(`[Audio] Aufnahme beendet (${durationMs}ms, ${Math.round(base64Data.length / 1024)}KB)`);
+      console.log(`[Audio] Aufnahme beendet (${durationMs}ms, ${Math.round(base64Data.length / 1024)}KB, Sprache erkannt)`);

      return {
        base64: base64Data,
@@ -198,47 +344,308 @@ class AudioService {

  // --- Wiedergabe ---

-  /** Base64-kodiertes Audio abspielen (z.B. TTS-Antwort von ARIA) */
+  /** Base64-kodiertes Audio in die Queue stellen und abspielen */
  async playAudio(base64Data: string): Promise<void> {
    if (!base64Data) return;

-    // Laufende Wiedergabe stoppen
-    this.stopPlayback();
-
-    try {
-      // Base64 -> temporaere WAV-Datei -> Sound abspielen
-      const tmpPath = `${RNFS.CachesDirectoryPath}/aria_tts_${Date.now()}.wav`;
-      await RNFS.writeFile(tmpPath, base64Data, 'base64');
-
-      this.currentSound = new Sound(tmpPath, '', (error) => {
-        if (error) {
-          console.error('[Audio] Fehler beim Laden:', error);
-          RNFS.unlink(tmpPath).catch(() => {});
-          return;
-        }
-        this.currentSound?.play((success) => {
-          if (success) {
-            console.log('[Audio] Wiedergabe abgeschlossen');
-          } else {
-            console.warn('[Audio] Wiedergabe fehlgeschlagen');
-          }
-          this.currentSound?.release();
-          this.currentSound = null;
-          RNFS.unlink(tmpPath).catch(() => {});
-        });
-      });
-    } catch (err) {
-      console.error('[Audio] Wiedergabefehler:', err);
+    this.audioQueue.push(base64Data);
+    if (!this.isPlaying) {
+      this._playNext();
    }
  }

-  /** Laufende Wiedergabe stoppen */
+  /** Base64-Audio persistent speichern. Gibt file:// Pfad zurueck (oder leer bei Fehler). */
+  async cacheAudio(base64Data: string, messageId: string): Promise<string> {
+    if (!base64Data || !messageId) return '';
+    try {
+      const dir = `${RNFS.DocumentDirectoryPath}/tts_cache`;
+      await RNFS.mkdir(dir).catch(() => {});
+      const path = `${dir}/${messageId}.wav`;
+      // Wenn Datei schon existiert (z.B. XTTS Chunks) → anhaengen statt ueberschreiben
+      const exists = await RNFS.exists(path);
+      if (exists) {
+        // Bestehende + neue Base64 laden, zusammenkleben (fuer jetzt: ueberschreiben)
+        // XTTS sendet mehrere Chunks — bei mehrfacher Ueberschreibung bleibt nur der letzte
+        // Fuer eine echte Konkatenation muesste WAV-Header gemerged werden
+        await RNFS.writeFile(path, base64Data, 'base64');
+      } else {
+        await RNFS.writeFile(path, base64Data, 'base64');
+      }
+      return `file://${path}`;
+    } catch (err) {
+      console.warn('[Audio] cacheAudio fehlgeschlagen:', err);
+      return '';
+    }
+  }
+
+  /** Einen PCM-Chunk aus einer audio_pcm Nachricht empfangen.
+   *  silent=true → nur cachen, nicht abspielen (z.B. wenn TTS geraetelokal gemutet).
+   *  Gibt bei final=true den Cache-Pfad zurueck (file://) oder '' wenn nicht gecached. */
+  async handlePcmChunk(payload: {
+    base64: string;
+    sampleRate?: number;
+    channels?: number;
+    messageId?: string;
+    chunk?: number;
+    final?: boolean;
+    silent?: boolean;
+  }): Promise<string> {
+    const silent = !!payload.silent;
+    if (!silent && !PcmStreamPlayer) {
+      console.warn('[Audio] PcmStreamPlayer Native Module nicht verfuegbar');
+      return '';
+    }
+
+    const messageId = payload.messageId || '';
+    const sampleRate = payload.sampleRate || 24000;
+    const channels = payload.channels || 1;
+    const base64 = payload.base64 || '';
+    const isFinal = !!payload.final;
+
+    // Neuer Stream? (messageId Wechsel oder nicht aktiv)
+    if (!this.pcmStreamActive || this.pcmMessageId !== messageId) {
+      if (this.pcmStreamActive && !silent) {
+        try { await PcmStreamPlayer!.stop(); } catch {}
+        this.pcmBuffer = [];
+        this.pcmBytesCollected = 0;
+      }
+      this.pcmStreamActive = true;
+      this.pcmMessageId = messageId;
+      this.pcmSampleRate = sampleRate;
+      this.pcmChannels = channels;
+      this.pcmBuffer = [];
+      this.pcmBytesCollected = 0;
+      if (!silent) {
+        const prerollSec = await loadPrerollSec();
+        try {
+          await PcmStreamPlayer!.start(sampleRate, channels, prerollSec);
+        } catch (err) {
+          console.error('[Audio] PcmStreamPlayer.start fehlgeschlagen:', err);
+          this.pcmStreamActive = false;
+          return '';
+        }
+        AudioFocus?.requestDuck().catch(() => {});
+      }
+    }
+
+    // Chunk — immer cachen, nur bei !silent auch abspielen
+    if (base64) {
+      if (!silent) {
+        try { await PcmStreamPlayer!.writeChunk(base64); } catch (err) { console.warn('[Audio] writeChunk', err); }
+      }
+      if (messageId && this.pcmBytesCollected < this.PCM_MAX_CACHE_BYTES) {
+        this.pcmBuffer.push(base64);
+        this.pcmBytesCollected += Math.floor(base64.length * 0.75);
+      }
+    }
+
+    if (isFinal) {
+      if (!silent) {
+        // end() resolved jetzt erst wenn der native Writer-Thread fertig
+        // ist (alle Samples ausgespielt) — danach erst AudioFocus freigeben,
+        // damit Spotify/YouTube nicht waehrend des Pre-Roll-Ausklangs
+        // wieder aufdrehen.
+        try { await PcmStreamPlayer!.end(); } catch {}
+        AudioFocus?.release().catch(() => {});
+      }
+      this.pcmStreamActive = false;
+
+      if (messageId && this.pcmBuffer.length > 0) {
+        const audioPath = await this._savePcmBufferAsWav(messageId);
+        this.pcmBuffer = [];
+        this.pcmBytesCollected = 0;
+        this.pcmMessageId = '';
+        return audioPath;
+      }
+      this.pcmMessageId = '';
+    }
+    return '';
+  }
+
+  /** Gesammelte PCM-Chunks als WAV speichern. Gibt file:// Pfad zurueck. */
+  private async _savePcmBufferAsWav(messageId: string): Promise<string> {
+    try {
+      const dir = `${RNFS.DocumentDirectoryPath}/tts_cache`;
+      await RNFS.mkdir(dir).catch(() => {});
+      const path = `${dir}/${messageId}.wav`;
+
+      // WAV-Header fuer PCM s16le
+      const sampleRate = this.pcmSampleRate;
+      const channels = this.pcmChannels;
+      const bitsPerSample = 16;
+      const byteRate = sampleRate * channels * bitsPerSample / 8;
+      const blockAlign = channels * bitsPerSample / 8;
+      const dataSize = this.pcmBytesCollected;
+      const fileSize = 36 + dataSize;
+
+      // Header als Base64 (44 bytes)
+      const header = new Uint8Array(44);
+      const dv = new DataView(header.buffer);
+      // "RIFF"
+      header[0] = 0x52; header[1] = 0x49; header[2] = 0x46; header[3] = 0x46;
+      dv.setUint32(4, fileSize, true);
+      // "WAVE"
+      header[8] = 0x57; header[9] = 0x41; header[10] = 0x56; header[11] = 0x45;
+      // "fmt "
+      header[12] = 0x66; header[13] = 0x6d; header[14] = 0x74; header[15] = 0x20;
+      dv.setUint32(16, 16, true);  // fmt chunk size
+      dv.setUint16(20, 1, true);    // PCM format
+      dv.setUint16(22, channels, true);
+      dv.setUint32(24, sampleRate, true);
+      dv.setUint32(28, byteRate, true);
+      dv.setUint16(32, blockAlign, true);
+      dv.setUint16(34, bitsPerSample, true);
+      // "data"
+      header[36] = 0x64; header[37] = 0x61; header[38] = 0x74; header[39] = 0x61;
+      dv.setUint32(40, dataSize, true);
+
+      // Header als base64
+      let headerB64 = '';
+      const chunk = 1024;
+      for (let i = 0; i < header.length; i += chunk) {
+        headerB64 += String.fromCharCode(...Array.from(header.slice(i, i + chunk)));
+      }
+      headerB64 = btoaSafe(headerB64);
+
+      // Datei schreiben: Header + alle PCM-Chunks
+      await RNFS.writeFile(path, headerB64, 'base64');
+      for (const b64 of this.pcmBuffer) {
+        await RNFS.appendFile(path, b64, 'base64');
+      }
+      console.log(`[Audio] PCM-Cache geschrieben: ${path} (${(dataSize / 1024).toFixed(0)}KB, ${this.pcmBuffer.length} chunks)`);
+      return `file://${path}`;
+    } catch (err) {
+      console.warn('[Audio] _savePcmBufferAsWav fehlgeschlagen:', err);
+      return '';
+    }
+  }
+
+  /** Audio aus lokaler Datei (file:// Pfad) in die Queue und abspielen. */
+  async playFromPath(filePath: string): Promise<void> {
+    if (!filePath) return;
+    try {
+      const cleanPath = filePath.replace(/^file:\/\//, '');
+      if (!(await RNFS.exists(cleanPath))) {
+        console.warn('[Audio] Cache-Datei existiert nicht mehr:', cleanPath);
+        return;
+      }
+      const b64 = await RNFS.readFile(cleanPath, 'base64');
+      this.playAudio(b64);
+    } catch (err) {
+      console.warn('[Audio] playFromPath fehlgeschlagen:', err);
+    }
+  }
+
+  // Callback wenn alle Audio-Teile abgespielt sind
+  private playbackFinishedListeners: (() => void)[] = [];
+
+  onPlaybackFinished(callback: () => void): () => void {
+    this.playbackFinishedListeners.push(callback);
+    return () => {
+      this.playbackFinishedListeners = this.playbackFinishedListeners.filter(cb => cb !== callback);
+    };
+  }
+
+  /** Naechstes Audio aus der Queue abspielen */
+  private async _playNext(): Promise<void> {
+    if (this.audioQueue.length === 0) {
+      this.isPlaying = false;
+      // Audio-Focus abgeben → andere Apps volle Lautstaerke
+      AudioFocus?.release().catch(() => {});
+      // Alle Audio-Teile abgespielt → Listener benachrichtigen
+      this.playbackFinishedListeners.forEach(cb => cb());
+      return;
+    }
+
+    // Beim ersten Playback-Start: andere Apps ducken
+    if (!this.isPlaying) {
+      AudioFocus?.requestDuck().catch(() => {});
+    }
+    this.isPlaying = true;
+
+    // Preloaded Sound verwenden wenn verfuegbar, sonst neu laden
+    let sound: Sound;
+    let soundPath: string;
+
+    if (this.preloadedSound) {
+      sound = this.preloadedSound;
+      soundPath = this.preloadedPath;
+      this.preloadedSound = null;
+      this.preloadedPath = '';
+      // Daten aus Queue entfernen (wurde schon preloaded)
+      this.audioQueue.shift();
+    } else {
+      const base64Data = this.audioQueue.shift()!;
+      try {
+        soundPath = `${RNFS.CachesDirectoryPath}/aria_tts_${Date.now()}.wav`;
+        await RNFS.writeFile(soundPath, base64Data, 'base64');
+        sound = await new Promise<Sound>((resolve, reject) => {
+          const s = new Sound(soundPath, '', (err) => err ? reject(err) : resolve(s));
+        });
+      } catch (err) {
+        console.error('[Audio] Laden fehlgeschlagen:', err);
+        this._playNext();
+        return;
+      }
+    }
+
+    this.currentSound = sound;
+
+    // Naechstes Audio schon vorbereiten waehrend dieses abspielt
+    this._preloadNext();
+
+    sound.play((success) => {
+      if (!success) console.warn('[Audio] Wiedergabe fehlgeschlagen');
+      sound.release();
+      this.currentSound = null;
+      RNFS.unlink(soundPath).catch(() => {});
+      this._playNext();
+    });
+  }
+
+  /** Naechstes Audio im Hintergrund vorladen (verhindert Stottern) */
+  private async _preloadNext(): Promise<void> {
+    if (this.audioQueue.length === 0 || this.preloadedSound) return;
+
+    const base64Data = this.audioQueue[0]; // Nicht shift — bleibt in Queue
+    try {
+      const tmpPath = `${RNFS.CachesDirectoryPath}/aria_tts_pre_${Date.now()}.wav`;
+      await RNFS.writeFile(tmpPath, base64Data, 'base64');
+      this.preloadedSound = await new Promise<Sound>((resolve, reject) => {
+        const s = new Sound(tmpPath, '', (err) => err ? reject(err) : resolve(s));
+      });
+      this.preloadedPath = tmpPath;
+    } catch {
+      this.preloadedSound = null;
+      this.preloadedPath = '';
+    }
+  }
+
+  /** Laufende Wiedergabe stoppen + Queue leeren */
  stopPlayback(): void {
+    this.audioQueue = [];
+    this.isPlaying = false;
    if (this.currentSound) {
      this.currentSound.stop();
      this.currentSound.release();
      this.currentSound = null;
    }
+    if (this.preloadedSound) {
+      this.preloadedSound.release();
+      this.preloadedSound = null;
+      if (this.preloadedPath) RNFS.unlink(this.preloadedPath).catch(() => {});
+      this.preloadedPath = '';
+    }
+    // PCM-Stream ebenfalls hart stoppen (Cancel/Abbruch)
+    if (this.pcmStreamActive) {
+      PcmStreamPlayer?.stop().catch(() => {});
+      this.pcmStreamActive = false;
+      this.pcmBuffer = [];
+      this.pcmBytesCollected = 0;
+      this.pcmMessageId = '';
+    }
+    // Audio-Focus freigeben
+    AudioFocus?.release().catch(() => {});
  }

  // --- Status & Callbacks ---
@@ -277,6 +684,46 @@ class AudioService {
      this.stateListeners.forEach(cb => cb(state));
    }
  }
+
+  /** Alte Aufnahme- und TTS-Files aus dem Cache loeschen (>30s alt). */
+  private async _cleanupStaleCacheFiles(): Promise<void> {
+    try {
+      const files = await RNFS.readDir(RNFS.CachesDirectoryPath);
+      const now = Date.now();
+      for (const f of files) {
+        if (!f.isFile()) continue;
+        if (!f.name.startsWith('aria_recording_') && !f.name.startsWith('aria_tts_')) continue;
+        const age = now - (f.mtime ? f.mtime.getTime() : 0);
+        if (age > 30000) {
+          await RNFS.unlink(f.path).catch(() => {});
+        }
+      }
+    } catch {
+      // silent — cleanup ist best-effort
+    }
+  }
+
+  /** Alte TTS-Cache-Dateien loeschen die nicht mehr referenziert sind (>30 Tage). */
+  async cleanupOldTTSCache(keepMessageIds: Set<string>, maxAgeDays = 30): Promise<void> {
+    try {
+      const dir = `${RNFS.DocumentDirectoryPath}/tts_cache`;
+      if (!(await RNFS.exists(dir))) return;
+      const files = await RNFS.readDir(dir);
+      const maxAgeMs = maxAgeDays * 24 * 60 * 60 * 1000;
+      const now = Date.now();
+      for (const f of files) {
+        if (!f.isFile() || !f.name.endsWith('.wav')) continue;
+        const messageId = f.name.replace(/\.wav$/, '');
+        const age = now - (f.mtime ? f.mtime.getTime() : 0);
+        // Loeschen wenn: nicht mehr referenziert UND aelter als X Tage
+        if (!keepMessageIds.has(messageId) && age > maxAgeMs) {
+          await RNFS.unlink(f.path).catch(() => {});
+        }
+      }
+    } catch {
+      // silent
+    }
+  }
 }

 // Singleton
@@ -12,7 +12,7 @@ import AsyncStorage from '@react-native-async-storage/async-storage';

 export type ConnectionState = 'connecting' | 'connected' | 'disconnected';

-export type MessageType = 'chat' | 'audio' | 'file' | 'location' | 'mode' | 'log' | 'event';
+export type MessageType = 'chat' | 'audio' | 'file' | 'location' | 'mode' | 'log' | 'event' | 'update_available' | string;

 export interface RVSMessage {
  type: MessageType;
@@ -0,0 +1,158 @@
+/**
+ * Auto-Update Service — prueft und installiert App-Updates via RVS
+ *
+ * Flow:
+ * 1. App sendet "update_check" mit aktueller Version an RVS
+ * 2. RVS vergleicht → sendet "update_available" mit Download-URL
+ * 3. App zeigt Benachrichtigung → User bestaetigt → Download + Install
+ */
+
+import { Alert, Linking, Platform, NativeModules } from 'react-native';
+import RNFS from 'react-native-fs';
+import rvs, { RVSMessage } from './rvs';
+
+// Version aus package.json (wird beim Build eingebettet)
+const packageJson = require('../../package.json');
+const APP_VERSION = packageJson.version || '0.0.0.0';
+
+type UpdateCallback = (info: UpdateInfo) => void;
+
+export interface UpdateInfo {
+  version: string;
+  downloadUrl: string;
+  size: number;
+}
+
+class UpdateService {
+  private listeners: UpdateCallback[] = [];
+  private checking = false;
+  private downloading = false;
+
+  constructor() {
+    // Auf update_available Nachrichten lauschen
+    rvs.onMessage((msg: RVSMessage) => {
+      if (msg.type === 'update_available' as any) {
+        const info: UpdateInfo = {
+          version: (msg.payload.version as string) || '',
+          downloadUrl: (msg.payload.downloadUrl as string) || '',
+          size: (msg.payload.size as number) || 0,
+        };
+        if (info.version && this.isNewer(info.version)) {
+          console.log(`[Update] Neue Version verfuegbar: ${info.version} (aktuell: ${APP_VERSION})`);
+          this.listeners.forEach(cb => cb(info));
+        }
+      }
+    });
+  }
+
+  /** Bei App-Start Update pruefen */
+  checkForUpdate(): void {
+    if (this.checking) return;
+    this.checking = true;
+
+    console.log(`[Update] Pruefe auf Updates (aktuell: ${APP_VERSION})`);
+    rvs.send('update_check' as any, { version: APP_VERSION });
+
+    setTimeout(() => { this.checking = false; }, 10000);
+  }
+
+  /** Callback registrieren */
+  onUpdateAvailable(callback: UpdateCallback): () => void {
+    this.listeners.push(callback);
+    return () => {
+      this.listeners = this.listeners.filter(cb => cb !== callback);
+    };
+  }
+
+  /** Update-Dialog anzeigen */
+  promptUpdate(info: UpdateInfo): void {
+    const sizeMB = (info.size / 1024 / 1024).toFixed(1);
+    Alert.alert(
+      'ARIA Update verfuegbar',
+      `Version ${info.version} (${sizeMB} MB)\n\nAktuell: ${APP_VERSION}\n\nJetzt herunterladen und installieren?`,
+      [
+        { text: 'Spaeter', style: 'cancel' },
+        {
+          text: 'Installieren',
+          onPress: () => this.downloadAndInstall(info),
+        },
+      ],
+    );
+  }
+
+  /** APK ueber WebSocket herunterladen und installieren */
+  async downloadAndInstall(info: UpdateInfo): Promise<void> {
+    if (this.downloading) return;
+    this.downloading = true;
+
+    try {
+      console.log(`[Update] Fordere APK v${info.version} an...`);
+      Alert.alert('Download gestartet', `Version ${info.version} wird ueber RVS heruntergeladen...`);
+
+      // APK ueber WebSocket anfordern
+      rvs.send('update_download' as any, {});
+
+      // Auf update_data warten (einmalig)
+      const apkData = await new Promise<{base64: string, fileName: string}>((resolve, reject) => {
+        const timeout = setTimeout(() => reject(new Error('Download-Timeout (60s)')), 60000);
+        const unsub = rvs.onMessage((msg: RVSMessage) => {
+          if ((msg.type as string) === 'update_data') {
+            clearTimeout(timeout);
+            unsub();
+            if (msg.payload.error) {
+              reject(new Error(msg.payload.error as string));
+            } else {
+              resolve({
+                base64: msg.payload.base64 as string,
+                fileName: msg.payload.fileName as string || `ARIA-${info.version}.apk`,
+              });
+            }
+          }
+        });
+      });
+
+      // Base64 als APK-Datei speichern
+      const destPath = `${RNFS.CachesDirectoryPath}/${apkData.fileName}`;
+      await RNFS.writeFile(destPath, apkData.base64, 'base64');
+      const fileSize = await RNFS.stat(destPath);
+      console.log(`[Update] APK gespeichert: ${destPath} (${(parseInt(fileSize.size) / 1024 / 1024).toFixed(1)}MB)`);
+
+      // APK installieren via natives ApkInstaller Module (FileProvider + Intent)
+      if (Platform.OS === 'android') {
+        try {
+          const { ApkInstaller } = NativeModules;
+          await ApkInstaller.install(destPath);
+        } catch (installErr: any) {
+          Alert.alert(
+            'APK heruntergeladen',
+            `Version ${info.version} gespeichert.\n\nBitte manuell installieren:\nDateimanager → ${apkData.fileName} antippen.\n\n(${installErr.message})`,
+          );
+        }
+      }
+    } catch (err: any) {
+      console.error(`[Update] Fehler: ${err.message}`);
+      Alert.alert('Update fehlgeschlagen', err.message);
+    } finally {
+      this.downloading = false;
+    }
+  }
+
+  /** Versionsvergleich */
+  private isNewer(remote: string): boolean {
+    const r = remote.split('.').map(Number);
+    const l = APP_VERSION.split('.').map(Number);
+    for (let i = 0; i < Math.max(r.length, l.length); i++) {
+      const diff = (r[i] || 0) - (l[i] || 0);
+      if (diff > 0) return true;
+      if (diff < 0) return false;
+    }
+    return false;
+  }
+
+  getCurrentVersion(): string {
+    return APP_VERSION;
+  }
+}
+
+const updateService = new UpdateService();
+export default updateService;
@@ -1,21 +1,13 @@
 /**
- * Wake Word Service — "ARIA" Erkennung
+ * Gespraechsmodus — "Ohr-Button"
 *
- * Nutzt react-native-live-audio-stream fuer kontinuierliches Mikrofon-Monitoring.
- * Erkennt Sprache per Energie-Schwellwert und sendet kurze Audio-Clips
- * zur serverseitigen Wake-Word-Pruefung (openwakeword in der Bridge).
+ * Wenn aktiv: Nach jeder ARIA-Antwort (TTS fertig) startet automatisch die Aufnahme.
+ * Wie ein Walkie-Talkie / natuerliches Gespraech:
+ *   ARIA spricht → Aufnahme startet → User spricht → VAD stoppt → ARIA antwortet → ...
 *
- * Architektur:
- *   App (Mikrofon) → Energie-Erkennung → Audio-Buffer
- *   → RVS "wake_check" → Bridge → openwakeword → Bestaetigung
- *   → App startet Aufnahme
- *
- * Aktuell (Phase 1): Einfacher Tap-to-Talk + Auto-Stop.
- * Spaeter (Phase 2): Porcupine on-device "ARIA" Keyword.
+ * Phase 2 (geplant): Porcupine "ARIA" Wake Word fuer passives Lauschen.
 */

-import LiveAudioStream from 'react-native-live-audio-stream';
-
 type WakeWordCallback = () => void;
 type StateCallback = (state: WakeWordState) => void;

@@ -25,72 +17,40 @@ class WakeWordService {
  private state: WakeWordState = 'off';
  private wakeCallbacks: WakeWordCallback[] = [];
  private stateCallbacks: StateCallback[] = [];
-  private isInitialized = false;

-  /** Wake Word Erkennung starten */
+  /** Gespraechsmodus starten */
  async start(): Promise<boolean> {
    if (this.state === 'listening') return true;
-
-    try {
-      if (!this.isInitialized) {
-        LiveAudioStream.init({
-          sampleRate: 16000,
-          channels: 1,
-          bitsPerSample: 16,
-          audioSource: 6, // VOICE_RECOGNITION
-          bufferSize: 4096,
-        });
-        this.isInitialized = true;
+    console.log('[WakeWord] Gespraechsmodus aktiviert — starte sofort Aufnahme');
+    this.setState('listening');
+    // Sofort erste Aufnahme starten
+    setTimeout(() => {
+      if (this.state === 'listening') {
+        this.wakeCallbacks.forEach(cb => cb());
      }
+    }, 500);
+    return true;
+  }

-      // Audio-Stream starten und auf Energie pruefen
-      LiveAudioStream.start();
+  /** Gespraechsmodus stoppen */
+  stop(): void {
+    console.log('[WakeWord] Gespraechsmodus deaktiviert');
+    this.setState('off');
+  }

-      LiveAudioStream.on('data', (base64Chunk: string) => {
-        if (this.state !== 'listening') return;
-
-        // Base64 → Int16 Array → RMS berechnen
-        const raw = this._base64ToInt16(base64Chunk);
-        const rms = this._calculateRMS(raw);
-
-        // Schwellwert: wenn laut genug → Wake Word erkannt
-        // Phase 1: Einfache Energie-Erkennung (jemand spricht)
-        // Phase 2: Porcupine "ARIA" Keyword
-        if (rms > 2000) {
-          this.setState('detected');
-          this.wakeCallbacks.forEach(cb => cb());
-          // Nach Detection kurz pausieren, Aufnahme uebernimmt das Mikrofon
-          this.stop();
-        }
-      });
-
-      this.setState('listening');
-      console.log('[WakeWord] Listening gestartet');
-      return true;
-    } catch (err) {
-      console.error('[WakeWord] Start fehlgeschlagen:', err);
-      return false;
+  /** Nach ARIA-Antwort (TTS fertig): Aufnahme automatisch starten */
+  async resume(): Promise<void> {
+    if (this.state !== 'listening') return;
+    // Kurze Pause damit TTS-Audio nicht ins Mikrofon geht
+    await new Promise(resolve => setTimeout(resolve, 800));
+    if (this.state === 'listening') {
+      console.log('[WakeWord] TTS fertig — starte automatisch Aufnahme');
+      this.wakeCallbacks.forEach(cb => cb());
    }
  }

-  /** Wake Word Erkennung stoppen */
-  stop(): void {
-    if (this.state === 'off') return;
-    try {
-      LiveAudioStream.stop();
-    } catch {}
-    this.setState('off');
-    console.log('[WakeWord] Gestoppt');
-  }
-
-  /** Nach Aufnahme erneut starten */
-  async resume(): Promise<void> {
-    // Kurze Pause damit Aufnahme das Mikrofon freigeben kann
-    setTimeout(() => {
-      if (this.state === 'off') {
-        this.start();
-      }
-    }, 500);
+  isActive(): boolean {
+    return this.state === 'listening';
  }

  // --- Callbacks ---
@@ -113,32 +73,12 @@ class WakeWordService {
    return this.state;
  }

-  // --- Hilfsfunktionen ---
-
  private setState(state: WakeWordState): void {
    if (this.state !== state) {
      this.state = state;
      this.stateCallbacks.forEach(cb => cb(state));
    }
  }
-
-  private _base64ToInt16(base64: string): Int16Array {
-    const binary = atob(base64);
-    const bytes = new Uint8Array(binary.length);
-    for (let i = 0; i < binary.length; i++) {
-      bytes[i] = binary.charCodeAt(i);
-    }
-    return new Int16Array(bytes.buffer);
-  }
-
-  private _calculateRMS(samples: Int16Array): number {
-    if (samples.length === 0) return 0;
-    let sum = 0;
-    for (let i = 0; i < samples.length; i++) {
-      sum += samples[i] * samples[i];
-    }
-    return Math.sqrt(sum / samples.length);
-  }
 }

 const wakeWordService = new WakeWordService();
@@ -3,9 +3,12 @@
 # → localhost ist aria-core
 ARIA_CORE_WS=ws://127.0.0.1:18789

-# Piper TTS Stimmen
-PIPER_RAMONA=/voices/de_DE-ramona-low.onnx
-PIPER_THORSTEN=/voices/de_DE-thorsten-high.onnx
-
 # Wake-Word
 WAKE_WORD=aria
+
+# Whisper STT — wird zur Laufzeit in der Diagnostic (Sektion "Whisper") umgeschaltet
+# und in /shared/config/voice_config.json gespeichert. Der Wert hier ist nur der
+# Initial-Default beim ersten Start.
+# Optionen: tiny | base | small | medium | large-v3
+WHISPER_MODEL=medium
+WHISPER_LANGUAGE=de
@@ -1,6 +1,6 @@
 # ════════════════════════════════════════════════
 #  ARIA Voice Bridge — Dockerfile
-#  Whisper STT + Piper TTS + Wake-Word
+#  Whisper STT + Wake-Word (TTS via XTTS v2 remote)
 # ════════════════════════════════════════════════

 FROM python:3.12-slim
@@ -91,6 +91,39 @@ _ACTIVATION_MAP: dict[str, Mode] = {
    mode.config.activation_phrase.lower(): mode for mode in Mode
 }

+# ID-Mapping fuer API-Mode-Wechsel (z.B. App ModeSelector schickt 'normal')
+_ID_MAP: dict[str, Mode] = {
+    "normal": Mode.NORMAL,
+    "nicht_stoeren": Mode.DND,
+    "dnd": Mode.DND,
+    "fluester": Mode.WHISPER,
+    "whisper": Mode.WHISPER,
+    "hangar": Mode.HANGAR,
+    "gaming": Mode.GAMING,
+}
+
+
+def mode_from_id(mode_id: str) -> Optional[Mode]:
+    """ID-basiertes Mapping fuer API-Mode-Wechsel (ohne Aktivierungsphrase)."""
+    if not mode_id:
+        return None
+    return _ID_MAP.get(mode_id.strip().lower())
+
+
+# Kanonische IDs fuer Broadcasts (matchen die App-UI-IDs in ModeSelector)
+_CANONICAL_ID: dict[Mode, str] = {
+    Mode.NORMAL: "normal",
+    Mode.DND: "nicht_stoeren",
+    Mode.WHISPER: "fluester",
+    Mode.HANGAR: "hangar",
+    Mode.GAMING: "gaming",
+}
+
+
+def canonical_id(mode: Mode) -> str:
+    """Kanonische ID die App + Diagnostic + Bridge gleichermassen kennen."""
+    return _CANONICAL_ID.get(mode, mode.name.lower())
+

 def detect_mode_switch(text: str) -> Optional[Mode]:
    """Erkennt ob ein Text eine Modus-Umschaltung enthaelt.
@@ -5,8 +5,7 @@
 # STT — Whisper (lokal, keine API noetig)
 faster-whisper

-# TTS — Piper (offline, deutsche Stimmen)
-piper-tts
+# TTS: laeuft remote ueber XTTS v2 auf dem Gaming-PC (keine lokalen Deps noetig)

 # WebSocket-Verbindung zu aria-core
 websockets
@@ -0,0 +1,44 @@
+#!/bin/bash
+# ARIA Docker Cleanup
+#
+# Standard:  docker builder prune + image prune (sicher, loescht keine Volumes)
+# --full:    Volle Reinigung inkl. --volumes (Vorsicht bei ungenutzten Volumes!)
+#
+# Usage:
+#   ./cleanup.sh           # sicherer Cleanup
+#   ./cleanup.sh --full    # aggressiver Cleanup (inkl. Volumes)
+
+set -e
+
+FULL=0
+for arg in "$@"; do
+  case "$arg" in
+    --full|-f) FULL=1 ;;
+    -h|--help)
+      grep '^#' "$0" | sed 's/^# \{0,1\}//'
+      exit 0
+      ;;
+  esac
+done
+
+echo "── Docker Speicher VOR Cleanup ───────────────────"
+docker system df
+echo
+
+if [ "$FULL" = "1" ]; then
+  echo ">>> VOLLE Reinigung (inkl. ungenutzter Volumes)"
+  read -p "Wirklich? [y/N] " -n 1 -r REPLY
+  echo
+  [[ ! $REPLY =~ ^[Yy]$ ]] && { echo "Abgebrochen."; exit 0; }
+  docker system prune -a --volumes -f
+else
+  echo ">>> Sicherer Cleanup (Build-Cache + ungenutzte Images)"
+  docker builder prune -a -f
+  docker image prune -a -f
+fi
+
+echo
+echo "── Docker Speicher NACH Cleanup ──────────────────"
+docker system df
+echo
+df -h / | head -2
@@ -37,15 +37,76 @@ const state = {
 };
 const SESSION_KEY_FILE = "/data/active-session";
 // /data Verzeichnis sicherstellen (Volume Mount)
-try { fs.mkdirSync("/data", { recursive: true }); } catch {}
+try { fs.mkdirSync("/data", { recursive: true }); } catch (e) {
+  console.error(`[startup] /data mkdir fehlgeschlagen: ${e.message}`);
+}
+// sessionFromFile zeigt an, ob der aktive Key aus der Datei kam.
+// Wenn true, darf resolveActiveSession NICHT mehr auto-picken (Wahl respektieren).
+let sessionFromFile = false;
 let activeSessionKey = (() => {
  try {
    const saved = fs.readFileSync(SESSION_KEY_FILE, "utf-8").trim();
-    if (saved) { console.log(`[startup] Gespeicherte Session geladen: '${saved}'`); return saved; }
-  } catch {}
+    if (saved) {
+      console.log(`[startup] Gespeicherte Session geladen: '${saved}'`);
+      sessionFromFile = true;
+      return saved;
+    }
+  } catch (e) {
+    console.error(`[startup] SESSION_KEY_FILE read: ${e.code || e.message}`);
+  }
  console.log("[startup] Keine gespeicherte Session — Fallback 'main'");
  return "main";
 })();
+
+// ── Runtime-Config: /shared/config/runtime.json ─────────────
+// ENV-Werte sind Defaults; Werte aus runtime.json haben Vorrang.
+// Bridge und ggf. andere Komponenten lesen dieselbe Datei.
+const RUNTIME_CONFIG_FILE = "/shared/config/runtime.json";
+const RUNTIME_CONFIG_FIELDS = [
+  "RVS_HOST", "RVS_PORT", "RVS_TLS", "RVS_TOKEN",
+  "ARIA_AUTH_TOKEN", "WHISPER_MODEL", "WHISPER_LANGUAGE",
+];
+function readRuntimeConfig() {
+  const envDefaults = {
+    RVS_HOST, RVS_PORT, RVS_TLS, RVS_TOKEN,
+    ARIA_AUTH_TOKEN: process.env.ARIA_AUTH_TOKEN || "",
+    WHISPER_MODEL: process.env.WHISPER_MODEL || "medium",
+    WHISPER_LANGUAGE: process.env.WHISPER_LANGUAGE || "de",
+  };
+  try {
+    const raw = fs.readFileSync(RUNTIME_CONFIG_FILE, "utf-8");
+    const parsed = JSON.parse(raw);
+    return { ...envDefaults, ...parsed };
+  } catch {
+    return envDefaults;
+  }
+}
+function writeRuntimeConfig(patch) {
+  let current = {};
+  try { current = JSON.parse(fs.readFileSync(RUNTIME_CONFIG_FILE, "utf-8")); } catch {}
+  for (const key of Object.keys(patch)) {
+    if (RUNTIME_CONFIG_FIELDS.includes(key)) current[key] = patch[key];
+  }
+  fs.mkdirSync("/shared/config", { recursive: true });
+  const tmp = RUNTIME_CONFIG_FILE + ".tmp";
+  fs.writeFileSync(tmp, JSON.stringify(current, null, 2));
+  fs.renameSync(tmp, RUNTIME_CONFIG_FILE);
+}
+
+// Atomic write: temp-file + rename, laute Logs bei Fehler.
+function persistActiveSession(key) {
+  try {
+    const tmp = SESSION_KEY_FILE + ".tmp";
+    fs.writeFileSync(tmp, key);
+    fs.renameSync(tmp, SESSION_KEY_FILE);
+    sessionFromFile = true;
+    console.log(`[session] Aktive Session persistiert: '${key}'`);
+    return true;
+  } catch (e) {
+    console.error(`[session] FEHLER beim Persistieren von '${key}': ${e.message}`);
+    return false;
+  }
+}
 const logs = [];
 let gatewayWs = null;
 let rvsWs = null;
@@ -56,6 +117,12 @@ const browserClients = new Set();
 let pipelineActive = false;
 let pipelineStartTime = 0;

+// Nach chat:final kommen oft noch Trailing Agent-Events. Waehrend dieses
+// Fensters unterdruecken wir agent_activity-Broadcasts, damit der
+// Thinking-Indicator nicht wieder anspringt.
+let lastChatFinalAt = 0;
+const SETTLED_WINDOW_MS = 3000;
+
 function plog(message, level) {
  const elapsed = pipelineActive ? `+${Date.now() - pipelineStartTime}ms` : "";
  const entry = { ts: new Date().toISOString(), level: level || "info", source: "pipeline", message: `${elapsed ? `[${elapsed}] ` : ""}${message}` };
@@ -74,8 +141,8 @@ function pipelineStart(method, text) {
  pipelineStartTime = Date.now();
  if (pipelineTimeout) clearTimeout(pipelineTimeout);
  pipelineTimeout = setTimeout(() => {
-    if (pipelineActive) pipelineEnd(false, "Timeout — keine Antwort nach 60s");
-  }, 60000);
+    if (pipelineActive) pipelineEnd(false, "Timeout — keine Antwort nach 10min");
+  }, 600000);
  plog(`━━━ Pipeline Start: ${method} ━━━`);
  plog(`Nachricht: "${text}"`);
 }
@@ -91,6 +158,9 @@ function pipelineEnd(ok, detail) {
  }
  plog(`━━━ Pipeline Ende ━━━`);
  pipelineActive = false;
+  // Thinking-Indikator IMMER zuruecksetzen — auch bei Timeout/Fehler/Abbruch
+  broadcast({ type: "agent_activity", activity: "idle" });
+  pendingMessageTime = 0;
 }

 // ── Auto-Restart bei Netzwerk-Namespace-Verlust ──────
@@ -257,8 +327,10 @@ async function connectGateway() {
      state.gateway.handshakeOk = false;
      gatewayWs = null;
      broadcastState();
+      // Stuck "ARIA denkt..." vermeiden, falls Gateway waehrend Pipeline abkackt
+      if (pipelineActive) pipelineEnd(false, `Gateway-Verbindung verloren (${code})`);
+      else broadcast({ type: "agent_activity", activity: "idle" });
      checkGatewayHealth();
-      // Auto-Reconnect nach 5s
      setTimeout(connectGateway, 5000);
    });

@@ -319,10 +391,29 @@ function handleGatewayMessage(msg) {
    if (event === "agent") {
      const data = payload.data || {};
      const delta = data.delta || "";
-      if (delta && payload.stream === "assistant") {
+      const stream = payload.stream || "";
+
+      if (delta && stream === "assistant") {
        broadcast({ type: "chat_delta", delta, payload });
      }
-      // agent Events nicht einzeln loggen (zu viele)
+
+      // Nach chat:final trickeln noch Aufraeum-Events rein — unterdruecken,
+      // damit der Thinking-Indicator nicht wieder anspringt.
+      const settled = lastChatFinalAt && (Date.now() - lastChatFinalAt) < SETTLED_WINDOW_MS;
+
+      // Tool-Nutzung erkennen und broadcasten
+      if (stream === "tool_use" || data.type === "tool_use") {
+        const toolName = data.name || data.tool || payload.tool || "";
+        if (toolName && !settled) {
+          broadcast({ type: "agent_activity", activity: "tool", tool: toolName, data });
+          log("info", "gateway", `Tool: ${toolName}`);
+        }
+      }
+
+      if (!settled) {
+        broadcast({ type: "agent_activity", activity: stream || "thinking" });
+      }
+      updateAgentActivity();
      return;
    }

@@ -335,9 +426,31 @@ function handleGatewayMessage(msg) {
        const runId = payload.runId || "";
        if (runId && seenFinalRuns.has(runId)) return; // Duplikat
        if (runId) { seenFinalRuns.add(runId); setTimeout(() => seenFinalRuns.delete(runId), 60000); }
+
+        // NO_REPLY → ARIA signalisiert "nicht antworten", Pipeline beenden aber nichts zeigen
+        const trimmed = (text || "").trim().replace(/^["'`*.\s]+|["'`*.\s]+$/g, "").toUpperCase();
+        if (trimmed === "NO_REPLY" || trimmed.startsWith("NO_REPLY")) {
+          log("info", "gateway", "NO_REPLY empfangen — still verworfen");
+          lastChatFinalAt = Date.now();
+          if (pipelineActive) pipelineEnd(true, "NO_REPLY (stumm)");
+          broadcast({ type: "agent_activity", activity: "idle" });
+          pendingMessageTime = 0;
+          updateAgentActivity();
+          return;
+        }
+
        log("info", "gateway", `ANTWORT: "${text.slice(0, 200)}"`);
+        lastChatFinalAt = Date.now();
        if (pipelineActive) pipelineEnd(true, `"${text.slice(0, 120)}"`);
        broadcast({ type: "chat_final", text, payload });
+        broadcast({ type: "agent_activity", activity: "idle" });
+        pendingMessageTime = 0; // Watchdog: Antwort erhalten
+        updateAgentActivity();
+        // Antwort in Backup-Log schreiben
+        try {
+          const entry = JSON.stringify({ ts: Date.now(), role: "assistant", text: text.slice(0, 2000), session: activeSessionKey }) + "\n";
+          fs.appendFileSync("/shared/config/chat_backup.jsonl", entry);
+        } catch {}
        return;
      }

@@ -350,6 +463,7 @@ function handleGatewayMessage(msg) {
        const error = payload.error || text || "Unbekannt";
        log("error", "gateway", `Chat-Fehler: ${error}`);
        if (pipelineActive) pipelineEnd(false, error);
+        else broadcast({ type: "agent_activity", activity: "idle" });
        broadcast({ type: "chat_error", error, payload });
        return;
      }
@@ -370,7 +484,9 @@ function handleGatewayMessage(msg) {
      if (runId) { seenFinalRuns.add(runId); setTimeout(() => seenFinalRuns.delete(runId), 60000); }
      const text = extractChatText(payload) || payload.text || "";
      log("info", "gateway", `ANTWORT: "${text.slice(0, 200)}"`);
+      lastChatFinalAt = Date.now();
      if (pipelineActive) pipelineEnd(true, `"${text.slice(0, 120)}"`);
+      else broadcast({ type: "agent_activity", activity: "idle" });
      broadcast({ type: "chat_final", text, payload });
      return;
    }
@@ -378,6 +494,7 @@ function handleGatewayMessage(msg) {
      const error = payload.error || payload.message || "Unbekannt";
      log("error", "gateway", `Chat-Fehler: ${error}`);
      if (pipelineActive) pipelineEnd(false, error);
+      else broadcast({ type: "agent_activity", activity: "idle" });
      broadcast({ type: "chat_error", error, payload });
      return;
    }
@@ -410,8 +527,17 @@ function sendToGateway(text, isPipeline) {
  const payload = JSON.stringify(msg);
  log("debug", "gateway", `RAW >>> ${payload}`);
  gatewayWs.send(payload);
+  pendingMessageTime = Date.now(); // Watchdog: Nachricht gesendet
+  // Nachricht sofort in Backup-Log schreiben (OpenClaw speichert erst nach Run-Ende)
+  try {
+    fs.mkdirSync("/shared/config", { recursive: true });
+    const entry = JSON.stringify({ ts: Date.now(), role: "user", text, session: activeSessionKey }) + "\n";
+    fs.appendFileSync("/shared/config/chat_backup.jsonl", entry);
+  } catch {}
  log("info", "gateway", `chat.send [${reqId}]: "${text}"`);
  if (isPipeline) plog(`chat.send [${reqId}] an Gateway gesendet — warte auf ACK...`);
+
+  // Gateway-Nachrichten NICHT an RVS senden (sonst doppelter ARIA-Request via Bridge)
  return true;
 }

@@ -425,7 +551,13 @@ function connectRVS(forcePlain) {
    return;
  }

-  // TLS-Logik: wss zuerst, bei Fehler Fallback auf ws (wenn erlaubt)
+  // Alte Verbindung sauber schliessen
+  if (rvsWs) {
+    try { rvsWs.removeAllListeners(); rvsWs.close(); } catch (_) {}
+    rvsWs = null;
+  }
+
+  // TLS-Logik: wss zuerst, bei Fehler Fallback auf ws
  const useTls = RVS_TLS === "true" && !forcePlain;
  const proto = useTls ? "wss" : "ws";
  const url = `${proto}://${RVS_HOST}:${RVS_PORT}?token=${RVS_TOKEN}`;
@@ -434,7 +566,18 @@ function connectRVS(forcePlain) {
  broadcastState();
  log("info", "rvs", `Verbinde: ${proto}://${RVS_HOST}:${RVS_PORT}`);

-  const ws = new WebSocket(url);
+  let ws;
+  try {
+    ws = new WebSocket(url);
+  } catch (err) {
+    log("error", "rvs", `WebSocket erstellen fehlgeschlagen: ${err.message}`);
+    if (useTls && RVS_TLS_FALLBACK === "true") {
+      connectRVS(true);
+    }
+    return;
+  }
+
+  let fallbackTriggered = false;

  ws.on("open", () => {
    log("info", "rvs", `Verbunden (${proto})`);
@@ -442,6 +585,16 @@ function connectRVS(forcePlain) {
    state.rvs.lastError = null;
    rvsWs = ws;
    broadcastState();
+
+    // Keepalive: alle 25s ein Ping senden damit die Verbindung nicht stirbt
+    const keepalive = setInterval(() => {
+      if (ws.readyState === WebSocket.OPEN) {
+        try { ws.ping(); } catch (_) {}
+      } else {
+        clearInterval(keepalive);
+      }
+    }, 25000);
+    ws._keepalive = keepalive;
  });

  ws.on("message", (raw) => {
@@ -449,13 +602,41 @@ function connectRVS(forcePlain) {
      const msg = JSON.parse(raw.toString());
      if (msg.type === "chat" && msg.payload) {
        const sender = msg.payload.sender || "?";
+        // Eigene Nachrichten ignorieren (Echo)
+        if (sender === "diagnostic") return;
        log("info", "rvs", `Chat von ${sender}: "${(msg.payload.text || "").slice(0, 100)}"`);
-        if (pipelineActive && sender !== "diagnostic") {
+        if (pipelineActive) {
          pipelineEnd(true, `Antwort via RVS von ${sender}: "${(msg.payload.text || "").slice(0, 120)}"`);
        }
        broadcast({ type: "rvs_chat", msg });
+      } else if (msg.type === "file_saved" && msg.payload) {
+        // Bild/Datei-Upload von der App — im Chat anzeigen
+        const name = msg.payload.name || "?";
+        const serverPath = msg.payload.serverPath || "";
+        const mimeType = msg.payload.mimeType || "";
+        log("info", "rvs", `Datei empfangen: ${name} (${serverPath})`);
+        // Als User-Nachricht mit Pfad broadcasten (Diagnostic zeigt Bilder inline)
+        broadcast({ type: "rvs_chat", msg: {
+          type: "chat",
+          payload: { text: `Anhang: ${name}\n${serverPath}`, sender: "user" }
+        }});
      } else if (msg.type === "heartbeat") {
        // ignorieren
+      } else if (msg.type === "mode") {
+        // Mode-Broadcast von der Bridge → an Browser-Clients weiterreichen
+        log("info", "rvs", `Mode-Broadcast: ${msg.payload?.mode} (${msg.payload?.name})`);
+        broadcast({ type: "mode", payload: msg.payload });
+      } else if (msg.type === "voice_ready") {
+        // XTTS-Bridge meldet Stimme fertig geladen → an Browser durchreichen
+        const v = msg.payload?.voice || "";
+        const err = msg.payload?.error;
+        const ms = msg.payload?.loadMs;
+        if (err) {
+          log("warn", "rvs", `Voice-Ready Fehler fuer "${v}": ${err}`);
+        } else {
+          log("info", "rvs", `Voice "${v || "default"}" geladen${ms ? ` in ${(ms/1000).toFixed(1)}s` : ""}`);
+        }
+        broadcast({ type: "voice_ready", payload: msg.payload });
      } else {
        log("debug", "rvs", `Nachricht: ${JSON.stringify(msg).slice(0, 150)}`);
      }
@@ -464,10 +645,13 @@ function connectRVS(forcePlain) {

  ws.on("close", () => {
    log("warn", "rvs", "Verbindung geschlossen");
+    if (ws._keepalive) clearInterval(ws._keepalive);
    state.rvs.status = "disconnected";
-    rvsWs = null;
+    if (rvsWs === ws) rvsWs = null;
    broadcastState();
-    setTimeout(() => connectRVS(), 5000);
+    if (!fallbackTriggered) {
+      setTimeout(() => connectRVS(), 5000);
+    }
  });

  ws.on("error", (err) => {
@@ -475,31 +659,71 @@ function connectRVS(forcePlain) {
    state.rvs.lastError = err.message;
    broadcastState();

-    // TLS Fallback: wenn wss fehlschlaegt und Fallback erlaubt → ws versuchen
-    if (useTls && RVS_TLS_FALLBACK === "true") {
+    // TLS Fallback
+    if (useTls && RVS_TLS_FALLBACK === "true" && !fallbackTriggered) {
+      fallbackTriggered = true;
      log("warn", "rvs", "TLS fehlgeschlagen — Fallback auf ws://");
-      ws.removeAllListeners();
-      try { ws.close(); } catch (_) {}
+      try { ws.removeAllListeners(); ws.close(); } catch (_) {}
+      if (rvsWs === ws) rvsWs = null;
      connectRVS(true);
    }
  });
 }

-function sendToRVS(text, isPipeline) {
-  if (!rvsWs || rvsWs.readyState !== WebSocket.OPEN) {
-    log("error", "rvs", "Nicht verbunden");
-    if (isPipeline) pipelineEnd(false, "RVS nicht verbunden");
-    return false;
-  }
+function sendToRVS_withResponse(sendType, sendPayload, expectType, clientWs) {
+  if (!RVS_HOST || !RVS_TOKEN) return;
+  const proto = RVS_TLS === "true" ? "wss" : "ws";
+  const url = `${proto}://${RVS_HOST}:${RVS_PORT}?token=${RVS_TOKEN}`;
+  const freshWs = new WebSocket(url);
+  const timeout = setTimeout(() => {
+    try { freshWs.close(); } catch (_) {}
+    clientWs.send(JSON.stringify({ type: expectType, payload: { voices: [], error: "Timeout" }, timestamp: Date.now() }));
+  }, 15000);
+  freshWs.on("open", () => {
+    freshWs.send(JSON.stringify({ type: sendType, payload: sendPayload, timestamp: Date.now() }));
+  });
+  freshWs.on("message", (raw) => {
+    try {
+      const resp = JSON.parse(raw.toString());
+      if (resp.type === expectType) {
+        clearTimeout(timeout);
+        clientWs.send(JSON.stringify(resp));
+        setTimeout(() => { try { freshWs.close(); } catch (_) {} }, 1000);
+      }
+    } catch {}
+  });
+  freshWs.on("error", () => {});
+}

-  rvsWs.send(JSON.stringify({
+function sendToRVS_raw(msgObj) {
+  if (!RVS_HOST || !RVS_TOKEN) return;
+  const proto = RVS_TLS === "true" ? "wss" : "ws";
+  const url = `${proto}://${RVS_HOST}:${RVS_PORT}?token=${RVS_TOKEN}`;
+  const freshWs = new WebSocket(url);
+  freshWs.on("open", () => {
+    freshWs.send(JSON.stringify(msgObj));
+    setTimeout(() => { try { freshWs.close(); } catch (_) {} }, 5000);
+  });
+  freshWs.on("error", () => {});
+}
+
+function sendToRVS(text, isPipeline) {
+  // Ueber Gateway senden (zuverlaessig) UND an RVS fuer App-Sichtbarkeit
+  // Die Bridge empfaengt RVS-Nachrichten von der App zuverlaessig,
+  // aber die Diagnostic→RVS→Bridge Route hat Zombie-Probleme.
+  // Deshalb: Gateway fuer ARIA, RVS nur fuer App-Anzeige.
+
+  // 1. An Gateway senden (damit ARIA antwortet)
+  const gatewayOk = sendToGateway(text, isPipeline);
+
+  // 2. An RVS senden (damit die App die Nachricht sieht)
+  sendToRVS_raw({
    type: "chat",
    payload: { text, sender: "diagnostic" },
    timestamp: Date.now(),
-  }));
-  log("info", "rvs", `Gesendet via RVS: "${text}"`);
-  if (isPipeline) plog(`Nachricht an RVS gesendet — warte auf Antwort via RVS...`);
-  return true;
+  });
+
+  return gatewayOk;
 }

 // ── Claude Proxy Test ────────────────────────────────────
@@ -517,7 +741,7 @@ async function testProxy(prompt) {

    const modelsRes = await fetch(healthUrl, {
      headers: { "Authorization": "Bearer not-needed" },
-      signal: AbortSignal.timeout(10000),
+      signal: AbortSignal.timeout(30000),
    });

    if (!modelsRes.ok) {
@@ -544,7 +768,7 @@ async function testProxy(prompt) {
    }

    // Schritt 2: Chat Completion testen (kurzer Prompt)
-    const testPrompt = prompt || "Antworte mit genau einem Wort: Ping";
+    const testPrompt = prompt || "Antworte in einem Satz: Wer bist du und funktionierst du?";
    log("info", "proxy", `Sende Test-Prompt: "${testPrompt}"`);

    const chatRes = await fetch(`${PROXY_URL}/v1/chat/completions`, {
@@ -558,7 +782,7 @@ async function testProxy(prompt) {
        messages: [{ role: "user", content: testPrompt }],
        max_tokens: 200,
      }),
-      signal: AbortSignal.timeout(30000),
+      signal: AbortSignal.timeout(120000), // 2min — Cold Start braucht Zeit
    });

    if (!chatRes.ok) {
@@ -923,6 +1147,111 @@ function waitForMessage(ws, timeoutMs) {
  });
 }

+// ── Watchdog: Stuck Run Erkennung ────────────────────────
+
+let lastAgentActivity = Date.now();
+let watchdogWarned = false;
+let watchdogFixAttempted = false;
+let pendingMessageTime = 0; // Wann wurde die letzte Nachricht gesendet
+
+function updateAgentActivity() {
+  lastAgentActivity = Date.now();
+  watchdogWarned = false;
+}
+
+// ── Disk-Space Monitor ───────────────────────────────
+// Prueft regelmaessig die Host-Disk (via gemountetem /shared) und
+// broadcastet bei kritischen Schwellwerten ein disk_status Event.
+let lastDiskStatus = null;
+let currentDiskStatus = null; // Vollstaendig fuer neu verbundene Clients
+function checkDiskSpace() {
+  const { exec } = require("child_process");
+  exec("df -B1 /shared", (err, stdout) => {
+    if (err) return;
+    const lines = stdout.trim().split("\n");
+    if (lines.length < 2) return;
+    const cols = lines[1].split(/\s+/);
+    // Filesystem  Size  Used  Avail  Use%  MountedOn
+    const total = parseInt(cols[1], 10);
+    const used = parseInt(cols[2], 10);
+    const avail = parseInt(cols[3], 10);
+    if (!total) return;
+    const pct = Math.round((used / total) * 100);
+    let level = "ok";
+    if (pct >= 95) level = "critical";
+    else if (pct >= 85) level = "warn";
+    else if (pct >= 70) level = "info";
+    const status = {
+      type: "disk_status",
+      level,
+      percent: pct,
+      usedBytes: used,
+      totalBytes: total,
+      availBytes: avail,
+    };
+    currentDiskStatus = status;
+    // Nur broadcasten wenn sich was geaendert hat (oder alle 60s Refresh)
+    const key = `${level}-${pct}`;
+    if (lastDiskStatus !== key) {
+      lastDiskStatus = key;
+      broadcast(status);
+      if (level !== "ok") {
+        log(level === "critical" ? "error" : "warn", "server",
+          `Disk ${pct}% belegt (${(used/1024/1024/1024).toFixed(1)}GB von ${(total/1024/1024/1024).toFixed(1)}GB)`);
+      }
+    }
+  });
+}
+// Beim Start + alle 30s
+setTimeout(checkDiskSpace, 2000);
+setInterval(checkDiskSpace, 30000);
+
+// Watchdog prüft alle 30s ob ARIA nach einer gesendeten Nachricht reagiert
+setInterval(async () => {
+  if (pendingMessageTime === 0) return; // Keine Nachricht gesendet
+  const waitingMs = Date.now() - pendingMessageTime;
+
+  // Nach 2min ohne Agent-Activity: Warnung
+  if (waitingMs > 120000 && !watchdogWarned) {
+    watchdogWarned = true;
+    log("warn", "server", `Watchdog: Keine ARIA-Aktivitaet seit ${Math.round(waitingMs / 1000)}s — moeglicherweise stuck`);
+    broadcast({ type: "watchdog", status: "warning", waitingMs, message: "ARIA reagiert nicht — moeglicherweise stuck Run" });
+  }
+
+  // Nach 5min: doctor --fix
+  if (waitingMs > 300000 && watchdogWarned && !watchdogFixAttempted) {
+    watchdogFixAttempted = true;
+    log("error", "server", "Watchdog: 5min ohne Antwort — fuehre openclaw doctor --fix aus");
+    broadcast({ type: "watchdog", status: "fixing", message: "Auto-Fix: openclaw doctor --fix" });
+    try {
+      await dockerExec("aria-core", "openclaw doctor --fix 2>/dev/null || true");
+      log("info", "server", "Watchdog: doctor --fix ausgefuehrt");
+      broadcast({ type: "watchdog", status: "fixed", message: "doctor --fix ausgefuehrt — warte auf Antwort..." });
+    } catch (err) {
+      log("error", "server", `Watchdog: doctor --fix fehlgeschlagen: ${err.message}`);
+    }
+  }
+
+  // Nach 8min: Container neustarten
+  if (waitingMs > 480000 && watchdogFixAttempted) {
+    log("error", "server", "Watchdog: 8min ohne Antwort — starte aria-core + aria-proxy neu");
+    broadcast({ type: "watchdog", status: "restarting", message: "Container-Restart: aria-core + aria-proxy" });
+    try {
+      const { execSync } = require("child_process");
+      execSync("docker restart aria-core aria-proxy", { timeout: 60000 });
+      log("info", "server", "Watchdog: Container neugestartet");
+      broadcast({ type: "watchdog", status: "restarted", message: "Container neugestartet — warte auf Gateway-Reconnect..." });
+      // Gateway wird sich automatisch neu verbinden
+    } catch (err) {
+      log("error", "server", `Watchdog: Container-Restart fehlgeschlagen: ${err.message}`);
+      broadcast({ type: "watchdog", status: "error", message: `Restart fehlgeschlagen: ${err.message}` });
+    }
+    pendingMessageTime = 0;
+    watchdogWarned = false;
+    watchdogFixAttempted = false;
+  }
+}, 30000);
+
 // ── HTTP Server + WebSocket fuer Browser ────────────────

 const htmlPath = path.join(__dirname, "index.html");
@@ -937,6 +1266,67 @@ const server = http.createServer((req, res) => {
  } else if (req.url === "/api/session") {
    res.writeHead(200, { "Content-Type": "application/json" });
    res.end(JSON.stringify({ sessionKey: activeSessionKey }));
+  } else if (req.url === "/api/runtime-config" && req.method === "GET") {
+    // Zentrale Runtime-Config (ENV + Override aus /shared/config/runtime.json)
+    res.writeHead(200, { "Content-Type": "application/json" });
+    res.end(JSON.stringify(readRuntimeConfig()));
+  } else if (req.url === "/api/runtime-config" && req.method === "POST") {
+    let body = "";
+    req.on("data", chunk => { body += chunk; if (body.length > 32768) req.destroy(); });
+    req.on("end", () => {
+      try {
+        const patch = JSON.parse(body);
+        writeRuntimeConfig(patch);
+        res.writeHead(200, { "Content-Type": "application/json" });
+        res.end(JSON.stringify({ ok: true, config: readRuntimeConfig() }));
+        log("info", "server", `Runtime-Config aktualisiert: ${Object.keys(patch).join(", ")}`);
+      } catch (err) {
+        res.writeHead(400, { "Content-Type": "application/json" });
+        res.end(JSON.stringify({ ok: false, error: err.message }));
+      }
+    });
+    return;
+  } else if (req.url === "/api/onboarding") {
+    // RVS-Credentials fuer QR-Code App-Onboarding
+    res.writeHead(200, { "Content-Type": "application/json" });
+    res.end(JSON.stringify({
+      rvsHost: RVS_HOST,
+      rvsPort: RVS_PORT,
+      rvsTLS: RVS_TLS === "true" || RVS_TLS === true,
+      rvsToken: RVS_TOKEN,
+    }));
+  } else if (req.url === "/api/cancel" && req.method === "POST") {
+    log("warn", "server", "HTTP /api/cancel — Cancel-Request (von Bridge)");
+    pendingMessageTime = 0;
+    watchdogWarned = false;
+    watchdogFixAttempted = false;
+    if (pipelineActive) pipelineEnd(false, "Vom Benutzer abgebrochen (App)");
+    else broadcast({ type: "agent_activity", activity: "idle" });
+    dockerExec("aria-core", "openclaw doctor --fix 2>/dev/null || true").catch(() => {});
+    res.writeHead(200, { "Content-Type": "application/json" });
+    res.end(JSON.stringify({ ok: true }));
+  } else if (req.url.startsWith("/shared/")) {
+    // Dateien aus Shared Volume ausliefern (Bilder, Uploads)
+    const filePath = decodeURIComponent(req.url);
+    const safePath = path.resolve(filePath);
+    if (!safePath.startsWith("/shared/")) {
+      res.writeHead(403);
+      res.end("Forbidden");
+      return;
+    }
+    try {
+      if (!fs.existsSync(safePath)) { res.writeHead(404); res.end("Not Found"); return; }
+      const ext = path.extname(safePath).toLowerCase();
+      const mimeTypes = { ".jpg": "image/jpeg", ".jpeg": "image/jpeg", ".png": "image/png", ".gif": "image/gif",
+                          ".pdf": "application/pdf", ".txt": "text/plain", ".json": "application/json" };
+      const contentType = mimeTypes[ext] || "application/octet-stream";
+      const data = fs.readFileSync(safePath);
+      res.writeHead(200, { "Content-Type": contentType, "Content-Length": data.length });
+      res.end(data);
+    } catch (err) {
+      res.writeHead(500);
+      res.end("Error");
+    }
  } else {
    res.writeHead(404);
    res.end("Not Found");
@@ -949,6 +1339,8 @@ wss.on("connection", (ws) => {
  browserClients.add(ws);
  // Initialen State + letzte Logs senden
  ws.send(JSON.stringify({ type: "init", state, logs: logs.slice(-100) }));
+  // Letzten Disk-Status mitgeben damit der Client sofort weiss wie's um Platz steht
+  if (currentDiskStatus) ws.send(JSON.stringify(currentDiskStatus));

  ws.on("message", (raw) => {
    try {
@@ -987,6 +1379,64 @@ wss.on("connection", (ws) => {
        if (ws._sshSock) ws._sshSock.write(msg.data);
      } else if (msg.action === "live_ssh_close") {
        if (ws._sshSock) { ws._sshSock.end(); ws._sshSock = null; }
+      } else if (msg.action === "send_file") {
+        // Datei von Diagnostic an Bridge via RVS senden
+        sendToRVS_raw({
+          type: "file",
+          payload: { name: msg.name, type: msg.type, size: msg.size, base64: msg.base64 },
+          timestamp: Date.now(),
+        });
+        log("info", "server", `Datei gesendet: ${msg.name} (${msg.type})`);
+      } else if (msg.action === "cancel_request") {
+        // Laufende Anfrage abbrechen — doctor --fix beendet stuck runs
+        log("warn", "server", "Anfrage abgebrochen — fuehre doctor --fix aus");
+        pendingMessageTime = 0;
+        watchdogWarned = false;
+        watchdogFixAttempted = false;
+        if (pipelineActive) pipelineEnd(false, "Vom Benutzer abgebrochen");
+        broadcast({ type: "agent_activity", activity: "idle" });
+        dockerExec("aria-core", "openclaw doctor --fix 2>/dev/null || true").catch(() => {});
+      } else if (msg.action === "voice_upload") {
+        // Voice-Samples an XTTS-Bridge via RVS weiterleiten, auf Bestätigung warten
+        log("info", "server", `Voice-Upload '${msg.name}' (${(msg.samples || []).length} Samples) sende an RVS...`);
+        sendToRVS_withResponse("voice_upload", { name: msg.name, samples: msg.samples }, "xtts_voice_saved", ws);
+      } else if (msg.action === "xtts_list_voices") {
+        // Frische Verbindung die auf Antwort wartet
+        sendToRVS_withResponse("xtts_list_voices", {}, "xtts_voices_list", ws);
+      } else if (msg.action === "xtts_delete_voice") {
+        // Weiterleiten an XTTS-Bridge, die antwortet mit neuer Liste
+        sendToRVS_raw({ type: "xtts_delete_voice", payload: { name: msg.name }, timestamp: Date.now() });
+        log("info", "server", `Voice-Delete '${msg.name}' an XTTS-Bridge gesendet`);
+      } else if (msg.action === "set_mode") {
+        // Mode-Wechsel → Bridge bearbeitet und broadcastet an alle Clients
+        sendToRVS_raw({ type: "mode", payload: { mode: msg.mode }, timestamp: Date.now() });
+        log("info", "server", `Mode-Wechsel angefordert: ${msg.mode}`);
+      } else if (msg.action === "get_voice_config") {
+        handleGetVoiceConfig(ws);
+      } else if (msg.action === "send_voice_config") {
+        // Stimmen-Config persistent speichern + an Bridge via RVS senden
+        let existing = {};
+        try { existing = JSON.parse(fs.readFileSync("/shared/config/voice_config.json", "utf-8")); } catch {}
+        const voiceConfig = {
+          ...existing,
+          ttsEnabled: msg.ttsEnabled !== false,
+          xttsVoice: msg.xttsVoice || "",
+        };
+        if (msg.whisperModel !== undefined) voiceConfig.whisperModel = msg.whisperModel;
+        try {
+          fs.mkdirSync("/shared/config", { recursive: true });
+          fs.writeFileSync("/shared/config/voice_config.json", JSON.stringify(voiceConfig, null, 2));
+        } catch {}
+        sendToRVS_raw({ type: "config", payload: voiceConfig, timestamp: Date.now() });
+        log("info", "server", `Voice-Config gespeichert: xttsVoice=${voiceConfig.xttsVoice || "default"}, whisper=${voiceConfig.whisperModel || "-"}`);
+      } else if (msg.action === "get_triggers") {
+        handleGetTriggers(ws);
+      } else if (msg.action === "save_triggers") {
+        handleSaveTriggers(ws, msg.triggers || []);
+      } else if (msg.action === "test_tts") {
+        handleTestTTS(ws, msg.text || "Test");
+      } else if (msg.action === "check_tts") {
+        handleCheckTTS(ws);
      } else if (msg.action === "check_desktop") {
        checkDesktopAvailable(ws);
      } else if (msg.action === "load_chat_history") {
@@ -995,6 +1445,8 @@ wss.on("connection", (ws) => {
        handleListSessions(ws);
      } else if (msg.action === "read_session") {
        handleReadSession(ws, msg.sessionPath);
+      } else if (msg.action === "export_session") {
+        handleExportSession(ws, msg.sessionPath, msg.sessionKey);
      } else if (msg.action === "delete_session") {
        handleDeleteSession(ws, msg.sessionPath);
      } else if (msg.action === "set_active_session") {
@@ -1113,6 +1565,78 @@ function startLiveSSH(clientWs) {
  createReq.end(createBody);
 }

+// ── Voice-Config laden ────────────────────────────────
+
+function handleGetVoiceConfig(clientWs) {
+  try {
+    const configPath = "/shared/config/voice_config.json";
+    if (fs.existsSync(configPath)) {
+      const config = JSON.parse(fs.readFileSync(configPath, "utf-8"));
+      clientWs.send(JSON.stringify({ type: "voice_config", ...config }));
+    } else {
+      clientWs.send(JSON.stringify({ type: "voice_config", ttsEnabled: true, xttsVoice: "" }));
+    }
+  } catch (err) {
+    clientWs.send(JSON.stringify({ type: "voice_config", ttsEnabled: true, xttsVoice: "" }));
+  }
+}
+
+// ── Highlight-Trigger (legacy UI — wird nicht mehr ausgewertet seit Piper raus) ─
+const TRIGGERS_FILE = "/shared/config/highlight_triggers.json";
+
+async function handleGetTriggers(clientWs) {
+  try {
+    const triggers = fs.existsSync(TRIGGERS_FILE)
+      ? JSON.parse(fs.readFileSync(TRIGGERS_FILE, "utf-8"))
+      : [];
+    clientWs.send(JSON.stringify({ type: "trigger_list", triggers }));
+  } catch (err) {
+    clientWs.send(JSON.stringify({ type: "trigger_list", triggers: [], error: err.message }));
+  }
+}
+
+async function handleSaveTriggers(clientWs, triggers) {
+  try {
+    fs.mkdirSync("/shared/config", { recursive: true });
+    fs.writeFileSync(TRIGGERS_FILE, JSON.stringify(triggers, null, 2));
+    log("info", "server", `${triggers.length} Highlight-Trigger gespeichert`);
+    clientWs.send(JSON.stringify({ type: "trigger_list", triggers }));
+  } catch (err) {
+    log("error", "server", `Trigger speichern fehlgeschlagen: ${err.message}`);
+  }
+}
+
+// ── TTS Diagnose (XTTS) ───────────────────────────────
+async function handleTestTTS(clientWs, text) {
+  try {
+    log("info", "server", `TTS-Test via XTTS: "${text}"`);
+    // Via RVS an die XTTS-Bridge: xtts_request mit Test-Text
+    const requestId = crypto.randomUUID();
+    sendToRVS_raw({
+      type: "xtts_request",
+      payload: { text, language: "de", requestId, voice: "" },
+      timestamp: Date.now(),
+    });
+    clientWs.send(JSON.stringify({ type: "tts_result", ok: true, duration: "pending", size: "?" }));
+  } catch (err) {
+    clientWs.send(JSON.stringify({ type: "tts_result", ok: false, error: err.message }));
+  }
+}
+
+async function handleCheckTTS(clientWs) {
+  try {
+    // XTTS-Status ueber RVS abfragen (xtts_list_voices)
+    sendToRVS_raw({ type: "xtts_list_voices", payload: {}, timestamp: Date.now() });
+    clientWs.send(JSON.stringify({
+      type: "tts_status",
+      ok: true,
+      error: null,
+    }));
+  } catch (err) {
+    clientWs.send(JSON.stringify({ type: "tts_status", ok: false, error: err.message }));
+  }
+}
+
 function checkDesktopAvailable(clientWs) {
  // Pruefen ob VNC auf der VM laeuft (Port 5900/5901)
  const checkSock = net.connect({ host: "host.docker.internal", port: 5901 }, () => {
@@ -1149,17 +1673,17 @@ async function handleListSessions(clientWs) {
  try {
    log("info", "server", "Lade Sessions aus aria-core...");

-    // sessions.json als Index lesen + Datei-Details holen
+    // sessions.json als Index lesen + Datei-Details holen (inkl. .reset.* Archive)
    const raw = await dockerExec("aria-core", `
      cat ${SESSIONS_DIR}/sessions.json 2>/dev/null || echo '{}' &&
      echo '===FILE_DETAILS===' &&
-      for f in ${SESSIONS_DIR}/*.jsonl; do
+      for f in ${SESSIONS_DIR}/*.jsonl ${SESSIONS_DIR}/*.jsonl.reset.*; do
        [ -f "$f" ] || continue
        name=$(basename "$f")
-        lines=$(wc -l < "$f" 2>/dev/null || echo 0)
+        msgs=$(grep -cE '"role":"(user|assistant)"' "$f" 2>/dev/null || echo 0)
        size=$(du -h "$f" 2>/dev/null | cut -f1)
        modified=$(stat -c '%Y' "$f" 2>/dev/null || echo 0)
-        echo "FILE:$name|LINES:$lines|SIZE:$size|MODIFIED:$modified"
+        echo "FILE:$name|LINES:$msgs|SIZE:$size|MODIFIED:$modified"
      done
    `.trim());

@@ -1214,8 +1738,29 @@ async function handleListSessions(clientWs) {
      delete fileDetails[filename];
    }

-    // Dateien die nicht im Index stehen (Waisen / Reset-Files)
+    // Dateien die nicht im Index stehen (Waisen ODER Reset-Archive)
    for (const [filename, details] of Object.entries(fileDetails)) {
+      // .jsonl.reset.<ISO-Timestamp>Z → archivierte Session (OpenClaw-Reset)
+      // Format: 528f4d70-...jsonl.reset.2026-04-18T09-49-44.814Z
+      const resetMatch = filename.match(/^([a-f0-9-]+)\.jsonl\.reset\.(.+Z)$/);
+      if (resetMatch) {
+        const id = resetMatch[1];
+        // Timestamp ISO-8601 parsen: 2026-04-18T09-49-44.814Z → 2026-04-18T09:49:44.814Z
+        const tsStr = resetMatch[2].replace(/T(\d{2})-(\d{2})-(\d{2})/, "T$1:$2:$3");
+        const resetAt = Math.floor(new Date(tsStr).getTime() / 1000) || parseInt(details.MODIFIED) || 0;
+        sessions.push({
+          path: `${SESSIONS_DIR}/${filename}`,
+          sessionKey: id.slice(0, 8) + "… (archiv)",
+          sessionId: id,
+          lines: parseInt(details.LINES) || 0,
+          size: details.SIZE || "?",
+          modified: resetAt,
+          archived: true,
+          resetAt,
+        });
+        continue;
+      }
+      // Echte Waisen (UUID.jsonl ohne Eintrag in sessions.json)
      const id = filename.replace(".jsonl", "");
      sessions.push({
        path: `${SESSIONS_DIR}/${filename}`,
@@ -1260,6 +1805,68 @@ async function handleReadSession(clientWs, sessionPath) {
  }
 }

+async function handleExportSession(clientWs, sessionPath, sessionKey) {
+  if (!sessionPath || sessionPath.includes("..") || !sessionPath.startsWith(SESSIONS_DIR)) {
+    clientWs.send(JSON.stringify({ type: "session_export", ok: false, error: "Ungueltiger Pfad" }));
+    return;
+  }
+  try {
+    const safePath = sessionPath.replace(/'/g, "");
+    const raw = await dockerExec("aria-core", `cat '${safePath}'`);
+    const lines = raw.split("\n").filter(l => l.trim());
+
+    const blocks = [];
+    for (const line of lines) {
+      let obj;
+      try { obj = JSON.parse(line); } catch { continue; }
+      if (obj.type !== "message" || !obj.message) continue;
+      const role = obj.message.role;
+      if (role !== "user" && role !== "assistant") continue;
+
+      let text = "";
+      const content = obj.message.content;
+      if (typeof content === "string") text = content;
+      else if (Array.isArray(content)) text = content.filter(c => c.type === "text").map(c => c.text || "").join("\n");
+      if (!text) continue;
+
+      if (role === "user") {
+        text = text.replace(/^Sender \(untrusted metadata\):[\s\S]*?```[\s\S]*?```\s*\n*/m, "").trim();
+        text = text.replace(/^\[.*?\]\s*/, "").trim();
+      } else {
+        text = text.replace(/^\[\[reply_to_\w+\]\]\s*/g, "").trim();
+      }
+      if (!text) continue;
+
+      const ts = obj.message.timestamp || obj.timestamp || 0;
+      const when = ts ? new Date(ts).toISOString().replace("T", " ").slice(0, 19) : "";
+      const heading = role === "user" ? "## 🧑 User" : "## 🤖 ARIA";
+      blocks.push(`${heading}${when ? ` — ${when}` : ""}\n\n${text}`);
+    }
+
+    const exportedAt = new Date().toISOString().replace("T", " ").slice(0, 19);
+    const title = sessionKey || sessionPath.split("/").pop().replace(".jsonl", "");
+    const markdown = [
+      `# Session: ${title}`,
+      ``,
+      `Exportiert: ${exportedAt}  `,
+      `Quelle: ${sessionPath}`,
+      ``,
+      `---`,
+      ``,
+      blocks.join("\n\n---\n\n"),
+      ``,
+    ].join("\n");
+
+    const safeKey = (sessionKey || "session").replace(/[^a-zA-Z0-9_-]/g, "_");
+    const filename = `${exportedAt.slice(0, 10)}_${safeKey}.md`;
+    clientWs.send(JSON.stringify({ type: "session_export", ok: true, filename, markdown }));
+    log("info", "server", `Session exportiert: ${filename} (${blocks.length} Nachrichten)`);
+  } catch (err) {
+    log("error", "server", `Session-Export fehlgeschlagen: ${err.message}`);
+    clientWs.send(JSON.stringify({ type: "session_export", ok: false, error: err.message }));
+  }
+}
+
 async function handleDeleteSession(clientWs, sessionPath) {
  if (!sessionPath || sessionPath.includes("..") || !sessionPath.startsWith(SESSIONS_DIR)) {
    clientWs.send(JSON.stringify({ type: "session_deleted", ok: false, error: "Ungueltiger Pfad" }));
@@ -1300,13 +1907,11 @@ async function handleDeleteSession(clientWs, sessionPath) {
 }

 // ── Session-Aufloesung: letzte aktive Session finden ────
+// Wird nach Gateway-(Re-)Connect aufgerufen. Darf die explizit gewaehlte
+// Session NIE ueberschreiben — nur beim absoluten Erststart auto-picken.
 async function resolveActiveSession() {
-  // Nur bei Fallback-Key "main" automatisch aufloesen — gespeicherte Wahl respektieren
-  const hasSavedSession = (() => {
-    try { return !!fs.readFileSync(SESSION_KEY_FILE, "utf-8").trim(); } catch { return false; }
-  })();
-  if (hasSavedSession && activeSessionKey !== "main") {
-    log("info", "server", `Gespeicherte Session '${activeSessionKey}' wird beibehalten`);
+  if (sessionFromFile) {
+    log("info", "server", `Session '${activeSessionKey}' aus /data — keine Auto-Wahl`);
    return;
  }

@@ -1325,10 +1930,19 @@ async function resolveActiveSession() {
  const keys = entries.map(e => (e.key || e.sessionKey || e.name || "?").replace(/^agent:main:/, ""));
  log("info", "server", `Verfuegbare Sessions: [${keys.join(", ")}]`);

-  // Neueste Session nehmen
+  // Neueste Session nehmen — aber user-definierte bevorzugen.
+  // aria-bridge / aria-diagnostic werden von den Services auto-erstellt;
+  // bei erstem Start soll lieber eine "echte" Session gewaehlt werden,
+  // falls vorhanden.
+  const AUTO_KEYS = new Set(["aria-bridge", "aria-diagnostic"]);
+  const normalise = (e) => (e.key || e.sessionKey || e.name || "").replace(/^agent:main:/, "");
+
+  const userEntries = entries.filter(e => !AUTO_KEYS.has(normalise(e)));
+  const pool = userEntries.length > 0 ? userEntries : entries;
+
  let newest = null;
  let newestTime = 0;
-  for (const entry of entries) {
+  for (const entry of pool) {
    const t = entry.updatedAt || entry.createdAt || 0;
    if (t >= newestTime) {
      newestTime = t;
@@ -1337,12 +1951,11 @@ async function resolveActiveSession() {
  }

  if (newest) {
-    const rawKey = newest.key || newest.sessionKey || newest.name || "";
-    const key = rawKey.replace(/^agent:main:/, "");
+    const key = normalise(newest);
    if (key) {
      activeSessionKey = key;
-      try { fs.writeFileSync(SESSION_KEY_FILE, activeSessionKey); } catch {}
-      log("info", "server", `Aktive Session auf neueste gewechselt: '${activeSessionKey}'`);
+      persistActiveSession(activeSessionKey);
+      log("info", "server", `Auto-Wahl Erststart: '${activeSessionKey}'`);
      for (const c of browserClients) {
        c.send(JSON.stringify({ type: "active_session", sessionKey: activeSessionKey }));
      }
@@ -1431,8 +2044,11 @@ function handleSetActiveSession(clientWs, sessionKey) {
    return;
  }
  activeSessionKey = sessionKey;
-  try { fs.writeFileSync(SESSION_KEY_FILE, activeSessionKey); } catch {}
-  log("info", "server", `Aktive Session: ${activeSessionKey}`);
+  const ok = persistActiveSession(activeSessionKey);
+  log("info", "server", `Aktive Session: ${activeSessionKey}${ok ? "" : " (WARN: nicht persistiert!)"}`);
+  if (!ok) {
+    clientWs.send(JSON.stringify({ type: "active_session", ok: false, sessionKey: activeSessionKey, error: "Persistierung fehlgeschlagen — /data Volume pruefen" }));
+  }
  // Allen Clients mitteilen
  for (const c of browserClients) {
    c.send(JSON.stringify({ type: "active_session", sessionKey: activeSessionKey }));
@@ -1448,7 +2064,7 @@ async function handleCreateSession(clientWs, sessionName) {
  try {
    // Session wird automatisch erstellt wenn man die erste Nachricht sendet
    activeSessionKey = sessionName;
-    try { fs.writeFileSync(SESSION_KEY_FILE, activeSessionKey); } catch {}
+    persistActiveSession(activeSessionKey);
    log("info", "server", `Neue Session erstellt und aktiviert: ${sessionName}`);
    // Allen Clients mitteilen
    for (const c of browserClients) {
@@ -18,7 +18,8 @@ services:
      claude-max-api"
    volumes:
      - ~/.claude:/root/.claude                      # Claude CLI Auth (Credentials in /root/.claude/.credentials.json)
-      - ./aria-data/ssh:/root/.ssh:ro               # SSH Keys fuer VM-Zugriff (aria-wohnung)
+      - ./aria-data/ssh:/root/.ssh                    # SSH Keys fuer VM-Zugriff (aria-wohnung, rw fuer ARIA)
+      - aria-shared:/shared                          # Shared Volume fuer Datei-Austausch (Uploads von App)
    environment:
      - HOST=0.0.0.0
      - SHELL=/bin/bash                              # Claude Code Bash-Tool braucht bash (nicht nur sh/ash)
@@ -58,6 +59,7 @@ services:
      - ./aria-data/ssh:/home/node/.ssh                # SSH Keys fuer VM-Zugriff
      - /tmp/.X11-unix:/tmp/.X11-unix
      - /var/run/docker.sock:/var/run/docker.sock  # VM von innen verwalten
+      - aria-shared:/shared                        # Shared Volume fuer Datei-Austausch (Bridge <> Core)
    restart: unless-stopped
    networks:
      - aria-net
@@ -70,8 +72,8 @@ services:
      - aria
    network_mode: "service:aria"                   # Teilt Netzwerk mit aria-core → localhost:18789
    volumes:
-      - ./aria-data/voices:/voices:ro              # TTS Stimmen
      - ./aria-data/config/aria.env:/config/aria.env
+      - aria-shared:/shared                        # Shared Volume fuer Datei-Austausch (Bridge <> Core)
      # Audio-Zugriff
      - /run/user/1000/pulse:/run/user/1000/pulse
      - /dev/snd:/dev/snd
@@ -97,6 +99,7 @@ services:
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - ./aria-data/config/diag-state:/data           # Persistenter State (aktive Session etc.)
+      - aria-shared:/shared                            # Shared Volume (Uploads + Config)
    environment:
      - ARIA_AUTH_TOKEN=${ARIA_AUTH_TOKEN:-}
      - PROXY_URL=http://proxy:3456
@@ -110,6 +113,7 @@ services:
 volumes:
  openclaw-config:                                 # Persistiert ~/.openclaw (Model, Auth, Sessions)
  claude-config:                                   # Persistiert ~/.claude (Permissions, Settings)
+  aria-shared:                                     # Datei-Austausch zwischen Bridge und Core

 networks:
  aria-net:
@@ -1,32 +0,0 @@
-#!/bin/bash
-# ════════════════════════════════════════════════
-#  ARIA — Piper Stimmen herunterladen
-#  Ramona (Alltag) + Thorsten (epische Momente)
-# ════════════════════════════════════════════════
-
-set -e
-
-VOICES_DIR="aria-data/voices"
-BASE_URL="https://huggingface.co/rhasspy/piper-voices/resolve/main/de/de_DE"
-
-mkdir -p "$VOICES_DIR"
-cd "$VOICES_DIR"
-
-echo "Lade ARIA Stimmen..."
-echo ""
-
-echo "[1/4] Ramona (Modell)..."
-wget -q --show-progress "$BASE_URL/ramona/low/de_DE-ramona-low.onnx"
-
-echo "[2/4] Ramona (Config)..."
-wget -q --show-progress "$BASE_URL/ramona/low/de_DE-ramona-low.onnx.json"
-
-echo "[3/4] Thorsten (Modell)..."
-wget -q --show-progress "$BASE_URL/thorsten/high/de_DE-thorsten-high.onnx"
-
-echo "[4/4] Thorsten (Config)..."
-wget -q --show-progress "$BASE_URL/thorsten/high/de_DE-thorsten-high.onnx.json"
-
-echo ""
-echo "Stimmen geladen!"
-ls -lh *.onnx
@@ -0,0 +1,79 @@
+# ARIA Issues & Features
+
+## Erledigt
+
+- [x] Bildupload funktioniert (Shared Volume /shared/uploads/)
+- [x] Sprachnachrichten werden als Text angezeigt (STT → Chat-Bubble)
+- [x] Cache leeren + Auto-Download von Anhaengen
+- [x] ARIA liest Nachrichten vor (TTS via Piper)
+- [x] Autoscroll zur letzten Nachricht (inverted FlatList)
+- [x] Bilder im Chat groesser + Vollbild-Vorschau
+- [x] Ohr-Button → Gespraechsmodus (Auto-Aufnahme nach ARIA-Antwort)
+- [x] Play-Button in ARIA-Nachrichten fuer Sprachwiedergabe
+- [x] Chat-Suche in der App (Lupe in Statusleiste)
+- [x] Watchdog mit Container-Restart (2min Warnung → 5min doctor --fix → 8min Restart)
+- [x] Abbrechen-Button im Diagnostic Chat
+- [x] Nachrichten Backup on-the-fly (/shared/config/chat_backup.jsonl)
+- [x] Grosse Nachrichten satzweise aufteilen fuer TTS
+- [x] RVS Nachrichten vom Smartphone gehen durch
+- [x] Stimmen-Einstellungen (Ramona/Thorsten, Speed pro Stimme)
+- [x] Highlight-Trigger konfigurierbar in Diagnostic
+- [x] XTTS v2 Integration (Gaming-PC, GPU, Voice Cloning)
+- [x] XTTS Voice Cloning (Audio-Samples hochladen, eigene Stimme)
+- [x] TTS Engine waehlbar (Piper/XTTS) in Diagnostic + App
+- [x] Auto-Update System (APK via RVS WebSocket)
+- [x] Auto-Update: APK-Installation via FileProvider
+- [x] Auto-Update: "Auf Updates pruefen" Button in App-Einstellungen
+- [x] Audio-Queue (sequentielle Wiedergabe, kein Ueberlappen)
+- [x] Textnachrichten werden von ARIA beantwortet (Bridge chat handler fix)
+- [x] Mehrere Anhaenge + Text vor dem Senden (Pending-Vorschau)
+- [x] Paste-Support fuer Bilder in Diagnostic Chat
+- [x] Markdown-Bereinigung fuer TTS (fett, kursiv, code, links, etc.)
+- [x] SSH Volume read-write fuer Proxy (kein -F Workaround mehr)
+- [x] Diagnostic: Sessions als Markdown exportieren (Download-Button)
+- [x] Speech Gate: Aufnahme wird verworfen wenn keine Sprache erkannt (verhindert dass Umgebungsgeraeusche an Whisper gehen)
+- [x] Session-Persistenz: Gewaehlte Session bleibt ueber Container-Restarts erhalten (sessionFromFile-Flag, atomic write)
+- [x] Diagnostic: "ARIA denkt..." bleibt nicht mehr stehen (pipelineEnd broadcastet immer idle, auch bei Timeout/Fehler/Disconnect)
+- [x] App: "ARIA denkt..." Indicator + Abbrechen-Button (Bridge spiegelt agent_activity via RVS)
+- [x] Whisper STT: Model-Auswahl in Diagnostic (tiny/base/small/medium/large-v3), Hot-Reload in Bridge, Default auf medium
+- [x] App: Audio-Aufnahme explizit 16kHz mono (spart Resample, optimal fuer Whisper)
+- [x] Streaming TTS (Weg A): XTTS → PCM-Stream → aria-bridge → App AudioTrack MODE_STREAM, keine WAV-Gaps mehr
+- [x] Piper komplett entfernt: nur noch XTTS v2 als TTS-Engine (remote, GPU auf Gaming-PC). Wenn XTTS offline ist, ist ARIA stumm — bewusst akzeptiert.
+- [x] Gespraechsmodus: Speech-Gate strenger (-28dB / 500ms) — keine Umgebungsgeraeusche mehr
+- [x] Gespraechsmodus: Max-Dauer 30s pro Aufnahme, Cache-Cleanup alter Files, Messages-Array gekappt (500)
+- [x] Diagnostic: Archivierte Session-Versionen (.reset.*) werden angezeigt + exportierbar — OpenClaw resettet Sessions bei erster Nutzung nach Container-Restart, Inhalt ist aber in .reset.<timestamp> Dateien gesichert
+- [x] tools/export-jsonl-to-md.js: CLI-Konverter fuer beliebige Session-JSONL zu Markdown
+- [x] NO_REPLY-Filter in Bridge + Diagnostic — still verworfen (kein Chat, kein TTS)
+- [x] Audio-Ducking + Exklusiv-Focus (Kotlin AudioFocusModule): andere Apps leiser bei TTS, pausiert bei Aufnahme
+- [x] TTS-Cleanup serverseitig: Code-Bloecke raus, Einheiten ausgeschrieben (22GB → Gigabyte), Abkuerzungen buchstabiert (CPU), URLs zu "ein Link". `<voice></voice>` Tag wird bevorzugt wenn ARIA ihn liefert.
+- [x] QR-Code Onboarding: Diagnostic generiert QR, App scannt (bestehender QRScanner funktioniert out of the box)
+- [x] TTS-Audio-Cache im Filesystem: Piper-Audio wird mit messageId verknuepft, als WAV in DocumentDirectory/tts_cache gespeichert, Play-Button spielt aus Cache statt regenerieren
+- [x] Config via Diagnostic: RVS-Credentials + Aria-Auth-Token via /api/runtime-config, persistiert in /shared/config/runtime.json, Bridge liest beim Start (Overrides der ENV)
+
+## Offen
+
+### Bugs (Prioritaet)
+- [ ] App: Audioausgabe hoert ab und zu einfach auf (mitten im Satz oder zwischen Chunks)
+- [ ] NO_REPLY wird als "NO" im Chat angezeigt — sollte still verworfen werden (Token nicht gesaeubert)
+
+### App Features
+- [ ] Wake Word on-device (Porcupine "ARIA" Keyword, Phase 2 — passives Lauschen)
+- [ ] Chat-History zuverlaessiger laden (AsyncStorage Race Condition)
+- [ ] Background Audio Service (TTS auch bei minimierter App)
+- [ ] Audio-Ducking: andere App-Audio-Ausgaben leiser stellen waehrend ARIA spricht (AudioFocus API)
+- [ ] Audio-Muten waehrend Aufnahme/Ohr-Modus: andere Audio stumm (wie WhatsApp-Sprachaufnahme)
+- [ ] Spracheingabe-Timeout erhoehen fuer laengere Texte
+- [ ] Generierte TTS-Audiodaten in der Chat-Nachricht einbetten (oder lokal cachen), Play-Button spielt aus Cache statt Regenerierung via XTTS. Base64 im Tag <soundfile></soundfile> (invisible) oder lokaler Datei-Cache mit Referenz in der Message.
+- [ ] QR-Code Onboarding: Diagnostic generiert QR mit RVS-Credentials, App scannt — keine manuelle Eingabe mehr
+
+### TTS / Audio
+- [ ] Audio-Normalisierung (Lautstaerke zwischen Chunks angleichen)
+
+### Architektur
+- [ ] Bilder: Claude Vision direkt nutzen (aktuell nur Dateipfad an ARIA)
+- [ ] Auto-Compacting und Memory/Brain Verwaltung (SQLite?)
+- [ ] Diagnostic: System-Info Tab (Container-Status, Disk, RAM, CPU)
+- [ ] RVS Zombie-Connections endgueltig loesen
+- [ ] Alle .env-Variablen ueber Diagnostic konfigurierbar machen (kein File-Sync mehr noetig, da alle ARIA-Container auf der gleichen VM laufen). Fallback .env bleibt fuer initialen Bootstrap.
+- [ ] XTTS-Container: kleine Web-Oberflaeche fuer Credentials/Server-Config, oder zentral aus Diagnostic per RVS push
+- [ ] Root-Cause OpenClaw Session-Reset: Herausfinden warum Sessions beim ersten chat.send nach Container-Restart verworfen werden (abortedLastRun / systemSent Theorie pruefen, ggf. Flag preemptiv patchen)
@@ -51,9 +51,36 @@ fi
 echo -e "  ${GREEN}✓${NC} Login erfolgreich"
 echo ""

+# ── Versionsnummern aktualisieren ─────────────
+echo -e "${GREEN}[1/5] Versionsnummern auf $VERSION setzen...${NC}"
+
+# package.json
+sed -i "s/\"version\": \"[^\"]*\"/\"version\": \"$VERSION\"/" android/package.json
+echo -e "  ${GREEN}✓${NC} package.json → $VERSION"
+
+# build.gradle: versionName + versionCode (aus Version berechnen)
+# Unterstuetzt 3-stellig (1.2.3) und 4-stellig (0.0.1.7)
+IFS='.' read -ra VER_PARTS <<< "$VERSION"
+V1=${VER_PARTS[0]:-0}; V2=${VER_PARTS[1]:-0}; V3=${VER_PARTS[2]:-0}; V4=${VER_PARTS[3]:-0}
+VERSION_CODE=$((V1 * 1000000 + V2 * 10000 + V3 * 100 + V4))
+# Mindestens 1 (Android erfordert versionCode >= 1)
+[ "$VERSION_CODE" -lt 1 ] && VERSION_CODE=1
+sed -i "s/versionName \"[^\"]*\"/versionName \"$VERSION\"/" android/android/app/build.gradle
+sed -i "s/versionCode [0-9]*/versionCode $VERSION_CODE/" android/android/app/build.gradle
+echo -e "  ${GREEN}✓${NC} build.gradle → versionName $VERSION, versionCode $VERSION_CODE"
+
+# SettingsScreen: Anzeige-Version (beliebiges Versionsformat)
+sed -i "s/Version [0-9][0-9.]*[^<]*/Version $VERSION /" android/src/screens/SettingsScreen.tsx
+echo -e "  ${GREEN}✓${NC} SettingsScreen → Version $VERSION"
+
+echo ""
+
 # ── APK bauen ─────────────────────────────────
-echo -e "${GREEN}[1/4] APK bauen...${NC}"
+echo -e "${GREEN}[2/5] APK bauen (Cache leeren + Build)...${NC}"
 cd android
+# Metro + Gradle Cache leeren damit neue Version sauber eingebettet wird
+rm -rf node_modules/.cache 2>/dev/null
+cd android && ./gradlew clean 2>/dev/null; cd ..
 ./build.sh release
 cd ..

@@ -70,7 +97,11 @@ echo -e "  ${GREEN}✓${NC} APK gebaut ($APK_SIZE)"
 echo ""

 # ── Git Tag ───────────────────────────────────
-echo -e "${GREEN}[2/4] Git Tag $TAG...${NC}"
+echo -e "${GREEN}[3/5] Git Tag $TAG...${NC}"
+
+# Versions-Aenderungen committen
+git add android/package.json android/android/app/build.gradle android/src/screens/SettingsScreen.tsx
+git commit -m "release: bump version to $VERSION" 2>/dev/null || echo -e "  ${YELLOW}Keine Aenderungen zum Committen${NC}"

 if git rev-parse "$TAG" &>/dev/null; then
    echo -e "  ${YELLOW}Tag $TAG existiert bereits — überspringe${NC}"
@@ -79,7 +110,7 @@ else
    echo -e "  ${GREEN}✓${NC} Tag $TAG erstellt"
 fi

-git push origin "$TAG"
+git push origin main "$TAG"
 echo -e "  ${GREEN}✓${NC} Tag gepusht"
 echo ""

@@ -102,7 +133,7 @@ fi
 RELEASE_BODY_ESCAPED=$(printf '%s' "$RELEASE_BODY" | python3 -c 'import sys,json; print(json.dumps(sys.stdin.read()))' 2>/dev/null || printf '"%s"' "$RELEASE_BODY" | sed 's/"/\\"/g')

 # ── Gitea Release erstellen ───────────────────
-echo -e "${GREEN}[3/4] Gitea Release erstellen...${NC}"
+echo -e "${GREEN}[4/5] Gitea Release erstellen...${NC}"

 RELEASE_RESPONSE=$(curl -s -X POST \
    "$GITEA_URL/api/v1/repos/$GITEA_REPO/releases" \
@@ -127,7 +158,7 @@ echo -e "  ${GREEN}✓${NC} Release #$RELEASE_ID erstellt"
 echo ""

 # ── APK hochladen ─────────────────────────────
-echo -e "${GREEN}[4/4] APK hochladen...${NC}"
+echo -e "${GREEN}[5/5] APK hochladen...${NC}"

 UPLOAD_RESPONSE=$(curl -s -X POST \
    "$GITEA_URL/api/v1/repos/$GITEA_REPO/releases/$RELEASE_ID/assets?name=$APK_NAME" \
@@ -142,6 +173,24 @@ else
    exit 1
 fi

+# ── Auto-Update: APK auf RVS-Server kopieren ─
+RVS_UPDATE_HOST="${RVS_UPDATE_HOST:-}"
+if [ -n "$RVS_UPDATE_HOST" ]; then
+    echo -e "${GREEN}[6/6] APK auf RVS-Server kopieren (Auto-Update)...${NC}"
+    # Alte APKs auf dem RVS loeschen, dann neue hochladen
+    ssh "$RVS_UPDATE_HOST" "rm -f ~/ARIA-AGENT/rvs/updates/ARIA-*.apk" 2>/dev/null
+    scp "$APK_PATH" "${RVS_UPDATE_HOST}:~/ARIA-AGENT/rvs/updates/${APK_NAME}" 2>/dev/null
+    if [ $? -eq 0 ]; then
+        echo -e "  ${GREEN}✓${NC} APK auf RVS-Server kopiert (alte Versionen geloescht)"
+    else
+        echo -e "  ${YELLOW}APK konnte nicht auf RVS kopiert werden (RVS_UPDATE_HOST=$RVS_UPDATE_HOST)${NC}"
+        echo -e "  ${YELLOW}Manuell: scp $APK_PATH $RVS_UPDATE_HOST:~/ARIA-AGENT/rvs/updates/${APK_NAME}${NC}"
+    fi
+else
+    echo -e "${YELLOW}Auto-Update uebersprungen (RVS_UPDATE_HOST nicht gesetzt)${NC}"
+    echo -e "${YELLOW}Setze RVS_UPDATE_HOST in .env fuer automatische Verteilung${NC}"
+fi
+
 # ── Fertig ────────────────────────────────────
 echo ""
 echo -e "${GREEN}╔═══════════════════════════════════════════════════╗${NC}"
@@ -149,4 +198,5 @@ echo -e "${GREEN}║   Release $TAG ist live!$(printf '%*s' $((27 - ${#TAG})) ''
 echo -e "${GREEN}╠═══════════════════════════════════════════════════╣${NC}"
 echo -e "${GREEN}║${NC}   $GITEA_URL/$GITEA_REPO/releases/tag/$TAG"
 echo -e "${GREEN}║${NC}   APK: $APK_NAME ($APK_SIZE)"
+echo -e "${GREEN}║${NC}   Auto-Update: ${RVS_UPDATE_HOST:-nicht konfiguriert}"
 echo -e "${GREEN}╚═══════════════════════════════════════════════════╝${NC}"
@@ -4,5 +4,7 @@ services:
    ports:
      - "${RVS_PORT:-443}:3000"
    restart: always
+    volumes:
+      - ./updates:/updates                # APK-Dateien fuer Auto-Update
    environment:
      - MAX_SESSIONS=10
@@ -1,14 +1,26 @@
 "use strict";

 const { WebSocketServer } = require("ws");
+const fs = require("fs");
+const path = require("path");

 // ── Konfiguration aus Umgebungsvariablen ────────────────────────────
 const PORT = parseInt(process.env.PORT || "3000", 10);
 const MAX_SESSIONS = parseInt(process.env.MAX_SESSIONS || "10", 10);
+const UPDATES_DIR = process.env.UPDATES_DIR || "/updates";
+// Kein Polling — APK wird manuell per git pull bereitgestellt

 // Erlaubte Nachrichtentypen — alles andere wird verworfen
 const ALLOWED_TYPES = new Set([
  "chat", "audio", "file", "location", "mode", "log", "event", "heartbeat",
+  "file_request", "file_response", "file_saved", "stt_result", "config", "tts_request",
+  "xtts_request", "xtts_response", "xtts_list_voices", "xtts_voices_list", "voice_upload", "xtts_voice_saved",
+  "update_check", "update_available", "update_download", "update_data",
+  "agent_activity", "cancel_request",
+  "audio_pcm",
+  "xtts_delete_voice",
+  "voice_preload", "voice_ready",
+  "stt_request", "stt_response",
 ]);

 // Token-Raum: token -> { clients: Set<ws> }
@@ -45,6 +57,9 @@ const wss = new WebSocketServer({ port: PORT });

 wss.on("listening", () => {
  log(`RVS läuft auf Port ${PORT} | Max Sessions: ${MAX_SESSIONS}`);
+  // Beim Start pruefen ob eine APK da ist
+  const apkInfo = getLatestAPK();
+  if (apkInfo) log(`APK bereit: v${apkInfo.version} (${(fs.statSync(apkInfo.path).size / 1024 / 1024).toFixed(1)}MB)`);
 });

 wss.on("connection", (ws, req) => {
@@ -106,6 +121,52 @@ function registerClient(ws, token) {
      return;
    }

+    // Update-Check: direkt an den anfragenden Client antworten (nicht relay'en)
+    if (msg.type === "update_check") {
+      const clientVersion = msg.payload?.version || "0.0.0.0";
+      const apkInfo = getLatestAPK();
+      if (apkInfo && compareVersions(apkInfo.version, clientVersion) > 0) {
+        ws.send(JSON.stringify({
+          type: "update_available",
+          payload: {
+            version: apkInfo.version,
+            downloadUrl: `/update/latest.apk`,
+            size: fs.statSync(apkInfo.path).size,
+          },
+          timestamp: Date.now(),
+        }));
+      }
+      return;
+    }
+
+    // Update-Download: APK als Base64 ueber WebSocket senden
+    if (msg.type === "update_download") {
+      const apkInfo = getLatestAPK();
+      if (!apkInfo) {
+        ws.send(JSON.stringify({ type: "update_data", payload: { error: "Keine APK verfuegbar" }, timestamp: Date.now() }));
+        return;
+      }
+      try {
+        const data = fs.readFileSync(apkInfo.path);
+        const base64 = data.toString("base64");
+        const sizeMB = (data.length / 1024 / 1024).toFixed(1);
+        log(`APK sende: v${apkInfo.version} (${sizeMB}MB) an Client`);
+        ws.send(JSON.stringify({
+          type: "update_data",
+          payload: {
+            version: apkInfo.version,
+            base64,
+            size: data.length,
+            fileName: `ARIA-v${apkInfo.version}.apk`,
+          },
+          timestamp: Date.now(),
+        }));
+      } catch (err) {
+        ws.send(JSON.stringify({ type: "update_data", payload: { error: err.message }, timestamp: Date.now() }));
+      }
+      return;
+    }
+
    // An alle anderen Clients im Raum weiterleiten
    for (const client of room.clients) {
      if (client !== ws && client.readyState === 1) {
@@ -166,6 +227,63 @@ wss.on("close", () => {
  clearInterval(cleanup);
 });

+// ── Auto-Update: APK-Erkennung + Push ──────────────────────────────
+
+let latestVersion = null;
+
+function getLatestAPK() {
+  try {
+    if (!fs.existsSync(UPDATES_DIR)) return null;
+    const files = fs.readdirSync(UPDATES_DIR)
+      .filter(f => f.endsWith(".apk"))
+      .map(f => {
+        // ARIA-v0.0.2.3.apk oder ARIA-Cockpit-release.apk
+        const match = f.match(/(\d+\.\d+\.\d+[\.\d]*)/);
+        return { file: f, path: path.join(UPDATES_DIR, f), version: match ? match[1] : null };
+      })
+      .filter(f => f.version)
+      .sort((a, b) => compareVersions(b.version, a.version)); // Neueste zuerst
+
+    return files[0] || null;
+  } catch {
+    return null;
+  }
+}
+
+function compareVersions(a, b) {
+  const pa = a.split(".").map(Number);
+  const pb = b.split(".").map(Number);
+  for (let i = 0; i < Math.max(pa.length, pb.length); i++) {
+    const diff = (pa[i] || 0) - (pb[i] || 0);
+    if (diff !== 0) return diff;
+  }
+  return 0;
+}
+
+function notifyClientsAboutUpdate(apkInfo) {
+  const msg = JSON.stringify({
+    type: "update_available",
+    payload: {
+      version: apkInfo.version,
+      downloadUrl: `/update/latest.apk`,
+      size: fs.statSync(apkInfo.path).size,
+    },
+    timestamp: Date.now(),
+  });
+
+  // An alle Clients in allen Rooms senden
+  for (const [, room] of rooms) {
+    for (const client of room.clients) {
+      if (client.readyState === 1) {
+        client.send(msg);
+      }
+    }
+  }
+  log(`Update-Benachrichtigung gesendet: v${apkInfo.version} (${rooms.size} Raum/Raeume)`);
+}
+
+// Kein Polling — Update-Check passiert on-demand (update_check Message von App)
+
 // ── Sauberes Herunterfahren ─────────────────────────────────────────

 process.on("SIGTERM", () => {
@@ -0,0 +1,74 @@
+#!/usr/bin/env node
+/**
+ * Exportiert ein OpenClaw Session-JSONL (auch .reset.*) als Markdown.
+ *
+ * Nutzung:
+ *   node export-jsonl-to-md.js <input.jsonl> [output.md]
+ *
+ * Oder direkt aus dem aria-core Container:
+ *   docker exec aria-core cat /home/node/.openclaw/agents/main/sessions/<ID>.jsonl.reset.<TS> \
+ *     | node export-jsonl-to-md.js - > output.md
+ */
+
+const fs = require("fs");
+
+const inputArg = process.argv[2];
+const outputArg = process.argv[3];
+
+if (!inputArg) {
+  console.error("Usage: export-jsonl-to-md.js <input.jsonl|-> [output.md]");
+  process.exit(1);
+}
+
+const raw = inputArg === "-" ? fs.readFileSync(0, "utf-8") : fs.readFileSync(inputArg, "utf-8");
+const lines = raw.split("\n").filter(l => l.trim());
+
+const blocks = [];
+for (const line of lines) {
+  let obj;
+  try { obj = JSON.parse(line); } catch { continue; }
+  if (obj.type !== "message" || !obj.message) continue;
+  const role = obj.message.role;
+  if (role !== "user" && role !== "assistant") continue;
+
+  let text = "";
+  const content = obj.message.content;
+  if (typeof content === "string") text = content;
+  else if (Array.isArray(content)) text = content.filter(c => c.type === "text").map(c => c.text || "").join("\n");
+  if (!text) continue;
+
+  if (role === "user") {
+    text = text.replace(/^Sender \(untrusted metadata\):[\s\S]*?```[\s\S]*?```\s*\n*/m, "").trim();
+    text = text.replace(/^\[.*?\]\s*/, "").trim();
+  } else {
+    text = text.replace(/^\[\[reply_to_\w+\]\]\s*/g, "").trim();
+  }
+  if (!text) continue;
+
+  const ts = obj.message.timestamp || obj.timestamp || 0;
+  const when = ts ? new Date(ts).toISOString().replace("T", " ").slice(0, 19) : "";
+  const heading = role === "user" ? "## 🧑 User" : "## 🤖 ARIA";
+  blocks.push(`${heading}${when ? ` — ${when}` : ""}\n\n${text}`);
+}
+
+const exportedAt = new Date().toISOString().replace("T", " ").slice(0, 19);
+const title = inputArg === "-" ? "Session" : inputArg.split("/").pop().replace(/\.jsonl.*/, "");
+const md = [
+  `# Session: ${title}`,
+  ``,
+  `Exportiert: ${exportedAt}  `,
+  `Quelle: ${inputArg === "-" ? "stdin" : inputArg}`,
+  `Nachrichten: ${blocks.length}`,
+  ``,
+  `---`,
+  ``,
+  blocks.join("\n\n---\n\n"),
+  ``,
+].join("\n");
+
+if (outputArg) {
+  fs.writeFileSync(outputArg, md);
+  console.error(`OK: ${blocks.length} Nachrichten → ${outputArg}`);
+} else {
+  process.stdout.write(md);
+}
@@ -0,0 +1,11 @@
+# ════════════════════════════════════════════════
+#  ARIA XTTS v2 — Konfiguration
+#  Kopieren nach .env und anpassen
+# ════════════════════════════════════════════════
+
+# RVS Verbindung (gleiche Daten wie auf der ARIA-VM)
+RVS_HOST=mobil.hacker-net.de
+RVS_PORT=444
+RVS_TLS=true
+RVS_TLS_FALLBACK=true
+RVS_TOKEN=dein_token_hier
@@ -0,0 +1,81 @@
+# ════════════════════════════════════════════════
+#  ARIA Gamebox Stack — GPU F5-TTS + Whisper STT
+#  Laeuft auf dem Gaming-PC (RTX 3060)
+#  Verbindet sich zum RVS fuer TTS/STT-Requests
+# ════════════════════════════════════════════════
+#
+#  Voraussetzungen:
+#    - Docker Desktop mit WSL2
+#    - NVIDIA Container Toolkit
+#    - .env mit RVS-Verbindungsdaten
+#
+#  Start: docker compose up -d
+# ════════════════════════════════════════════════
+
+services:
+
+  # ─── F5-TTS Bridge (GPU) ──────────────────────
+  # Ersetzt den frueheren XTTS-Stack. Empfaengt xtts_request via RVS,
+  # rendert via F5-TTS mit Voice-Cloning, streamt PCM an die App.
+  # Voice-Upload: speichert WAV und laesst whisper-bridge den Referenz-
+  # text transkribieren — der User muss nichts eintippen.
+  f5tts-bridge:
+    build: ./f5tts
+    container_name: aria-f5tts-bridge
+    deploy:
+      resources:
+        reservations:
+          devices:
+            - driver: nvidia
+              count: 1
+              capabilities: [gpu]
+    volumes:
+      - ./voices:/voices                         # WAV + TXT Referenz
+      - f5tts-models:/root/.cache/huggingface    # Model-Cache persistieren
+    environment:
+      - RVS_HOST=${RVS_HOST}
+      - RVS_PORT=${RVS_PORT:-443}
+      - RVS_TLS=${RVS_TLS:-true}
+      - RVS_TLS_FALLBACK=${RVS_TLS_FALLBACK:-true}
+      - RVS_TOKEN=${RVS_TOKEN}
+      - F5TTS_MODEL=${F5TTS_MODEL:-F5TTS_v1_Base}
+      - F5TTS_DEVICE=${F5TTS_DEVICE:-cuda}
+      - VOICES_DIR=/voices
+    restart: unless-stopped
+
+  # ─── Whisper STT (GPU) ────────────────────────
+  # Faster-Whisper auf der Gamebox statt auf der VM (CPU) —
+  # deutlich schneller. Verbindet sich selbst per WebSocket an
+  # den RVS und nimmt dort stt_request Nachrichten der aria-bridge
+  # entgegen, antwortet mit stt_response. Zusaetzlich nutzt die
+  # f5tts-bridge Whisper intern fuer die Referenz-Transkription bei
+  # Voice-Uploads. Laedt das Modell beim Start vor; auf Config-
+  # Broadcasts (Diagnostic → whisperModel) wird zur Laufzeit hot-
+  # swapped.
+  whisper-bridge:
+    build: ./whisper
+    container_name: aria-whisper-bridge
+    deploy:
+      resources:
+        reservations:
+          devices:
+            - driver: nvidia
+              count: 1
+              capabilities: [gpu]
+    environment:
+      - RVS_HOST=${RVS_HOST}
+      - RVS_PORT=${RVS_PORT:-443}
+      - RVS_TLS=${RVS_TLS:-true}
+      - RVS_TLS_FALLBACK=${RVS_TLS_FALLBACK:-true}
+      - RVS_TOKEN=${RVS_TOKEN}
+      - WHISPER_MODEL=${WHISPER_MODEL:-small}
+      - WHISPER_DEVICE=${WHISPER_DEVICE:-cuda}
+      - WHISPER_COMPUTE_TYPE=${WHISPER_COMPUTE_TYPE:-float16}
+      - WHISPER_LANGUAGE=${WHISPER_LANGUAGE:-de}
+    volumes:
+      - whisper-models:/root/.cache/huggingface
+    restart: unless-stopped
+
+volumes:
+  f5tts-models:
+  whisper-models:
@@ -0,0 +1,21 @@
+FROM nvidia/cuda:12.2.2-cudnn8-runtime-ubuntu22.04
+
+ENV DEBIAN_FRONTEND=noninteractive
+ENV PYTHONUNBUFFERED=1
+
+RUN apt-get update && apt-get install -y --no-install-recommends \
+    python3 python3-pip ffmpeg git \
+    && rm -rf /var/lib/apt/lists/*
+
+WORKDIR /app
+
+# PyTorch CUDA-Wheels zuerst (f5-tts zieht sonst CPU-only Torch rein)
+RUN pip3 install --no-cache-dir torch==2.3.1 torchaudio==2.3.1 \
+    --index-url https://download.pytorch.org/whl/cu121
+
+COPY requirements.txt .
+RUN pip3 install --no-cache-dir -r requirements.txt
+
+COPY bridge.py .
+
+CMD ["python3", "bridge.py"]
@@ -0,0 +1,580 @@
+#!/usr/bin/env python3
+"""
+ARIA F5-TTS Bridge — laeuft auf der Gamebox (RTX 3060).
+
+Empfaengt xtts_request via RVS → F5-TTS Voice Cloning auf GPU → streamt
+16-bit PCM Chunks als audio_pcm Nachrichten zurueck an die App.
+
+Voice-Layout im VOICES_DIR:
+  {name}.wav   — Referenz-Audio (6-10s, 24kHz mono empfohlen)
+  {name}.txt   — Referenz-Text (UTF-8, was im WAV gesprochen wird)
+
+Beim voice_upload senden wir intern einen stt_request an die whisper-bridge
+und legen die Transkription als .txt ab — der User muss keinen Text eingeben.
+
+Env:
+  RVS_HOST, RVS_PORT, RVS_TLS, RVS_TLS_FALLBACK, RVS_TOKEN
+  F5TTS_MODEL   Default: F5TTS_v1_Base
+  F5TTS_DEVICE  Default: cuda
+  VOICES_DIR    Default: /voices
+"""
+import asyncio
+import base64
+import json
+import logging
+import os
+import re
+import subprocess
+import sys
+import tempfile
+import time
+import uuid
+from pathlib import Path
+from typing import Optional
+
+import numpy as np
+import soundfile as sf
+import websockets
+
+logging.basicConfig(
+    level=logging.INFO,
+    format="%(asctime)s [%(levelname)s] %(message)s",
+    datefmt="%H:%M:%S",
+)
+logger = logging.getLogger("f5tts-bridge")
+# HuggingFace + Torch download-Logs etwas daempfen
+logging.getLogger("httpx").setLevel(logging.WARNING)
+logging.getLogger("urllib3").setLevel(logging.WARNING)
+
+RVS_HOST = os.getenv("RVS_HOST", "").strip()
+RVS_PORT = int(os.getenv("RVS_PORT", "443"))
+RVS_TLS = os.getenv("RVS_TLS", "true").lower() == "true"
+RVS_TLS_FALLBACK = os.getenv("RVS_TLS_FALLBACK", "true").lower() == "true"
+RVS_TOKEN = os.getenv("RVS_TOKEN", "").strip()
+
+F5TTS_MODEL = os.getenv("F5TTS_MODEL", "F5TTS_v1_Base")
+F5TTS_DEVICE = os.getenv("F5TTS_DEVICE", "cuda")
+VOICES_DIR = Path(os.getenv("VOICES_DIR", "/voices"))
+
+PCM_CHUNK_BYTES = 8192   # ~170ms @ 24kHz mono s16
+TARGET_SR = 24000        # F5-TTS native
+
+# ── Lazy F5-TTS Loader ──────────────────────────────────────
+
+_F5TTS_cls = None
+
+
+def _get_f5tts_cls():
+    """Lazy import damit Startup-Logs nicht durch Torch-Warnungen zumuellen."""
+    global _F5TTS_cls
+    if _F5TTS_cls is None:
+        from f5_tts.api import F5TTS as _cls
+        _F5TTS_cls = _cls
+    return _F5TTS_cls
+
+
+class F5Runner:
+    """Haelt das F5-TTS-Modell. Synthese laeuft im Executor (blocking)."""
+
+    def __init__(self) -> None:
+        self.model = None
+        self._lock = asyncio.Lock()
+
+    def _load_blocking(self) -> None:
+        cls = _get_f5tts_cls()
+        logger.info("Lade F5-TTS '%s' (device=%s)...", F5TTS_MODEL, F5TTS_DEVICE)
+        t0 = time.time()
+        self.model = cls(model=F5TTS_MODEL, device=F5TTS_DEVICE)
+        logger.info("F5-TTS geladen in %.1fs", time.time() - t0)
+
+    async def ensure_loaded(self) -> None:
+        async with self._lock:
+            if self.model is not None:
+                return
+            loop = asyncio.get_event_loop()
+            await loop.run_in_executor(None, self._load_blocking)
+
+    def _infer_blocking(self, gen_text: str, ref_wav: str, ref_text: str) -> tuple[np.ndarray, int]:
+        wav, sr, _ = self.model.infer(
+            ref_file=ref_wav,
+            ref_text=ref_text,
+            gen_text=gen_text,
+            remove_silence=True,
+            seed=-1,
+        )
+        # F5-TTS gibt float32 1D-Array — auf 24kHz sample-rate standard
+        if not isinstance(wav, np.ndarray):
+            wav = np.asarray(wav, dtype=np.float32)
+        if wav.ndim > 1:
+            wav = wav.squeeze()
+        return wav.astype(np.float32), int(sr)
+
+    async def synthesize(self, gen_text: str, ref_wav: str, ref_text: str) -> tuple[np.ndarray, int]:
+        await self.ensure_loaded()
+        loop = asyncio.get_event_loop()
+        return await loop.run_in_executor(None, self._infer_blocking, gen_text, ref_wav, ref_text)
+
+
+# ── Helpers ─────────────────────────────────────────────────
+
+_SENTENCE_SPLIT = re.compile(r"(?<=[.!?])\s+|\n+")
+
+
+def split_sentences(text: str, max_len: int = 350) -> list[str]:
+    """Teilt langen Text an Satzgrenzen. Kurze Texte bleiben als-is."""
+    text = text.strip()
+    if not text:
+        return []
+    if len(text) <= max_len:
+        return [text]
+    parts = [p.strip() for p in _SENTENCE_SPLIT.split(text) if p.strip()]
+    # Zu kurze Fragmente mergen damit F5-TTS nicht an jedem Komma neu startet
+    merged: list[str] = []
+    buf = ""
+    for p in parts:
+        if len(buf) + len(p) + 1 <= max_len:
+            buf = f"{buf} {p}".strip()
+        else:
+            if buf:
+                merged.append(buf)
+            buf = p
+    if buf:
+        merged.append(buf)
+    return merged or [text]
+
+
+def float_to_pcm16(wav: np.ndarray) -> bytes:
+    """Float32 (-1..+1) → int16 little-endian bytes."""
+    wav = np.clip(wav, -1.0, 1.0)
+    pcm = (wav * 32767.0).astype(np.int16)
+    return pcm.tobytes()
+
+
+def sanitize_voice_name(name: str) -> str:
+    return re.sub(r"[^a-zA-Z0-9_-]", "_", name)
+
+
+def voice_paths(name: str) -> tuple[Path, Path]:
+    safe = sanitize_voice_name(name)
+    return VOICES_DIR / f"{safe}.wav", VOICES_DIR / f"{safe}.txt"
+
+
+def ensure_24k_mono_wav(src_wav: Path) -> Path:
+    """F5-TTS moechte 24kHz mono als Referenz — ffmpeg konvertiert inplace.
+
+    Wenn das File schon passt, wird nichts geaendert. Sonst wird es
+    reingeschrieben (Original wird ueberschrieben).
+    """
+    try:
+        info = sf.info(str(src_wav))
+        if info.samplerate == TARGET_SR and info.channels == 1:
+            return src_wav
+    except Exception:
+        pass
+    tmp_out = src_wav.with_suffix(".conv.wav")
+    cmd = ["ffmpeg", "-y", "-i", str(src_wav),
+           "-ar", str(TARGET_SR), "-ac", "1", "-f", "wav", str(tmp_out)]
+    r = subprocess.run(cmd, capture_output=True, timeout=30)
+    if r.returncode != 0:
+        logger.warning("ffmpeg-Konvertierung von %s fehlgeschlagen: %s",
+                       src_wav, r.stderr.decode(errors="replace")[:200])
+        try:
+            tmp_out.unlink()
+        except OSError:
+            pass
+        return src_wav
+    os.replace(tmp_out, src_wav)
+    return src_wav
+
+
+async def _send(ws, mtype: str, payload: dict) -> None:
+    try:
+        await ws.send(json.dumps({
+            "type": mtype,
+            "payload": payload,
+            "timestamp": int(time.time() * 1000),
+        }))
+    except Exception as e:
+        logger.warning("Send fehlgeschlagen (%s): %s", mtype, e)
+
+
+# ── Interne Transkription via whisper-bridge ────────────────
+
+_pending_stt: dict[str, asyncio.Future] = {}
+_STT_TIMEOUT_S = 60.0
+
+
+async def request_transcription(ws, wav_path: Path, language: str = "de") -> Optional[str]:
+    """Sendet einen stt_request an die whisper-bridge (ueber RVS) und wartet auf stt_response."""
+    try:
+        with open(wav_path, "rb") as f:
+            audio_b64 = base64.b64encode(f.read()).decode("ascii")
+    except Exception as e:
+        logger.error("Lesen %s fehlgeschlagen: %s", wav_path, e)
+        return None
+
+    request_id = str(uuid.uuid4())
+    loop = asyncio.get_event_loop()
+    fut: asyncio.Future = loop.create_future()
+    _pending_stt[request_id] = fut
+
+    try:
+        await _send(ws, "stt_request", {
+            "requestId": request_id,
+            "audio": audio_b64,
+            "mimeType": "audio/wav",
+            "model": "small",  # klein reicht fuer Voice-Referenz
+            "language": language,
+        })
+        return await asyncio.wait_for(fut, timeout=_STT_TIMEOUT_S)
+    except asyncio.TimeoutError:
+        logger.warning("Transkription Timeout fuer %s", wav_path.name)
+        return None
+    except Exception as e:
+        logger.warning("Transkription Fehler: %s", e)
+        return None
+    finally:
+        _pending_stt.pop(request_id, None)
+
+
+# ── TTS-Request Handler ─────────────────────────────────────
+
+# Queue damit sich parallele Requests nicht ueberlappen (GPU-Throughput)
+_tts_queue: asyncio.Queue[tuple] = asyncio.Queue()
+
+
+async def _tts_worker(ws, runner: F5Runner) -> None:
+    """Serialisiert Synthesen — GPU kann sonst OOM gehen."""
+    while True:
+        text, voice, request_id, message_id, language = await _tts_queue.get()
+        try:
+            await _do_tts(ws, runner, text, voice, request_id, message_id, language)
+        except Exception:
+            logger.exception("TTS-Worker Fehler")
+        finally:
+            _tts_queue.task_done()
+
+
+async def _do_tts(ws, runner: F5Runner, text: str, voice: str,
+                  request_id: str, message_id: str, language: str) -> None:
+    t0 = time.time()
+    ref_wav_path, ref_txt_path = voice_paths(voice) if voice else (None, None)
+    has_custom = bool(voice and ref_wav_path and ref_wav_path.exists() and ref_txt_path.exists())
+    if voice and not has_custom:
+        # Wenn nur WAV da ist aber kein txt → on-the-fly transkribieren
+        if ref_wav_path and ref_wav_path.exists() and (not ref_txt_path or not ref_txt_path.exists()):
+            logger.info("Voice '%s' hat kein txt — transkribiere on-the-fly", voice)
+            text_ref = await request_transcription(ws, ref_wav_path, language)
+            if text_ref:
+                try:
+                    ref_txt_path.write_text(text_ref.strip(), encoding="utf-8")
+                    has_custom = True
+                    logger.info("Referenz-Text nachgezogen: '%s'", text_ref[:60])
+                except Exception as e:
+                    logger.warning("Referenz-Text speichern fehlgeschlagen: %s", e)
+        if not has_custom:
+            logger.warning("Voice '%s' nicht komplett (%s, txt=%s) — nehme Default",
+                           voice, ref_wav_path, (ref_txt_path and ref_txt_path.exists()))
+
+    if has_custom:
+        ref_wav_str = str(ref_wav_path)
+        ref_text = ref_txt_path.read_text(encoding="utf-8").strip()
+    else:
+        # Fallback: kein Custom-Voice. F5-TTS braucht IMMER eine Referenz,
+        # wir nehmen default_ref.wav/txt falls vorhanden, sonst die erste
+        # gefundene Voice im Ordner.
+        default_wav = VOICES_DIR / "default_ref.wav"
+        default_txt = VOICES_DIR / "default_ref.txt"
+        if default_wav.exists() and default_txt.exists():
+            ref_wav_str = str(default_wav)
+            ref_text = default_txt.read_text(encoding="utf-8").strip()
+        else:
+            # Nimm irgendein vorhandenes voice-Paar
+            pair = next(
+                ((w, t) for w, t in (
+                    (v, v.with_suffix(".txt")) for v in VOICES_DIR.glob("*.wav")
+                ) if t.exists()),
+                None,
+            )
+            if not pair:
+                logger.error("Keine Referenz-Stimme im VOICES_DIR — TTS abgebrochen")
+                return
+            ref_wav_str, ref_text = str(pair[0]), pair[1].read_text(encoding="utf-8").strip()
+
+    sentences = split_sentences(text)
+    logger.info("F5-TTS: %d Satz(e), voice=%s (%s)", len(sentences), voice or "default", ref_wav_str)
+
+    chunk_index = 0
+    pcm_sr = TARGET_SR
+    for i, sent in enumerate(sentences):
+        try:
+            wav, sr = await runner.synthesize(sent, ref_wav_str, ref_text)
+            pcm_sr = sr
+            pcm_bytes = float_to_pcm16(wav)
+            # Erste PCM-Chunk des allerersten Satzes bekommt Fade-In (maskiert
+            # eventuelle Warmup-Glitches). Alle anderen Chunks bleiben wie sind.
+            if i == 0 and chunk_index == 0:
+                pcm_bytes = _fade_in_pcm16(pcm_bytes, sr, 120)
+
+            # Stueckeln
+            for off in range(0, len(pcm_bytes), PCM_CHUNK_BYTES):
+                slice_ = pcm_bytes[off:off + PCM_CHUNK_BYTES]
+                await _send(ws, "audio_pcm", {
+                    "requestId": request_id,
+                    "messageId": message_id,
+                    "base64": base64.b64encode(slice_).decode("ascii"),
+                    "format": "pcm_s16le",
+                    "sampleRate": sr,
+                    "channels": 1,
+                    "voice": voice or "default",
+                    "chunk": chunk_index,
+                    "final": False,
+                })
+                chunk_index += 1
+        except Exception as e:
+            logger.exception("F5-TTS Synthese-Fehler (Satz %d)", i)
+            await _send(ws, "xtts_response", {
+                "requestId": request_id,
+                "error": str(e)[:200],
+            })
+            return
+
+    # Final-Marker
+    await _send(ws, "audio_pcm", {
+        "requestId": request_id,
+        "messageId": message_id,
+        "base64": "",
+        "format": "pcm_s16le",
+        "sampleRate": pcm_sr,
+        "channels": 1,
+        "voice": voice or "default",
+        "chunk": chunk_index,
+        "final": True,
+    })
+
+    logger.info("TTS komplett: %d Chunks, %.2fs render (voice=%s, text=%d chars)",
+                chunk_index, time.time() - t0, voice or "default", len(text))
+
+
+def _fade_in_pcm16(pcm: bytes, sr: int, fade_ms: int) -> bytes:
+    """Linear Fade-In auf erste fade_ms — maskiert Warmup-Glitches."""
+    arr = np.frombuffer(pcm, dtype=np.int16).copy()
+    fade_samples = min(int((fade_ms / 1000.0) * sr), len(arr))
+    if fade_samples <= 0:
+        return pcm
+    ramp = np.linspace(0.0, 1.0, fade_samples, dtype=np.float32)
+    arr[:fade_samples] = (arr[:fade_samples].astype(np.float32) * ramp).astype(np.int16)
+    return arr.tobytes()
+
+
+# ── Voice Management Handlers ───────────────────────────────
+
+async def handle_voice_upload(ws, payload: dict) -> None:
+    name = (payload.get("name") or "").strip()
+    samples = payload.get("samples") or []
+    if not name or not samples:
+        logger.warning("voice_upload: ungueltig (name=%r, samples=%d)", name, len(samples))
+        return
+    logger.info("Voice-Upload: '%s' (%d Samples)", name, len(samples))
+
+    try:
+        VOICES_DIR.mkdir(parents=True, exist_ok=True)
+        safe = sanitize_voice_name(name)
+        wav_path = VOICES_DIR / f"{safe}.wav"
+        txt_path = VOICES_DIR / f"{safe}.txt"
+
+        # Samples zusammenfuegen
+        buffers = [base64.b64decode(s.get("base64", "")) for s in samples]
+        with open(wav_path, "wb") as f:
+            for b in buffers:
+                f.write(b)
+        size_kb = wav_path.stat().st_size / 1024
+        logger.info("Voice WAV gespeichert: %s (%.0fKB)", wav_path, size_kb)
+
+        # Auf 24kHz mono normalisieren (falls App in anderem Format liefert)
+        ensure_24k_mono_wav(wav_path)
+
+        # Transkription ueber whisper-bridge anfragen
+        logger.info("Transkribiere '%s' via whisper-bridge...", name)
+        text = await request_transcription(ws, wav_path, language="de")
+        if not text:
+            logger.warning("Transkription fehlgeschlagen — speichere Platzhalter-Text")
+            text = "Das ist ein Referenz Audio."
+        txt_path.write_text(text.strip(), encoding="utf-8")
+        logger.info("Voice '%s' komplett (txt: %s)", name, text[:80])
+
+        await _send(ws, "xtts_voice_saved", {
+            "name": name, "size": int(size_kb * 1024), "refText": text.strip(),
+        })
+        # Liste aktualisieren
+        await handle_list_voices(ws)
+    except Exception as e:
+        logger.exception("voice_upload Fehler")
+        await _send(ws, "xtts_voice_saved", {"name": name, "error": str(e)[:200]})
+
+
+async def handle_list_voices(ws) -> None:
+    try:
+        voices = []
+        if VOICES_DIR.exists():
+            for wav in sorted(VOICES_DIR.glob("*.wav")):
+                txt = wav.with_suffix(".txt")
+                voices.append({
+                    "name": wav.stem,
+                    "file": wav.name,
+                    "size": wav.stat().st_size,
+                    "hasRefText": txt.exists(),
+                })
+        logger.info("Stimmen-Liste: %d", len(voices))
+        await _send(ws, "xtts_voices_list", {"voices": voices})
+    except Exception:
+        logger.exception("handle_list_voices Fehler")
+
+
+async def handle_delete_voice(ws, payload: dict) -> None:
+    name = (payload.get("name") or "").strip()
+    if not name:
+        return
+    try:
+        wav, txt = voice_paths(name)
+        for p in (wav, txt):
+            if p.exists():
+                p.unlink()
+                logger.info("Voice geloescht: %s", p)
+        await handle_list_voices(ws)
+    except Exception:
+        logger.exception("handle_delete_voice Fehler")
+
+
+# Letzte diagnostisch-gesetzte Voice (verhindert Endlos-Preload bei jedem config)
+_last_diag_voice = ""
+
+
+async def handle_voice_preload(ws, payload: dict, runner: F5Runner) -> None:
+    voice = (payload.get("voice") or "").strip()
+    request_id = payload.get("requestId", "")
+    logger.info("Voice-Preload angefordert: '%s'", voice or "default")
+
+    try:
+        ref_wav, ref_txt = voice_paths(voice) if voice else (None, None)
+        if voice and (not ref_wav or not ref_wav.exists()):
+            await _send(ws, "voice_ready", {"voice": voice, "requestId": request_id, "error": "voice-file-not-found"})
+            return
+
+        # Ref-Text sicherstellen (falls nur WAV da ist)
+        if voice and ref_txt and not ref_txt.exists():
+            text = await request_transcription(ws, ref_wav, language="de")
+            if text:
+                ref_txt.write_text(text.strip(), encoding="utf-8")
+                logger.info("Referenz-Text beim Preload nachgezogen")
+
+        # Dummy-Render zum Warmup
+        t0 = time.time()
+        await _do_tts(ws, runner, "ja.", voice, f"preload-{request_id}", "", "de")
+        ms = int((time.time() - t0) * 1000)
+        await _send(ws, "voice_ready", {"voice": voice, "requestId": request_id, "loadMs": ms})
+    except Exception as e:
+        logger.exception("Voice-Preload Fehler")
+        await _send(ws, "voice_ready", {"voice": voice, "requestId": request_id, "error": str(e)[:200]})
+
+
+# ── Haupt-Loop ──────────────────────────────────────────────
+
+async def run_loop(runner: F5Runner) -> None:
+    # Preload im Hintergrund starten damit der Startup nicht blockiert
+    asyncio.create_task(runner.ensure_loaded())
+
+    use_tls = RVS_TLS
+    retry_s = 2
+    tls_fallback_tried = False
+    global _last_diag_voice
+
+    while True:
+        scheme = "wss" if use_tls else "ws"
+        url = f"{scheme}://{RVS_HOST}:{RVS_PORT}/ws?token={RVS_TOKEN}"
+        masked = url.replace(RVS_TOKEN, "***") if RVS_TOKEN else url
+
+        try:
+            logger.info("Verbinde zu RVS: %s", masked)
+            async with websockets.connect(url, ping_interval=20, ping_timeout=10, max_size=50 * 1024 * 1024) as ws:
+                logger.info("RVS verbunden")
+                retry_s = 2
+                tls_fallback_tried = False
+
+                # TTS-Worker fuer diese Verbindung starten
+                worker = asyncio.create_task(_tts_worker(ws, runner))
+
+                try:
+                    async for raw in ws:
+                        try:
+                            msg = json.loads(raw)
+                        except Exception:
+                            continue
+                        mtype = msg.get("type", "")
+                        payload = msg.get("payload", {}) or {}
+
+                        if mtype == "xtts_request":
+                            await _tts_queue.put((
+                                payload.get("text", ""),
+                                payload.get("voice", "") or "",
+                                payload.get("requestId", ""),
+                                payload.get("messageId", ""),
+                                payload.get("language", "de"),
+                            ))
+                        elif mtype == "voice_upload":
+                            asyncio.create_task(handle_voice_upload(ws, payload))
+                        elif mtype == "xtts_list_voices":
+                            asyncio.create_task(handle_list_voices(ws))
+                        elif mtype == "xtts_delete_voice":
+                            asyncio.create_task(handle_delete_voice(ws, payload))
+                        elif mtype == "voice_preload":
+                            asyncio.create_task(handle_voice_preload(ws, payload, runner))
+                        elif mtype == "stt_response":
+                            # Antwort auf unseren internen Transkriptions-Request
+                            req_id = payload.get("requestId", "")
+                            fut = _pending_stt.get(req_id)
+                            if fut and not fut.done():
+                                if payload.get("error"):
+                                    fut.set_result(None)
+                                else:
+                                    fut.set_result(payload.get("text") or "")
+                        elif mtype == "config":
+                            v = (payload.get("xttsVoice") or "").strip()
+                            if v and v != _last_diag_voice:
+                                _last_diag_voice = v
+                                asyncio.create_task(handle_voice_preload(
+                                    ws, {"voice": v, "source": "diagnostic"}, runner,
+                                ))
+                            elif not v:
+                                _last_diag_voice = ""
+                finally:
+                    worker.cancel()
+                    try:
+                        await worker
+                    except asyncio.CancelledError:
+                        pass
+        except Exception as e:
+            logger.warning("Verbindung verloren: %s", e)
+            if use_tls and RVS_TLS_FALLBACK and not tls_fallback_tried:
+                logger.info("TLS fehlgeschlagen — Fallback auf ws://")
+                use_tls = False
+                tls_fallback_tried = True
+                continue
+            await asyncio.sleep(min(retry_s, 30))
+            retry_s = min(retry_s * 2, 30)
+
+
+async def main() -> None:
+    if not RVS_HOST:
+        logger.error("RVS_HOST nicht gesetzt — Abbruch")
+        sys.exit(1)
+    VOICES_DIR.mkdir(parents=True, exist_ok=True)
+    runner = F5Runner()
+    await run_loop(runner)
+
+
+if __name__ == "__main__":
+    try:
+        asyncio.run(main())
+    except KeyboardInterrupt:
+        sys.exit(0)
@@ -0,0 +1,5 @@
+f5-tts>=1.0.0
+websockets>=12.0
+numpy>=1.24
+soundfile>=0.12
+requests>=2.31
@@ -0,0 +1,14 @@
+FROM nvidia/cuda:12.2.2-cudnn8-runtime-ubuntu22.04
+
+RUN apt-get update && apt-get install -y --no-install-recommends \
+    python3 python3-pip ffmpeg \
+    && rm -rf /var/lib/apt/lists/*
+
+WORKDIR /app
+
+COPY requirements.txt .
+RUN pip3 install --no-cache-dir -r requirements.txt
+
+COPY bridge.py .
+
+CMD ["python3", "bridge.py"]
@@ -0,0 +1,254 @@
+#!/usr/bin/env python3
+"""
+ARIA Whisper Bridge — laeuft auf der Gamebox (RTX 3060).
+
+Empfaengt stt_request via RVS → FFmpeg-Konvertierung → faster-whisper auf GPU
+→ sendet stt_response zurueck an die aria-bridge.
+
+Env:
+  RVS_HOST, RVS_PORT, RVS_TLS, RVS_TLS_FALLBACK, RVS_TOKEN
+  WHISPER_MODEL          Default: small
+  WHISPER_DEVICE         Default: cuda
+  WHISPER_COMPUTE_TYPE   Default: float16
+  WHISPER_LANGUAGE       Default: de
+"""
+import asyncio
+import base64
+import json
+import logging
+import os
+import subprocess
+import sys
+import tempfile
+import time
+from typing import Optional
+
+import numpy as np
+import websockets
+from faster_whisper import WhisperModel
+
+logging.basicConfig(
+    level=logging.INFO,
+    format="%(asctime)s [%(levelname)s] %(message)s",
+    datefmt="%H:%M:%S",
+)
+logger = logging.getLogger("whisper-bridge")
+
+RVS_HOST = os.getenv("RVS_HOST", "").strip()
+RVS_PORT = int(os.getenv("RVS_PORT", "443"))
+RVS_TLS = os.getenv("RVS_TLS", "true").lower() == "true"
+RVS_TLS_FALLBACK = os.getenv("RVS_TLS_FALLBACK", "true").lower() == "true"
+RVS_TOKEN = os.getenv("RVS_TOKEN", "").strip()
+
+WHISPER_MODEL = os.getenv("WHISPER_MODEL", "small")
+WHISPER_DEVICE = os.getenv("WHISPER_DEVICE", "cuda")
+WHISPER_COMPUTE_TYPE = os.getenv("WHISPER_COMPUTE_TYPE", "float16")
+WHISPER_LANGUAGE = os.getenv("WHISPER_LANGUAGE", "de")
+
+ALLOWED_MODELS = {"tiny", "base", "small", "medium", "large-v3"}
+
+
+class WhisperRunner:
+    """Haelt das Whisper-Modell. Hot-Swap bei Konfig-Wechsel via ensure_loaded()."""
+
+    def __init__(self) -> None:
+        self.model_size: str = WHISPER_MODEL
+        self.model: Optional[WhisperModel] = None
+        self._lock = asyncio.Lock()
+
+    def _load_blocking(self, size: str) -> None:
+        logger.info(
+            "Lade Whisper '%s' (device=%s, compute=%s)",
+            size, WHISPER_DEVICE, WHISPER_COMPUTE_TYPE,
+        )
+        t0 = time.time()
+        self.model = WhisperModel(
+            size, device=WHISPER_DEVICE, compute_type=WHISPER_COMPUTE_TYPE,
+        )
+        self.model_size = size
+        logger.info("Whisper '%s' geladen in %.1fs", size, time.time() - t0)
+
+    async def ensure_loaded(self, desired_size: str) -> None:
+        if desired_size not in ALLOWED_MODELS:
+            logger.warning("Ungueltiges Whisper-Modell '%s' — nutze %s", desired_size, WHISPER_MODEL)
+            desired_size = WHISPER_MODEL
+        async with self._lock:
+            if self.model is not None and self.model_size == desired_size:
+                return
+            loop = asyncio.get_event_loop()
+            await loop.run_in_executor(None, self._load_blocking, desired_size)
+
+    async def transcribe(self, audio: np.ndarray, language: str) -> tuple[str, float]:
+        if self.model is None:
+            return "", 0.0
+
+        def _run():
+            segments, info = self.model.transcribe(
+                audio, language=language, beam_size=5, vad_filter=True,
+            )
+            text = " ".join(seg.text.strip() for seg in segments)
+            return text, info.duration
+
+        loop = asyncio.get_event_loop()
+        return await loop.run_in_executor(None, _run)
+
+
+def ffmpeg_to_float32(audio_b64: str, mime_type: str) -> np.ndarray:
+    """Dekodiert beliebiges Audio-Format → 16kHz mono float32 PCM."""
+    if "mp4" in mime_type or "m4a" in mime_type or "aac" in mime_type:
+        ext = ".mp4"
+    elif "wav" in mime_type:
+        ext = ".wav"
+    elif "ogg" in mime_type or "opus" in mime_type:
+        ext = ".ogg"
+    else:
+        ext = ".bin"
+
+    in_fh = tempfile.NamedTemporaryFile(suffix=ext, delete=False)
+    try:
+        in_fh.write(base64.b64decode(audio_b64))
+        in_fh.close()
+        out_path = in_fh.name + ".raw"
+        cmd = ["ffmpeg", "-y", "-i", in_fh.name, "-ar", "16000", "-ac", "1", "-f", "f32le", out_path]
+        result = subprocess.run(cmd, capture_output=True, timeout=30)
+        if result.returncode != 0:
+            logger.error("FFmpeg Fehler: %s", result.stderr.decode(errors="replace")[:300])
+            return np.zeros(0, dtype=np.float32)
+        try:
+            return np.fromfile(out_path, dtype=np.float32)
+        finally:
+            try:
+                os.unlink(out_path)
+            except OSError:
+                pass
+    finally:
+        try:
+            os.unlink(in_fh.name)
+        except OSError:
+            pass
+
+
+async def _send(ws, mtype: str, payload: dict) -> None:
+    try:
+        await ws.send(json.dumps({
+            "type": mtype,
+            "payload": payload,
+            "timestamp": int(time.time() * 1000),
+        }))
+    except Exception as e:
+        logger.warning("Send fehlgeschlagen (%s): %s", mtype, e)
+
+
+async def handle_stt_request(ws, payload: dict, runner: WhisperRunner) -> None:
+    request_id = payload.get("requestId", "")
+    audio_b64 = payload.get("audio", "")
+    mime_type = payload.get("mimeType", "audio/mp4")
+    model = payload.get("model") or WHISPER_MODEL
+    language = payload.get("language") or WHISPER_LANGUAGE
+
+    if not audio_b64:
+        await _send(ws, "stt_response", {"requestId": request_id, "error": "no-audio"})
+        return
+
+    try:
+        t_load = time.time()
+        await runner.ensure_loaded(model)
+        load_ms = int((time.time() - t_load) * 1000)
+
+        audio = ffmpeg_to_float32(audio_b64, mime_type)
+        if audio.size == 0:
+            await _send(ws, "stt_response", {"requestId": request_id, "error": "ffmpeg-failed"})
+            return
+        duration_s = len(audio) / 16000.0
+        logger.info("STT-Request: %.1fs Audio, model=%s, lang=%s", duration_s, runner.model_size, language)
+
+        t_stt = time.time()
+        text, detected_duration = await runner.transcribe(audio, language)
+        stt_ms = int((time.time() - t_stt) * 1000)
+
+        logger.info("STT-Ergebnis (%dms): '%s'", stt_ms, text[:100])
+
+        await _send(ws, "stt_response", {
+            "requestId": request_id,
+            "text": text.strip(),
+            "durationS": duration_s,
+            "sttMs": stt_ms,
+            "loadMs": load_ms,
+            "model": runner.model_size,
+        })
+    except Exception as e:
+        logger.exception("STT-Request fehlgeschlagen")
+        await _send(ws, "stt_response", {
+            "requestId": request_id,
+            "error": str(e)[:200],
+        })
+
+
+async def run_loop(runner: WhisperRunner) -> None:
+    # Modell vorab laden damit erste Anfrage flott ist
+    try:
+        await runner.ensure_loaded(WHISPER_MODEL)
+    except Exception as e:
+        logger.error("Preload fehlgeschlagen: %s — Fortsetzung, wird bei erstem Request nachgeladen", e)
+
+    use_tls = RVS_TLS
+    retry_s = 2
+    tls_fallback_tried = False
+
+    while True:
+        scheme = "wss" if use_tls else "ws"
+        url = f"{scheme}://{RVS_HOST}:{RVS_PORT}/ws?token={RVS_TOKEN}"
+        masked = url.replace(RVS_TOKEN, "***") if RVS_TOKEN else url
+        try:
+            logger.info("Verbinde zu RVS: %s", masked)
+            async with websockets.connect(url, ping_interval=20, ping_timeout=10) as ws:
+                logger.info("RVS verbunden")
+                retry_s = 2
+                tls_fallback_tried = False
+                async for raw in ws:
+                    try:
+                        msg = json.loads(raw)
+                    except Exception:
+                        continue
+                    mtype = msg.get("type", "")
+                    payload = msg.get("payload", {}) or {}
+
+                    if mtype == "stt_request":
+                        req_id = payload.get("requestId", "?")
+                        audio_len = len(payload.get("audio", ""))
+                        logger.info("stt_request empfangen (id=%s, %dKB Audio)",
+                                    req_id[:8] if req_id != "?" else "?", audio_len // 1365)
+                        asyncio.create_task(handle_stt_request(ws, payload, runner))
+                    elif mtype == "config":
+                        new_model = payload.get("whisperModel")
+                        if new_model and new_model != runner.model_size:
+                            logger.info("Config-Broadcast: Whisper-Modell → %s", new_model)
+                            asyncio.create_task(runner.ensure_loaded(new_model))
+                    else:
+                        # Alle anderen Nachrichten debug-loggen — hilft beim Diagnostizieren,
+                        # ob stt_request ueberhaupt durch den RVS kommt
+                        logger.debug("Unbeachteter Type: %s", mtype)
+        except Exception as e:
+            logger.warning("Verbindung verloren: %s", e)
+            if use_tls and RVS_TLS_FALLBACK and not tls_fallback_tried:
+                logger.info("TLS-Verbindung fehlgeschlagen — Fallback auf ws://")
+                use_tls = False
+                tls_fallback_tried = True
+                continue
+            await asyncio.sleep(min(retry_s, 30))
+            retry_s = min(retry_s * 2, 30)
+
+
+async def main() -> None:
+    if not RVS_HOST:
+        logger.error("RVS_HOST ist nicht gesetzt — Abbruch")
+        sys.exit(1)
+    runner = WhisperRunner()
+    await run_loop(runner)
+
+
+if __name__ == "__main__":
+    try:
+        asyncio.run(main())
+    except KeyboardInterrupt:
+        sys.exit(0)
@@ -0,0 +1,4 @@
+faster-whisper==1.0.3
+websockets>=12.0
+numpy>=1.24
+requests>=2.31