release: bump version to 0.0.2.8

docs: document .env.example with detailed comments, explain both tokens in README
- ARIA_AUTH_TOKEN: Gateway auth (who can talk to ARIA) - RVS_TOKEN: Pairing token (same room in RVS relay) - RVS_UPDATE_HOST: SSH target for auto-update APK copy - All variables with German comments and examples Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:49:47 +02:00 · 2026-04-10 08:45:26 +02:00 · 2026-04-10 08:39:59 +02:00 · 2026-04-10 08:34:35 +02:00 · 2026-04-10 08:31:49 +02:00 · 2026-04-10 02:52:56 +02:00
9 changed files with 176 additions and 58 deletions
@@ -1,20 +1,50 @@
-# ARIA Environment Configuration
-# Copy to .env and fill in values
+# ════════════════════════════════════════════════
+#  ARIA — Umgebungsvariablen
+#  Kopieren nach .env und Werte eintragen
+# ════════════════════════════════════════════════

-# Auth token for ARIA Core (generate a long random string)
-# openssl rand -hex 32
+# ── ARIA Auth Token ──────────────────────────────
+# Authentifizierung fuer den OpenClaw Gateway (aria-core).
+# Wird von Diagnostic, Bridge und App genutzt um sich am Gateway anzumelden.
+# Alle Services die mit aria-core kommunizieren brauchen diesen Token.
+# Generieren: openssl rand -hex 32
 ARIA_AUTH_TOKEN=change-me-to-a-long-random-string

-# RVS — Rendezvous-Server (Bridge + App verbinden sich hierüber)
+# ── RVS — Rendezvous-Server ─────────────────────
+# Der RVS ist ein WebSocket-Relay im Rechenzentrum.
+# App, Bridge, Diagnostic und XTTS-Bridge verbinden sich hierueber.
+# Alle muessen den gleichen Host, Port und Token nutzen.
+
+# Hostname des RVS-Servers (z.B. rvs.example.de oder mobil.hacker-net.de)
 RVS_HOST=rvs.example.de
+
+# Port auf dem der RVS laeuft (muss mit rvs/docker-compose.yml uebereinstimmen)
 RVS_PORT=443
+
+# TLS (wss://) verwenden? true = verschluesselt, false = unverschluesselt (ws://)
 RVS_TLS=true
+
 # Bei TLS-Fehler automatisch auf ws:// (ohne TLS) fallback?
-# true = Fallback erlaubt, false = nur mit TLS verbinden
+# Nuetzlich wenn kein TLS-Zertifikat vorhanden (z.B. Entwicklung)
 RVS_TLS_FALLBACK=true
+
+# Pairing-Token: Wer den gleichen Token hat, landet im gleichen RVS-Room.
+# Wird von generate-token.sh automatisch generiert und hier eingetragen.
+# Die Android App bekommt den Token per QR-Code beim Pairing.
+# WICHTIG: Muss auf ARIA-VM, Gaming-PC (xtts/.env) und App identisch sein!
+# Generieren: ./generate-token.sh (traegt den Token automatisch ein)
 RVS_TOKEN=

-# Gitea (for release.sh — Kennwort wird interaktiv abgefragt)
+# ── Gitea — Release-Verwaltung ───────────────────
+# Wird von release.sh genutzt um APKs auf Gitea zu veroeffentlichen.
+# Kennwort wird beim Release interaktiv abgefragt (nicht in .env!).
 GITEA_URL=https://git.hacker-net.de
 GITEA_REPO=Hacker-Software/ARIA-AGENT
 GITEA_USER=duffyduck
+
+# ── Auto-Update — APK auf RVS-Server kopieren ───
+# SSH-Ziel fuer scp: release.sh kopiert die APK dorthin.
+# Der RVS-Server stellt sie dann per WebSocket an die App bereit.
+# Format: user@host (z.B. root@aria-rvs oder root@rvs.example.de)
+# Leer lassen = Auto-Update ueberspringen, APK manuell auf RVS kopieren.
+RVS_UPDATE_HOST=
@@ -103,16 +103,31 @@ cd ~/ARIA-AGENT
 cp .env.example .env
 ```

-`.env` Datei editieren:
+`.env` Datei editieren (Details siehe `.env.example`):
 ```bash
+# Gateway-Auth: Alle Services die mit aria-core reden brauchen diesen Token
+# Diagnostic, Bridge, App nutzen ihn fuer den WebSocket-Handshake
 ARIA_AUTH_TOKEN=        # openssl rand -hex 32
+
+# RVS-Verbindung: Hostname + Port deines Rendezvous-Servers
 RVS_HOST=               # z.B. rvs.hackersoft.de
 RVS_PORT=443
 RVS_TLS=true
 RVS_TLS_FALLBACK=true
-RVS_TOKEN=              # wird von generate-token.sh automatisch gesetzt
+
+# Pairing-Token: Verbindet App, Bridge, Diagnostic und XTTS im gleichen RVS-Room
+# MUSS auf allen Geraeten identisch sein (ARIA-VM, Gaming-PC, App)
+# Wird von generate-token.sh automatisch generiert und eingetragen
+RVS_TOKEN=              # ./generate-token.sh
+
+# Optional: SSH-Host des RVS-Servers fuer Auto-Update (z.B. root@aria-rvs)
+RVS_UPDATE_HOST=
 ```

+**Zwei Tokens, zwei Zwecke:**
+- **ARIA_AUTH_TOKEN**: Authentifizierung am OpenClaw Gateway (aria-core). Wer diesen Token hat, kann ARIA Befehle geben.
+- **RVS_TOKEN**: Pairing-Token fuer den Rendezvous-Server. Alle Geraete mit dem gleichen Token landen im gleichen "Room" und koennen kommunizieren. Die App bekommt diesen Token per QR-Code.
+
 ### 2. Claude CLI einloggen (Proxy-Auth)

 Der Proxy-Container nutzt deine Claude Max Subscription. Die Credentials muessen
@@ -79,8 +79,8 @@ android {
        applicationId "com.ariacockpit"
        minSdkVersion rootProject.ext.minSdkVersion
        targetSdkVersion rootProject.ext.targetSdkVersion
-        versionCode 206
-        versionName "0.0.2.6"
+        versionCode 208
+        versionName "0.0.2.8"
        // Fallback fuer Libraries mit Product Flavors
        missingDimensionStrategy 'react-native-camera', 'general'
    }
@@ -1,6 +1,6 @@
 {
  "name": "aria-cockpit",
-  "version": "0.0.2.6",
+  "version": "0.0.2.8",
  "private": true,
  "scripts": {
    "android": "react-native run-android",
@@ -748,7 +748,7 @@ const SettingsScreen: React.FC = () => {
      <Text style={styles.sectionTitle}>{'\u00DC'}ber</Text>
      <View style={styles.card}>
        <Text style={styles.aboutTitle}>ARIA Cockpit</Text>
-        <Text style={styles.aboutVersion}>Version 0.0.2.6 </Text>
+        <Text style={styles.aboutVersion}>Version 0.0.2.8 </Text>
        <Text style={styles.aboutInfo}>
          Stefans Kommandozentrale f{'\u00FC'}r ARIA.{'\n'}
          Gebaut mit React Native + TypeScript.
@@ -58,6 +58,8 @@ class AudioService {
  // Audio-Queue fuer sequentielle TTS-Wiedergabe
  private audioQueue: string[] = [];
  private isPlaying: boolean = false;
+  private preloadedSound: Sound | null = null;
+  private preloadedPath: string = '';

  // VAD State
  private vadEnabled: boolean = false;
@@ -220,35 +222,62 @@ class AudioService {
    }

    this.isPlaying = true;
-    const base64Data = this.audioQueue.shift()!;

-    try {
-      const tmpPath = `${RNFS.CachesDirectoryPath}/aria_tts_${Date.now()}.wav`;
-      await RNFS.writeFile(tmpPath, base64Data, 'base64');
+    // Preloaded Sound verwenden wenn verfuegbar, sonst neu laden
+    let sound: Sound;
+    let soundPath: string;

-      this.currentSound = new Sound(tmpPath, '', (error) => {
-        if (error) {
-          console.error('[Audio] Fehler beim Laden:', error);
-          RNFS.unlink(tmpPath).catch(() => {});
-          this._playNext();
-          return;
-        }
-        this.currentSound?.play((success) => {
-          if (success) {
-            console.log('[Audio] Wiedergabe abgeschlossen');
-          } else {
-            console.warn('[Audio] Wiedergabe fehlgeschlagen');
-          }
-          this.currentSound?.release();
-          this.currentSound = null;
-          RNFS.unlink(tmpPath).catch(() => {});
-          // Naechstes Audio abspielen
-          this._playNext();
+    if (this.preloadedSound) {
+      sound = this.preloadedSound;
+      soundPath = this.preloadedPath;
+      this.preloadedSound = null;
+      this.preloadedPath = '';
+      // Daten aus Queue entfernen (wurde schon preloaded)
+      this.audioQueue.shift();
+    } else {
+      const base64Data = this.audioQueue.shift()!;
+      try {
+        soundPath = `${RNFS.CachesDirectoryPath}/aria_tts_${Date.now()}.wav`;
+        await RNFS.writeFile(soundPath, base64Data, 'base64');
+        sound = await new Promise<Sound>((resolve, reject) => {
+          const s = new Sound(soundPath, '', (err) => err ? reject(err) : resolve(s));
        });
-      });
-    } catch (err) {
-      console.error('[Audio] Wiedergabefehler:', err);
+      } catch (err) {
+        console.error('[Audio] Laden fehlgeschlagen:', err);
+        this._playNext();
+        return;
+      }
+    }
+
+    this.currentSound = sound;
+
+    // Naechstes Audio schon vorbereiten waehrend dieses abspielt
+    this._preloadNext();
+
+    sound.play((success) => {
+      if (!success) console.warn('[Audio] Wiedergabe fehlgeschlagen');
+      sound.release();
+      this.currentSound = null;
+      RNFS.unlink(soundPath).catch(() => {});
      this._playNext();
+    });
+  }
+
+  /** Naechstes Audio im Hintergrund vorladen (verhindert Stottern) */
+  private async _preloadNext(): Promise<void> {
+    if (this.audioQueue.length === 0 || this.preloadedSound) return;
+
+    const base64Data = this.audioQueue[0]; // Nicht shift — bleibt in Queue
+    try {
+      const tmpPath = `${RNFS.CachesDirectoryPath}/aria_tts_pre_${Date.now()}.wav`;
+      await RNFS.writeFile(tmpPath, base64Data, 'base64');
+      this.preloadedSound = await new Promise<Sound>((resolve, reject) => {
+        const s = new Sound(tmpPath, '', (err) => err ? reject(err) : resolve(s));
+      });
+      this.preloadedPath = tmpPath;
+    } catch {
+      this.preloadedSound = null;
+      this.preloadedPath = '';
    }
  }

@@ -261,6 +290,12 @@ class AudioService {
      this.currentSound.release();
      this.currentSound = null;
    }
+    if (this.preloadedSound) {
+      this.preloadedSound.release();
+      this.preloadedSound = null;
+      if (this.preloadedPath) RNFS.unlink(this.preloadedPath).catch(() => {});
+      this.preloadedPath = '';
+    }
  }

  // --- Status & Callbacks ---
@@ -18,7 +18,7 @@ services:
      claude-max-api"
    volumes:
      - ~/.claude:/root/.claude                      # Claude CLI Auth (Credentials in /root/.claude/.credentials.json)
-      - ./aria-data/ssh:/root/.ssh:ro               # SSH Keys fuer VM-Zugriff (aria-wohnung)
+      - ./aria-data/ssh:/root/.ssh                    # SSH Keys fuer VM-Zugriff (aria-wohnung, rw fuer ARIA)
      - aria-shared:/shared                          # Shared Volume fuer Datei-Austausch (Uploads von App)
    environment:
      - HOST=0.0.0.0
@@ -18,19 +18,35 @@
 - [x] RVS Nachrichten vom Smartphone gehen durch
 - [x] Stimmen-Einstellungen (Ramona/Thorsten, Speed pro Stimme)
 - [x] Highlight-Trigger konfigurierbar in Diagnostic
+- [x] XTTS v2 Integration (Gaming-PC, GPU, Voice Cloning)
+- [x] XTTS Voice Cloning (Audio-Samples hochladen, eigene Stimme)
+- [x] TTS Engine waehlbar (Piper/XTTS) in Diagnostic + App
+- [x] Auto-Update System (APK via RVS WebSocket)
+- [x] Audio-Queue (sequentielle Wiedergabe, kein Ueberlappen)

 ## Offen

-### TTS / Stimmen
- [ ] TTS Engine waehlbar: Piper (CPU, schnell) oder Coqui XTTS v2 (GPU, natuerlicher)
- [ ] Piper Voices Download ueber Diagnostic (neue Sprachen/Stimmen)
- [ ] Coqui XTTS v2 Integration (braucht GPU, bessere deutsche Stimme)
+### Bugs (Prioritaet)
+- [ ] Session-Persistenz: Bei Container-Restart wird immer aria-bridge geladen statt die zuletzt gewaehlte Session. Wird nicht persistent gespeichert.
+- [ ] App: Textnachrichten, Bilder und Anhaenge werden von ARIA nicht beantwortet — nur Sprachnachrichten funktionieren.
+- [ ] App: Audioausgabe hoert ab und zu einfach auf (mitten im Satz oder zwischen Chunks)
+- [ ] Auto-Update: release.sh kopiert APK nicht auf den RVS-Server (rvs/updates/ bleibt leer)
+- [ ] App: Kein Auto-Scroll zur letzten Nachricht beim App-Start (soll direkt springen, nicht animiert scrollen)
+- [ ] App: Bei neuen Nachrichten soll automatisch zur letzten Nachricht gescrollt werden

-### App
+### App Features
+- [ ] App: Zu Anhaengen noch Text/Sprache hinzufuegen koennen (z.B. Bild senden + "Was siehst du?")
 - [ ] Wake Word on-device (Porcupine "ARIA" Keyword, Phase 2)
 - [ ] Chat-History zuverlaessiger laden (AsyncStorage Race Condition)
+- [ ] Background Audio Service (TTS auch bei minimierter App)
+
+### TTS / Audio
+- [ ] XTTS Audio-Streaming verbessern (minimales Stottern bei Chunk-Uebergaengen)
+- [ ] Audio-Normalisierung (Lautstaerke zwischen Chunks angleichen)
+- [ ] Piper Voices Download ueber Diagnostic (neue Sprachen/Stimmen)

 ### Architektur
 - [ ] Bilder: Claude Vision direkt nutzen (aktuell nur Dateipfad an ARIA)
 - [ ] Auto-Compacting und Memory/Brain Verwaltung (SQLite?)
 - [ ] Diagnostic: System-Info Tab (Container-Status, Disk, RAM, CPU)
+- [ ] RVS Zombie-Connections endgueltig loesen (WebRTC statt WebSocket?)
@@ -100,44 +100,66 @@ async function handleTTSRequest(payload) {
  // Markdown entfernen
  const cleanText = text.replace(/\*\*([^*]+)\*\*/g, "$1").trim();

-  // Text in Saetze aufteilen (sequentiell rendern fuer korrekte Reihenfolge)
-  const sentences = cleanText.split(/(?<=[.!?])\s+/).map(s => s.trim()).filter(s => s.length > 0);
-  if (sentences.length === 0) return;
+  // Text in Saetze aufteilen, dann zu Chunks von 2-3 Saetzen zusammenfassen
+  // (mehr Kontext = konsistentere Stimme/Lautstaerke, aber nicht zu lang fuer WebSocket)
+  const sentences = cleanText.split(/(?<=[.!?])\s+/)
+    .map(s => s.trim())
+    .filter(s => s.length > 0)
+    .map(s => s.replace(/[.]+$/, '')); // Punkt am Ende entfernen

-  log(`TTS-Request: "${cleanText.slice(0, 60)}..." (${sentences.length} Saetze, voice: ${voice || "default"}, lang: ${language || "de"})`);
+  const MAX_CHUNK_CHARS = 150; // Max ~150 Zeichen pro Chunk (schnelles Rendering, Preloading reicht)
+  const chunks = [];
+  let currentChunk = '';
+  for (const sentence of sentences) {
+    if (currentChunk && (currentChunk.length + sentence.length + 2) > MAX_CHUNK_CHARS) {
+      chunks.push(currentChunk);
+      currentChunk = sentence;
+    } else {
+      currentChunk = currentChunk ? currentChunk + ', ' + sentence : sentence;
+    }
+  }
+  if (currentChunk) chunks.push(currentChunk);
+  if (chunks.length === 0) return;
+
+  log(`TTS-Request: "${cleanText.slice(0, 60)}..." (${sentences.length} Saetze → ${chunks.length} Chunks, voice: ${voice || "default"}, lang: ${language || "de"})`);

  try {
    const voiceSample = voice ? path.join(VOICES_DIR, `${voice}.wav`) : null;
    const hasCustomVoice = voiceSample && fs.existsSync(voiceSample);

-    // Jeden Satz sequentiell rendern und sofort senden
-    for (let i = 0; i < sentences.length; i++) {
-      const sentence = sentences[i];
+    // Streaming: Chunk rendern → sofort senden → naechster Chunk
+    // App spielt mit Preloading-Queue nahtlos ab
+    let sentCount = 0;
+
+    for (let i = 0; i < chunks.length; i++) {
+      const chunk = chunks[i];
      try {
-        const audioBuffer = await callXTTSAPI(sentence, language || "de", hasCustomVoice ? voiceSample : null);
+        const audioBuffer = await callXTTSAPI(chunk, language || "de", hasCustomVoice ? voiceSample : null);

        if (audioBuffer && audioBuffer.length > 100) {
-          const base64 = audioBuffer.toString("base64");
-          log(`TTS [${i + 1}/${sentences.length}]: ${audioBuffer.length} bytes (${(audioBuffer.length / 1024).toFixed(0)}KB) — "${sentence.slice(0, 40)}..."`);
+          log(`TTS [${i + 1}/${chunks.length}]: ${(audioBuffer.length / 1024).toFixed(0)}KB — "${chunk.slice(0, 50)}"`);

          sendToRVS({
            type: "xtts_response",
            payload: {
              requestId: `${requestId || ""}_${i}`,
-              base64,
+              base64: audioBuffer.toString("base64"),
              mimeType: "audio/wav",
              voice: voice || "default",
              engine: "xtts",
+              part: i + 1,
+              totalParts: chunks.length,
            },
            timestamp: Date.now(),
          });
+          sentCount++;
        }
-      } catch (sentenceErr) {
-        log(`TTS [${i + 1}/${sentences.length}] Fehler: ${sentenceErr.message} — ueberspringe`);
+      } catch (chunkErr) {
+        log(`TTS [${i + 1}/${chunks.length}] Fehler: ${chunkErr.message} — ueberspringe`);
      }
    }

-    log(`TTS komplett: ${sentences.length} Saetze gerendert`);
+    log(`TTS komplett: ${sentCount}/${chunks.length} Chunks gestreamt`);
  } catch (err) {
    log(`TTS Fehler: ${err.message}`);
    sendToRVS({
Author	SHA1	Message	Date
duffyduck	054e4057d8	release: bump version to 0.0.2.8	2026-04-10 08:49:47 +02:00
duffyduck	3943e79bb1	docs: document .env.example with detailed comments, explain both tokens in README - ARIA_AUTH_TOKEN: Gateway auth (who can talk to ARIA) - RVS_TOKEN: Pairing token (same room in RVS relay) - RVS_UPDATE_HOST: SSH target for auto-update APK copy - All variables with German comments and examples Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 08:45:26 +02:00
duffyduck	87f4317c15	docs: add auto-update APK not reaching RVS bug to issue.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 08:39:59 +02:00
duffyduck	50aa793910	fix: Proxy SSH volume read-write (ARIA can manage keys without -F workaround) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 08:34:35 +02:00
duffyduck	5efc9865a8	docs: add 6 new bugs/features to issue.md - Session persistence on container restart - App: text/image/attachment messages not working (only voice) - App: audio stops randomly - App: auto-scroll to last message on start + new messages - App: add text/voice to attachments - Prioritized bugs section Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 08:31:49 +02:00
duffyduck	949c573c49	fix: XTTS chunk size 150 chars (faster render, preload overlaps playback) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 02:52:56 +02:00
duffyduck	f7f450a09d	fix: XTTS streaming mode - send each chunk immediately, comma between sentences - Back to streaming: render chunk → send immediately → next chunk - App plays with preloading queue (no waiting for all chunks) - Comma instead of dot between sentences in chunk (no "Punkt" read aloud) - Sentence-ending dots already removed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 02:48:50 +02:00
duffyduck	81f7c38383	fix: XTTS splits concatenated audio into ~8s parts (seamless with preload) - All chunks rendered and PCM concatenated (consistent voice) - Split into ~8 second WAV parts (not per-sentence) - 8s is long enough for preload overlap, small enough for WebSocket - Parts include part/totalParts metadata Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 02:41:14 +02:00
duffyduck	2c785cb37a	feat: XTTS concatenates chunks into seamless WAV (no stuttering) - All chunks rendered sequentially, PCM data concatenated - Single WAV with proper header sent back (no queue needed in app) - If total > 800KB, split into parts (WebSocket limit) - Eliminates stuttering between sentences Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 02:40:16 +02:00
duffyduck	57e65b061c	docs: update issue.md with XTTS streaming as next priority Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 02:38:21 +02:00
duffyduck	aa54765b03	release: bump version to 0.0.2.7	2026-04-10 02:24:58 +02:00
duffyduck	8929bc99bb	fix: XTTS groups sentences into ~250 char chunks for consistent voice quality - 2-3 sentences per chunk (more context = stable voice/volume) - Max 250 chars per chunk (keeps WebSocket packets manageable) - Dots re-added between sentences within a chunk (natural pauses) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 02:23:29 +02:00
duffyduck	0428c06612	fix: Audio preloading to prevent stuttering, remove trailing dots for XTTS - Preload next audio while current plays (eliminates gap between sentences) - Remove trailing dots from sentences (XTTS reads them aloud) - stopPlayback cleans up preloaded audio Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 02:21:19 +02:00