release: bump version to 0.0.3.8

feat: Whisper model selector + 16kHz mono recording
- App: AudioSamplingRateAndroid 16000 + AudioChannelsAndroid 1 → Whisper bekommt direkt sein Ziel-Format, kein Resample mehr - Bridge: STTEngine.reload() laedt Modell zur Laufzeit neu (tiny/base/small/medium/large-v3) - Bridge: Config-Message triggert Hot-Reload wenn whisperModel sich aendert - Bridge: Default auf 'medium' (besser als 'small' bei aehnlicher Latenz) - Diagnostic: Neue Sektion "Whisper (Spracherkennung)" mit Dropdown, auto-save bei Auswahl, beim Laden wird der gespeicherte Wert gesetzt - Diagnostic/Server: send_voice_config merged whisperModel in voice_config.json - aria.env.example: WHISPER_MODEL + WHISPER_LANGUAGE dokumentiert Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 11:41:12 +02:00 · 2026-04-18 11:37:27 +02:00 · 2026-04-18 11:22:02 +02:00 · 2026-04-18 11:14:15 +02:00 · 2026-04-18 11:11:12 +02:00 · 2026-04-18 11:03:26 +02:00
12 changed files with 422 additions and 78 deletions
@@ -306,7 +306,8 @@ aria-core → Antwort → Gateway → Diagnostic → RVS → App
 ### Features

 - **STT**: faster-whisper (lokal, offline, 16kHz mono)
- **TTS**: Piper (Ramona + Thorsten, offline)
+- **TTS**: Piper (Ramona + Thorsten, offline) oder XTTS v2 (remote, GPU, Voice Cloning)
+- **Markdown-Bereinigung**: Entfernt **fett**, *kursiv*, `code`, Links, Listen etc. vor TTS (natuerliche Sprache)
 - **Wake-Word**: openwakeword (lokales Mikrofon auf der VM)
 - **App-Audio**: Base64 Audio von App → FFmpeg → Whisper STT → Text an aria-core
 - **Modi**: Normal, Nicht stoeren, Fluestern, Hangar, Gaming
@@ -367,15 +368,17 @@ API-Endpoint fuer andere Services: `GET http://localhost:3001/api/session`

 - Text-Chat mit ARIA
 - **Sprachaufnahme**: Push-to-Talk (halten) oder Tap-to-Talk (tippen, Auto-Stop bei Stille)
+- **Gespraechsmodus** (Ohr-Button): Nach jeder ARIA-Antwort startet automatisch die Aufnahme — wie ein natuerliches Gespraech hin und her, ohne Buttons druecken
 - **VAD (Voice Activity Detection)**: Erkennt 1.8s Stille und stoppt automatisch
 - **STT (Speech-to-Text)**: Audio wird in der Bridge per Whisper transkribiert, transkribierter Text erscheint im Chat
- **TTS-Wiedergabe**: ARIA antwortet per Lautsprecher (Piper oder XTTS v2)
+- **TTS-Wiedergabe**: ARIA antwortet per Lautsprecher (Piper oder XTTS v2), Audio-Queue mit Preloading
 - **Play-Button**: Jede ARIA-Nachricht kann nochmal vorgelesen werden
 - **Chat-Suche**: Lupe in der Statusleiste filtert Nachrichten live
- **Datei- und Bild-Upload**: Bilder inline im Chat (Vollbild-Tap), Dateien mit Icon + Name + Groesse
+- **Mehrere Anhaenge**: Bilder + Dateien sammeln, Text hinzufuegen, dann zusammen senden
+- **Paste-Support**: Bilder aus Zwischenablage einfuegen (Diagnostic)
 - **Anhaenge**: Bridge speichert in Shared Volume, ARIA kann darauf zugreifen, Re-Download ueber RVS
 - **Einstellungen**: TTS Engine, Stimmen, Speed pro Stimme, Speicherort, Auto-Download, GPS
- **Auto-Update**: Prueft beim Start auf neue Version, Download + Installation ueber RVS
+- **Auto-Update**: Prueft beim Start + per Button auf neue Version, Download + Installation ueber RVS (FileProvider)
 - GPS-Position (optional)
 - QR-Code Scanner fuer Token-Pairing

@@ -709,6 +712,11 @@ docker exec aria-core ssh aria-wohnung hostname
 - [x] Auto-Update System (APK via RVS)
 - [x] Chat-Suche, Play-Button, Abbrechen-Button
 - [x] XTTS v2 Integration (GPU, Voice Cloning, remote ueber RVS)
+- [x] Gespraechsmodus (Ohr-Button, automatische Aufnahme nach ARIA-Antwort)
+- [x] Mehrere Anhaenge + Text vor dem Senden + Paste-Support
+- [x] Markdown-Bereinigung fuer TTS
+- [x] Auto-Update mit FileProvider + Update-Check Button
+- [x] Inverted FlatList (zuverlaessiges Scroll-to-Bottom)

 ### Phase 2 — ARIA wird produktiv

@@ -79,8 +79,8 @@ android {
        applicationId "com.ariacockpit"
        minSdkVersion rootProject.ext.minSdkVersion
        targetSdkVersion rootProject.ext.targetSdkVersion
-        versionCode 304
-        versionName "0.0.3.4"
+        versionCode 308
+        versionName "0.0.3.8"
        // Fallback fuer Libraries mit Product Flavors
        missingDimensionStrategy 'react-native-camera', 'general'
    }
@@ -1,6 +1,6 @@
 {
  "name": "aria-cockpit",
-  "version": "0.0.3.4",
+  "version": "0.0.3.8",
  "private": true,
  "scripts": {
    "android": "react-native run-android",
@@ -5,7 +5,7 @@
 * Datei- und Kamera-Upload.
 */

-import React, { useState, useEffect, useRef, useCallback } from 'react';
+import React, { useState, useEffect, useRef, useCallback, useMemo } from 'react';
 import {
  View,
  Text,
@@ -96,6 +96,7 @@ const ChatScreen: React.FC = () => {
  const [searchQuery, setSearchQuery] = useState('');
  const [searchVisible, setSearchVisible] = useState(false);
  const [pendingAttachments, setPendingAttachments] = useState<{file: any, isPhoto: boolean}[]>([]);
+  const [agentActivity, setAgentActivity] = useState<{activity: string, tool: string}>({activity: 'idle', tool: ''});

  const flatListRef = useRef<FlatList>(null);
  const messageIdCounter = useRef(0);
@@ -250,6 +251,13 @@ const ChatScreen: React.FC = () => {
      if (message.type === 'audio' && message.payload.base64) {
        audioService.playAudio(message.payload.base64 as string);
      }
+
+      // Thinking-Indicator Status von der Bridge
+      if (message.type === 'agent_activity') {
+        const activity = (message.payload.activity as string) || 'idle';
+        const tool = (message.payload.tool as string) || '';
+        setAgentActivity({ activity, tool });
+      }
    });

    const unsubState = rvs.onStateChange((state) => {
@@ -369,33 +377,8 @@ const ChatScreen: React.FC = () => {
    return () => { if (saveTimer.current) clearTimeout(saveTimer.current); };
  }, [messages]);

-  // Auto-Scroll: immer zur letzten Nachricht springen
-  const shouldAutoScroll = useRef(true);
-  const lastMessageCount = useRef(0);
-
-  // Bei neuen Nachrichten oder App-Start: nach unten springen
-  useEffect(() => {
-    if (messages.length > 0 && shouldAutoScroll.current) {
-      const isInitial = lastMessageCount.current === 0;
-      // Mehrfach versuchen (FlatList braucht Zeit zum Rendern)
-      const delays = isInitial ? [100, 300, 600, 1000] : [100, 300];
-      for (const delay of delays) {
-        setTimeout(() => {
-          flatListRef.current?.scrollToEnd({ animated: !isInitial });
-        }, delay);
-      }
-    }
-    lastMessageCount.current = messages.length;
-  }, [messages]);
-
-  const handleScrollBeginDrag = useCallback(() => {
-    shouldAutoScroll.current = false;
-  }, []);
-  const handleScrollEndDrag = useCallback((e: any) => {
-    const { contentOffset, contentSize, layoutMeasurement } = e.nativeEvent;
-    const isAtBottom = contentOffset.y + layoutMeasurement.height >= contentSize.height - 50;
-    shouldAutoScroll.current = isAtBottom;
-  }, []);
+  // Inverted FlatList: neueste Nachrichten unten, kein manuelles Scrollen noetig
+  const invertedMessages = useMemo(() => [...messages].reverse(), [messages]);

  // GPS-Position holen (optional)
  const getCurrentLocation = useCallback((): Promise<{ lat: number; lon: number } | null> => {
@@ -449,6 +432,12 @@ const ChatScreen: React.FC = () => {
    });
  }, [inputText, getCurrentLocation, pendingAttachments, sendPendingAttachments]);

+  // Anfrage abbrechen — sofort lokalen Indicator weg, Bridge triggert doctor --fix
+  const cancelRequest = useCallback(() => {
+    setAgentActivity({ activity: 'idle', tool: '' });
+    rvs.send('cancel_request' as any, {});
+  }, []);
+
  // Sprachaufnahme abgeschlossen
  const handleVoiceRecording = useCallback(async (result: RecordingResult) => {
    const location = await getCurrentLocation();
@@ -684,13 +673,12 @@ const ChatScreen: React.FC = () => {
      {/* Nachrichtenliste */}
      <FlatList
        ref={flatListRef}
-        data={searchQuery ? messages.filter(m => m.text.toLowerCase().includes(searchQuery.toLowerCase())) : messages}
+        inverted
+        data={searchQuery ? messages.filter(m => m.text.toLowerCase().includes(searchQuery.toLowerCase())).reverse() : invertedMessages}
        keyExtractor={item => item.id}
        renderItem={renderMessage}
        contentContainerStyle={styles.messageList}
        showsVerticalScrollIndicator={false}
-        onScrollBeginDrag={handleScrollBeginDrag}
-        onScrollEndDrag={handleScrollEndDrag}
        ListEmptyComponent={
          <View style={styles.emptyContainer}>
            <Text style={styles.emptyIcon}>{'\uD83E\uDD16'}</Text>
@@ -700,6 +688,22 @@ const ChatScreen: React.FC = () => {
        }
      />

+      {/* Thinking-Indicator */}
+      {agentActivity.activity !== 'idle' && (
+        <View style={styles.thinkingBar}>
+          <Text style={styles.thinkingText}>
+            {agentActivity.activity === 'tool' && agentActivity.tool
+              ? `\uD83D\uDD27 ${agentActivity.tool}`
+              : agentActivity.activity === 'assistant'
+              ? '\u270D\uFE0F ARIA schreibt...'
+              : '\uD83D\uDCAD ARIA denkt...'}
+          </Text>
+          <TouchableOpacity style={styles.thinkingCancel} onPress={cancelRequest}>
+            <Text style={styles.thinkingCancelText}>Abbrechen</Text>
+          </TouchableOpacity>
+        </View>
+      )}
+
      {/* Pending Anhaenge Vorschau */}
      {pendingAttachments.length > 0 && (
        <View style={styles.pendingBar}>
@@ -996,6 +1000,33 @@ const styles = StyleSheet.create({
  wakeWordIcon: {
    fontSize: 16,
  },
+  thinkingBar: {
+    flexDirection: 'row',
+    alignItems: 'center',
+    justifyContent: 'space-between',
+    backgroundColor: '#1E1E2E',
+    paddingHorizontal: 12,
+    paddingVertical: 6,
+    borderTopWidth: 1,
+    borderTopColor: '#2A2A3E',
+  },
+  thinkingText: {
+    color: '#FFD60A',
+    fontSize: 12,
+    flex: 1,
+  },
+  thinkingCancel: {
+    paddingHorizontal: 10,
+    paddingVertical: 4,
+    borderWidth: 1,
+    borderColor: '#FF3B30',
+    borderRadius: 4,
+  },
+  thinkingCancelText: {
+    color: '#FF3B30',
+    fontSize: 11,
+    fontWeight: 'bold',
+  },
  pendingBar: {
    flexDirection: 'row',
    alignItems: 'center',
@@ -42,6 +42,8 @@ const AUDIO_ENCODING = 'audio/wav';
 // VAD (Voice Activity Detection) — Stille-Erkennung
 const VAD_SILENCE_THRESHOLD_DB = -45;  // dB unter dem als "Stille" gilt
 const VAD_SILENCE_DURATION_MS = 1800;  // ms Stille bevor Auto-Stop
+const VAD_SPEECH_THRESHOLD_DB = -35;   // dB ueber dem als "Sprache" gilt (Sprach-Gate)
+const VAD_SPEECH_MIN_MS = 300;         // ms Sprache bevor Aufnahme zaehlt

 // --- Audio-Service ---

@@ -61,6 +63,10 @@ class AudioService {
  private preloadedSound: Sound | null = null;
  private preloadedPath: string = '';

+  // Sprach-Gate: Aufnahme erst senden wenn tatsaechlich gesprochen wurde
+  private speechDetected: boolean = false;
+  private speechStartTime: number = 0;
+
  // VAD State
  private vadEnabled: boolean = false;
  private lastSpeechTime: number = 0;
@@ -121,6 +127,8 @@ class AudioService {
        AudioEncoderAndroid: AudioEncoderAndroidType.AAC,
        AudioSourceAndroid: AudioSourceAndroidType.MIC,
        OutputFormatAndroid: OutputFormatAndroidType.MPEG_4,
+        AudioSamplingRateAndroid: 16000,
+        AudioChannelsAndroid: 1,
      }, true); // meteringEnabled = true

      // Metering-Callback
@@ -128,7 +136,21 @@ class AudioService {
        const db = e.currentMetering ?? -160;
        this.meterListeners.forEach(cb => cb(db));

-        // VAD: Stille erkennen
+        // Sprach-Gate: Erkennen ob tatsaechlich gesprochen wird
+        if (db > VAD_SPEECH_THRESHOLD_DB) {
+          if (!this.speechDetected && this.speechStartTime === 0) {
+            this.speechStartTime = Date.now();
+          }
+          if (this.speechStartTime > 0 && Date.now() - this.speechStartTime >= VAD_SPEECH_MIN_MS) {
+            this.speechDetected = true;
+          }
+        } else {
+          if (!this.speechDetected) {
+            this.speechStartTime = 0; // Reset wenn noch nicht als Sprache erkannt
+          }
+        }
+
+        // VAD: Stille erkennen (nur wenn Sprache erkannt wurde)
        if (this.vadEnabled) {
          if (db > VAD_SILENCE_THRESHOLD_DB) {
            this.lastSpeechTime = Date.now();
@@ -138,6 +160,8 @@ class AudioService {

      this.recordingStartTime = Date.now();
      this.lastSpeechTime = Date.now();
+      this.speechDetected = false;
+      this.speechStartTime = 0;
      this.setState('recording');

      // VAD aktivieren
@@ -180,6 +204,15 @@ class AudioService {
      this.recorder.removeRecordBackListener();

      const durationMs = Date.now() - this.recordingStartTime;
+      const hadSpeech = this.speechDetected;
+
+      // Sprach-Gate: Wenn keine Sprache erkannt → Aufnahme verwerfen
+      if (!hadSpeech) {
+        RNFS.unlink(this.recordingPath).catch(() => {});
+        this.setState('idle');
+        console.log('[Audio] Aufnahme verworfen — keine Sprache erkannt (nur Umgebungsgeraeusche)');
+        return null;
+      }

      // Audio-Datei als Base64 lesen
      const base64Data = await RNFS.readFile(this.recordingPath, 'base64');
@@ -188,7 +221,7 @@ class AudioService {
      RNFS.unlink(this.recordingPath).catch(() => {});

      this.setState('idle');
-      console.log(`[Audio] Aufnahme beendet (${durationMs}ms, ${Math.round(base64Data.length / 1024)}KB)`);
+      console.log(`[Audio] Aufnahme beendet (${durationMs}ms, ${Math.round(base64Data.length / 1024)}KB, Sprache erkannt)`);

      return {
        base64: base64Data,
@@ -21,8 +21,14 @@ class WakeWordService {
  /** Gespraechsmodus starten */
  async start(): Promise<boolean> {
    if (this.state === 'listening') return true;
-    console.log('[WakeWord] Gespraechsmodus aktiviert — Aufnahme startet nach ARIA-Antwort');
+    console.log('[WakeWord] Gespraechsmodus aktiviert — starte sofort Aufnahme');
    this.setState('listening');
+    // Sofort erste Aufnahme starten
+    setTimeout(() => {
+      if (this.state === 'listening') {
+        this.wakeCallbacks.forEach(cb => cb());
+      }
+    }, 500);
    return true;
  }

@@ -9,3 +9,10 @@ PIPER_THORSTEN=/voices/de_DE-thorsten-high.onnx

 # Wake-Word
 WAKE_WORD=aria
+
+# Whisper STT — wird zur Laufzeit in der Diagnostic (Sektion "Whisper") umgeschaltet
+# und in /shared/config/voice_config.json gespeichert. Der Wert hier ist nur der
+# Initial-Default beim ersten Start.
+# Optionen: tiny | base | small | medium | large-v3
+WHISPER_MODEL=medium
+WHISPER_LANGUAGE=de
@@ -63,7 +63,7 @@ RVS_TLS = os.getenv("RVS_TLS", "true")       # true = wss://, false = ws://
 RVS_TLS_FALLBACK = os.getenv("RVS_TLS_FALLBACK", "true")  # Bei TLS-Fehler ws:// versuchen
 RVS_TOKEN = os.getenv("RVS_TOKEN", "")       # Pairing-Token (gleich wie in der App)
 DIAGNOSTIC_URL = os.getenv("DIAGNOSTIC_URL", "http://127.0.0.1:3001")  # Diagnostic API
-WHISPER_MODEL = os.getenv("WHISPER_MODEL", "small")
+WHISPER_MODEL = os.getenv("WHISPER_MODEL", "medium")
 WHISPER_LANGUAGE = os.getenv("WHISPER_LANGUAGE", "de")

 # Audio-Parameter
@@ -330,6 +330,25 @@ class STTEngine:
        self.model = WhisperModel(self.model_size, device="cpu", compute_type="int8")
        logger.info("Whisper-Modell geladen")

+    def reload(self, model_size: str) -> bool:
+        """Laedt ein anderes Whisper-Modell (bei Config-Aenderung)."""
+        if model_size == self.model_size and self.model is not None:
+            return False
+        allowed = {"tiny", "base", "small", "medium", "large-v3"}
+        if model_size not in allowed:
+            logger.warning("Ungueltiges Whisper-Modell: %s (erlaubt: %s)", model_size, allowed)
+            return False
+        logger.info("Lade Whisper-Modell neu: %s -> %s", self.model_size, model_size)
+        self.model_size = model_size
+        self.model = None
+        try:
+            self.model = WhisperModel(model_size, device="cpu", compute_type="int8")
+            logger.info("Whisper-Modell '%s' geladen", model_size)
+            return True
+        except Exception:
+            logger.exception("Whisper-Modell '%s' konnte nicht geladen werden", model_size)
+            return False
+
    def transcribe(self, audio_data: np.ndarray) -> str:
        """Transkribiert Audio-Daten zu Text.

@@ -502,6 +521,7 @@ class ARIABridge:
        # Komponenten
        self.voice_engine = VoiceEngine(VOICES_DIR)
        self.tts_enabled = True
+        vc: dict = {}
        # Gespeicherte Voice-Config laden
        try:
            vc_path = "/shared/config/voice_config.json"
@@ -520,8 +540,10 @@ class ARIABridge:
                logger.info("Voice-Config geladen: %s", vc)
        except Exception as e:
            logger.warning("Voice-Config laden fehlgeschlagen: %s", e)
+        # Whisper-Modell: Config hat Vorrang, dann env/Default (medium)
+        whisper_model = vc.get("whisperModel") or self.config.get("WHISPER_MODEL", WHISPER_MODEL)
        self.stt_engine = STTEngine(
-            model_size=self.config.get("WHISPER_MODEL", WHISPER_MODEL),
+            model_size=whisper_model,
            language=self.config.get("WHISPER_LANGUAGE", WHISPER_LANGUAGE),
        )
        self.wake_word = WakeWordDetector()
@@ -530,6 +552,9 @@ class ARIABridge:
        self.ws_core: Optional[websockets.WebSocketClientProtocol] = None
        self.ws_rvs: Optional[websockets.WebSocketClientProtocol] = None

+        # Letzter gesendeter agent_activity-State (zum Entduplizieren)
+        self._last_activity_state: Optional[tuple] = None
+
    def initialize(self) -> None:
        """Initialisiert alle Komponenten.

@@ -734,8 +759,18 @@ class ARIABridge:
        if event_name == "agent":
            data = payload.get("data", {})
            delta = data.get("delta", "")
-            if delta and payload.get("stream") == "assistant":
+            stream = payload.get("stream", "")
+            if delta and stream == "assistant":
                logger.debug("[core] Delta: '%s'", delta[:40])
+            # Activity-Signal zur App (entdupliziert)
+            tool_name = data.get("name") or data.get("tool") or payload.get("tool") or ""
+            if stream == "tool_use" or data.get("type") == "tool_use":
+                activity = "tool"
+            elif stream == "assistant":
+                activity = "assistant"
+            else:
+                activity = "thinking"
+            await self._emit_activity(activity, tool_name)
            return

        # ── chat Events: Snapshots mit state=delta|final|error ──
@@ -744,6 +779,7 @@ class ARIABridge:

            if state == "final":
                text = self._extract_chat_text(payload)
+                await self._emit_activity("idle", "")
                if not text:
                    logger.warning("[core] chat final ohne Text: %s", json.dumps(payload)[:200])
                    return
@@ -754,6 +790,7 @@ class ARIABridge:
            if state == "error":
                error = payload.get("error", "Unbekannt")
                logger.error("[core] Chat-Fehler: %s", error)
+                await self._emit_activity("idle", "")
                await self._send_to_rvs({
                    "type": "chat",
                    "payload": {
@@ -1063,6 +1100,12 @@ class ARIABridge:
                await self.send_to_core(text, source="app")
            return

+        if msg_type == "cancel_request":
+            logger.info("[rvs] Cancel-Request von App — rufe Diagnostic /api/cancel auf")
+            await self._cancel_via_diagnostic()
+            await self._emit_activity("idle", "")
+            return
+
        elif msg_type == "xtts_response":
            # XTTS-Audio vom Gaming-PC empfangen → an App weiterleiten
            audio_b64 = payload.get("base64", "")
@@ -1142,6 +1185,15 @@ class ARIABridge:
                self.voice_engine.speech_speed["thorsten"] = max(0.3, min(2.0, float(payload["speedThorsten"])))
                logger.info("[rvs] Speed Thorsten: %.1f", self.voice_engine.speech_speed["thorsten"])
                changed = True
+            whisper_reloaded = False
+            if "whisperModel" in payload:
+                new_model = payload["whisperModel"]
+                if new_model and new_model != self.stt_engine.model_size:
+                    logger.info("[rvs] Whisper-Modell Wechsel: %s -> %s (laedt...)", self.stt_engine.model_size, new_model)
+                    loop = asyncio.get_event_loop()
+                    whisper_reloaded = await loop.run_in_executor(None, self.stt_engine.reload, new_model)
+                    if whisper_reloaded:
+                        changed = True
            # Persistent speichern in Shared Volume
            if changed:
                try:
@@ -1154,6 +1206,7 @@ class ARIABridge:
                        "xttsVoice": getattr(self, "xtts_voice", ""),
                        "speedRamona": self.voice_engine.speech_speed.get("ramona", 1.0),
                        "speedThorsten": self.voice_engine.speech_speed.get("thorsten", 1.0),
+                        "whisperModel": self.stt_engine.model_size,
                    }
                    with open("/shared/config/voice_config.json", "w") as f:
                        json.dump(config_data, f, indent=2)
@@ -1396,6 +1449,36 @@ class ARIABridge:

    # ── Log-Streaming an die App ─────────────────────────────

+    async def _cancel_via_diagnostic(self) -> None:
+        """Ruft das Diagnostic /api/cancel an — dort laeuft die volle Abbruch-Logik
+        (openclaw doctor --fix mit Docker-Socket)."""
+        def _do_request():
+            try:
+                req = urllib.request.Request(
+                    f"{self._diagnostic_url}/api/cancel",
+                    method="POST",
+                    data=b"",
+                )
+                with urllib.request.urlopen(req, timeout=5) as resp:
+                    return resp.status
+            except Exception as e:
+                return f"error: {e}"
+
+        status = await asyncio.get_event_loop().run_in_executor(None, _do_request)
+        logger.info("[cancel] Diagnostic /api/cancel: %s", status)
+
+    async def _emit_activity(self, activity: str, tool: str = "") -> None:
+        """Sendet agent_activity an die App — nur wenn sich der State geaendert hat."""
+        state = (activity, tool)
+        if state == self._last_activity_state:
+            return
+        self._last_activity_state = state
+        await self._send_to_rvs({
+            "type": "agent_activity",
+            "payload": {"activity": activity, "tool": tool},
+            "timestamp": int(asyncio.get_event_loop().time() * 1000),
+        })
+
    async def send_log_to_app(self, source: str, message: str, level: str = "info") -> None:
        """Sendet einen Log-Eintrag an die App (erscheint im Log-Viewer)."""
        await self._send_to_rvs({
@@ -499,6 +499,30 @@
      </div>
    </div>

+    <!-- Whisper (STT) -->
+    <div class="settings-section">
+      <h2>Whisper (Spracherkennung)</h2>
+      <div style="font-size:11px;color:#8888AA;margin-bottom:8px;">
+        Aenderungen werden sofort an die Bridge gesendet und das Modell neu geladen
+        (kann bei medium/large 10-30s dauern — waehrend dieser Zeit ist STT kurz pausiert).
+      </div>
+      <div class="card" style="max-width:500px;">
+        <div style="display:flex;align-items:center;gap:12px;margin-bottom:8px;">
+          <label style="color:#8888AA;font-size:12px;min-width:80px;">Modell:</label>
+          <select id="diag-whisper-model" onchange="sendVoiceConfig()" style="flex:1;background:#1E1E2E;color:#fff;border:1px solid #2A2A3E;border-radius:6px;padding:6px 10px;font-size:13px;">
+            <option value="tiny">tiny (39MB, schnell, niedrige Qualitaet)</option>
+            <option value="base">base (74MB, schnell, ok)</option>
+            <option value="small">small (244MB, mittel)</option>
+            <option value="medium" selected>medium (769MB, gut — Empfehlung)</option>
+            <option value="large-v3">large-v3 (1.5GB, beste Qualitaet, langsam auf CPU)</option>
+          </select>
+        </div>
+        <div style="font-size:10px;color:#555570;">
+          Tipp: <code>medium</code> ist der beste Kompromiss fuer CPU. <code>large-v3</code> nur bei GPU sinnvoll.
+        </div>
+      </div>
+    </div>
+
    <!-- Highlight-Trigger -->
    <div class="settings-section">
      <h2>Highlight-Trigger</h2>
@@ -763,6 +787,11 @@
          }
          xttsSelect.value = xttsVoice;
          toggleXTTSPanel();
+          // Whisper-Modell wiederherstellen (falls gesetzt)
+          if (msg.whisperModel) {
+            const wSel = document.getElementById('diag-whisper-model');
+            if (wSel) wSel.value = msg.whisperModel;
+          }
          return;
        }

@@ -891,6 +920,18 @@
          else alert('Loeschen fehlgeschlagen: ' + (msg.error || '?'));
          return;
        }
+        if (msg.type === 'session_export') {
+          if (!msg.ok) { alert('Export fehlgeschlagen: ' + (msg.error || '?')); return; }
+          const blob = new Blob([msg.markdown], { type: 'text/markdown;charset=utf-8' });
+          const url = URL.createObjectURL(blob);
+          const a = document.createElement('a');
+          a.href = url;
+          a.download = msg.filename;
+          document.body.appendChild(a);
+          a.click();
+          setTimeout(() => { URL.revokeObjectURL(url); a.remove(); }, 100);
+          return;
+        }
        if (msg.type === 'active_session') {
          updateActiveSessionBar(msg.sessionKey);
          loadSessions(); // Tabelle neu rendern
@@ -1392,7 +1433,8 @@
      const speedThorsten = parseFloat(document.getElementById('diag-speed-thorsten').value);
      const ttsEngine = document.getElementById('diag-tts-engine').value;
      const xttsVoice = document.getElementById('diag-xtts-voice').value;
-      send({ action: 'send_voice_config', defaultVoice, highlightVoice, ttsEnabled, speedRamona, speedThorsten, ttsEngine, xttsVoice });
+      const whisperModel = document.getElementById('diag-whisper-model').value;
+      send({ action: 'send_voice_config', defaultVoice, highlightVoice, ttsEnabled, speedRamona, speedThorsten, ttsEngine, xttsVoice, whisperModel });
    }

    // ── Highlight-Trigger ────────────────────────
@@ -1679,7 +1721,8 @@
              + `<td style="padding:4px 6px;color:#8888AA;font-size:10px;">${date}</td>`
              + `<td style="padding:4px 6px;white-space:nowrap;">`
              + (isActive ? '' : `<button class="btn secondary" onclick="event.stopPropagation();activateSession('${escapeHtml(s.sessionKey)}')" style="padding:2px 6px;font-size:10px;color:#34C759;margin-right:2px;" title="Aktivieren">&#9654;</button>`)
-              + `<button class="btn secondary" onclick="event.stopPropagation();deleteSession('${escapeHtml(s.path)}')" style="padding:2px 6px;font-size:10px;color:#FF6B6B;" title="Loeschen">X</button>`
+              + `<button class="btn secondary" onclick="event.stopPropagation();deleteSession('${escapeHtml(s.path)}')" style="padding:2px 6px;font-size:10px;color:#FF6B6B;margin-right:2px;" title="Loeschen">X</button>`
+              + `<button class="btn secondary" onclick="event.stopPropagation();exportSession('${escapeHtml(s.path)}','${escapeHtml(s.sessionKey)}')" style="padding:2px 6px;font-size:10px;color:#8888AA;" title="Als Markdown exportieren">&#x2B07;</button>`
              + `</td></tr>`;
      }
      html += '</table>';
@@ -1743,6 +1786,10 @@
      send({ action: 'delete_session', sessionPath: path });
    }

+    function exportSession(path, sessionKey) {
+      send({ action: 'export_session', sessionPath: path, sessionKey });
+    }
+
    function activateSession(sessionKey) {
      send({ action: 'set_active_session', sessionKey });
    }
@@ -37,15 +37,41 @@ const state = {
 };
 const SESSION_KEY_FILE = "/data/active-session";
 // /data Verzeichnis sicherstellen (Volume Mount)
-try { fs.mkdirSync("/data", { recursive: true }); } catch {}
+try { fs.mkdirSync("/data", { recursive: true }); } catch (e) {
+  console.error(`[startup] /data mkdir fehlgeschlagen: ${e.message}`);
+}
+// sessionFromFile zeigt an, ob der aktive Key aus der Datei kam.
+// Wenn true, darf resolveActiveSession NICHT mehr auto-picken (Wahl respektieren).
+let sessionFromFile = false;
 let activeSessionKey = (() => {
  try {
    const saved = fs.readFileSync(SESSION_KEY_FILE, "utf-8").trim();
-    if (saved) { console.log(`[startup] Gespeicherte Session geladen: '${saved}'`); return saved; }
-  } catch {}
+    if (saved) {
+      console.log(`[startup] Gespeicherte Session geladen: '${saved}'`);
+      sessionFromFile = true;
+      return saved;
+    }
+  } catch (e) {
+    console.error(`[startup] SESSION_KEY_FILE read: ${e.code || e.message}`);
+  }
  console.log("[startup] Keine gespeicherte Session — Fallback 'main'");
  return "main";
 })();
+
+// Atomic write: temp-file + rename, laute Logs bei Fehler.
+function persistActiveSession(key) {
+  try {
+    const tmp = SESSION_KEY_FILE + ".tmp";
+    fs.writeFileSync(tmp, key);
+    fs.renameSync(tmp, SESSION_KEY_FILE);
+    sessionFromFile = true;
+    console.log(`[session] Aktive Session persistiert: '${key}'`);
+    return true;
+  } catch (e) {
+    console.error(`[session] FEHLER beim Persistieren von '${key}': ${e.message}`);
+    return false;
+  }
+}
 const logs = [];
 let gatewayWs = null;
 let rvsWs = null;
@@ -91,6 +117,9 @@ function pipelineEnd(ok, detail) {
  }
  plog(`━━━ Pipeline Ende ━━━`);
  pipelineActive = false;
+  // Thinking-Indikator IMMER zuruecksetzen — auch bei Timeout/Fehler/Abbruch
+  broadcast({ type: "agent_activity", activity: "idle" });
+  pendingMessageTime = 0;
 }

 // ── Auto-Restart bei Netzwerk-Namespace-Verlust ──────
@@ -257,8 +286,10 @@ async function connectGateway() {
      state.gateway.handshakeOk = false;
      gatewayWs = null;
      broadcastState();
+      // Stuck "ARIA denkt..." vermeiden, falls Gateway waehrend Pipeline abkackt
+      if (pipelineActive) pipelineEnd(false, `Gateway-Verbindung verloren (${code})`);
+      else broadcast({ type: "agent_activity", activity: "idle" });
      checkGatewayHealth();
-      // Auto-Reconnect nach 5s
      setTimeout(connectGateway, 5000);
    });

@@ -372,6 +403,7 @@ function handleGatewayMessage(msg) {
        const error = payload.error || text || "Unbekannt";
        log("error", "gateway", `Chat-Fehler: ${error}`);
        if (pipelineActive) pipelineEnd(false, error);
+        else broadcast({ type: "agent_activity", activity: "idle" });
        broadcast({ type: "chat_error", error, payload });
        return;
      }
@@ -393,6 +425,7 @@ function handleGatewayMessage(msg) {
      const text = extractChatText(payload) || payload.text || "";
      log("info", "gateway", `ANTWORT: "${text.slice(0, 200)}"`);
      if (pipelineActive) pipelineEnd(true, `"${text.slice(0, 120)}"`);
+      else broadcast({ type: "agent_activity", activity: "idle" });
      broadcast({ type: "chat_final", text, payload });
      return;
    }
@@ -400,6 +433,7 @@ function handleGatewayMessage(msg) {
      const error = payload.error || payload.message || "Unbekannt";
      log("error", "gateway", `Chat-Fehler: ${error}`);
      if (pipelineActive) pipelineEnd(false, error);
+      else broadcast({ type: "agent_activity", activity: "idle" });
      broadcast({ type: "chat_error", error, payload });
      return;
    }
@@ -1109,6 +1143,16 @@ const server = http.createServer((req, res) => {
  } else if (req.url === "/api/session") {
    res.writeHead(200, { "Content-Type": "application/json" });
    res.end(JSON.stringify({ sessionKey: activeSessionKey }));
+  } else if (req.url === "/api/cancel" && req.method === "POST") {
+    log("warn", "server", "HTTP /api/cancel — Cancel-Request (von Bridge)");
+    pendingMessageTime = 0;
+    watchdogWarned = false;
+    watchdogFixAttempted = false;
+    if (pipelineActive) pipelineEnd(false, "Vom Benutzer abgebrochen (App)");
+    else broadcast({ type: "agent_activity", activity: "idle" });
+    dockerExec("aria-core", "openclaw doctor --fix 2>/dev/null || true").catch(() => {});
+    res.writeHead(200, { "Content-Type": "application/json" });
+    res.end(JSON.stringify({ ok: true }));
  } else if (req.url.startsWith("/shared/")) {
    // Dateien aus Shared Volume ausliefern (Bilder, Uploads)
    const filePath = decodeURIComponent(req.url);
@@ -1209,7 +1253,11 @@ wss.on("connection", (ws) => {
        handleGetVoiceConfig(ws);
      } else if (msg.action === "send_voice_config") {
        // Stimmen-Config persistent speichern + an Bridge via RVS senden
+        // Bestehende Config lesen um Felder zu mergen die dieser Call nicht setzt
+        let existing = {};
+        try { existing = JSON.parse(fs.readFileSync("/shared/config/voice_config.json", "utf-8")); } catch {}
        const voiceConfig = {
+          ...existing,
          defaultVoice: msg.defaultVoice || "ramona",
          highlightVoice: msg.highlightVoice || "thorsten",
          ttsEnabled: msg.ttsEnabled !== false,
@@ -1218,12 +1266,13 @@ wss.on("connection", (ws) => {
          speedRamona: msg.speedRamona || 1.0,
          speedThorsten: msg.speedThorsten || 1.0,
        };
+        if (msg.whisperModel !== undefined) voiceConfig.whisperModel = msg.whisperModel;
        try {
          fs.mkdirSync("/shared/config", { recursive: true });
          fs.writeFileSync("/shared/config/voice_config.json", JSON.stringify(voiceConfig, null, 2));
        } catch {}
        sendToRVS_raw({ type: "config", payload: voiceConfig, timestamp: Date.now() });
-        log("info", "server", `Voice-Config gespeichert+gesendet: default=${voiceConfig.defaultVoice}, highlight=${voiceConfig.highlightVoice}, tts=${voiceConfig.ttsEnabled}`);
+        log("info", "server", `Voice-Config gespeichert+gesendet: default=${voiceConfig.defaultVoice}, whisper=${voiceConfig.whisperModel || "-"}`);
      } else if (msg.action === "get_triggers") {
        handleGetTriggers(ws);
      } else if (msg.action === "save_triggers") {
@@ -1240,6 +1289,8 @@ wss.on("connection", (ws) => {
        handleListSessions(ws);
      } else if (msg.action === "read_session") {
        handleReadSession(ws, msg.sessionPath);
+      } else if (msg.action === "export_session") {
+        handleExportSession(ws, msg.sessionPath, msg.sessionKey);
      } else if (msg.action === "delete_session") {
        handleDeleteSession(ws, msg.sessionPath);
      } else if (msg.action === "set_active_session") {
@@ -1622,6 +1673,68 @@ async function handleReadSession(clientWs, sessionPath) {
  }
 }

+async function handleExportSession(clientWs, sessionPath, sessionKey) {
+  if (!sessionPath || sessionPath.includes("..") || !sessionPath.startsWith(SESSIONS_DIR)) {
+    clientWs.send(JSON.stringify({ type: "session_export", ok: false, error: "Ungueltiger Pfad" }));
+    return;
+  }
+  try {
+    const safePath = sessionPath.replace(/'/g, "");
+    const raw = await dockerExec("aria-core", `cat '${safePath}'`);
+    const lines = raw.split("\n").filter(l => l.trim());
+
+    const blocks = [];
+    for (const line of lines) {
+      let obj;
+      try { obj = JSON.parse(line); } catch { continue; }
+      if (obj.type !== "message" || !obj.message) continue;
+      const role = obj.message.role;
+      if (role !== "user" && role !== "assistant") continue;
+
+      let text = "";
+      const content = obj.message.content;
+      if (typeof content === "string") text = content;
+      else if (Array.isArray(content)) text = content.filter(c => c.type === "text").map(c => c.text || "").join("\n");
+      if (!text) continue;
+
+      if (role === "user") {
+        text = text.replace(/^Sender \(untrusted metadata\):[\s\S]*?```[\s\S]*?```\s*\n*/m, "").trim();
+        text = text.replace(/^\[.*?\]\s*/, "").trim();
+      } else {
+        text = text.replace(/^\[\[reply_to_\w+\]\]\s*/g, "").trim();
+      }
+      if (!text) continue;
+
+      const ts = obj.message.timestamp || obj.timestamp || 0;
+      const when = ts ? new Date(ts).toISOString().replace("T", " ").slice(0, 19) : "";
+      const heading = role === "user" ? "## 🧑 User" : "## 🤖 ARIA";
+      blocks.push(`${heading}${when ? ` — ${when}` : ""}\n\n${text}`);
+    }
+
+    const exportedAt = new Date().toISOString().replace("T", " ").slice(0, 19);
+    const title = sessionKey || sessionPath.split("/").pop().replace(".jsonl", "");
+    const markdown = [
+      `# Session: ${title}`,
+      ``,
+      `Exportiert: ${exportedAt}  `,
+      `Quelle: ${sessionPath}`,
+      ``,
+      `---`,
+      ``,
+      blocks.join("\n\n---\n\n"),
+      ``,
+    ].join("\n");
+
+    const safeKey = (sessionKey || "session").replace(/[^a-zA-Z0-9_-]/g, "_");
+    const filename = `${exportedAt.slice(0, 10)}_${safeKey}.md`;
+    clientWs.send(JSON.stringify({ type: "session_export", ok: true, filename, markdown }));
+    log("info", "server", `Session exportiert: ${filename} (${blocks.length} Nachrichten)`);
+  } catch (err) {
+    log("error", "server", `Session-Export fehlgeschlagen: ${err.message}`);
+    clientWs.send(JSON.stringify({ type: "session_export", ok: false, error: err.message }));
+  }
+}
+
 async function handleDeleteSession(clientWs, sessionPath) {
  if (!sessionPath || sessionPath.includes("..") || !sessionPath.startsWith(SESSIONS_DIR)) {
    clientWs.send(JSON.stringify({ type: "session_deleted", ok: false, error: "Ungueltiger Pfad" }));
@@ -1662,13 +1775,11 @@ async function handleDeleteSession(clientWs, sessionPath) {
 }

 // ── Session-Aufloesung: letzte aktive Session finden ────
+// Wird nach Gateway-(Re-)Connect aufgerufen. Darf die explizit gewaehlte
+// Session NIE ueberschreiben — nur beim absoluten Erststart auto-picken.
 async function resolveActiveSession() {
-  // Nur bei Fallback-Key "main" automatisch aufloesen — gespeicherte Wahl respektieren
-  const hasSavedSession = (() => {
-    try { return !!fs.readFileSync(SESSION_KEY_FILE, "utf-8").trim(); } catch { return false; }
-  })();
-  if (hasSavedSession && activeSessionKey !== "main") {
-    log("info", "server", `Gespeicherte Session '${activeSessionKey}' wird beibehalten`);
+  if (sessionFromFile) {
+    log("info", "server", `Session '${activeSessionKey}' aus /data — keine Auto-Wahl`);
    return;
  }

@@ -1687,10 +1798,19 @@ async function resolveActiveSession() {
  const keys = entries.map(e => (e.key || e.sessionKey || e.name || "?").replace(/^agent:main:/, ""));
  log("info", "server", `Verfuegbare Sessions: [${keys.join(", ")}]`);

-  // Neueste Session nehmen
+  // Neueste Session nehmen — aber user-definierte bevorzugen.
+  // aria-bridge / aria-diagnostic werden von den Services auto-erstellt;
+  // bei erstem Start soll lieber eine "echte" Session gewaehlt werden,
+  // falls vorhanden.
+  const AUTO_KEYS = new Set(["aria-bridge", "aria-diagnostic"]);
+  const normalise = (e) => (e.key || e.sessionKey || e.name || "").replace(/^agent:main:/, "");
+
+  const userEntries = entries.filter(e => !AUTO_KEYS.has(normalise(e)));
+  const pool = userEntries.length > 0 ? userEntries : entries;
+
  let newest = null;
  let newestTime = 0;
-  for (const entry of entries) {
+  for (const entry of pool) {
    const t = entry.updatedAt || entry.createdAt || 0;
    if (t >= newestTime) {
      newestTime = t;
@@ -1699,12 +1819,11 @@ async function resolveActiveSession() {
  }

  if (newest) {
-    const rawKey = newest.key || newest.sessionKey || newest.name || "";
-    const key = rawKey.replace(/^agent:main:/, "");
+    const key = normalise(newest);
    if (key) {
      activeSessionKey = key;
-      try { fs.writeFileSync(SESSION_KEY_FILE, activeSessionKey); } catch {}
-      log("info", "server", `Aktive Session auf neueste gewechselt: '${activeSessionKey}'`);
+      persistActiveSession(activeSessionKey);
+      log("info", "server", `Auto-Wahl Erststart: '${activeSessionKey}'`);
      for (const c of browserClients) {
        c.send(JSON.stringify({ type: "active_session", sessionKey: activeSessionKey }));
      }
@@ -1793,8 +1912,11 @@ function handleSetActiveSession(clientWs, sessionKey) {
    return;
  }
  activeSessionKey = sessionKey;
-  try { fs.writeFileSync(SESSION_KEY_FILE, activeSessionKey); } catch {}
-  log("info", "server", `Aktive Session: ${activeSessionKey}`);
+  const ok = persistActiveSession(activeSessionKey);
+  log("info", "server", `Aktive Session: ${activeSessionKey}${ok ? "" : " (WARN: nicht persistiert!)"}`);
+  if (!ok) {
+    clientWs.send(JSON.stringify({ type: "active_session", ok: false, sessionKey: activeSessionKey, error: "Persistierung fehlgeschlagen — /data Volume pruefen" }));
+  }
  // Allen Clients mitteilen
  for (const c of browserClients) {
    c.send(JSON.stringify({ type: "active_session", sessionKey: activeSessionKey }));
@@ -1810,7 +1932,7 @@ async function handleCreateSession(clientWs, sessionName) {
  try {
    // Session wird automatisch erstellt wenn man die erste Nachricht sendet
    activeSessionKey = sessionName;
-    try { fs.writeFileSync(SESSION_KEY_FILE, activeSessionKey); } catch {}
+    persistActiveSession(activeSessionKey);
    log("info", "server", `Neue Session erstellt und aktiviert: ${sessionName}`);
    // Allen Clients mitteilen
    for (const c of browserClients) {
@@ -6,9 +6,9 @@
 - [x] Sprachnachrichten werden als Text angezeigt (STT → Chat-Bubble)
 - [x] Cache leeren + Auto-Download von Anhaengen
 - [x] ARIA liest Nachrichten vor (TTS via Piper)
- [x] Autoscroll zur letzten Nachricht
+- [x] Autoscroll zur letzten Nachricht (inverted FlatList)
 - [x] Bilder im Chat groesser + Vollbild-Vorschau
- [x] Ohr-Button Absturz gefixt (LiveAudioStream entfernt, Phase 1 Placeholder)
+- [x] Ohr-Button → Gespraechsmodus (Auto-Aufnahme nach ARIA-Antwort)
 - [x] Play-Button in ARIA-Nachrichten fuer Sprachwiedergabe
 - [x] Chat-Suche in der App (Lupe in Statusleiste)
 - [x] Watchdog mit Container-Restart (2min Warnung → 5min doctor --fix → 8min Restart)
@@ -22,28 +22,34 @@
 - [x] XTTS Voice Cloning (Audio-Samples hochladen, eigene Stimme)
 - [x] TTS Engine waehlbar (Piper/XTTS) in Diagnostic + App
 - [x] Auto-Update System (APK via RVS WebSocket)
+- [x] Auto-Update: APK-Installation via FileProvider
+- [x] Auto-Update: "Auf Updates pruefen" Button in App-Einstellungen
 - [x] Audio-Queue (sequentielle Wiedergabe, kein Ueberlappen)
+- [x] Textnachrichten werden von ARIA beantwortet (Bridge chat handler fix)
+- [x] Mehrere Anhaenge + Text vor dem Senden (Pending-Vorschau)
+- [x] Paste-Support fuer Bilder in Diagnostic Chat
+- [x] Markdown-Bereinigung fuer TTS (fett, kursiv, code, links, etc.)
+- [x] SSH Volume read-write fuer Proxy (kein -F Workaround mehr)
+- [x] Diagnostic: Sessions als Markdown exportieren (Download-Button)
+- [x] Speech Gate: Aufnahme wird verworfen wenn keine Sprache erkannt (verhindert dass Umgebungsgeraeusche an Whisper gehen)
+- [x] Session-Persistenz: Gewaehlte Session bleibt ueber Container-Restarts erhalten (sessionFromFile-Flag, atomic write)
+- [x] Diagnostic: "ARIA denkt..." bleibt nicht mehr stehen (pipelineEnd broadcastet immer idle, auch bei Timeout/Fehler/Disconnect)
+- [x] App: "ARIA denkt..." Indicator + Abbrechen-Button (Bridge spiegelt agent_activity via RVS)
+- [x] Whisper STT: Model-Auswahl in Diagnostic (tiny/base/small/medium/large-v3), Hot-Reload in Bridge, Default auf medium
+- [x] App: Audio-Aufnahme explizit 16kHz mono (spart Resample, optimal fuer Whisper)

 ## Offen

 ### Bugs (Prioritaet)
- [ ] Session-Persistenz: Bei Container-Restart wird immer aria-bridge geladen statt die zuletzt gewaehlte Session. Wird nicht persistent gespeichert.
- [x] App: Textnachrichten werden von ARIA beantwortet (Bridge chat handler fix)
 - [ ] App: Audioausgabe hoert ab und zu einfach auf (mitten im Satz oder zwischen Chunks)
- [x] Auto-Update: APK-Installation via FileProvider (content:// URI)
- [x] Auto-Update: "Auf Updates pruefen" Button in App-Einstellungen
- [x] App: Auto-Scroll zur letzten Nachricht beim App-Start (direkt, ohne Animation)
- [x] App: Bei neuen Nachrichten automatisch zur letzten Nachricht scrollen

 ### App Features
- [x] App: Zu Anhaengen Text hinzufuegen vor dem Senden (Pending-Vorschau + optionaler Text)
- [x] Gespraechsmodus (Ohr-Button): Auto-Aufnahme nach ARIA-Antwort (Walkie-Talkie)
 - [ ] Wake Word on-device (Porcupine "ARIA" Keyword, Phase 2 — passives Lauschen)
 - [ ] Chat-History zuverlaessiger laden (AsyncStorage Race Condition)
 - [ ] Background Audio Service (TTS auch bei minimierter App)

 ### TTS / Audio
- [ ] XTTS Audio-Streaming verbessern (minimales Stottern bei Chunk-Uebergaengen)
+- [ ] XTTS Audio-Streaming (PCM-Stream statt WAV-Dateien, eliminiert Stottern komplett)
 - [ ] Audio-Normalisierung (Lautstaerke zwischen Chunks angleichen)
 - [ ] Piper Voices Download ueber Diagnostic (neue Sprachen/Stimmen)

@@ -51,4 +57,4 @@
 - [ ] Bilder: Claude Vision direkt nutzen (aktuell nur Dateipfad an ARIA)
 - [ ] Auto-Compacting und Memory/Brain Verwaltung (SQLite?)
 - [ ] Diagnostic: System-Info Tab (Container-Status, Disk, RAM, CPU)
- [ ] RVS Zombie-Connections endgueltig loesen (WebRTC statt WebSocket?)
+- [ ] RVS Zombie-Connections endgueltig loesen
@@ -16,6 +16,7 @@ const ALLOWED_TYPES = new Set([
  "file_request", "file_response", "file_saved", "stt_result", "config", "tts_request",
  "xtts_request", "xtts_response", "xtts_list_voices", "xtts_voices_list", "voice_upload", "xtts_voice_saved",
  "update_check", "update_available", "update_download", "update_data",
+  "agent_activity", "cancel_request",
 ]);

 // Token-Raum: token -> { clients: Set<ws> }
Author	SHA1	Message	Date
duffyduck	cd390a4115	release: bump version to 0.0.3.8	2026-04-18 11:41:12 +02:00
duffyduck	a65ed579d2	feat: Whisper model selector + 16kHz mono recording - App: AudioSamplingRateAndroid 16000 + AudioChannelsAndroid 1 → Whisper bekommt direkt sein Ziel-Format, kein Resample mehr - Bridge: STTEngine.reload() laedt Modell zur Laufzeit neu (tiny/base/small/medium/large-v3) - Bridge: Config-Message triggert Hot-Reload wenn whisperModel sich aendert - Bridge: Default auf 'medium' (besser als 'small' bei aehnlicher Latenz) - Diagnostic: Neue Sektion "Whisper (Spracherkennung)" mit Dropdown, auto-save bei Auswahl, beim Laden wird der gespeicherte Wert gesetzt - Diagnostic/Server: send_voice_config merged whisperModel in voice_config.json - aria.env.example: WHISPER_MODEL + WHISPER_LANGUAGE dokumentiert Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 11:37:27 +02:00
duffyduck	2ad1f57382	feat: Thinking indicator + cancel button in the app - Bridge: _emit_activity() spiegelt OpenClaw agent events als agent_activity an RVS, dedupliziert State-Wechsel. chat:final/error senden idle. - Bridge: Neuer cancel_request-Handler ruft Diagnostic /api/cancel per HTTP. - Diagnostic: Neuer POST /api/cancel Endpoint (gleiche Logik wie WS-Cancel). - RVS: agent_activity + cancel_request in ALLOWED_TYPES. - App: Gelber Indicator ueber der Input-Bar mit Text je nach Activity, roter Abbrechen-Button. Cancel sendet cancel_request via RVS. - issue.md: Erledigte Bugfixes + Features konsolidiert. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 11:22:02 +02:00
duffyduck	58e3cfd3e6	feat: Session export as markdown in Diagnostic - ⬇ Button per Session-Zeile — exportiert auch inaktive Sessions - Server parst JSONL, extrahiert User/Assistant-Nachrichten mit Timestamp - Metadata-Prefix wird entfernt, Markdown mit # Session-Header generiert - Browser-Download via Blob + download-Attribut Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 11:14:15 +02:00
duffyduck	7de4ee8f5b	fix: Stuck "ARIA denkt..." indicator after pipeline ends - pipelineEnd() now broadcasts agent_activity: idle unconditionally - chat:error and chat:final paths broadcast idle outside of active pipeline - Gateway close event ends active pipeline + broadcasts idle - Prevents indicator from hanging after timeout/error/disconnect Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 11:11:12 +02:00
duffyduck	213edac3a7	fix: Session persistence - respect user choice across container restarts - sessionFromFile flag prevents auto-pick after first start - Atomic write (temp + rename) with loud error logging - Auto-pick filters out aria-bridge/aria-diagnostic when user sessions exist - handleSetActiveSession reports persistence failures to client Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 11:03:26 +02:00
duffyduck	acc13aef6b	fix: Speech gate - only send recording if actual speech detected - VAD_SPEECH_THRESHOLD_DB = -35 (louder than silence threshold) - Needs 300ms of speech before counting as real speech - Recording discarded if only background noise detected - Prevents sending garbage to Whisper in conversation mode Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 18:20:05 +02:00
duffyduck	4bbc6f7787	release: bump version to 0.0.3.7	2026-04-11 13:18:17 +02:00
duffyduck	20f2ea1829	fix: Conversation mode starts recording immediately when ear button tapped Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 13:15:26 +02:00
duffyduck	2d23f0668b	docs: update README with conversation mode, multi-attachments, markdown cleanup - Conversation mode (ear button) documented in App Features - Multiple attachments + paste support - Markdown cleanup for TTS - Auto-Update FileProvider + check button - Roadmap: 22 items in Phase 1 completed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 12:43:09 +02:00
duffyduck	d6030a06b7	docs: update issue.md - move completed items, clean up open list 28 items completed, 10 remaining open Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 12:23:04 +02:00
duffyduck	0df76e2af6	release: bump version to 0.0.3.6	2026-04-11 12:19:00 +02:00
duffyduck	f80fe1df93	fix: Inverted FlatList - newest messages always visible at bottom - No more scrollToEnd/scrollToIndex needed - FlatList inverted=true with reversed data - New messages appear at bottom automatically - User scrolls up to see history (natural chat behavior) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 12:17:32 +02:00
duffyduck	cff421bc53	release: bump version to 0.0.3.5	2026-04-11 12:13:41 +02:00
duffyduck	bca925d385	fix: Use scrollToIndex with viewPosition:1 for reliable bottom scroll - scrollToIndex targets last message at bottom of viewport - onScrollToIndexFailed fallback to scrollToEnd - More reliable than scrollToEnd with dynamic heights Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 12:12:24 +02:00