fix: Textauswahl, adaptive VAD-Schwelle + Barge-In bei Sprachaufnahme

Bug 1 — Textauswahl in Bubbles ging nicht mehr: MessageText hatte verschachtelte <Text onPress={...}> fuer Custom-Link- Styling. Das fing die Long-Press-Geste ab, daher kein Markieren+Kopieren mehr. Jetzt nur noch ein einzelnes <Text selectable dataDetectorType="all">, Android macht URLs/Telefonnummern/Emails per System-Detection klickbar. Bug 2 — VAD erkannte Stille nicht zuverlaessig (Aufnahme lief endlos): Festwerte (-45dB Stille / -28dB Sprache) passten nicht zu jeder Umgebung. In lauteren Raeumen lag der Hintergrundpegel ueber der Stille-Schwelle, lastSpeechTime wurde dauerhaft aktualisiert → VAD feuerte nie, Aufnahme lief bis 120s Max-Duration. Jetzt adaptiv: erste 5 Mic-Samples (~500ms) bilden die Baseline; Stille- Schwelle = baseline+6dB, Sprache-Schwelle = baseline+12dB. Toast zeigt die kalibrierten Werte beim Aufnahmestart. Fallback auf -38dB/-22dB falls das Mikro keine Metering-Updates liefert. Bug 3 — Barge-In ("ach vergiss es"): Wenn waehrend ARIAs Antwort eine neue Sprachnachricht aufgenommen wird, wird ARIAs aktuelle Aktivitaet (TTS + thinking/tool) sofort abgebrochen bevor die neue Message gesendet wird — wie in einem echten Gespraech wo man den anderen unterbrechen darf. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 21:49:48 +02:00
parent fa0667088a
commit 406f4cb3cc
3 changed files with 83 additions and 94 deletions
@@ -504,6 +504,8 @@ const ChatScreen: React.FC = () => {
      const result = await audioService.stopRecording();
      if (result && result.durationMs > 500) {
        // User hat im Fenster gesprochen → Sprachnachricht senden
+        // Barge-In: laufende ARIA-Aktivitaet abbrechen wenn welche da ist.
+        interruptAriaIfBusy();
        const location = await getCurrentLocation();
        const userMsg: ChatMessage = {
          id: nextId(),
@@ -648,8 +650,28 @@ const ChatScreen: React.FC = () => {
    rvs.send('cancel_request' as any, {});
  }, []);

+  // Barge-In: wenn der User waehrend ARIA arbeitet/spricht eine neue Sprach-
+  // Nachricht aufnimmt, alte Aktivitaet sofort abbrechen — TTS verstummen,
+  // aria-core-Run via cancel_request abbrechen. So kann man "ach vergiss es,
+  // mach lieber X" sagen wie in einem echten Gespraech.
+  const interruptAriaIfBusy = useCallback(() => {
+    const speaking = audioService.isPlayingAudio();
+    const thinking = agentActivity.activity !== 'idle';
+    if (!speaking && !thinking) return false;
+    console.log('[Chat] Barge-In: speaking=%s thinking=%s — interrupting ARIA',
+                speaking, thinking);
+    if (speaking) audioService.haltAllPlayback('user spricht (barge-in)');
+    if (thinking) {
+      setAgentActivity({ activity: 'idle', tool: '' });
+      rvs.send('cancel_request' as any, {});
+    }
+    return true;
+  }, [agentActivity]);
+
  // Sprachaufnahme abgeschlossen
  const handleVoiceRecording = useCallback(async (result: RecordingResult) => {
+    // Barge-In: laufende ARIA-Aktivitaet abbrechen falls aktiv.
+    interruptAriaIfBusy();
    const location = await getCurrentLocation();

    const userMsg: ChatMessage = {
@@ -668,7 +690,7 @@ const ChatScreen: React.FC = () => {
      speed: ttsSpeedRef.current,
      ...(location && { location }),
    });
-  }, [getCurrentLocation]);
+  }, [getCurrentLocation, interruptAriaIfBusy]);

  // Datei auswaehlen → zur Pending-Liste hinzufuegen
  const handleFileSelected = useCallback(async (file: FileData) => {