ARIA-AGENT

Author	SHA1	Message	Date
duffyduck	6651f5937d	feat(audio): Wake-Word parallel zu TTS mit AcousticEchoCanceler Du kannst jetzt "Computer" sagen waehrend ARIA noch redet — TTS verstummt, neue Aufnahme startet. Vorher musste man warten oder manuell den Voice-Button tappen. Native (OpenWakeWordModule.kt): - AudioRecord-Source von MIC auf VOICE_COMMUNICATION (aktiviert auf den meisten Geraeten Echo-Cancellation + Noise-Suppression) - Zusaetzlich AcousticEchoCanceler/NoiseSuppressor/AutomaticGainControl explizit aktiviert wenn vorhanden — robuster auf Geraeten wo die VOICE_COMMUNICATION-Source die Effects nicht automatisch mitbringt - releaseAudioEffects() im stop/dispose JS (wakeword.ts): - Neue API: startBargeListening / stopBargeListening — Wake-Word parallel aktivieren, ohne den State 'conversing' zu verlassen - onWakeDetected unterscheidet jetzt: in 'conversing' → barge-in- Callback (nicht der normale wake-callback). Sonst Standard-Pfad. - onBargeIn-Subscriber-API + isBargeListening-Getter Lifecycle-Wiring (audio.ts + ChatScreen): - audioService.onPlaybackStarted callback (neu) - ChatScreen: Bei TTS-Start → wakeWord.startBargeListening - ChatScreen: Bei TTS-Ende → wakeWord.stopBargeListening (sonst kein AudioRecord fuer die naechste Aufnahme) - ChatScreen: Bei BargeIn → haltAllPlayback + cancel_request + 150ms-Pause + neue Aufnahme starten issue.md + README aktualisiert. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 22:50:09 +02:00
duffyduck	e9e7dd804f	docs: issue.md + README mit audioRequestId-Fix + Bereit-Sound aktualisiert issue.md: drei neue Erledigt-Eintraege (Placeholder-Race per audioRequestId, Mikro-Offen-Toast erst nach Recording-Start, Bereit- Sound mit Toggle). Neuer Offen-Eintrag: Wake-Word parallel zu TTS mit AcousticEchoCanceler. README: Wake-Word-Bedienung erweitert um Ding-Dong + "🎤 sprich jetzt"-Toast. Roadmap mit den beiden neuen Features ergaenzt. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 22:44:03 +02:00
duffyduck	2a56ac0290	docs: issue.md + README aktualisiert mit aktuellen Features issue.md: openWakeWord, ABI-Split, Underrun-Schutz, Conversation-Focus, PhoneStateListener, Voice-Override-Fix, Bild+Text-Merge, Diagnostic-UI, adaptive VAD, Max-Aufnahme konfigurierbar, Barge-In, Push-to-Talk-Refactor, Settings-Sub-Screens, Textauswahl-Fix in Erledigt verschoben. Porcupine-bezogene offene Bugs entfernt (Engine gewechselt). Neue Offene: STT-Placeholder-Replacement, Custom-onnx-Upload, Pause+Resume bei Anruf. README: Push-to-Talk-Erwaehnung raus, VAD-Beschreibung auf adaptiv + neuen Default 5min, neue Bullets fuer Barge-In + Anruf-Pause, Roadmap ergaenzt. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 22:13:53 +02:00
duffyduck	309df9d851	fix(wake-word): Embedding-Output ist rank-4, nicht rank-2 — Trigger funktioniert jetzt Hauptursache warum kein Wake-Word je triggerte: das Google-Speech- Embedding-Modell liefert (1,1,1,96), nicht (1,96). Der Cast `as Array<FloatArray>` warf eine ClassCastException, die vom try/catch geschluckt wurde — Pipeline lief still ins Leere. Zusaetzlich: - WW-Input-Frame-Count wird jetzt aus den Modell-Metadaten gelesen (variiert pro Keyword; hey_jarvis=16, computer_v2evtl. anders) - "Computer" als Wake-Word erweitert (Community-Modell aus fwartner/home-assistant-wakewords-collection) "ARIA" als Wake-Word: gibt's nicht fertig trainiert. Muesste ueber das openWakeWord Colab-Notebook trainiert werden (~1h auf gratis-GPU). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 13:24:47 +02:00
duffyduck	55cfb752a2	feat(app): Wake-Word komplett on-device via openWakeWord (ONNX) Picovoice/Porcupine raus — neuer Stack ist openWakeWord (Apache 2.0, on-device, ONNX Runtime). Kein API-Key, keine Lizenzgebuehren, Audio verlaesst das Geraet nicht. Eigene Wake-Words sind via openWakeWord- Notebook gratis trainierbar. Pipeline (alles im OpenWakeWordModule.kt): 1. AudioRecord 16kHz mono int16 in 1280-Sample-Chunks (80ms) 2. melspectrogram.onnx → 32-mel Frames (mel/10 + 2 wie in Python) 3. embedding_model.onnx, 76-Frame Sliding Window (stride 8) → 96-dim 4. hey_jarvis.onnx (oder anderes Keyword) auf letzten 16 Embeddings 5. Sigmoid-Score, threshold/patience/debounce-Filter 6. RN-Event "WakeWordDetected" raus Mitgelieferte Modelle in assets/openwakeword/: hey_jarvis (Default), alexa, hey_mycroft, hey_rhasspy. Externe Service-API (start/stop/ configure/onWakeWord/...) bleibt identisch — ChatScreen unveraendert. build.gradle: com.microsoft.onnxruntime:onnxruntime-android:1.17.1 package.json: @picovoice/porcupine-react-native + voice-processor raus SettingsScreen: AccessKey-Feld weg, neue Keyword-Liste mit Labels README: Wake-Word-Sektion komplett umgeschrieben (kein Picovoice mehr) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 12:56:33 +02:00
duffyduck	44d2c6b4fe	fix(app): Spotify-Bounce zwischen ARIA-Antworten + Wake-Word-Doku AudioFocus wird jetzt mit 800ms Verzoegerung freigegeben — wenn ARIA direkt eine zweite Antwort hinterherschickt oder das Recording ins TTS uebergeht, wird das Release abgebrochen. Spotify/YouTube haben damit keine Mikro-Sekunden-Luecke mehr zum Hochkommen waehrend ARIA spricht. README: neue Sektion zur Wake-Word-Einrichtung mit Picovoice (7-Tage-Trial, Console-Link, Anleitung fuer eigene Keywords) und veraltete Wake-Word-Limitation entfernt. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-25 22:49:45 +02:00
duffyduck	4f494daffb	docs: BigVGAN-Warnung deutlich — funktioniert nicht mit unserem Vocos-Setup Die BigVGAN-Variante des aihpi F5-TTS Checkpoints ist nicht einfach ein "optional besser" Fallback — sie ist mit dem Default-Vocos-Vocoder den die f5-tts Library laedt inkompatibel. Output wird NaN, App bleibt stumm. Stefan hat das probiert, App stumm, 10 Minuten Debugging. README war zu locker formuliert ("Meist hoehere Quali") — jetzt klar als "funktioniert AKTUELL NICHT" gekennzeichnet. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 19:49:54 +02:00
duffyduck	5ba89c7191	docs: README-Abschnitt fuer deutsches F5-TTS Fine-Tune (aihpi) Konfig-Tabelle mit den konkreten Diagnostic-Werten fuer das deutsche Fine-Tune von aihpi/F5-TTS-German — Modell-Architektur, hf:// Pfade, empfohlene cfg_strength / nfe_step. Plus Hinweis auf die BigVGAN- Variante als Alternative. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 19:34:36 +02:00
duffyduck	578ade3544	docs: README + issue.md auf Stand mit F5-TTS, Whisper-Gamebox, App-Settings README: - Architektur-Diagramm: Gamebox-Stack mit f5tts-bridge + whisper-bridge - Voice Bridge: STT primaer remote (Gamebox), TTS via F5-TTS - Diagnostic-Section: Voice-Status, Disk-Voll Banner, Auto-Transkription - App-Features: VAD-Toleranz/Pre-Roll/Audio-Pause konfigurierbar - XTTS-Section ersetzt durch "Gamebox-Stack — F5-TTS + Whisper" - Roadmap Phase 1: alle juengsten Erledigungen ergaenzt issue.md: alle erledigten Punkte der letzten Iterationen aufgenommen (Pre-Roll, Decimal-TTS, voice_ready, Whisper-Gamebox, F5-TTS, AudioFocus Pause, VAD-Setting, ...). Offene Liste auf den aktuellen Stand reduziert. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 14:51:27 +02:00
duffyduck	f801d99748	feat: Piper komplett entfernt — nur noch XTTS v2 als TTS Breaking Change: wenn XTTS-Bridge (Gaming-PC) offline ist, bleibt ARIA stumm. Chat-Antworten kommen weiter an, aber kein Audio. Das ist bewusst akzeptiert — XTTS klingt einfach grauenhaft viel besser. Bridge (aria_bridge.py): - from piper import ... raus - VoiceEngine-Klasse komplett entfernt (synthesize, speak, select_voice) - EPIC_TRIGGERS + load_epic_triggers raus (Highlight-Voice-Feature ohne Piper sinnlos) - self.voice_engine, voice_name, requested_voice Aufrufe weg - _process_core_response: immer XTTS, kein Fallback - tts_request Handler: immer XTTS - config Handler: nur ttsEnabled + xttsVoice + whisperModel - import wave raus bridge/requirements.txt: piper-tts raus bridge/Dockerfile: Kommentar aktualisiert docker-compose.yml: ./aria-data/voices Mount raus aria-data/config/aria.env.example: PIPER_RAMONA/PIPER_THORSTEN raus get-voices.sh: komplett geloescht (war nur Piper-Downloader) Diagnostic UI (index.html): - Piper Panel (Standard-Stimme / Highlight-Stimme / Speed-Sliders) weg - TTS Engine Dropdown weg (immer XTTS) - TTS Diagnose Tab zeigt nur noch XTTS-Status + Test-Button - sendVoiceConfig sendet nur noch ttsEnabled/xttsVoice/whisperModel - toggleXTTSPanel als no-op Legacy-Stub (JS-Calls bleiben safe) Diagnostic Server (server.js): - handleSendVoiceConfig: nur noch ttsEnabled + xttsVoice + whisperModel - handleTestTTS: via xtts_request (nicht mehr Piper subprocess) - handleCheckTTS: via xtts_list_voices ueber RVS - handleGetVoiceConfig/Defaults bereinigt - Highlight-Trigger UI bleibt, wird aber von Bridge nicht mehr ausgewertet (dead-code im UI, spaeter ggf. fuer XTTS-Voice-Switch) README + issue.md aktualisiert. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 22:24:42 +02:00
duffyduck	271fc4edf6	docs: cleanup.sh + README updates for latest features - cleanup.sh: sicherer (default) + aggressiver (--full) Docker-Cleanup mit Speicher-Report vor/nach - README: Phase-1-Liste, Diagnostic-Features und App-Features um die neuen Punkte ergaenzt (Speech Gate, Session-Persistenz, Session-Export, App Thinking-Indicator, Whisper-Modellauswahl, 16kHz-Aufnahme) - README: Neuer Abschnitt "Docker-Cleanup" mit cleanup.sh Usage Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 11:46:12 +02:00
duffyduck	2d23f0668b	docs: update README with conversation mode, multi-attachments, markdown cleanup - Conversation mode (ear button) documented in App Features - Multiple attachments + paste support - Markdown cleanup for TTS - Auto-Update FileProvider + check button - Roadmap: 22 items in Phase 1 completed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 12:43:09 +02:00
duffyduck	3943e79bb1	docs: document .env.example with detailed comments, explain both tokens in README - ARIA_AUTH_TOKEN: Gateway auth (who can talk to ARIA) - RVS_TOKEN: Pairing token (same room in RVS relay) - RVS_UPDATE_HOST: SSH target for auto-update APK copy - All variables with German comments and examples Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 08:45:26 +02:00
duffyduck	3461f45207	docs: update README with XTTS v2 setup details, voice cloning guide - Architecture diagram for XTTS flow (Gaming-PC ↔ RVS ↔ ARIA-VM) - Port 8020 (not 8000), token must match, model caching - Voice cloning step-by-step guide - TTS engine switching (Piper/XTTS) with fallback - Known limitation: RVS zombie connections Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 01:49:08 +02:00
duffyduck	cb33a20694	docs: update README with XTTS, auto-update, watchdog, TTS settings - Architecture: Added XTTS v2 (Gaming-PC) and auto-update flow - Diagnostic: Thinking indicator, cancel button, TTS tab, voice cloning - App: Play button, chat search, auto-update, voice speed settings - RVS: Auto-update APK distribution over WebSocket - Watchdog: 2min warning → 5min doctor --fix → 8min container restart - Roadmap: Phase 1 fully completed, updated Phase 2+3 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 19:46:16 +02:00
duffyduck	5c8d11824e	fixed, long chats not loading to end, saved attachments in local folder on android., if file missing redownload over shared folde via rvs server, andord app added settingss for local storage path, updated readme	2026-03-29 12:51:38 +02:00
duffyduck	db053c2dbd	fixed sst to milliseconds and autoscroll the the third, attachments added shared volume, addes attachments at chats, updateded readme	2026-03-29 12:34:28 +02:00
duffyduck	f1f297b3a7	fixed voice button apk and update readme	2026-03-29 11:41:32 +02:00
duffyduck	2227e49993	updates android buold environment and setup.sh	2026-03-29 11:32:37 +02:00
duffyduck	dbd97d3cf4	added audio workword, and recording, editied readme	2026-03-29 11:29:15 +02:00
duffyduck	1ee800f451	updated readme ans increased timeout	2026-03-16 01:05:32 +01:00
duffyduck	2e4a12c812	added claude cli log and test and optimize log windows through seperate tabs, update readme changelog	2026-03-12 01:25:35 +01:00
duffyduck	c5d835ea09	- `aria-data/config/AGENT.md` — ARIAs Persönlichkeit und Sicherheitsregeln - `aria-data/config/USER.md` — Stefans Präferenzen - `aria-data/config/TOOLING.md` — VM-Tooling Liste - `aria-data/skills/README.md` — Skill-Bauanleitung ### Bekannte Probleme - Android Release-Build: `EMFILE: too many open files` — Fix: `CI=true` in `build.sh` - JDK 21 inkompatibel mit AGP 8.1 — Fix: Automatischer Fallback auf JDK 17 - `react-native-screens` > 3.27.0 inkompatibel mit RN 0.73.4 — Fix: Version gepinnt	2026-03-11 23:13:28 +01:00
duffyduck	71f9ae221c	added claude cli to proxy	2026-03-11 22:41:26 +01:00
duffyduck	dd12a49aaf	change claude proxy name and added ws support in adroid app	2026-03-11 22:35:26 +01:00
duffyduck	e951fc712f	TLS Fallback (Bridge → RVS) Audio-Rendering fuer App (Piper TTS via RVS) Chat-Persistenz (AsyncStorage, 500 Nachrichten)	2026-03-10 18:40:03 +01:00
duffyduck	afcd45d32f	Docker & Infrastruktur — OpenClaw Image fix, libportaudio2, aria.env.example Wake-Word Fix — openwakeword API-Bug behoben get-voices.sh — neues Script + README-Schritt	2026-03-10 14:08:28 +01:00
duffyduck	c67da1d085	version 0.0.0.3	2026-03-09 00:31:21 +01:00
duffyduck	5eb3ebf199	first release 0.0.0.2	2026-03-08 23:31:46 +01:00
Stefan Hacker	36eed69fa9	first commit	2026-03-08 19:08:10 +01:00

30 Commits