Hacker-Software/ARIA-AGENT - ARIA-AGENT - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
duffyduck	6ab6196739	feat: Streaming TTS — PCM-Stream statt WAV-Chunks (Weg A) Pipeline: XTTS-Server → xtts-bridge → aria-bridge → RVS → App AudioTrack XTTS-Bridge (Gaming-PC): - streamXTTSAsPCM(): liest /tts_to_audio/ Response inkrementell, parst WAV-Header (samplerate/channels), teilt PCM in 8KB-Chunks (~170ms bei 24kHz s16 mono) und sendet jeden als audio_pcm. - Finaler Chunk mit final=true nach letztem Text-Chunk aria-bridge: - audio_pcm Handler leitet payload 1:1 weiter, filled messageId aus requestId → messageId Map falls XTTS-Bridge messageId nicht hatte - Alter xtts_response Pfad bleibt als Legacy-Fallback (WAV) RVS: audio_pcm in ALLOWED_TYPES Android Native: - PcmStreamPlayerModule (Kotlin): AudioTrack MODE_STREAM mit Writer-Thread und BlockingQueue. start(rate, ch) / writeChunk(b64) / end() / stop() - 8x MinBufferSize grosszuegig dimensioniert, glatt auch bei Netz-Aussetzern - Registered im MainApplication via PcmStreamPlayerPackage App JS: - audioService.handlePcmChunk(): erkennt neue Session (messageId-Wechsel), started nativen Stream, cached PCM-Bytes pro Message. Bei final=true Stream sauber schliessen + _savePcmBufferAsWav → WAV-File im tts_cache/<messageId>.wav - _savePcmBufferAsWav: baut 44-byte WAV-Header (PCM s16le, korrekte samplerate/channels), haengt alle gesammelten base64-PCM-Chunks an - stopPlayback beendet auch aktiven PCM-Stream - ChatScreen routet type=audio_pcm an handlePcmChunk, bei final setzt audioPath in der Message Play-Button: falls messageId einen audioPath hat → WAV aus Cache (Sound-basiert), egal ob Original-TTS Piper oder XTTS war. Audio-Focus: - requestDuck() beim Stream-Start, release() bei Stream-Ende - Andere Apps (Spotify etc.) werden leiser waehrend ARIA spricht Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 22:01:27 +02:00
duffyduck	764619f076	fix: Comprehensive markdown/formatting cleanup for TTS (Piper + XTTS) - Remove bold, italic, `code`, code blocks, links, headers, quotes, lists - Replace newlines with natural pauses (period/comma) - Remove quotation marks, empty brackets - Fixes text being swallowed/garbled by TTS engines Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 11:47:04 +02:00
duffyduck	949c573c49	fix: XTTS chunk size 150 chars (faster render, preload overlaps playback) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 02:52:56 +02:00
duffyduck	f7f450a09d	fix: XTTS streaming mode - send each chunk immediately, comma between sentences - Back to streaming: render chunk → send immediately → next chunk - App plays with preloading queue (no waiting for all chunks) - Comma instead of dot between sentences in chunk (no "Punkt" read aloud) - Sentence-ending dots already removed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 02:48:50 +02:00
duffyduck	81f7c38383	fix: XTTS splits concatenated audio into ~8s parts (seamless with preload) - All chunks rendered and PCM concatenated (consistent voice) - Split into ~8 second WAV parts (not per-sentence) - 8s is long enough for preload overlap, small enough for WebSocket - Parts include part/totalParts metadata Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 02:41:14 +02:00
duffyduck	2c785cb37a	feat: XTTS concatenates chunks into seamless WAV (no stuttering) - All chunks rendered sequentially, PCM data concatenated - Single WAV with proper header sent back (no queue needed in app) - If total > 800KB, split into parts (WebSocket limit) - Eliminates stuttering between sentences Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 02:40:16 +02:00
duffyduck	8929bc99bb	fix: XTTS groups sentences into ~250 char chunks for consistent voice quality - 2-3 sentences per chunk (more context = stable voice/volume) - Max 250 chars per chunk (keeps WebSocket packets manageable) - Dots re-added between sentences within a chunk (natural pauses) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 02:23:29 +02:00
duffyduck	0428c06612	fix: Audio preloading to prevent stuttering, remove trailing dots for XTTS - Preload next audio while current plays (eliminates gap between sentences) - Remove trailing dots from sentences (XTTS reads them aloud) - stopPlayback cleans up preloaded audio Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 02:21:19 +02:00
duffyduck	b3d3b8b6bc	fix: XTTS bridge splits text into sentences sequentially - XTTS-Bridge does sentence splitting (not ARIA-Bridge) - Sequential rendering: correct order guaranteed - Each sentence sent as separate xtts_response - Markdown removal before splitting - App starts playback after first sentence (faster UX) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 02:03:29 +02:00
duffyduck	16847ce6f7	fix: TTS toggle global above engine selector, health check /docs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 00:27:55 +02:00
duffyduck	a1e1ee31bd	fix: XTTS bridge port 8020, longer startup wait - XTTS API runs on port 8020 (not 8000) - Bridge waits up to 5min for model download (30x10s) - Health check uses / instead of /docs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 23:39:45 +02:00
duffyduck	a242693751	feat: XTTS v2 integration, auto-update system, TTS engine abstraction - XTTS v2: Docker setup for Gaming-PC (GPU), bridge via RVS relay - XTTS: Voice cloning UI in Diagnostic (multi-file upload) - XTTS: Engine selectable (Piper local vs XTTS remote) with fallback - Auto-Update: RVS serves APK over WebSocket (no HTTP needed) - Auto-Update: App checks version on start, prompts install - Auto-Update: release.sh copies APK to RVS via scp - Bridge: TTS engine abstraction (piper/xtts), config persistent - Bridge: xtts_response handler, tts_request on-demand - Diagnostic: TTS engine dropdown, XTTS voice panel, voice cloning - App: Play button on ARIA messages, chat search, update service - Wake word: Disabled LiveAudioStream (crash fix), Phase 1 placeholder - Watchdog: Container restart after 8min stuck - Chat backup: on-the-fly to /shared/config/chat_backup.jsonl Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 19:42:10 +02:00