Compare commits

..

84 Commits

Author SHA1 Message Date
duffyduck 099b9651a6 release: bump version to 0.0.4.0 2026-04-18 12:28:08 +02:00
duffyduck 76d72a1eef feat: Archivierte Session-Versionen (OpenClaw .reset.* Files) in Diagnostic
OpenClaw resettet Sessions beim ersten chat.send nach Container-Restart
(wenn abortedLastRun / systemSent Inkonsistenz erkannt wurde) und
benennt die alte .jsonl in .jsonl.reset.<timestamp>.Z um. Der Inhalt
war also gar nicht verloren, nur unsichtbar.

Diagnostic:
- handleListSessions scannt jetzt auch *.jsonl.reset.* Files
- Reset-Files bekommen archived:true + resetAt-Timestamp
- Neue UI-Sektion "Archivierte Versionen" (collapsible <details>)
  mit Export-Button, zeigt aufklappbar alle gesicherten alten Sessions
- Aktivieren ist fuer Archive deaktiviert (zerstoert aktive Session)
- Loeschen + Export stehen zur Verfuegung

tools/export-jsonl-to-md.js:
- Standalone Node-Script zum Konvertieren beliebiger .jsonl (auch reset-Files)
- Nutzbar via stdin, exakt gleiche Export-Logik wie Diagnostic
- Fuer Rettungsaktionen direkt auf der VM

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:24:54 +02:00
duffyduck 87deede078 fix: Session Msgs-Counter zaehlt echte Nachrichten, nicht alle Zeilen
Vorher: wc -l auf der .jsonl — zaehlt auch Tool-Calls, Run-Events,
Metadata-Eintraege mit. Diagnostic zeigte z.B. "10 Msgs" fuer eine
Session mit 6 echten User/Assistant-Nachrichten.

Jetzt: grep -cE '"role":"(user|assistant)"' — zaehlt nur echte
Konversations-Messages. Matcht wie der Export und die Chat-History
das interpretieren.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:09:01 +02:00
duffyduck 6fec8588c1 fix: Gespraechsmodus - strenger Speech-Gate + Crash-Prevention
Probleme:
- Hintergrundgeraeusche wurden als Sprache erkannt und an Whisper geschickt
- App stuerzte nach laengerem Zuhoeren ab (OOM / Cache-Ueberlauf)

Aenderungen:
- VAD_SPEECH_THRESHOLD_DB -35 -> -28 (filtert Raum-Ambient)
- VAD_SPEECH_MIN_MS 300 -> 500 (keine Huestler/Klopfer mehr)
- Max-Aufnahmedauer 30s (Notbremse gegen Runaway-Loops)
- _cleanupStaleCacheFiles(): alte aria_recording_/aria_tts_ Files (>30s)
  werden vor jeder neuen Aufnahme geloescht
- ChatScreen: capMessages() begrenzt Messages-Array auf 500 Eintraege
  (OOM-Schutz in langen Gespraechen)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:05:15 +02:00
duffyduck aafdbcd57a fix: Thinking indicator beim Seitenladen nicht mehr sichtbar
Zwei display:-Deklarationen im inline-style der Diagnostic-Chat-Leiste
haben sich gegenseitig ueberschrieben — 'display:flex' war die zweite
und hat 'display:none' aushebelt. Indicator war so beim Seitenaufbau
sichtbar bis JS ein idle-Event empfing.

- HTML: 'display:flex' aus inline-style entfernt
- JS: beim Anzeigen explizit display='flex' setzen (statt 'block')

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 11:58:39 +02:00
duffyduck 08da28f475 release: bump version to 0.0.3.9 2026-04-18 11:52:53 +02:00
duffyduck 8c1014d281 fix: Thinking indicator respringt nach chat:final durch trailing events
Nach chat:final kommen oft noch agent-Events rein (Core raeumt nach),
die den Thinking-Indicator wieder anspringen liessen.

- Diagnostic: 3s-Settled-Window nach chat:final, agent_activity-Broadcasts
  werden in dem Fenster unterdrueckt (idle kommt weiter durch).
- Bridge: Gleiches Fenster in _emit_activity() — App bekommt keine
  trailing thinking/tool-Events mehr nach dem finalen Antwort.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 11:51:22 +02:00
duffyduck 271fc4edf6 docs: cleanup.sh + README updates for latest features
- cleanup.sh: sicherer (default) + aggressiver (--full) Docker-Cleanup
  mit Speicher-Report vor/nach
- README: Phase-1-Liste, Diagnostic-Features und App-Features um die
  neuen Punkte ergaenzt (Speech Gate, Session-Persistenz, Session-Export,
  App Thinking-Indicator, Whisper-Modellauswahl, 16kHz-Aufnahme)
- README: Neuer Abschnitt "Docker-Cleanup" mit cleanup.sh Usage

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 11:46:12 +02:00
duffyduck cd390a4115 release: bump version to 0.0.3.8 2026-04-18 11:41:12 +02:00
duffyduck a65ed579d2 feat: Whisper model selector + 16kHz mono recording
- App: AudioSamplingRateAndroid 16000 + AudioChannelsAndroid 1
  → Whisper bekommt direkt sein Ziel-Format, kein Resample mehr
- Bridge: STTEngine.reload() laedt Modell zur Laufzeit neu
  (tiny/base/small/medium/large-v3)
- Bridge: Config-Message triggert Hot-Reload wenn whisperModel sich aendert
- Bridge: Default auf 'medium' (besser als 'small' bei aehnlicher Latenz)
- Diagnostic: Neue Sektion "Whisper (Spracherkennung)" mit Dropdown,
  auto-save bei Auswahl, beim Laden wird der gespeicherte Wert gesetzt
- Diagnostic/Server: send_voice_config merged whisperModel in voice_config.json
- aria.env.example: WHISPER_MODEL + WHISPER_LANGUAGE dokumentiert

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 11:37:27 +02:00
duffyduck 2ad1f57382 feat: Thinking indicator + cancel button in the app
- Bridge: _emit_activity() spiegelt OpenClaw agent events als agent_activity
  an RVS, dedupliziert State-Wechsel. chat:final/error senden idle.
- Bridge: Neuer cancel_request-Handler ruft Diagnostic /api/cancel per HTTP.
- Diagnostic: Neuer POST /api/cancel Endpoint (gleiche Logik wie WS-Cancel).
- RVS: agent_activity + cancel_request in ALLOWED_TYPES.
- App: Gelber Indicator ueber der Input-Bar mit Text je nach Activity,
  roter Abbrechen-Button. Cancel sendet cancel_request via RVS.
- issue.md: Erledigte Bugfixes + Features konsolidiert.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 11:22:02 +02:00
duffyduck 58e3cfd3e6 feat: Session export as markdown in Diagnostic
- ⬇ Button per Session-Zeile — exportiert auch inaktive Sessions
- Server parst JSONL, extrahiert User/Assistant-Nachrichten mit Timestamp
- Metadata-Prefix wird entfernt, Markdown mit # Session-Header generiert
- Browser-Download via Blob + download-Attribut

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 11:14:15 +02:00
duffyduck 7de4ee8f5b fix: Stuck "ARIA denkt..." indicator after pipeline ends
- pipelineEnd() now broadcasts agent_activity: idle unconditionally
- chat:error and chat:final paths broadcast idle outside of active pipeline
- Gateway close event ends active pipeline + broadcasts idle
- Prevents indicator from hanging after timeout/error/disconnect

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 11:11:12 +02:00
duffyduck 213edac3a7 fix: Session persistence - respect user choice across container restarts
- sessionFromFile flag prevents auto-pick after first start
- Atomic write (temp + rename) with loud error logging
- Auto-pick filters out aria-bridge/aria-diagnostic when user sessions exist
- handleSetActiveSession reports persistence failures to client

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 11:03:26 +02:00
duffyduck acc13aef6b fix: Speech gate - only send recording if actual speech detected
- VAD_SPEECH_THRESHOLD_DB = -35 (louder than silence threshold)
- Needs 300ms of speech before counting as real speech
- Recording discarded if only background noise detected
- Prevents sending garbage to Whisper in conversation mode

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 18:20:05 +02:00
duffyduck 4bbc6f7787 release: bump version to 0.0.3.7 2026-04-11 13:18:17 +02:00
duffyduck 20f2ea1829 fix: Conversation mode starts recording immediately when ear button tapped
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 13:15:26 +02:00
duffyduck 2d23f0668b docs: update README with conversation mode, multi-attachments, markdown cleanup
- Conversation mode (ear button) documented in App Features
- Multiple attachments + paste support
- Markdown cleanup for TTS
- Auto-Update FileProvider + check button
- Roadmap: 22 items in Phase 1 completed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 12:43:09 +02:00
duffyduck d6030a06b7 docs: update issue.md - move completed items, clean up open list
28 items completed, 10 remaining open

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 12:23:04 +02:00
duffyduck 0df76e2af6 release: bump version to 0.0.3.6 2026-04-11 12:19:00 +02:00
duffyduck f80fe1df93 fix: Inverted FlatList - newest messages always visible at bottom
- No more scrollToEnd/scrollToIndex needed
- FlatList inverted=true with reversed data
- New messages appear at bottom automatically
- User scrolls up to see history (natural chat behavior)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 12:17:32 +02:00
duffyduck cff421bc53 release: bump version to 0.0.3.5 2026-04-11 12:13:41 +02:00
duffyduck bca925d385 fix: Use scrollToIndex with viewPosition:1 for reliable bottom scroll
- scrollToIndex targets last message at bottom of viewport
- onScrollToIndexFailed fallback to scrollToEnd
- More reliable than scrollToEnd with dynamic heights

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 12:12:24 +02:00
duffyduck 9abde89805 release: bump version to 0.0.3.4 2026-04-11 12:09:23 +02:00
duffyduck ea4f639fcb fix: Auto-scroll retry with multiple delays (100, 300, 600, 1000ms)
FlatList needs time to render - single setTimeout(150) was unreliable.
Now tries 4 times on initial load, 2 times for new messages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 12:07:54 +02:00
duffyduck 64cd5f7d52 release: bump version to 0.0.3.3 2026-04-11 12:04:37 +02:00
duffyduck 843ebe1d8f fix: Remove duplicate closure ending in ChatScreen (build error)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 12:03:20 +02:00
duffyduck 764619f076 fix: Comprehensive markdown/formatting cleanup for TTS (Piper + XTTS)
- Remove **bold**, *italic*, `code`, code blocks, links, headers, quotes, lists
- Replace newlines with natural pauses (period/comma)
- Remove quotation marks, empty brackets
- Fixes text being swallowed/garbled by TTS engines

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:47:04 +02:00
duffyduck e3a0cfb55a docs: mark conversation mode as done, keep Porcupine as Phase 2
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:42:33 +02:00
duffyduck 2929749314 feat: Conversation mode (ear button) - auto-record after ARIA speaks
- Ear button activates conversation mode (green dot)
- After TTS playback finishes → 800ms pause → auto-start recording
- VAD stops recording on silence → sends to ARIA → ARIA answers → TTS → loop
- Like a natural conversation / walkie-talkie mode
- Audio service fires onPlaybackFinished when queue empty

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:40:55 +02:00
duffyduck 51b9512f4e docs: mark scroll bugs as fixed in issue.md
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:37:53 +02:00
duffyduck ffcfa44eef fix: Auto-scroll to last message on app start + new messages
- useEffect on messages array instead of onContentSizeChange
- Instant jump (no animation) when loading history
- Animated scroll for single new messages
- Scroll pauses when user scrolls up, resumes at bottom

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:37:30 +02:00
duffyduck 6363da97b1 feat: Multiple attachments + paste support (App + Diagnostic)
App:
- Multiple pending attachments (horizontal scroll preview)
- Individual remove (X) or clear all
- Send button shows when any attachment pending
- All files sent before text message

Diagnostic:
- Clip icon for file selection (multiple)
- Paste images/files from clipboard (Ctrl+V)
- Pending preview with thumbnails
- Files sent via RVS before text message

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:34:33 +02:00
duffyduck 07ed2cdcf6 docs: mark attachment text feature as done in issue.md
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 10:06:13 +02:00
duffyduck 5ad68b7dfc feat: Attachments not sent immediately - add text/voice before sending
- File/photo selection stores as pending (not sent immediately)
- Preview bar shows pending attachment above input field
- User can add text message before sending (e.g. "Was siehst du?")
- Send button appears when attachment is pending (even without text)
- Placeholder changes to "Text zum Anhang (optional)..."
- X button to cancel pending attachment
- File + text sent together (file first, then chat message)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 10:05:50 +02:00
duffyduck 8a6ee018ea docs: mark text message bug as fixed in issue.md
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 09:59:48 +02:00
duffyduck b42590ff95 docs: mark auto-update bugs as fixed in issue.md
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 09:54:57 +02:00
duffyduck 056b579c47 release: bump version to 0.0.3.2 2026-04-11 09:53:54 +02:00
duffyduck 576e612cd0 fix: release.sh clears Metro + Gradle cache before build (version consistency)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 09:51:41 +02:00
duffyduck c2faa06a15 release: bump version to 0.0.3.1 2026-04-10 23:19:40 +02:00
duffyduck d3ed3556eb fix: Bridge chat handler was missing send_to_core (text messages ignored)
The chat handler checked sender but never forwarded the text to aria-core.
Only voice messages worked because they went through the audio→STT→send_to_core path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 23:13:29 +02:00
duffyduck d960d125c0 release: bump version to 0.0.3.0 2026-04-10 09:07:20 +02:00
duffyduck 89d5d7ec0a release: bump version to 0.0.2.9 2026-04-10 09:01:47 +02:00
duffyduck ea0c13936b fix: release.sh deletes old APKs on RVS before uploading new one
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 09:00:45 +02:00
duffyduck 773c976822 fix: Auto-update APK install via FileProvider + dynamic version
- Native ApkInstallerModule: FileProvider content:// URI for Android 7+
- REQUEST_INSTALL_PACKAGES permission in AndroidManifest
- file_paths.xml for FileProvider cache access
- APP_VERSION reads from package.json (not hardcoded)
- "Auf Updates pruefen" button in Settings
- Version display reads from package.json dynamically

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:59:52 +02:00
duffyduck cd05ed2379 docs: add auto-update FileProvider bug + update check button to issue.md
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:55:30 +02:00
duffyduck 054e4057d8 release: bump version to 0.0.2.8 2026-04-10 08:49:47 +02:00
duffyduck 3943e79bb1 docs: document .env.example with detailed comments, explain both tokens in README
- ARIA_AUTH_TOKEN: Gateway auth (who can talk to ARIA)
- RVS_TOKEN: Pairing token (same room in RVS relay)
- RVS_UPDATE_HOST: SSH target for auto-update APK copy
- All variables with German comments and examples

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:45:26 +02:00
duffyduck 87f4317c15 docs: add auto-update APK not reaching RVS bug to issue.md
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:39:59 +02:00
duffyduck 50aa793910 fix: Proxy SSH volume read-write (ARIA can manage keys without -F workaround)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:34:35 +02:00
duffyduck 5efc9865a8 docs: add 6 new bugs/features to issue.md
- Session persistence on container restart
- App: text/image/attachment messages not working (only voice)
- App: audio stops randomly
- App: auto-scroll to last message on start + new messages
- App: add text/voice to attachments
- Prioritized bugs section

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:31:49 +02:00
duffyduck 949c573c49 fix: XTTS chunk size 150 chars (faster render, preload overlaps playback)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 02:52:56 +02:00
duffyduck f7f450a09d fix: XTTS streaming mode - send each chunk immediately, comma between sentences
- Back to streaming: render chunk → send immediately → next chunk
- App plays with preloading queue (no waiting for all chunks)
- Comma instead of dot between sentences in chunk (no "Punkt" read aloud)
- Sentence-ending dots already removed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 02:48:50 +02:00
duffyduck 81f7c38383 fix: XTTS splits concatenated audio into ~8s parts (seamless with preload)
- All chunks rendered and PCM concatenated (consistent voice)
- Split into ~8 second WAV parts (not per-sentence)
- 8s is long enough for preload overlap, small enough for WebSocket
- Parts include part/totalParts metadata

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 02:41:14 +02:00
duffyduck 2c785cb37a feat: XTTS concatenates chunks into seamless WAV (no stuttering)
- All chunks rendered sequentially, PCM data concatenated
- Single WAV with proper header sent back (no queue needed in app)
- If total > 800KB, split into parts (WebSocket limit)
- Eliminates stuttering between sentences

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 02:40:16 +02:00
duffyduck 57e65b061c docs: update issue.md with XTTS streaming as next priority
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 02:38:21 +02:00
duffyduck aa54765b03 release: bump version to 0.0.2.7 2026-04-10 02:24:58 +02:00
duffyduck 8929bc99bb fix: XTTS groups sentences into ~250 char chunks for consistent voice quality
- 2-3 sentences per chunk (more context = stable voice/volume)
- Max 250 chars per chunk (keeps WebSocket packets manageable)
- Dots re-added between sentences within a chunk (natural pauses)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 02:23:29 +02:00
duffyduck 0428c06612 fix: Audio preloading to prevent stuttering, remove trailing dots for XTTS
- Preload next audio while current plays (eliminates gap between sentences)
- Remove trailing dots from sentences (XTTS reads them aloud)
- stopPlayback cleans up preloaded audio

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 02:21:19 +02:00
duffyduck a7eb3cf433 release: bump version to 0.0.2.6 2026-04-10 02:11:04 +02:00
duffyduck e4e0e793a8 fix: Audio queue for sequential TTS playback (no overlap/skip)
- Audio packets queued instead of stopping previous
- _playNext() plays sequentially, each sentence after the previous
- stopPlayback() clears queue
- Fixes overlapping/skipping with XTTS sentence-by-sentence rendering

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 02:09:35 +02:00
duffyduck b3d3b8b6bc fix: XTTS bridge splits text into sentences sequentially
- XTTS-Bridge does sentence splitting (not ARIA-Bridge)
- Sequential rendering: correct order guaranteed
- Each sentence sent as separate xtts_response
- Markdown removal before splitting
- App starts playback after first sentence (faster UX)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 02:03:29 +02:00
duffyduck 06bc456221 fix: XTTS splits long text into sentences before sending (WebSocket size limit)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 01:56:25 +02:00
duffyduck 3461f45207 docs: update README with XTTS v2 setup details, voice cloning guide
- Architecture diagram for XTTS flow (Gaming-PC ↔ RVS ↔ ARIA-VM)
- Port 8020 (not 8000), token must match, model caching
- Voice cloning step-by-step guide
- TTS engine switching (Piper/XTTS) with fallback
- Known limitation: RVS zombie connections

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 01:49:08 +02:00
duffyduck a17d4acc13 fix: XTTS bridge shares /voices volume with XTTS server
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 01:40:41 +02:00
duffyduck 62fd9193a1 fix: XTTS voice dropdown shows saved voice after page reload
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 01:34:00 +02:00
duffyduck 2329645df4 fix: XTTS voices list + upload use fresh RVS connection with response wait
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 01:24:55 +02:00
duffyduck 8a435ddf6c fix: voice upload uses send() via server, not client-side sendToRVS_raw
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 01:15:29 +02:00
duffyduck 25b754ba31 fix: voice upload Base64 conversion (chunked, no stack overflow)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 01:08:32 +02:00
duffyduck b734593bf2 fix: Bridge _send_to_rvs ping-check before send, force reconnect on zombie
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 00:37:22 +02:00
duffyduck 16847ce6f7 fix: TTS toggle global above engine selector, health check /docs
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 00:27:55 +02:00
duffyduck 6300829317 fix: XTTS model cache volume path /app/xtts_models
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:44:29 +02:00
duffyduck a1e1ee31bd fix: XTTS bridge port 8020, longer startup wait
- XTTS API runs on port 8020 (not 8000)
- Bridge waits up to 5min for model download (30x10s)
- Health check uses / instead of /docs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:39:45 +02:00
duffyduck 7ed70b876d updated image public path 2026-04-07 23:06:26 +02:00
duffyduck 3ca85da906 release: bump version to 0.0.2.5 2026-04-05 20:12:56 +02:00
duffyduck d6a89168ef release: bump version to 0.0.2.4 2026-04-05 19:51:19 +02:00
duffyduck cb33a20694 docs: update README with XTTS, auto-update, watchdog, TTS settings
- Architecture: Added XTTS v2 (Gaming-PC) and auto-update flow
- Diagnostic: Thinking indicator, cancel button, TTS tab, voice cloning
- App: Play button, chat search, auto-update, voice speed settings
- RVS: Auto-update APK distribution over WebSocket
- Watchdog: 2min warning → 5min doctor --fix → 8min container restart
- Roadmap: Phase 1 fully completed, updated Phase 2+3

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 19:46:16 +02:00
duffyduck a242693751 feat: XTTS v2 integration, auto-update system, TTS engine abstraction
- XTTS v2: Docker setup for Gaming-PC (GPU), bridge via RVS relay
- XTTS: Voice cloning UI in Diagnostic (multi-file upload)
- XTTS: Engine selectable (Piper local vs XTTS remote) with fallback
- Auto-Update: RVS serves APK over WebSocket (no HTTP needed)
- Auto-Update: App checks version on start, prompts install
- Auto-Update: release.sh copies APK to RVS via scp
- Bridge: TTS engine abstraction (piper/xtts), config persistent
- Bridge: xtts_response handler, tts_request on-demand
- Diagnostic: TTS engine dropdown, XTTS voice panel, voice cloning
- App: Play button on ARIA messages, chat search, update service
- Wake word: Disabled LiveAudioStream (crash fix), Phase 1 placeholder
- Watchdog: Container restart after 8min stuck
- Chat backup: on-the-fly to /shared/config/chat_backup.jsonl

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 19:42:10 +02:00
duffyduck 81ca3cc7a7 Ohr-Button Absturz gefixt (LiveAudioStream entfernt, Phase 1 , Play-Button in ARIA-Nachrichten fuer Sprachwiedergabe
- [x] Chat-Suche in der App (Lupe in Statusleiste)
- [x] Watchdog mit Container-Restart (2min Warnung → 5min doctor --fix → 8min Restart),Abbrechen-Button im Diagnostic Chat
- [x] Nachrichten Backup on-the-fly (/shared/config/chat_backup.jsonl)
- [x] Grosse Nachrichten satzweise aufteilen fuer TTS
- [x] RVS Nachrichten vom Smartphone gehen durch
2026-04-01 23:45:25 +02:00
duffyduck 1a32098c9e release: bump version to 0.0.2.3 2026-04-01 23:45:15 +02:00
duffyduck fa4c32270b sst immer 2026-03-29 19:18:41 +02:00
duffyduck 9c43b875f4 release: bump version to 0.0.2.2 2026-03-29 19:04:31 +02:00
duffyduck 63560e290b two speed 2026-03-29 19:03:40 +02:00
duffyduck 1ab8a6a2fe addes speed config for voice 2026-03-29 18:50:09 +02:00
33 changed files with 2612 additions and 396 deletions
+37 -7
View File
@@ -1,20 +1,50 @@
# ARIA Environment Configuration
# Copy to .env and fill in values
# ════════════════════════════════════════════════
# ARIA — Umgebungsvariablen
# Kopieren nach .env und Werte eintragen
# ════════════════════════════════════════════════
# Auth token for ARIA Core (generate a long random string)
# openssl rand -hex 32
# ── ARIA Auth Token ──────────────────────────────
# Authentifizierung fuer den OpenClaw Gateway (aria-core).
# Wird von Diagnostic, Bridge und App genutzt um sich am Gateway anzumelden.
# Alle Services die mit aria-core kommunizieren brauchen diesen Token.
# Generieren: openssl rand -hex 32
ARIA_AUTH_TOKEN=change-me-to-a-long-random-string
# RVS — Rendezvous-Server (Bridge + App verbinden sich hierüber)
# ── RVS — Rendezvous-Server ─────────────────────
# Der RVS ist ein WebSocket-Relay im Rechenzentrum.
# App, Bridge, Diagnostic und XTTS-Bridge verbinden sich hierueber.
# Alle muessen den gleichen Host, Port und Token nutzen.
# Hostname des RVS-Servers (z.B. rvs.example.de oder mobil.hacker-net.de)
RVS_HOST=rvs.example.de
# Port auf dem der RVS laeuft (muss mit rvs/docker-compose.yml uebereinstimmen)
RVS_PORT=443
# TLS (wss://) verwenden? true = verschluesselt, false = unverschluesselt (ws://)
RVS_TLS=true
# Bei TLS-Fehler automatisch auf ws:// (ohne TLS) fallback?
# true = Fallback erlaubt, false = nur mit TLS verbinden
# Nuetzlich wenn kein TLS-Zertifikat vorhanden (z.B. Entwicklung)
RVS_TLS_FALLBACK=true
# Pairing-Token: Wer den gleichen Token hat, landet im gleichen RVS-Room.
# Wird von generate-token.sh automatisch generiert und hier eingetragen.
# Die Android App bekommt den Token per QR-Code beim Pairing.
# WICHTIG: Muss auf ARIA-VM, Gaming-PC (xtts/.env) und App identisch sein!
# Generieren: ./generate-token.sh (traegt den Token automatisch ein)
RVS_TOKEN=
# Gitea (for release.sh — Kennwort wird interaktiv abgefragt)
# ── Gitea — Release-Verwaltung ───────────────────
# Wird von release.sh genutzt um APKs auf Gitea zu veroeffentlichen.
# Kennwort wird beim Release interaktiv abgefragt (nicht in .env!).
GITEA_URL=https://git.hacker-net.de
GITEA_REPO=Hacker-Software/ARIA-AGENT
GITEA_USER=duffyduck
# ── Auto-Update — APK auf RVS-Server kopieren ───
# SSH-Ziel fuer scp: release.sh kopiert die APK dorthin.
# Der RVS-Server stellt sie dann per WebSocket an die App bereit.
# Format: user@host (z.B. root@aria-rvs oder root@rvs.example.de)
# Leer lassen = Auto-Update ueberspringen, APK manuell auf RVS kopieren.
RVS_UPDATE_HOST=
+1
View File
@@ -36,6 +36,7 @@ android/local.properties
android/package-lock.json
*.apk
*.aab
rvs/updates/*.apk
# ── Tauri / Desktop Build ───────────────────────
desktop/src-tauri/target/
+193 -26
View File
@@ -29,11 +29,18 @@ ARIA hat zwei Rollen:
┌─────────────────────────────────────────────────────────┐
│ RVS — Rendezvous-Server │
│ Node.js WebSocket Relay (Docker, Rechenzentrum) │
│ Reiner Relay — kennt keine Tokens, leitet durch
│ Relay + Auto-Update (APK-Verteilung)
│ rvs/docker-compose.yml │
└───────────────────────┬─────────────────────────────────┘
│ WebSocket Tunnel
└───────────┬───────────────────────────┬─────────────────┘
│ WebSocket Tunnel │ WebSocket Tunnel
┌───────────────────────────┐
│ Gaming-PC (optional) │
│ RTX 3060, Docker+WSL2 │
│ XTTS v2 (natuerliche │
│ Stimmen, Voice Cloning) │
│ xtts/docker-compose.yml │
└───────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ ARIA-VM (Proxmox, Debian 13) — ARIAs Wohnung │
│ Basissystem + Docker. Rest richtet ARIA selbst ein. │
@@ -66,13 +73,14 @@ ARIA hat zwei Rollen:
└─────────────────────────────────────────────────────────┘
```
**Drei separate Deployments:**
**Vier separate Deployments:**
| Was | Wo | Wie |
|-----|----|-----|
| RVS | Rechenzentrum | `cd rvs && docker compose up -d` |
| ARIA Core | Debian 13 VM | `docker compose up -d && ./aria-setup.sh` |
| Android App | Stefans Handy | APK installieren, QR-Code scannen |
| XTTS v2 (optional) | Gaming-PC (GPU) | `cd xtts && docker compose up -d` |
| Android App | Stefans Handy | APK installieren (Auto-Update via RVS) |
---
@@ -95,16 +103,31 @@ cd ~/ARIA-AGENT
cp .env.example .env
```
`.env` Datei editieren:
`.env` Datei editieren (Details siehe `.env.example`):
```bash
# Gateway-Auth: Alle Services die mit aria-core reden brauchen diesen Token
# Diagnostic, Bridge, App nutzen ihn fuer den WebSocket-Handshake
ARIA_AUTH_TOKEN= # openssl rand -hex 32
# RVS-Verbindung: Hostname + Port deines Rendezvous-Servers
RVS_HOST= # z.B. rvs.hackersoft.de
RVS_PORT=443
RVS_TLS=true
RVS_TLS_FALLBACK=true
RVS_TOKEN= # wird von generate-token.sh automatisch gesetzt
# Pairing-Token: Verbindet App, Bridge, Diagnostic und XTTS im gleichen RVS-Room
# MUSS auf allen Geraeten identisch sein (ARIA-VM, Gaming-PC, App)
# Wird von generate-token.sh automatisch generiert und eingetragen
RVS_TOKEN= # ./generate-token.sh
# Optional: SSH-Host des RVS-Servers fuer Auto-Update (z.B. root@aria-rvs)
RVS_UPDATE_HOST=
```
**Zwei Tokens, zwei Zwecke:**
- **ARIA_AUTH_TOKEN**: Authentifizierung am OpenClaw Gateway (aria-core). Wer diesen Token hat, kann ARIA Befehle geben.
- **RVS_TOKEN**: Pairing-Token fuer den Rendezvous-Server. Alle Geraete mit dem gleichen Token landen im gleichen "Room" und koennen kommunizieren. Die App bekommt diesen Token per QR-Code.
### 2. Claude CLI einloggen (Proxy-Auth)
Der Proxy-Container nutzt deine Claude Max Subscription. Die Credentials muessen
@@ -283,7 +306,8 @@ aria-core → Antwort → Gateway → Diagnostic → RVS → App
### Features
- **STT**: faster-whisper (lokal, offline, 16kHz mono)
- **TTS**: Piper (Ramona + Thorsten, offline)
- **TTS**: Piper (Ramona + Thorsten, offline) oder XTTS v2 (remote, GPU, Voice Cloning)
- **Markdown-Bereinigung**: Entfernt **fett**, *kursiv*, `code`, Links, Listen etc. vor TTS (natuerliche Sprache)
- **Wake-Word**: openwakeword (lokales Mikrofon auf der VM)
- **App-Audio**: Base64 Audio von App → FFmpeg → Whisper STT → Text an aria-core
- **Modi**: Normal, Nicht stoeren, Fluestern, Hangar, Gaming
@@ -314,13 +338,19 @@ Erreichbar unter `http://<VM-IP>:3001`. Teilt das Netzwerk mit aria-core.
### Features
- **Status-Karten**: Gateway (Handshake), RVS (TLS-Fallback), Proxy (Auth)
- **Chat-Test**: Nachrichten direkt an ARIA senden (Gateway oder via RVS)
- **Session-Verwaltung**: Sessions auflisten, wechseln, erstellen, loeschen
- **Chat-Test**: Nachrichten direkt an ARIA senden (Gateway oder via RVS), Vollbild-Modus
- **"ARIA denkt..." Indikator**: Zeigt live was ARIA gerade tut (Denken, Tool, Schreiben)
- **Abbrechen-Button**: Stoppt laufende Anfragen + doctor --fix
- **Session-Verwaltung**: Sessions auflisten, wechseln, erstellen, loeschen, als Markdown exportieren (⬇ Button)
- **Chat-History**: Wird beim Laden und Session-Wechsel angezeigt (read-only aus JSONL)
- **TTS-Diagnose Tab**: Stimmen testen, Status pruefen, Fehler anzeigen
- **Einstellungen**: TTS-Engine (Piper/XTTS), Stimmen, Speed, Highlight-Trigger, Betriebsmodi, Whisper-Modell (tiny…large-v3, Hot-Reload)
- **XTTS Voice Cloning**: Audio-Samples hochladen, eigene Stimme erstellen
- **Claude Login**: Browser-Terminal zum Einloggen in den Proxy
- **Core Terminal**: Shell in aria-core (openclaw CLI)
- **Container-Logs**: Echtzeit-Logs aller Container (gefiltert nach Tab)
- **Container-Logs**: Echtzeit-Logs aller Container (gefiltert nach Tab + Pipeline)
- **SSH Terminal**: Direkter SSH-Zugang zu aria-wohnung
- **Watchdog**: Erkennt stuck Runs (2min Warnung → 5min doctor --fix → 8min Container-Restart)
### Session-Verwaltung
@@ -338,12 +368,19 @@ API-Endpoint fuer andere Services: `GET http://localhost:3001/api/session`
- Text-Chat mit ARIA
- **Sprachaufnahme**: Push-to-Talk (halten) oder Tap-to-Talk (tippen, Auto-Stop bei Stille)
- **Gespraechsmodus** (Ohr-Button): Nach jeder ARIA-Antwort startet automatisch die Aufnahme — wie ein natuerliches Gespraech hin und her, ohne Buttons druecken
- **VAD (Voice Activity Detection)**: Erkennt 1.8s Stille und stoppt automatisch
- **STT (Speech-to-Text)**: Audio wird in der Bridge per Whisper transkribiert, transkribierter Text erscheint im Chat
- **Wake Word**: Toggle-Button (Ohr-Symbol) aktiviert kontinuierliches Mikrofon-Monitoring
- **TTS-Wiedergabe**: ARIA antwortet per Lautsprecher (Ramona/Thorsten)
- **Datei- und Bild-Upload**: Bilder inline im Chat, Dateien mit Icon + Name + Groesse
- **Anhaenge**: Bridge speichert Dateien in Shared Volume (`/shared/uploads/`), ARIA kann darauf zugreifen
- **Speech Gate**: Aufnahme wird verworfen wenn keine Sprache erkannt (kein Rauschen an Whisper)
- **STT (Speech-to-Text)**: Audio wird als 16kHz mono aufgenommen und in der Bridge per Whisper transkribiert, transkribierter Text erscheint im Chat
- **"ARIA denkt..." Indicator**: Zeigt live den Status vom Core (Denken, Tool, Schreiben) + Abbrechen-Button
- **TTS-Wiedergabe**: ARIA antwortet per Lautsprecher (Piper oder XTTS v2), Audio-Queue mit Preloading
- **Play-Button**: Jede ARIA-Nachricht kann nochmal vorgelesen werden
- **Chat-Suche**: Lupe in der Statusleiste filtert Nachrichten live
- **Mehrere Anhaenge**: Bilder + Dateien sammeln, Text hinzufuegen, dann zusammen senden
- **Paste-Support**: Bilder aus Zwischenablage einfuegen (Diagnostic)
- **Anhaenge**: Bridge speichert in Shared Volume, ARIA kann darauf zugreifen, Re-Download ueber RVS
- **Einstellungen**: TTS Engine, Stimmen, Speed pro Stimme, Speicherort, Auto-Download, GPS
- **Auto-Update**: Prueft beim Start + per Button auf neue Version, Download + Installation ueber RVS (FileProvider)
- GPS-Position (optional)
- QR-Code Scanner fuer Token-Pairing
@@ -374,19 +411,42 @@ cd android
```
Das Script macht alles in einem Schritt:
1. Fragt Gitea-Kennwort ab (wird nirgends gespeichert)
2. Baut die Release-APK
3. Erstellt Git Tag + pusht
4. Erstellt Gitea Release
5. Laedt APK als Asset hoch
1. Setzt Versionsnummern (package.json, build.gradle, SettingsScreen)
2. Fragt Gitea-Kennwort ab (wird nirgends gespeichert)
3. Baut die Release-APK
4. Git Commit + Tag + Push
5. Erstellt Gitea Release + laedt APK hoch
6. Kopiert APK auf RVS-Server (Auto-Update, optional)
Voraussetzung in `.env`:
```bash
GITEA_URL=https://gitea.hackersoft.de
GITEA_REPO=stefan/aria-agent
GITEA_USER=stefan
RVS_UPDATE_HOST=root@aria-rvs # Optional: fuer Auto-Update
```
### Docker-Cleanup
Das Bridge-Image zieht grosse ML-Deps (faster-whisper, ctranslate2, onnxruntime,
openwakeword, piper-tts) — bei jedem Rebuild waechst der Docker-Build-Cache. Wenn
die VM voll laeuft:
```bash
./cleanup.sh # sicher: Build-Cache + ungenutzte Images
./cleanup.sh --full # aggressiv: zusaetzlich ungenutzte Volumes (mit Rueckfrage)
```
### Auto-Update
Die App prueft beim Start ob eine neuere Version auf dem RVS liegt.
Der Update-Flow:
1. `./release.sh 0.0.3.0` → APK wird auf RVS kopiert (via scp)
2. Alternativ: `git pull` auf dem RVS-Server → APK in `rvs/updates/`
3. App sendet `update_check` mit aktueller Version
4. RVS vergleicht → sendet `update_available`
5. App zeigt Dialog → Download ueber WebSocket → Installation
### Audio-Pipeline (Spracheingabe)
```
@@ -454,6 +514,11 @@ aria-data/
│ ├── aria.env ← Voice Bridge Config
│ └── diag-state/ ← Diagnostic persistenter State
│ (im Shared Volume /shared/config/):
│ ├── voice_config.json ← TTS-Einstellungen (Stimme, Speed, Engine)
│ ├── highlight_triggers.json ← Highlight-Trigger Woerter
│ └── chat_backup.jsonl ← Nachrichten-Backup (on-the-fly)
└── ssh/ ← SSH Keys fuer VM-Zugriff
├── id_ed25519 ← Private Key (generiert von aria-setup.sh)
├── id_ed25519.pub ← Public Key (muss in VM authorized_keys!)
@@ -469,7 +534,7 @@ tar -czf aria-backup-$(date +%Y%m%d).tar.gz aria-data/
## RVS — Rendezvous-Server
Laeuft im Rechenzentrum. Reiner Relay — kennt keine Tokens, speichert nichts.
Laeuft im Rechenzentrum. WebSocket Relay + Auto-Update Server.
Wer sich mit dem gleichen Token verbindet, landet im gleichen Room.
```bash
@@ -477,10 +542,90 @@ cd rvs
docker compose up -d
```
**Features:**
- WebSocket Relay (alle Message-Types: chat, audio, file, config, xtts, update, etc.)
- Auto-Update: APK-Verteilung an Apps ueber WebSocket
- Heartbeat + tote Verbindungen aufraeumen
**Auto-Update APK bereitstellen:**
```bash
# APK in updates/ legen (manuell oder via release.sh)
cp ARIA-v0.0.3.0.apk ~/ARIA-AGENT/rvs/updates/
# RVS erkennt die Version aus dem Dateinamen
```
**Multi-Instanz:** Mehrere ARIA-VMs koennen denselben RVS nutzen — jede mit eigenem Token.
---
## XTTS v2 — GPU TTS Server (optional)
Laeuft auf einem separaten Rechner mit NVIDIA GPU (z.B. Gaming-PC mit RTX 3060).
Verbindet sich ueber RVS mit der ARIA-Infrastruktur — kein VPN noetig, funktioniert
ueber verschiedene Netze hinweg.
### Architektur
```
Gaming-PC (Windows, RTX 3060, Docker Desktop + WSL2)
├── aria-xtts XTTS v2 GPU Server (Port 8020 intern)
└── aria-xtts-bridge RVS-Relay (empfaengt Requests, sendet Audio)
└── Beide teilen ./voices/ Volume fuer Voice Cloning
↕ RVS (Rechenzentrum, WebSocket Relay)
ARIA-VM
└── aria-bridge: tts_engine="xtts" → xtts_request via RVS → wartet auf xtts_response
```
### Voraussetzungen
- Docker Desktop mit WSL2 (Windows) oder Docker mit NVIDIA Runtime (Linux)
- NVIDIA Container Toolkit
- GPU mit mindestens 4GB VRAM (6GB+ empfohlen)
- **Gleicher RVS_TOKEN wie auf der ARIA-VM!**
### Setup
```bash
cd xtts
cp .env.example .env
# .env mit RVS-Verbindungsdaten fuellen (gleicher Token wie ARIA-VM!)
docker compose up -d
# Erster Start laedt ~2GB Model herunter (danach gecacht)
```
**Wichtig:** Der XTTS-Server laeuft intern auf Port **8020** (nicht 8000).
Das Model wird im Volume `xtts-models` gecacht und muss nur einmal geladen werden.
### Features
- **Natuerliche Stimmen**: Deutlich bessere Qualitaet als Piper
- **Voice Cloning**: Eigene Stimme mit 6-10s Audio-Sample (~2s Latenz auf RTX 3060)
- **16 Sprachen**: Deutsch, Englisch, Franzoesisch, etc.
- **Fallback**: Wenn XTTS nicht erreichbar, nutzt die Bridge automatisch Piper
### TTS-Engine umschalten
In der Diagnostic unter Einstellungen → Sprachausgabe:
- **TTS aktiv**: Global An/Aus
- **TTS Engine**: Piper (lokal, CPU, schnell) oder XTTS v2 (remote, GPU, natuerlich)
- **Piper**: Standard-Stimme, Highlight-Stimme, Speed pro Stimme
- **XTTS**: Stimmen-Auswahl, Voice Cloning
### Stimme klonen
1. TTS Engine auf "XTTS v2" stellen
2. "Stimme klonen" → Audio-Dateien hochladen (WAV/MP3, 1-10 Dateien, min. 6-10s gesamt)
3. Name vergeben → "Stimme erstellen"
4. "Laden" klicken → neue Stimme in der Auswahl
5. Stimme auswaehlen → Config wird automatisch gespeichert
> **Tipp:** Fuer beste Ergebnisse: saubere Aufnahme, eine Stimme, kein Hintergrund,
> 10-30 Sekunden Gesamtlaenge. Mehrere kurze Dateien werden zusammengefuegt.
---
## Docker Volumes
| Volume | Pfad im Container | Zweck |
@@ -491,7 +636,7 @@ docker compose up -d
| `./aria-data/ssh` (bind) | `/root/.ssh`, `/home/node/.ssh` | SSH Keys |
| `./aria-data/brain` (bind) | `/home/node/.openclaw/workspace/memory` | Gedaechtnis |
| `./aria-data/skills` (bind) | `/home/node/.openclaw/workspace/skills` | Skills |
| `aria-shared` | `/shared` (Core + Bridge) | Datei-Austausch (Uploads von App) |
| `aria-shared` | `/shared` (Core + Bridge + Proxy + Diag) | Datei-Austausch, Config, Uploads |
| `./aria-data/config/diag-state` (bind) | `/data` (Diagnostic) | Persistenter State (aktive Session) |
---
@@ -549,6 +694,8 @@ docker exec aria-core ssh aria-wohnung hostname
- **Wake Word nur auf VM**: Die Bridge hoert auf "ARIA" ueber das lokale Mikrofon der VM.
In der App gibt es Energy-basierte Erkennung (Phase 1). On-device "ARIA"-Keyword (Porcupine) ist Phase 2.
- **Audio-Format**: App nimmt AAC/MP4 auf, Bridge konvertiert via FFmpeg zu 16kHz PCM.
- **RVS Zombie-Connections**: WebSocket-Verbindungen sterben gelegentlich ohne Fehlermeldung.
Bridge hat Ping-Check (5s), Diagnostic nutzt frische Verbindungen pro Request.
- **Bildanalyse eingeschraenkt**: Bilder werden in `/shared/uploads/` gespeichert. ARIA kann
sie per Bash/Read-Tool oeffnen, aber Claude Vision (direkte Bildanalyse) ist ueber den
Proxy-Pfad (`claude --print`) noch nicht moeglich. ARIA sieht den Dateipfad, nicht das Bild.
@@ -569,8 +716,26 @@ docker exec aria-core ssh aria-wohnung hostname
- [x] Android App (Chat + Sprache + Uploads)
- [x] Tool-Permissions (alle Tools freigeschaltet)
- [x] SSH-Zugriff auf VM (aria-wohnung)
- [x] Diagnostic Web-UI
- [x] Diagnostic Web-UI + Einstellungen
- [x] Session-Verwaltung + Chat-History
- [x] Stimmen-Einstellungen (Ramona/Thorsten, Speed, Highlight-Trigger)
- [x] TTS satzweise fuer lange Texte
- [x] Datei-/Bild-Upload mit Shared Volume
- [x] Watchdog (stuck Run Erkennung + Auto-Fix + Container-Restart)
- [x] Auto-Update System (APK via RVS)
- [x] Chat-Suche, Play-Button, Abbrechen-Button
- [x] XTTS v2 Integration (GPU, Voice Cloning, remote ueber RVS)
- [x] Gespraechsmodus (Ohr-Button, automatische Aufnahme nach ARIA-Antwort)
- [x] Mehrere Anhaenge + Text vor dem Senden + Paste-Support
- [x] Markdown-Bereinigung fuer TTS
- [x] Auto-Update mit FileProvider + Update-Check Button
- [x] Inverted FlatList (zuverlaessiges Scroll-to-Bottom)
- [x] Speech Gate (VAD verwirft Aufnahme ohne erkannte Sprache)
- [x] Session-Persistenz ueber Container-Restarts (sessionFromFile + atomic write)
- [x] Session-Export als Markdown-Datei (Download-Button pro Session)
- [x] "ARIA denkt..."-Indicator + Abbrechen-Button in App (via Bridge → RVS)
- [x] Whisper-Modell waehlbar in Diagnostic (tiny…large-v3, Hot-Reload)
- [x] App-Aufnahme explizit 16kHz mono (optimal fuer Whisper, kein Resample)
### Phase 2 — ARIA wird produktiv
@@ -578,7 +743,8 @@ docker exec aria-core ssh aria-wohnung hostname
- [ ] Gitea-Integration
- [ ] VM einrichten (Desktop, Browser, Tools)
- [ ] Heartbeat (periodische Selbst-Checks)
- [ ] Lokales LLM als Wächter (Triage vor Claude-Call)
- [ ] Lokales LLM als Waechter (Triage vor Claude-Call)
- [ ] Auto-Compacting / Memory-Verwaltung
### Phase 3 — Erweiterungen
@@ -586,3 +752,4 @@ docker exec aria-core ssh aria-wohnung hostname
- [ ] Desktop Client (Tauri)
- [ ] bKVM Remote IT-Support
- [ ] Porcupine Wake Word (on-device "ARIA" in der App)
- [ ] Claude Vision direkt (Bildanalyse ohne Dateipfad-Umweg)
+2 -2
View File
@@ -79,8 +79,8 @@ android {
applicationId "com.ariacockpit"
minSdkVersion rootProject.ext.minSdkVersion
targetSdkVersion rootProject.ext.targetSdkVersion
versionCode 201
versionName "0.0.2.1"
versionCode 400
versionName "0.0.4.0"
// Fallback fuer Libraries mit Product Flavors
missingDimensionStrategy 'react-native-camera', 'general'
}
@@ -3,6 +3,7 @@
<uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.CAMERA" />
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-permission android:name="android.permission.REQUEST_INSTALL_PACKAGES" />
<application
android:name=".MainApplication"
@@ -24,5 +25,15 @@
<category android:name="android.intent.category.LAUNCHER" />
</intent-filter>
</activity>
<provider
android:name="androidx.core.content.FileProvider"
android:authorities="${applicationId}.fileprovider"
android:exported="false"
android:grantUriPermissions="true">
<meta-data
android:name="android.support.FILE_PROVIDER_PATHS"
android:resource="@xml/file_paths" />
</provider>
</application>
</manifest>
@@ -0,0 +1,44 @@
package com.ariacockpit
import android.content.Intent
import android.net.Uri
import android.os.Build
import androidx.core.content.FileProvider
import com.facebook.react.bridge.ReactApplicationContext
import com.facebook.react.bridge.ReactContextBaseJavaModule
import com.facebook.react.bridge.ReactMethod
import com.facebook.react.bridge.Promise
import java.io.File
class ApkInstallerModule(reactContext: ReactApplicationContext) : ReactContextBaseJavaModule(reactContext) {
override fun getName() = "ApkInstaller"
@ReactMethod
fun install(filePath: String, promise: Promise) {
try {
val file = File(filePath)
if (!file.exists()) {
promise.reject("FILE_NOT_FOUND", "APK nicht gefunden: $filePath")
return
}
val context = reactApplicationContext
val uri: Uri = if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.N) {
FileProvider.getUriForFile(context, "${context.packageName}.fileprovider", file)
} else {
Uri.fromFile(file)
}
val intent = Intent(Intent.ACTION_VIEW).apply {
setDataAndType(uri, "application/vnd.android.package-archive")
addFlags(Intent.FLAG_ACTIVITY_NEW_TASK)
addFlags(Intent.FLAG_GRANT_READ_URI_PERMISSION)
}
context.startActivity(intent)
promise.resolve(true)
} catch (e: Exception) {
promise.reject("INSTALL_ERROR", e.message, e)
}
}
}
@@ -0,0 +1,16 @@
package com.ariacockpit
import com.facebook.react.ReactPackage
import com.facebook.react.bridge.NativeModule
import com.facebook.react.bridge.ReactApplicationContext
import com.facebook.react.uimanager.ViewManager
class ApkInstallerPackage : ReactPackage {
override fun createNativeModules(reactContext: ReactApplicationContext): List<NativeModule> {
return listOf(ApkInstallerModule(reactContext))
}
override fun createViewManagers(reactContext: ReactApplicationContext): List<ViewManager<*, *>> {
return emptyList()
}
}
@@ -18,8 +18,7 @@ class MainApplication : Application(), ReactApplication {
object : DefaultReactNativeHost(this) {
override fun getPackages(): List<ReactPackage> =
PackageList(this).packages.apply {
// Packages that cannot be autolinked yet can be added manually here, for example:
// add(MyReactNativePackage())
add(ApkInstallerPackage())
}
override fun getJSMainModuleName(): String = "index"
@@ -0,0 +1,4 @@
<?xml version="1.0" encoding="utf-8"?>
<paths>
<cache-path name="cache" path="." />
</paths>
+2 -3
View File
@@ -1,6 +1,6 @@
{
"name": "aria-cockpit",
"version": "0.0.2.1",
"version": "0.0.4.0",
"private": true,
"scripts": {
"android": "react-native run-android",
@@ -24,8 +24,7 @@
"react-native-camera-kit": "^13.0.0",
"@react-native-async-storage/async-storage": "^1.21.0",
"react-native-fs": "^2.20.0",
"react-native-audio-recorder-player": "^3.6.7",
"react-native-live-audio-stream": "^1.1.1"
"react-native-audio-recorder-player": "^3.6.7"
},
"devDependencies": {
"typescript": "^5.3.3",
+300 -100
View File
@@ -5,7 +5,7 @@
* Datei- und Kamera-Upload.
*/
import React, { useState, useEffect, useRef, useCallback } from 'react';
import React, { useState, useEffect, useRef, useCallback, useMemo } from 'react';
import {
View,
Text,
@@ -16,6 +16,7 @@ import {
Platform,
StyleSheet,
Image,
ScrollView,
Modal,
} from 'react-native';
import AsyncStorage from '@react-native-async-storage/async-storage';
@@ -23,6 +24,7 @@ import RNFS from 'react-native-fs';
import rvs, { RVSMessage, ConnectionState } from '../services/rvs';
import audioService from '../services/audio';
import wakeWordService from '../services/wakeword';
import updateService from '../services/updater';
import VoiceButton from '../components/VoiceButton';
import FileUpload, { FileData } from '../components/FileUpload';
import CameraUpload, { PhotoData } from '../components/CameraUpload';
@@ -52,6 +54,12 @@ interface ChatMessage {
const CHAT_STORAGE_KEY = 'aria_chat_messages';
const MAX_STORED_MESSAGES = 500;
const MAX_MEMORY_MESSAGES = 500;
// Hilfe: Messages-Array auf Max kappen (aelteste raus) — verhindert OOM
// im Gespraechsmodus bei sehr vielen Nachrichten.
const capMessages = (msgs: ChatMessage[]): ChatMessage[] =>
msgs.length > MAX_MEMORY_MESSAGES ? msgs.slice(-MAX_MEMORY_MESSAGES) : msgs;
const DEFAULT_ATTACHMENT_DIR = `${RNFS.DocumentDirectoryPath}/chat_attachments`;
const STORAGE_PATH_KEY = 'aria_attachment_storage_path';
@@ -91,6 +99,10 @@ const ChatScreen: React.FC = () => {
const [gpsEnabled, setGpsEnabled] = useState(false);
const [wakeWordActive, setWakeWordActive] = useState(false);
const [fullscreenImage, setFullscreenImage] = useState<string | null>(null);
const [searchQuery, setSearchQuery] = useState('');
const [searchVisible, setSearchVisible] = useState(false);
const [pendingAttachments, setPendingAttachments] = useState<{file: any, isPhoto: boolean}[]>([]);
const [agentActivity, setAgentActivity] = useState<{activity: string, tool: string}>({activity: 'idle', tool: ''});
const flatListRef = useRef<FlatList>(null);
const messageIdCounter = useRef(0);
@@ -212,12 +224,12 @@ const ChatScreen: React.FC = () => {
if (sender === 'diagnostic') {
const diagText = (message.payload.text as string) || '';
if (diagText) {
setMessages(prev => [...prev, {
setMessages(prev => capMessages([...prev, {
id: nextId(),
sender: 'user',
text: diagText,
timestamp: message.timestamp,
}]);
}]));
}
return;
}
@@ -237,7 +249,7 @@ const ChatScreen: React.FC = () => {
timestamp: ts,
attachments: message.payload.attachments as Attachment[] | undefined,
};
return [...prev, ariaMsg];
return capMessages([...prev, ariaMsg]);
});
}
@@ -245,6 +257,13 @@ const ChatScreen: React.FC = () => {
if (message.type === 'audio' && message.payload.base64) {
audioService.playAudio(message.payload.base64 as string);
}
// Thinking-Indicator Status von der Bridge
if (message.type === 'agent_activity') {
const activity = (message.payload.activity as string) || 'idle';
const tool = (message.payload.tool as string) || '';
setAgentActivity({ activity, tool });
}
});
const unsubState = rvs.onStateChange((state) => {
@@ -260,12 +279,30 @@ const ChatScreen: React.FC = () => {
};
}, []);
// Wake Word: "ARIA" Erkennung → Auto-Aufnahme starten
// Auto-Update: Bei App-Start pruefen
useEffect(() => {
const unsubUpdate = updateService.onUpdateAvailable((info) => {
updateService.promptUpdate(info);
});
// Nach 5s pruefen (RVS muss erst verbunden sein)
const timer = setTimeout(() => updateService.checkForUpdate(), 5000);
return () => { unsubUpdate(); clearTimeout(timer); };
}, []);
// Gespraechsmodus: Nach TTS-Wiedergabe automatisch Aufnahme starten
useEffect(() => {
const unsubPlayback = audioService.onPlaybackFinished(() => {
if (wakeWordService.isActive()) {
wakeWordService.resume();
}
});
return () => unsubPlayback();
}, []);
// Wake Word / Gespraechsmodus: Auto-Aufnahme starten
useEffect(() => {
const unsubWake = wakeWordService.onWakeWord(async () => {
console.log('[Chat] Wake Word erkannt — starte Auto-Aufnahme');
// TTS stoppen damit ARIA sich nicht selbst hoert
audioService.stopPlayback();
console.log('[Chat] Gespraechsmodus — starte Auto-Aufnahme');
// Aufnahme mit Auto-Stop (VAD) starten
const started = await audioService.startRecording(true);
if (!started) {
@@ -287,7 +324,7 @@ const ChatScreen: React.FC = () => {
timestamp: Date.now(),
attachments: [{ type: 'audio', name: 'Sprachaufnahme' }],
};
setMessages(prev => [...prev, userMsg]);
setMessages(prev => capMessages([...prev, userMsg]));
rvs.send('audio', {
base64: result.base64,
durationMs: result.durationMs,
@@ -346,22 +383,8 @@ const ChatScreen: React.FC = () => {
return () => { if (saveTimer.current) clearTimeout(saveTimer.current); };
}, [messages]);
// Auto-Scroll wird ueber onContentSizeChange der FlatList gesteuert
const shouldAutoScroll = useRef(true);
const handleContentSizeChange = useCallback(() => {
if (shouldAutoScroll.current) {
flatListRef.current?.scrollToEnd({ animated: false });
}
}, []);
const handleScrollBeginDrag = useCallback(() => {
shouldAutoScroll.current = false;
}, []);
const handleScrollEndDrag = useCallback((e: any) => {
// Auto-Scroll wieder aktivieren wenn User ganz unten ist
const { contentOffset, contentSize, layoutMeasurement } = e.nativeEvent;
const isAtBottom = contentOffset.y + layoutMeasurement.height >= contentSize.height - 50;
shouldAutoScroll.current = isAtBottom;
}, []);
// Inverted FlatList: neueste Nachrichten unten, kein manuelles Scrollen noetig
const invertedMessages = useMemo(() => [...messages].reverse(), [messages]);
// GPS-Position holen (optional)
const getCurrentLocation = useCallback((): Promise<{ lat: number; lon: number } | null> => {
@@ -387,6 +410,13 @@ const ChatScreen: React.FC = () => {
const sendTextMessage = useCallback(async () => {
const text = inputText.trim();
// Wenn pending Anhaenge vorhanden → Anhaenge + Text zusammen senden
if (pendingAttachments.length > 0) {
sendPendingAttachments(text);
return;
}
if (!text) return;
setInputText('');
@@ -399,14 +429,20 @@ const ChatScreen: React.FC = () => {
text,
timestamp: Date.now(),
};
setMessages(prev => [...prev, userMsg]);
setMessages(prev => capMessages([...prev, userMsg]));
// An RVS senden
rvs.send('chat', {
text,
...(location && { location }),
});
}, [inputText, getCurrentLocation]);
}, [inputText, getCurrentLocation, pendingAttachments, sendPendingAttachments]);
// Anfrage abbrechen — sofort lokalen Indicator weg, Bridge triggert doctor --fix
const cancelRequest = useCallback(() => {
setAgentActivity({ activity: 'idle', tool: '' });
rvs.send('cancel_request' as any, {});
}, []);
// Sprachaufnahme abgeschlossen
const handleVoiceRecording = useCallback(async (result: RecordingResult) => {
@@ -418,7 +454,7 @@ const ChatScreen: React.FC = () => {
text: '🎙 Spracheingabe wird verarbeitet...',
timestamp: Date.now(),
};
setMessages(prev => [...prev, userMsg]);
setMessages(prev => capMessages([...prev, userMsg]));
rvs.send('audio', {
base64: result.base64,
@@ -428,88 +464,91 @@ const ChatScreen: React.FC = () => {
});
}, [getCurrentLocation]);
// Datei senden
// Datei auswaehlen → zur Pending-Liste hinzufuegen
const handleFileSelected = useCallback(async (file: FileData) => {
setShowFileUpload(false);
const location = await getCurrentLocation();
setPendingAttachments(prev => [...prev, { file, isPhoto: false }]);
}, []);
const isImage = file.type.startsWith('image/');
const msgId = nextId();
let imageUri = isImage && file.base64 ? `data:${file.type};base64,${file.base64}` : file.uri;
const userMsg: ChatMessage = {
id: msgId,
sender: 'user',
text: 'Anhang empfangen',
timestamp: Date.now(),
attachments: [{
type: isImage ? 'image' : 'file',
name: file.name,
size: file.size,
uri: imageUri,
mimeType: file.type,
}],
};
setMessages(prev => [...prev, userMsg]);
// Anhang auf Disk speichern fuer Persistenz
if (file.base64) {
persistAttachment(file.base64, msgId, file.name).then(filePath => {
setMessages(prev => prev.map(m =>
m.id === msgId ? { ...m, attachments: m.attachments?.map(a => ({ ...a, uri: filePath })) } : m
));
}).catch(() => {});
}
rvs.send('file', {
name: file.name,
type: file.type,
size: file.size,
base64: file.base64,
...(location && { location }),
});
}, [getCurrentLocation]);
// Foto senden
// Foto auswaehlen → zur Pending-Liste hinzufuegen
const handlePhotoSelected = useCallback(async (photo: PhotoData) => {
setShowCameraUpload(false);
setPendingAttachments(prev => [...prev, { file: photo, isPhoto: true }]);
}, []);
// Alle Pending Anhaenge + Text senden
const sendPendingAttachments = useCallback(async (messageText: string) => {
if (pendingAttachments.length === 0) return;
const location = await getCurrentLocation();
const msgId = nextId();
const dataUri = photo.base64 ? `data:${photo.type};base64,${photo.base64}` : undefined;
// Alle Attachments fuer die Chat-Nachricht sammeln
const attachments: Attachment[] = [];
for (const { file, isPhoto } of pendingAttachments) {
const isImage = isPhoto || (file.type && file.type.startsWith('image/'));
const name = isPhoto ? file.fileName : file.name;
const base64 = file.base64 || '';
const mimeType = file.type || '';
const imageUri = isImage && base64 ? `data:${mimeType};base64,${base64}` : file.uri;
attachments.push({
type: isImage ? 'image' : 'file',
name,
size: file.size,
uri: imageUri,
mimeType,
});
}
// Chat-Nachricht mit allen Anhaengen
const userMsg: ChatMessage = {
id: msgId,
sender: 'user',
text: 'Anhang empfangen',
text: messageText || `${pendingAttachments.length} Anhang/Anhaenge`,
timestamp: Date.now(),
attachments: [{
type: 'image',
name: photo.fileName,
uri: dataUri,
mimeType: photo.type,
}],
attachments,
};
setMessages(prev => [...prev, userMsg]);
setMessages(prev => capMessages([...prev, userMsg]));
// Foto auf Disk speichern fuer Persistenz
if (photo.base64) {
persistAttachment(photo.base64, msgId, photo.fileName).then(filePath => {
setMessages(prev => prev.map(m =>
m.id === msgId ? { ...m, attachments: m.attachments?.map(a => ({ ...a, uri: filePath })) } : m
));
}).catch(() => {});
// Alle Dateien an RVS senden + auf Disk speichern
for (const { file, isPhoto } of pendingAttachments) {
const name = isPhoto ? file.fileName : file.name;
const base64 = file.base64 || '';
const mimeType = file.type || '';
// Auf Disk speichern
if (base64) {
persistAttachment(base64, msgId + '_' + name, name).then(filePath => {
setMessages(prev => prev.map(m =>
m.id === msgId ? { ...m, attachments: m.attachments?.map(a =>
a.name === name && !a.uri?.startsWith('file://') ? { ...a, uri: filePath } : a
)} : m
));
}).catch(() => {});
}
// An RVS senden
rvs.send('file', {
name,
type: mimeType,
size: file.size,
base64,
...(isPhoto && file.width && { width: file.width, height: file.height }),
...(location && { location }),
});
}
rvs.send('file', {
name: photo.fileName,
type: photo.type,
base64: photo.base64,
width: photo.width,
height: photo.height,
...(location && { location }),
});
}, [getCurrentLocation]);
// Text als separate Nachricht (damit ARIA weiss was zu tun ist)
if (messageText) {
rvs.send('chat', {
text: messageText,
...(location && { location }),
});
}
setPendingAttachments([]);
setInputText('');
}, [pendingAttachments, getCurrentLocation]);
// --- Rendering ---
@@ -581,6 +620,18 @@ const ChatScreen: React.FC = () => {
{item.text}
</Text>
)}
{/* Play-Button fuer ARIA-Nachrichten */}
{!isUser && item.text.length > 0 && (
<TouchableOpacity
style={styles.playButton}
onPress={() => {
// TTS-Request an Bridge senden
rvs.send('tts_request' as any, { text: item.text, voice: '' });
}}
>
<Text style={styles.playButtonText}>{'\uD83D\uDD0A'}</Text>
</TouchableOpacity>
)}
<Text style={styles.timestamp}>{time}</Text>
</View>
);
@@ -603,19 +654,37 @@ const ChatScreen: React.FC = () => {
{connectionState === 'connected' ? 'Verbunden' :
connectionState === 'connecting' ? 'Verbinde...' : 'Getrennt'}
</Text>
<TouchableOpacity onPress={() => setSearchVisible(!searchVisible)} style={{marginLeft: 'auto', paddingHorizontal: 8}}>
<Text style={{fontSize: 16}}>{'\uD83D\uDD0D'}</Text>
</TouchableOpacity>
</View>
{/* Suchleiste */}
{searchVisible && (
<View style={styles.searchBar}>
<TextInput
style={styles.searchInput}
value={searchQuery}
onChangeText={setSearchQuery}
placeholder="Chat durchsuchen..."
placeholderTextColor="#555570"
autoFocus
/>
<TouchableOpacity onPress={() => { setSearchVisible(false); setSearchQuery(''); }}>
<Text style={{color: '#FF3B30', fontSize: 14, paddingHorizontal: 8}}>X</Text>
</TouchableOpacity>
</View>
)}
{/* Nachrichtenliste */}
<FlatList
ref={flatListRef}
data={messages}
inverted
data={searchQuery ? messages.filter(m => m.text.toLowerCase().includes(searchQuery.toLowerCase())).reverse() : invertedMessages}
keyExtractor={item => item.id}
renderItem={renderMessage}
contentContainerStyle={styles.messageList}
showsVerticalScrollIndicator={false}
onContentSizeChange={handleContentSizeChange}
onScrollBeginDrag={handleScrollBeginDrag}
onScrollEndDrag={handleScrollEndDrag}
ListEmptyComponent={
<View style={styles.emptyContainer}>
<Text style={styles.emptyIcon}>{'\uD83E\uDD16'}</Text>
@@ -625,6 +694,56 @@ const ChatScreen: React.FC = () => {
}
/>
{/* Thinking-Indicator */}
{agentActivity.activity !== 'idle' && (
<View style={styles.thinkingBar}>
<Text style={styles.thinkingText}>
{agentActivity.activity === 'tool' && agentActivity.tool
? `\uD83D\uDD27 ${agentActivity.tool}`
: agentActivity.activity === 'assistant'
? '\u270D\uFE0F ARIA schreibt...'
: '\uD83D\uDCAD ARIA denkt...'}
</Text>
<TouchableOpacity style={styles.thinkingCancel} onPress={cancelRequest}>
<Text style={styles.thinkingCancelText}>Abbrechen</Text>
</TouchableOpacity>
</View>
)}
{/* Pending Anhaenge Vorschau */}
{pendingAttachments.length > 0 && (
<View style={styles.pendingBar}>
<ScrollView horizontal showsHorizontalScrollIndicator={false} style={{flex: 1}}>
{pendingAttachments.map((att, idx) => (
<View key={idx} style={styles.pendingItem}>
{att.file.type?.startsWith('image/') || att.isPhoto ? (
<Image
source={{ uri: att.file.base64
? `data:${att.file.type};base64,${att.file.base64}`
: att.file.uri }}
style={styles.pendingThumb}
/>
) : (
<View style={[styles.pendingThumb, {justifyContent: 'center', alignItems: 'center'}]}>
<Text style={{fontSize: 20}}>{'\uD83D\uDCC4'}</Text>
</View>
)}
<TouchableOpacity
style={styles.pendingRemove}
onPress={() => setPendingAttachments(prev => prev.filter((_, i) => i !== idx))}
>
<Text style={{color: '#fff', fontSize: 10, fontWeight: 'bold'}}>X</Text>
</TouchableOpacity>
</View>
))}
</ScrollView>
<Text style={{color: '#8888AA', fontSize: 11, marginLeft: 8}}>{pendingAttachments.length}</Text>
<TouchableOpacity onPress={() => setPendingAttachments([])}>
<Text style={{color: '#FF3B30', fontSize: 14, paddingHorizontal: 8}}>Alle X</Text>
</TouchableOpacity>
</View>
)}
{/* Eingabebereich */}
<View style={styles.inputContainer}>
{/* Datei-Buttons */}
@@ -647,7 +766,7 @@ const ChatScreen: React.FC = () => {
style={styles.textInput}
value={inputText}
onChangeText={setInputText}
placeholder="Nachricht an ARIA..."
placeholder={pendingAttachments.length > 0 ? "Text zu den Anhaengen (optional)..." : "Nachricht an ARIA..."}
placeholderTextColor="#555570"
multiline
maxLength={4000}
@@ -656,7 +775,7 @@ const ChatScreen: React.FC = () => {
/>
{/* Senden oder Sprache */}
{inputText.trim() ? (
{inputText.trim() || pendingAttachments.length > 0 ? (
<TouchableOpacity style={styles.sendButton} onPress={sendTextMessage}>
<Text style={styles.sendIcon}>{'\u2B06\uFE0F'}</Text>
</TouchableOpacity>
@@ -887,6 +1006,87 @@ const styles = StyleSheet.create({
wakeWordIcon: {
fontSize: 16,
},
thinkingBar: {
flexDirection: 'row',
alignItems: 'center',
justifyContent: 'space-between',
backgroundColor: '#1E1E2E',
paddingHorizontal: 12,
paddingVertical: 6,
borderTopWidth: 1,
borderTopColor: '#2A2A3E',
},
thinkingText: {
color: '#FFD60A',
fontSize: 12,
flex: 1,
},
thinkingCancel: {
paddingHorizontal: 10,
paddingVertical: 4,
borderWidth: 1,
borderColor: '#FF3B30',
borderRadius: 4,
},
thinkingCancelText: {
color: '#FF3B30',
fontSize: 11,
fontWeight: 'bold',
},
pendingBar: {
flexDirection: 'row',
alignItems: 'center',
backgroundColor: '#1E1E2E',
paddingHorizontal: 12,
paddingVertical: 8,
borderTopWidth: 1,
borderTopColor: '#2A2A3E',
},
pendingItem: {
position: 'relative',
marginRight: 8,
},
pendingThumb: {
width: 50,
height: 50,
borderRadius: 6,
backgroundColor: '#0D0D1A',
},
pendingRemove: {
position: 'absolute',
top: -4,
right: -4,
width: 18,
height: 18,
borderRadius: 9,
backgroundColor: '#FF3B30',
justifyContent: 'center',
alignItems: 'center',
},
searchBar: {
flexDirection: 'row',
alignItems: 'center',
backgroundColor: '#12122A',
paddingHorizontal: 12,
paddingVertical: 6,
borderBottomWidth: 1,
borderBottomColor: '#1E1E2E',
},
searchInput: {
flex: 1,
color: '#FFFFFF',
fontSize: 14,
paddingVertical: 4,
},
playButton: {
alignSelf: 'flex-end',
paddingHorizontal: 8,
paddingVertical: 2,
marginTop: 4,
},
playButtonText: {
fontSize: 16,
},
fullscreenOverlay: {
flex: 1,
backgroundColor: 'rgba(0,0,0,0.95)',
+50 -28
View File
@@ -74,7 +74,8 @@ const SettingsScreen: React.FC = () => {
const [ttsEnabled, setTtsEnabled] = useState(true);
const [defaultVoice, setDefaultVoice] = useState('ramona');
const [highlightVoice, setHighlightVoice] = useState('thorsten');
const [speechSpeed, setSpeechSpeed] = useState(1.0);
const [speedRamona, setSpeedRamona] = useState(1.0);
const [speedThorsten, setSpeedThorsten] = useState(1.0);
const [editingPath, setEditingPath] = useState(false);
const [tempPath, setTempPath] = useState('');
@@ -104,8 +105,11 @@ const SettingsScreen: React.FC = () => {
AsyncStorage.getItem('aria_highlight_voice').then(saved => {
if (saved) setHighlightVoice(saved);
});
AsyncStorage.getItem('aria_speech_speed').then(saved => {
if (saved) setSpeechSpeed(parseFloat(saved));
AsyncStorage.getItem('aria_speed_ramona').then(saved => {
if (saved) setSpeedRamona(parseFloat(saved));
});
AsyncStorage.getItem('aria_speed_thorsten').then(saved => {
if (saved) setSpeedThorsten(parseFloat(saved));
});
}, []);
@@ -525,41 +529,49 @@ const SettingsScreen: React.FC = () => {
</View>
</View>
{/* Sprechgeschwindigkeit */}
{/* Sprechgeschwindigkeit Ramona */}
<View style={{marginTop: 16}}>
<Text style={styles.toggleLabel}>Sprechgeschwindigkeit: {speechSpeed.toFixed(1)}x</Text>
<View style={{flexDirection: 'row', alignItems: 'center', gap: 8, marginTop: 8}}>
<Text style={{color: '#555570', fontSize: 11}}>0.5x</Text>
<View style={{flex: 1}}>
<TouchableOpacity
style={{height: 30, justifyContent: 'center'}}
onPress={(e) => {
const layout = e.nativeEvent;
// Einfacher Tap-basierter Slider
}}
>
<View style={{height: 4, backgroundColor: '#2A2A3E', borderRadius: 2}}>
<View style={{height: 4, backgroundColor: '#0096FF', borderRadius: 2, width: `${((speechSpeed - 0.5) / 1.5) * 100}%`}} />
</View>
</TouchableOpacity>
</View>
<Text style={{color: '#555570', fontSize: 11}}>2.0x</Text>
</View>
<Text style={styles.toggleLabel}>Ramona Speed: {speedRamona.toFixed(1)}x</Text>
<View style={{flexDirection: 'row', justifyContent: 'space-around', marginTop: 8}}>
{[0.5, 0.75, 1.0, 1.25, 1.5, 2.0].map(speed => (
<TouchableOpacity
key={speed}
onPress={() => {
setSpeechSpeed(speed);
AsyncStorage.setItem('aria_speech_speed', String(speed));
rvs.send('config' as any, { speechSpeed: speed });
setSpeedRamona(speed);
AsyncStorage.setItem('aria_speed_ramona', String(speed));
rvs.send('config' as any, { speedRamona: speed });
}}
style={{
paddingHorizontal: 10, paddingVertical: 6, borderRadius: 6,
backgroundColor: speechSpeed === speed ? '#0096FF' : '#1E1E2E',
backgroundColor: speedRamona === speed ? '#0096FF' : '#1E1E2E',
}}
>
<Text style={{color: speechSpeed === speed ? '#fff' : '#8888AA', fontSize: 12, fontWeight: '600'}}>
<Text style={{color: speedRamona === speed ? '#fff' : '#8888AA', fontSize: 12, fontWeight: '600'}}>
{speed}x
</Text>
</TouchableOpacity>
))}
</View>
</View>
{/* Sprechgeschwindigkeit Thorsten */}
<View style={{marginTop: 16}}>
<Text style={styles.toggleLabel}>Thorsten Speed: {speedThorsten.toFixed(1)}x</Text>
<View style={{flexDirection: 'row', justifyContent: 'space-around', marginTop: 8}}>
{[0.5, 0.75, 1.0, 1.25, 1.5, 2.0].map(speed => (
<TouchableOpacity
key={speed}
onPress={() => {
setSpeedThorsten(speed);
AsyncStorage.setItem('aria_speed_thorsten', String(speed));
rvs.send('config' as any, { speedThorsten: speed });
}}
style={{
paddingHorizontal: 10, paddingVertical: 6, borderRadius: 6,
backgroundColor: speedThorsten === speed ? '#0096FF' : '#1E1E2E',
}}
>
<Text style={{color: speedThorsten === speed ? '#fff' : '#8888AA', fontSize: 12, fontWeight: '600'}}>
{speed}x
</Text>
</TouchableOpacity>
@@ -736,11 +748,21 @@ const SettingsScreen: React.FC = () => {
<Text style={styles.sectionTitle}>{'\u00DC'}ber</Text>
<View style={styles.card}>
<Text style={styles.aboutTitle}>ARIA Cockpit</Text>
<Text style={styles.aboutVersion}>Version 0.0.2.1 </Text>
<Text style={styles.aboutVersion}>Version {require('../../package.json').version}</Text>
<Text style={styles.aboutInfo}>
Stefans Kommandozentrale f{'\u00FC'}r ARIA.{'\n'}
Gebaut mit React Native + TypeScript.
</Text>
<TouchableOpacity
style={[styles.connectButton, {marginTop: 12}]}
onPress={() => {
const updateService = require('../services/updater').default;
updateService.checkForUpdate();
Alert.alert('Update-Check', 'Pruefe auf neue Version...');
}}
>
<Text style={styles.connectButtonText}>Auf Updates pr{'\u00FC'}fen</Text>
</TouchableOpacity>
</View>
{/* Platz am Ende */}
+168 -31
View File
@@ -42,6 +42,11 @@ const AUDIO_ENCODING = 'audio/wav';
// VAD (Voice Activity Detection) — Stille-Erkennung
const VAD_SILENCE_THRESHOLD_DB = -45; // dB unter dem als "Stille" gilt
const VAD_SILENCE_DURATION_MS = 1800; // ms Stille bevor Auto-Stop
const VAD_SPEECH_THRESHOLD_DB = -28; // dB ueber dem als "Sprache" gilt (Sprach-Gate) — hoeher = weniger Umgebungsgeraeusche
const VAD_SPEECH_MIN_MS = 500; // ms Sprache bevor Aufnahme zaehlt — laenger = keine Huestler/Klopfer mehr
// Max-Dauer einer Aufnahme in Gespraechsmodus (Notbremse gegen Runaway-Loops)
const MAX_RECORDING_MS = 30000;
// --- Audio-Service ---
@@ -55,10 +60,21 @@ class AudioService {
private recorder: AudioRecorderPlayer;
private recordingPath: string = '';
// Audio-Queue fuer sequentielle TTS-Wiedergabe
private audioQueue: string[] = [];
private isPlaying: boolean = false;
private preloadedSound: Sound | null = null;
private preloadedPath: string = '';
// Sprach-Gate: Aufnahme erst senden wenn tatsaechlich gesprochen wurde
private speechDetected: boolean = false;
private speechStartTime: number = 0;
// VAD State
private vadEnabled: boolean = false;
private lastSpeechTime: number = 0;
private vadTimer: ReturnType<typeof setInterval> | null = null;
private maxDurationTimer: ReturnType<typeof setTimeout> | null = null;
constructor() {
this.recorder = new AudioRecorderPlayer();
@@ -108,6 +124,10 @@ class AudioService {
// Laufende Wiedergabe stoppen (damit ARIA sich nicht selbst hoert)
this.stopPlayback();
// Aufraeumen: Alte aria_recording_ und aria_tts_ Files loeschen
// (Schutz gegen Cache-Ueberlauf im Gespraechsmodus bei vielen Zyklen)
this._cleanupStaleCacheFiles().catch(() => {});
this.recordingPath = `${RNFS.CachesDirectoryPath}/aria_recording_${Date.now()}.mp4`;
// Aufnahme mit Metering starten
@@ -115,6 +135,8 @@ class AudioService {
AudioEncoderAndroid: AudioEncoderAndroidType.AAC,
AudioSourceAndroid: AudioSourceAndroidType.MIC,
OutputFormatAndroid: OutputFormatAndroidType.MPEG_4,
AudioSamplingRateAndroid: 16000,
AudioChannelsAndroid: 1,
}, true); // meteringEnabled = true
// Metering-Callback
@@ -122,7 +144,21 @@ class AudioService {
const db = e.currentMetering ?? -160;
this.meterListeners.forEach(cb => cb(db));
// VAD: Stille erkennen
// Sprach-Gate: Erkennen ob tatsaechlich gesprochen wird
if (db > VAD_SPEECH_THRESHOLD_DB) {
if (!this.speechDetected && this.speechStartTime === 0) {
this.speechStartTime = Date.now();
}
if (this.speechStartTime > 0 && Date.now() - this.speechStartTime >= VAD_SPEECH_MIN_MS) {
this.speechDetected = true;
}
} else {
if (!this.speechDetected) {
this.speechStartTime = 0; // Reset wenn noch nicht als Sprache erkannt
}
}
// VAD: Stille erkennen (nur wenn Sprache erkannt wurde)
if (this.vadEnabled) {
if (db > VAD_SILENCE_THRESHOLD_DB) {
this.lastSpeechTime = Date.now();
@@ -132,6 +168,8 @@ class AudioService {
this.recordingStartTime = Date.now();
this.lastSpeechTime = Date.now();
this.speechDetected = false;
this.speechStartTime = 0;
this.setState('recording');
// VAD aktivieren
@@ -144,6 +182,11 @@ class AudioService {
this.silenceListeners.forEach(cb => cb());
}
}, 200);
// Notbremse: Nach MAX_RECORDING_MS zwangsweise stoppen
this.maxDurationTimer = setTimeout(() => {
console.warn(`[Audio] Max-Dauer ${MAX_RECORDING_MS}ms erreicht — Zwangs-Stop`);
this.silenceListeners.forEach(cb => cb());
}, MAX_RECORDING_MS);
}
console.log('[Audio] Aufnahme gestartet (autoStop: %s)', autoStop);
@@ -168,12 +211,25 @@ class AudioService {
clearInterval(this.vadTimer);
this.vadTimer = null;
}
if (this.maxDurationTimer) {
clearTimeout(this.maxDurationTimer);
this.maxDurationTimer = null;
}
try {
await this.recorder.stopRecorder();
this.recorder.removeRecordBackListener();
const durationMs = Date.now() - this.recordingStartTime;
const hadSpeech = this.speechDetected;
// Sprach-Gate: Wenn keine Sprache erkannt → Aufnahme verwerfen
if (!hadSpeech) {
RNFS.unlink(this.recordingPath).catch(() => {});
this.setState('idle');
console.log('[Audio] Aufnahme verworfen — keine Sprache erkannt (nur Umgebungsgeraeusche)');
return null;
}
// Audio-Datei als Base64 lesen
const base64Data = await RNFS.readFile(this.recordingPath, 'base64');
@@ -182,7 +238,7 @@ class AudioService {
RNFS.unlink(this.recordingPath).catch(() => {});
this.setState('idle');
console.log(`[Audio] Aufnahme beendet (${durationMs}ms, ${Math.round(base64Data.length / 1024)}KB)`);
console.log(`[Audio] Aufnahme beendet (${durationMs}ms, ${Math.round(base64Data.length / 1024)}KB, Sprache erkannt)`);
return {
base64: base64Data,
@@ -198,47 +254,110 @@ class AudioService {
// --- Wiedergabe ---
/** Base64-kodiertes Audio abspielen (z.B. TTS-Antwort von ARIA) */
/** Base64-kodiertes Audio in die Queue stellen und abspielen */
async playAudio(base64Data: string): Promise<void> {
if (!base64Data) return;
// Laufende Wiedergabe stoppen
this.stopPlayback();
try {
// Base64 -> temporaere WAV-Datei -> Sound abspielen
const tmpPath = `${RNFS.CachesDirectoryPath}/aria_tts_${Date.now()}.wav`;
await RNFS.writeFile(tmpPath, base64Data, 'base64');
this.currentSound = new Sound(tmpPath, '', (error) => {
if (error) {
console.error('[Audio] Fehler beim Laden:', error);
RNFS.unlink(tmpPath).catch(() => {});
return;
}
this.currentSound?.play((success) => {
if (success) {
console.log('[Audio] Wiedergabe abgeschlossen');
} else {
console.warn('[Audio] Wiedergabe fehlgeschlagen');
}
this.currentSound?.release();
this.currentSound = null;
RNFS.unlink(tmpPath).catch(() => {});
});
});
} catch (err) {
console.error('[Audio] Wiedergabefehler:', err);
this.audioQueue.push(base64Data);
if (!this.isPlaying) {
this._playNext();
}
}
/** Laufende Wiedergabe stoppen */
// Callback wenn alle Audio-Teile abgespielt sind
private playbackFinishedListeners: (() => void)[] = [];
onPlaybackFinished(callback: () => void): () => void {
this.playbackFinishedListeners.push(callback);
return () => {
this.playbackFinishedListeners = this.playbackFinishedListeners.filter(cb => cb !== callback);
};
}
/** Naechstes Audio aus der Queue abspielen */
private async _playNext(): Promise<void> {
if (this.audioQueue.length === 0) {
this.isPlaying = false;
// Alle Audio-Teile abgespielt → Listener benachrichtigen
this.playbackFinishedListeners.forEach(cb => cb());
return;
}
this.isPlaying = true;
// Preloaded Sound verwenden wenn verfuegbar, sonst neu laden
let sound: Sound;
let soundPath: string;
if (this.preloadedSound) {
sound = this.preloadedSound;
soundPath = this.preloadedPath;
this.preloadedSound = null;
this.preloadedPath = '';
// Daten aus Queue entfernen (wurde schon preloaded)
this.audioQueue.shift();
} else {
const base64Data = this.audioQueue.shift()!;
try {
soundPath = `${RNFS.CachesDirectoryPath}/aria_tts_${Date.now()}.wav`;
await RNFS.writeFile(soundPath, base64Data, 'base64');
sound = await new Promise<Sound>((resolve, reject) => {
const s = new Sound(soundPath, '', (err) => err ? reject(err) : resolve(s));
});
} catch (err) {
console.error('[Audio] Laden fehlgeschlagen:', err);
this._playNext();
return;
}
}
this.currentSound = sound;
// Naechstes Audio schon vorbereiten waehrend dieses abspielt
this._preloadNext();
sound.play((success) => {
if (!success) console.warn('[Audio] Wiedergabe fehlgeschlagen');
sound.release();
this.currentSound = null;
RNFS.unlink(soundPath).catch(() => {});
this._playNext();
});
}
/** Naechstes Audio im Hintergrund vorladen (verhindert Stottern) */
private async _preloadNext(): Promise<void> {
if (this.audioQueue.length === 0 || this.preloadedSound) return;
const base64Data = this.audioQueue[0]; // Nicht shift — bleibt in Queue
try {
const tmpPath = `${RNFS.CachesDirectoryPath}/aria_tts_pre_${Date.now()}.wav`;
await RNFS.writeFile(tmpPath, base64Data, 'base64');
this.preloadedSound = await new Promise<Sound>((resolve, reject) => {
const s = new Sound(tmpPath, '', (err) => err ? reject(err) : resolve(s));
});
this.preloadedPath = tmpPath;
} catch {
this.preloadedSound = null;
this.preloadedPath = '';
}
}
/** Laufende Wiedergabe stoppen + Queue leeren */
stopPlayback(): void {
this.audioQueue = [];
this.isPlaying = false;
if (this.currentSound) {
this.currentSound.stop();
this.currentSound.release();
this.currentSound = null;
}
if (this.preloadedSound) {
this.preloadedSound.release();
this.preloadedSound = null;
if (this.preloadedPath) RNFS.unlink(this.preloadedPath).catch(() => {});
this.preloadedPath = '';
}
}
// --- Status & Callbacks ---
@@ -277,6 +396,24 @@ class AudioService {
this.stateListeners.forEach(cb => cb(state));
}
}
/** Alte Aufnahme- und TTS-Files aus dem Cache loeschen (>30s alt). */
private async _cleanupStaleCacheFiles(): Promise<void> {
try {
const files = await RNFS.readDir(RNFS.CachesDirectoryPath);
const now = Date.now();
for (const f of files) {
if (!f.isFile()) continue;
if (!f.name.startsWith('aria_recording_') && !f.name.startsWith('aria_tts_')) continue;
const age = now - (f.mtime ? f.mtime.getTime() : 0);
if (age > 30000) {
await RNFS.unlink(f.path).catch(() => {});
}
}
} catch {
// silent — cleanup ist best-effort
}
}
}
// Singleton
+1 -1
View File
@@ -12,7 +12,7 @@ import AsyncStorage from '@react-native-async-storage/async-storage';
export type ConnectionState = 'connecting' | 'connected' | 'disconnected';
export type MessageType = 'chat' | 'audio' | 'file' | 'location' | 'mode' | 'log' | 'event';
export type MessageType = 'chat' | 'audio' | 'file' | 'location' | 'mode' | 'log' | 'event' | 'update_available' | string;
export interface RVSMessage {
type: MessageType;
+158
View File
@@ -0,0 +1,158 @@
/**
* Auto-Update Service — prueft und installiert App-Updates via RVS
*
* Flow:
* 1. App sendet "update_check" mit aktueller Version an RVS
* 2. RVS vergleicht → sendet "update_available" mit Download-URL
* 3. App zeigt Benachrichtigung → User bestaetigt → Download + Install
*/
import { Alert, Linking, Platform, NativeModules } from 'react-native';
import RNFS from 'react-native-fs';
import rvs, { RVSMessage } from './rvs';
// Version aus package.json (wird beim Build eingebettet)
const packageJson = require('../../package.json');
const APP_VERSION = packageJson.version || '0.0.0.0';
type UpdateCallback = (info: UpdateInfo) => void;
export interface UpdateInfo {
version: string;
downloadUrl: string;
size: number;
}
class UpdateService {
private listeners: UpdateCallback[] = [];
private checking = false;
private downloading = false;
constructor() {
// Auf update_available Nachrichten lauschen
rvs.onMessage((msg: RVSMessage) => {
if (msg.type === 'update_available' as any) {
const info: UpdateInfo = {
version: (msg.payload.version as string) || '',
downloadUrl: (msg.payload.downloadUrl as string) || '',
size: (msg.payload.size as number) || 0,
};
if (info.version && this.isNewer(info.version)) {
console.log(`[Update] Neue Version verfuegbar: ${info.version} (aktuell: ${APP_VERSION})`);
this.listeners.forEach(cb => cb(info));
}
}
});
}
/** Bei App-Start Update pruefen */
checkForUpdate(): void {
if (this.checking) return;
this.checking = true;
console.log(`[Update] Pruefe auf Updates (aktuell: ${APP_VERSION})`);
rvs.send('update_check' as any, { version: APP_VERSION });
setTimeout(() => { this.checking = false; }, 10000);
}
/** Callback registrieren */
onUpdateAvailable(callback: UpdateCallback): () => void {
this.listeners.push(callback);
return () => {
this.listeners = this.listeners.filter(cb => cb !== callback);
};
}
/** Update-Dialog anzeigen */
promptUpdate(info: UpdateInfo): void {
const sizeMB = (info.size / 1024 / 1024).toFixed(1);
Alert.alert(
'ARIA Update verfuegbar',
`Version ${info.version} (${sizeMB} MB)\n\nAktuell: ${APP_VERSION}\n\nJetzt herunterladen und installieren?`,
[
{ text: 'Spaeter', style: 'cancel' },
{
text: 'Installieren',
onPress: () => this.downloadAndInstall(info),
},
],
);
}
/** APK ueber WebSocket herunterladen und installieren */
async downloadAndInstall(info: UpdateInfo): Promise<void> {
if (this.downloading) return;
this.downloading = true;
try {
console.log(`[Update] Fordere APK v${info.version} an...`);
Alert.alert('Download gestartet', `Version ${info.version} wird ueber RVS heruntergeladen...`);
// APK ueber WebSocket anfordern
rvs.send('update_download' as any, {});
// Auf update_data warten (einmalig)
const apkData = await new Promise<{base64: string, fileName: string}>((resolve, reject) => {
const timeout = setTimeout(() => reject(new Error('Download-Timeout (60s)')), 60000);
const unsub = rvs.onMessage((msg: RVSMessage) => {
if ((msg.type as string) === 'update_data') {
clearTimeout(timeout);
unsub();
if (msg.payload.error) {
reject(new Error(msg.payload.error as string));
} else {
resolve({
base64: msg.payload.base64 as string,
fileName: msg.payload.fileName as string || `ARIA-${info.version}.apk`,
});
}
}
});
});
// Base64 als APK-Datei speichern
const destPath = `${RNFS.CachesDirectoryPath}/${apkData.fileName}`;
await RNFS.writeFile(destPath, apkData.base64, 'base64');
const fileSize = await RNFS.stat(destPath);
console.log(`[Update] APK gespeichert: ${destPath} (${(parseInt(fileSize.size) / 1024 / 1024).toFixed(1)}MB)`);
// APK installieren via natives ApkInstaller Module (FileProvider + Intent)
if (Platform.OS === 'android') {
try {
const { ApkInstaller } = NativeModules;
await ApkInstaller.install(destPath);
} catch (installErr: any) {
Alert.alert(
'APK heruntergeladen',
`Version ${info.version} gespeichert.\n\nBitte manuell installieren:\nDateimanager → ${apkData.fileName} antippen.\n\n(${installErr.message})`,
);
}
}
} catch (err: any) {
console.error(`[Update] Fehler: ${err.message}`);
Alert.alert('Update fehlgeschlagen', err.message);
} finally {
this.downloading = false;
}
}
/** Versionsvergleich */
private isNewer(remote: string): boolean {
const r = remote.split('.').map(Number);
const l = APP_VERSION.split('.').map(Number);
for (let i = 0; i < Math.max(r.length, l.length); i++) {
const diff = (r[i] || 0) - (l[i] || 0);
if (diff > 0) return true;
if (diff < 0) return false;
}
return false;
}
getCurrentVersion(): string {
return APP_VERSION;
}
}
const updateService = new UpdateService();
export default updateService;
+30 -90
View File
@@ -1,21 +1,13 @@
/**
* Wake Word Service — "ARIA" Erkennung
* Gespraechsmodus — "Ohr-Button"
*
* Nutzt react-native-live-audio-stream fuer kontinuierliches Mikrofon-Monitoring.
* Erkennt Sprache per Energie-Schwellwert und sendet kurze Audio-Clips
* zur serverseitigen Wake-Word-Pruefung (openwakeword in der Bridge).
* Wenn aktiv: Nach jeder ARIA-Antwort (TTS fertig) startet automatisch die Aufnahme.
* Wie ein Walkie-Talkie / natuerliches Gespraech:
* ARIA spricht → Aufnahme startet → User spricht → VAD stoppt → ARIA antwortet → ...
*
* Architektur:
* App (Mikrofon) → Energie-Erkennung → Audio-Buffer
* → RVS "wake_check" → Bridge → openwakeword → Bestaetigung
* → App startet Aufnahme
*
* Aktuell (Phase 1): Einfacher Tap-to-Talk + Auto-Stop.
* Spaeter (Phase 2): Porcupine on-device "ARIA" Keyword.
* Phase 2 (geplant): Porcupine "ARIA" Wake Word fuer passives Lauschen.
*/
import LiveAudioStream from 'react-native-live-audio-stream';
type WakeWordCallback = () => void;
type StateCallback = (state: WakeWordState) => void;
@@ -25,72 +17,40 @@ class WakeWordService {
private state: WakeWordState = 'off';
private wakeCallbacks: WakeWordCallback[] = [];
private stateCallbacks: StateCallback[] = [];
private isInitialized = false;
/** Wake Word Erkennung starten */
/** Gespraechsmodus starten */
async start(): Promise<boolean> {
if (this.state === 'listening') return true;
try {
if (!this.isInitialized) {
LiveAudioStream.init({
sampleRate: 16000,
channels: 1,
bitsPerSample: 16,
audioSource: 6, // VOICE_RECOGNITION
bufferSize: 4096,
});
this.isInitialized = true;
console.log('[WakeWord] Gespraechsmodus aktiviert — starte sofort Aufnahme');
this.setState('listening');
// Sofort erste Aufnahme starten
setTimeout(() => {
if (this.state === 'listening') {
this.wakeCallbacks.forEach(cb => cb());
}
}, 500);
return true;
}
// Audio-Stream starten und auf Energie pruefen
LiveAudioStream.start();
/** Gespraechsmodus stoppen */
stop(): void {
console.log('[WakeWord] Gespraechsmodus deaktiviert');
this.setState('off');
}
LiveAudioStream.on('data', (base64Chunk: string) => {
if (this.state !== 'listening') return;
// Base64 → Int16 Array → RMS berechnen
const raw = this._base64ToInt16(base64Chunk);
const rms = this._calculateRMS(raw);
// Schwellwert: wenn laut genug → Wake Word erkannt
// Phase 1: Einfache Energie-Erkennung (jemand spricht)
// Phase 2: Porcupine "ARIA" Keyword
if (rms > 2000) {
this.setState('detected');
this.wakeCallbacks.forEach(cb => cb());
// Nach Detection kurz pausieren, Aufnahme uebernimmt das Mikrofon
this.stop();
}
});
this.setState('listening');
console.log('[WakeWord] Listening gestartet');
return true;
} catch (err) {
console.error('[WakeWord] Start fehlgeschlagen:', err);
return false;
/** Nach ARIA-Antwort (TTS fertig): Aufnahme automatisch starten */
async resume(): Promise<void> {
if (this.state !== 'listening') return;
// Kurze Pause damit TTS-Audio nicht ins Mikrofon geht
await new Promise(resolve => setTimeout(resolve, 800));
if (this.state === 'listening') {
console.log('[WakeWord] TTS fertig — starte automatisch Aufnahme');
this.wakeCallbacks.forEach(cb => cb());
}
}
/** Wake Word Erkennung stoppen */
stop(): void {
if (this.state === 'off') return;
try {
LiveAudioStream.stop();
} catch {}
this.setState('off');
console.log('[WakeWord] Gestoppt');
}
/** Nach Aufnahme erneut starten */
async resume(): Promise<void> {
// Kurze Pause damit Aufnahme das Mikrofon freigeben kann
setTimeout(() => {
if (this.state === 'off') {
this.start();
}
}, 500);
isActive(): boolean {
return this.state === 'listening';
}
// --- Callbacks ---
@@ -113,32 +73,12 @@ class WakeWordService {
return this.state;
}
// --- Hilfsfunktionen ---
private setState(state: WakeWordState): void {
if (this.state !== state) {
this.state = state;
this.stateCallbacks.forEach(cb => cb(state));
}
}
private _base64ToInt16(base64: string): Int16Array {
const binary = atob(base64);
const bytes = new Uint8Array(binary.length);
for (let i = 0; i < binary.length; i++) {
bytes[i] = binary.charCodeAt(i);
}
return new Int16Array(bytes.buffer);
}
private _calculateRMS(samples: Int16Array): number {
if (samples.length === 0) return 0;
let sum = 0;
for (let i = 0; i < samples.length; i++) {
sum += samples[i] * samples[i];
}
return Math.sqrt(sum / samples.length);
}
}
const wakeWordService = new WakeWordService();
+7
View File
@@ -9,3 +9,10 @@ PIPER_THORSTEN=/voices/de_DE-thorsten-high.onnx
# Wake-Word
WAKE_WORD=aria
# Whisper STT — wird zur Laufzeit in der Diagnostic (Sektion "Whisper") umgeschaltet
# und in /shared/config/voice_config.json gespeichert. Der Wert hier ist nur der
# Initial-Default beim ersten Start.
# Optionen: tiny | base | small | medium | large-v3
WHISPER_MODEL=medium
WHISPER_LANGUAGE=de
+255 -24
View File
@@ -38,6 +38,7 @@ import websockets
from faster_whisper import WhisperModel
from openwakeword.model import Model as WakeWordModel
from piper import PiperVoice
from piper.config import SynthesisConfig
from modes import Mode, detect_mode_switch, should_speak
@@ -62,7 +63,7 @@ RVS_TLS = os.getenv("RVS_TLS", "true") # true = wss://, false = ws://
RVS_TLS_FALLBACK = os.getenv("RVS_TLS_FALLBACK", "true") # Bei TLS-Fehler ws:// versuchen
RVS_TOKEN = os.getenv("RVS_TOKEN", "") # Pairing-Token (gleich wie in der App)
DIAGNOSTIC_URL = os.getenv("DIAGNOSTIC_URL", "http://127.0.0.1:3001") # Diagnostic API
WHISPER_MODEL = os.getenv("WHISPER_MODEL", "small")
WHISPER_MODEL = os.getenv("WHISPER_MODEL", "medium")
WHISPER_LANGUAGE = os.getenv("WHISPER_LANGUAGE", "de")
# Audio-Parameter
@@ -131,6 +132,7 @@ class VoiceEngine:
self.voices: dict[str, PiperVoice] = {}
self.default_voice = "ramona"
self.highlight_voice = "thorsten"
self.speech_speed = {"ramona": 1.0, "thorsten": 1.0}
def initialize(self) -> None:
"""Laedt die Piper-Stimmen aus dem Voices-Verzeichnis."""
@@ -199,11 +201,23 @@ class VoiceEngine:
return None
try:
# Langen Text in Saetze aufteilen (Piper hat Limits bei langen Texten)
# Markdown + Sonderzeichen entfernen fuer natuerliche Sprache
import re
sentences = re.split(r'(?<=[.!?])\s+', text.strip())
# Markdown-Formatierung entfernen
sentences = [re.sub(r'\*\*([^*]+)\*\*', r'\1', s).strip() for s in sentences if s.strip()]
clean = text.strip()
clean = re.sub(r'\*\*([^*]+)\*\*', r'\1', clean) # **fett**
clean = re.sub(r'\*([^*]+)\*', r'\1', clean) # *kursiv*
clean = re.sub(r'`[^`]+`', '', clean) # `code`
clean = re.sub(r'```[\s\S]*?```', '', clean) # Code-Bloecke
clean = re.sub(r'\[([^\]]+)\]\([^)]+\)', r'\1', clean) # [text](url)
clean = re.sub(r'#{1,6}\s*', '', clean) # ### Ueberschriften
clean = re.sub(r'>\s*', '', clean) # > Zitate
clean = re.sub(r'[-*]\s+', '', clean) # Listen
clean = re.sub(r'\n{2,}', '. ', clean) # Absaetze
clean = re.sub(r'\n', ', ', clean) # Zeilenumbrueche
clean = re.sub(r'\s{2,}', ' ', clean) # Mehrfach-Leerzeichen
clean = re.sub(r'["""„]', '', clean) # Anfuehrungszeichen
sentences = re.split(r'(?<=[.!?])\s+', clean)
sentences = [s.strip() for s in sentences if s.strip()]
if not sentences:
return None
@@ -216,8 +230,10 @@ class VoiceEngine:
continue
with tempfile.NamedTemporaryFile(suffix=".wav", delete=False) as tmp:
tmp_path = tmp.name
speed = self.speech_speed.get(voice_name, 1.0)
syn_config = SynthesisConfig(length_scale=1.0 / max(0.3, speed))
with wave.open(tmp_path, "wb") as wav_file:
voice.synthesize_wav(sentence, wav_file)
voice.synthesize_wav(sentence, wav_file, syn_config=syn_config)
with wave.open(tmp_path, "rb") as wav_file:
if sample_rate is None:
sample_rate = wav_file.getframerate()
@@ -314,6 +330,25 @@ class STTEngine:
self.model = WhisperModel(self.model_size, device="cpu", compute_type="int8")
logger.info("Whisper-Modell geladen")
def reload(self, model_size: str) -> bool:
"""Laedt ein anderes Whisper-Modell (bei Config-Aenderung)."""
if model_size == self.model_size and self.model is not None:
return False
allowed = {"tiny", "base", "small", "medium", "large-v3"}
if model_size not in allowed:
logger.warning("Ungueltiges Whisper-Modell: %s (erlaubt: %s)", model_size, allowed)
return False
logger.info("Lade Whisper-Modell neu: %s -> %s", self.model_size, model_size)
self.model_size = model_size
self.model = None
try:
self.model = WhisperModel(model_size, device="cpu", compute_type="int8")
logger.info("Whisper-Modell '%s' geladen", model_size)
return True
except Exception:
logger.exception("Whisper-Modell '%s' konnte nicht geladen werden", model_size)
return False
def transcribe(self, audio_data: np.ndarray) -> str:
"""Transkribiert Audio-Daten zu Text.
@@ -486,6 +521,7 @@ class ARIABridge:
# Komponenten
self.voice_engine = VoiceEngine(VOICES_DIR)
self.tts_enabled = True
vc: dict = {}
# Gespeicherte Voice-Config laden
try:
vc_path = "/shared/config/voice_config.json"
@@ -494,12 +530,20 @@ class ARIABridge:
vc = json.load(f)
self.voice_engine.default_voice = vc.get("defaultVoice", "ramona")
self.voice_engine.highlight_voice = vc.get("highlightVoice", "thorsten")
self.voice_engine.speech_speed = {
"ramona": vc.get("speedRamona", 1.0),
"thorsten": vc.get("speedThorsten", 1.0),
}
self.tts_enabled = vc.get("ttsEnabled", True)
self.tts_engine_type = vc.get("ttsEngine", "piper")
self.xtts_voice = vc.get("xttsVoice", "")
logger.info("Voice-Config geladen: %s", vc)
except Exception as e:
logger.warning("Voice-Config laden fehlgeschlagen: %s", e)
# Whisper-Modell: Config hat Vorrang, dann env/Default (medium)
whisper_model = vc.get("whisperModel") or self.config.get("WHISPER_MODEL", WHISPER_MODEL)
self.stt_engine = STTEngine(
model_size=self.config.get("WHISPER_MODEL", WHISPER_MODEL),
model_size=whisper_model,
language=self.config.get("WHISPER_LANGUAGE", WHISPER_LANGUAGE),
)
self.wake_word = WakeWordDetector()
@@ -508,6 +552,12 @@ class ARIABridge:
self.ws_core: Optional[websockets.WebSocketClientProtocol] = None
self.ws_rvs: Optional[websockets.WebSocketClientProtocol] = None
# Letzter gesendeter agent_activity-State (zum Entduplizieren)
self._last_activity_state: Optional[tuple] = None
# Zeitstempel des letzten chat:final — waehrend 3s danach werden
# trailing Agent-Events unterdrueckt (Core raeumt manchmal nach).
self._last_chat_final_at: float = 0.0
def initialize(self) -> None:
"""Initialisiert alle Komponenten.
@@ -522,19 +572,20 @@ class ARIABridge:
# Voice-Engine IMMER laden — rendert Audio fuer die App (auch ohne Soundkarte)
self.voice_engine.initialize()
# STT IMMER laden — verarbeitet Audio von der App (braucht kein Sounddevice)
self.stt_engine.initialize()
# Audio-Hardware pruefen (fuer lokales Mikro/Lautsprecher)
self.audio_available = False
try:
devices = sd.query_devices()
# Pruefen ob ein Output-Device existiert
sd.query_devices(kind='output')
self.audio_available = True
logger.info("Audio-Geraet gefunden — Wake-Word und lokale TTS aktiv")
self.stt_engine.initialize()
self.wake_word.initialize()
except (sd.PortAudioError, Exception):
logger.warning("Kein Audio-Geraet — Wake-Word und lokale TTS deaktiviert")
logger.info("Piper TTS rendert Audio fuer die App (via RVS)")
logger.warning("Kein Audio-Geraet — Wake-Word und lokale Wiedergabe deaktiviert")
logger.info("TTS rendert fuer App (via RVS), STT verarbeitet App-Audio")
logger.info("Alle Komponenten initialisiert")
logger.info("aria-core: %s", self.ws_url)
@@ -711,8 +762,18 @@ class ARIABridge:
if event_name == "agent":
data = payload.get("data", {})
delta = data.get("delta", "")
if delta and payload.get("stream") == "assistant":
stream = payload.get("stream", "")
if delta and stream == "assistant":
logger.debug("[core] Delta: '%s'", delta[:40])
# Activity-Signal zur App (entdupliziert)
tool_name = data.get("name") or data.get("tool") or payload.get("tool") or ""
if stream == "tool_use" or data.get("type") == "tool_use":
activity = "tool"
elif stream == "assistant":
activity = "assistant"
else:
activity = "thinking"
await self._emit_activity(activity, tool_name)
return
# ── chat Events: Snapshots mit state=delta|final|error ──
@@ -721,6 +782,8 @@ class ARIABridge:
if state == "final":
text = self._extract_chat_text(payload)
self._last_chat_final_at = asyncio.get_event_loop().time()
await self._emit_activity("idle", "")
if not text:
logger.warning("[core] chat final ohne Text: %s", json.dumps(payload)[:200])
return
@@ -731,6 +794,8 @@ class ARIABridge:
if state == "error":
error = payload.get("error", "Unbekannt")
logger.error("[core] Chat-Fehler: %s", error)
self._last_chat_final_at = asyncio.get_event_loop().time()
await self._emit_activity("idle", "")
await self._send_to_rvs({
"type": "chat",
"payload": {
@@ -837,17 +902,47 @@ class ARIABridge:
# TTS-Audio rendern und an die App senden (wenn Modus es erlaubt)
if getattr(self, 'tts_enabled', True) and should_speak(self.current_mode, is_critical):
audio_data = self.voice_engine.synthesize(text, voice_name)
if audio_data:
audio_b64 = base64.b64encode(audio_data).decode("ascii")
await self._send_to_rvs({
"type": "audio",
"payload": {
"base64": audio_b64,
"mimeType": "audio/wav",
"voice": voice_name,
},
"timestamp": int(asyncio.get_event_loop().time() * 1000),
tts_engine = getattr(self, 'tts_engine_type', 'piper')
if tts_engine == "xtts":
# XTTS: Ganzen Text senden, XTTS-Bridge teilt satzweise auf
xtts_voice = getattr(self, 'xtts_voice', '')
try:
await self._send_to_rvs({
"type": "xtts_request",
"payload": {
"text": text,
"voice": xtts_voice,
"language": "de",
"requestId": str(uuid.uuid4()),
},
"timestamp": int(asyncio.get_event_loop().time() * 1000),
})
logger.info("[core] XTTS-Request gesendet (%s): '%s'", xtts_voice or "default", text[:60])
except Exception as e:
logger.warning("[core] XTTS-Request fehlgeschlagen: %s — Fallback auf Piper", e)
# Fallback auf Piper
audio_data = self.voice_engine.synthesize(text, voice_name)
if audio_data:
audio_b64 = base64.b64encode(audio_data).decode("ascii")
await self._send_to_rvs({
"type": "audio",
"payload": {"base64": audio_b64, "mimeType": "audio/wav", "voice": voice_name},
"timestamp": int(asyncio.get_event_loop().time() * 1000),
})
else:
# Piper: Lokal rendern
audio_data = self.voice_engine.synthesize(text, voice_name)
if audio_data:
audio_b64 = base64.b64encode(audio_data).decode("ascii")
await self._send_to_rvs({
"type": "audio",
"payload": {
"base64": audio_b64,
"mimeType": "audio/wav",
"voice": voice_name,
},
"timestamp": int(asyncio.get_event_loop().time() * 1000),
})
logger.info("[core] TTS-Audio gesendet: %d bytes (%s)", len(audio_data), voice_name)
@@ -1004,6 +1099,61 @@ class ARIABridge:
sender = payload.get("sender", "")
if sender in ("aria", "stt"):
return
text = payload.get("text", "")
if text:
logger.info("[rvs] App-Chat: '%s'", text[:80])
await self.send_to_core(text, source="app")
return
if msg_type == "cancel_request":
logger.info("[rvs] Cancel-Request von App — rufe Diagnostic /api/cancel auf")
await self._cancel_via_diagnostic()
await self._emit_activity("idle", "")
return
elif msg_type == "xtts_response":
# XTTS-Audio vom Gaming-PC empfangen → an App weiterleiten
audio_b64 = payload.get("base64", "")
error = payload.get("error", "")
if error:
logger.warning("[rvs] XTTS Fehler: %s", error)
return
if audio_b64:
logger.info("[rvs] XTTS-Audio empfangen: %dKB", len(audio_b64) // 1365)
await self._send_to_rvs({
"type": "audio",
"payload": {
"base64": audio_b64,
"mimeType": payload.get("mimeType", "audio/wav"),
"voice": payload.get("voice", "xtts"),
},
"timestamp": int(asyncio.get_event_loop().time() * 1000),
})
return
elif msg_type == "tts_request":
# App fordert TTS-Audio fuer einen Text an (Play-Button)
text = payload.get("text", "")
requested_voice = payload.get("voice", "")
if text:
voice_name = requested_voice or self.voice_engine.select_voice(text)
audio_data = self.voice_engine.synthesize(text, voice_name)
if audio_data:
audio_b64 = base64.b64encode(audio_data).decode("ascii")
try:
await self._send_to_rvs({
"type": "audio",
"payload": {
"base64": audio_b64,
"mimeType": "audio/wav",
"voice": voice_name,
},
"timestamp": int(asyncio.get_event_loop().time() * 1000),
})
logger.info("[rvs] TTS on-demand: %d bytes (%s)", len(audio_data), voice_name)
except Exception as e:
logger.warning("[rvs] TTS on-demand senden fehlgeschlagen: %s", e)
return
elif msg_type == "config":
# Konfiguration von App/Diagnostic empfangen + persistent speichern
@@ -1024,6 +1174,31 @@ class ARIABridge:
self.tts_enabled = bool(payload["ttsEnabled"])
logger.info("[rvs] TTS %s", "aktiviert" if self.tts_enabled else "deaktiviert")
changed = True
if "ttsEngine" in payload:
self.tts_engine_type = payload["ttsEngine"]
logger.info("[rvs] TTS-Engine: %s", self.tts_engine_type)
changed = True
if "xttsVoice" in payload:
self.xtts_voice = payload["xttsVoice"]
logger.info("[rvs] XTTS-Stimme: %s", self.xtts_voice)
changed = True
if "speedRamona" in payload:
self.voice_engine.speech_speed["ramona"] = max(0.3, min(2.0, float(payload["speedRamona"])))
logger.info("[rvs] Speed Ramona: %.1f", self.voice_engine.speech_speed["ramona"])
changed = True
if "speedThorsten" in payload:
self.voice_engine.speech_speed["thorsten"] = max(0.3, min(2.0, float(payload["speedThorsten"])))
logger.info("[rvs] Speed Thorsten: %.1f", self.voice_engine.speech_speed["thorsten"])
changed = True
whisper_reloaded = False
if "whisperModel" in payload:
new_model = payload["whisperModel"]
if new_model and new_model != self.stt_engine.model_size:
logger.info("[rvs] Whisper-Modell Wechsel: %s -> %s (laedt...)", self.stt_engine.model_size, new_model)
loop = asyncio.get_event_loop()
whisper_reloaded = await loop.run_in_executor(None, self.stt_engine.reload, new_model)
if whisper_reloaded:
changed = True
# Persistent speichern in Shared Volume
if changed:
try:
@@ -1032,6 +1207,11 @@ class ARIABridge:
"defaultVoice": self.voice_engine.default_voice,
"highlightVoice": self.voice_engine.highlight_voice,
"ttsEnabled": getattr(self, "tts_enabled", True),
"ttsEngine": getattr(self, "tts_engine_type", "piper"),
"xttsVoice": getattr(self, "xtts_voice", ""),
"speedRamona": self.voice_engine.speech_speed.get("ramona", 1.0),
"speedThorsten": self.voice_engine.speech_speed.get("thorsten", 1.0),
"whisperModel": self.stt_engine.model_size,
}
with open("/shared/config/voice_config.json", "w") as f:
json.dump(config_data, f, indent=2)
@@ -1249,10 +1429,24 @@ class ARIABridge:
pass
async def _send_to_rvs(self, message: dict) -> None:
"""Sendet eine Nachricht an die App (via RVS)."""
"""Sendet eine Nachricht an die App (via RVS) mit Verbindungs-Check."""
if self.ws_rvs is None:
return
# Ping-Check: Verbindung wirklich aktiv?
try:
pong = await self.ws_rvs.ping()
await asyncio.wait_for(pong, timeout=5)
except Exception:
logger.warning("[rvs] Ping fehlgeschlagen — Verbindung tot, erzwinge Reconnect")
try:
await self.ws_rvs.close()
except Exception:
pass
self.ws_rvs = None
# Reconnect wird vom connect_to_rvs Loop uebernommen
return
try:
await self.ws_rvs.send(json.dumps(message))
except Exception:
@@ -1260,6 +1454,43 @@ class ARIABridge:
# ── Log-Streaming an die App ─────────────────────────────
async def _cancel_via_diagnostic(self) -> None:
"""Ruft das Diagnostic /api/cancel an — dort laeuft die volle Abbruch-Logik
(openclaw doctor --fix mit Docker-Socket)."""
def _do_request():
try:
req = urllib.request.Request(
f"{self._diagnostic_url}/api/cancel",
method="POST",
data=b"",
)
with urllib.request.urlopen(req, timeout=5) as resp:
return resp.status
except Exception as e:
return f"error: {e}"
status = await asyncio.get_event_loop().run_in_executor(None, _do_request)
logger.info("[cancel] Diagnostic /api/cancel: %s", status)
async def _emit_activity(self, activity: str, tool: str = "") -> None:
"""Sendet agent_activity an die App — nur wenn sich der State geaendert hat.
Trailing Agent-Events nach chat:final werden 3s lang unterdrueckt
(nur 'idle' kommt immer durch)."""
if activity != "idle" and self._last_chat_final_at > 0:
since_final = asyncio.get_event_loop().time() - self._last_chat_final_at
if since_final < 3.0:
return
state = (activity, tool)
if state == self._last_activity_state:
return
self._last_activity_state = state
await self._send_to_rvs({
"type": "agent_activity",
"payload": {"activity": activity, "tool": tool},
"timestamp": int(asyncio.get_event_loop().time() * 1000),
})
async def send_log_to_app(self, source: str, message: str, level: str = "info") -> None:
"""Sendet einen Log-Eintrag an die App (erscheint im Log-Viewer)."""
await self._send_to_rvs({
Executable
+44
View File
@@ -0,0 +1,44 @@
#!/bin/bash
# ARIA Docker Cleanup
#
# Standard: docker builder prune + image prune (sicher, loescht keine Volumes)
# --full: Volle Reinigung inkl. --volumes (Vorsicht bei ungenutzten Volumes!)
#
# Usage:
# ./cleanup.sh # sicherer Cleanup
# ./cleanup.sh --full # aggressiver Cleanup (inkl. Volumes)
set -e
FULL=0
for arg in "$@"; do
case "$arg" in
--full|-f) FULL=1 ;;
-h|--help)
grep '^#' "$0" | sed 's/^# \{0,1\}//'
exit 0
;;
esac
done
echo "── Docker Speicher VOR Cleanup ───────────────────"
docker system df
echo
if [ "$FULL" = "1" ]; then
echo ">>> VOLLE Reinigung (inkl. ungenutzter Volumes)"
read -p "Wirklich? [y/N] " -n 1 -r REPLY
echo
[[ ! $REPLY =~ ^[Yy]$ ]] && { echo "Abgebrochen."; exit 0; }
docker system prune -a --volumes -f
else
echo ">>> Sicherer Cleanup (Build-Cache + ungenutzte Images)"
docker builder prune -a -f
docker image prune -a -f
fi
echo
echo "── Docker Speicher NACH Cleanup ──────────────────"
docker system df
echo
df -h / | head -2
+354 -27
View File
@@ -201,11 +201,18 @@
<button class="btn secondary" onclick="toggleChatFullscreen()" id="btn-chat-fs" style="padding:4px 10px;font-size:11px;">Vollbild</button>
</div>
<div class="chat-box" id="chat-box"></div>
<div id="thinking-indicator" style="display:none;padding:6px 10px;font-size:12px;color:#FFD60A;background:#1E1E2E;border-radius:0 0 6px 6px;margin-top:-8px;margin-bottom:8px;">
<span style="animation:pulse 1s infinite;">&#x1F4AD;</span> <span id="thinking-text">ARIA denkt...</span>
<div id="thinking-indicator" style="display:none;padding:6px 10px;font-size:12px;color:#FFD60A;background:#1E1E2E;border-radius:0 0 6px 6px;margin-top:-8px;margin-bottom:8px;align-items:center;justify-content:space-between;">
<span><span style="animation:pulse 1s infinite;">&#x1F4AD;</span> <span id="thinking-text">ARIA denkt...</span></span>
<button class="btn secondary" onclick="cancelRequest()" style="padding:2px 10px;font-size:11px;color:#FF3B30;border-color:#FF3B30;">Abbrechen</button>
</div>
<div id="diag-pending-attachments" style="display:none;padding:6px 10px;background:#1E1E2E;border-radius:6px 6px 0 0;margin-bottom:-4px;display:flex;gap:6px;flex-wrap:wrap;align-items:center;">
</div>
<div class="input-row">
<input type="text" id="chat-input" placeholder="Nachricht an ARIA...">
<label class="btn secondary" style="padding:6px 10px;cursor:pointer;font-size:14px;" title="Datei anhaengen">
&#x1F4CE;
<input type="file" id="diag-file-input" multiple accept="image/*,application/pdf,.doc,.docx,.txt" style="display:none;" onchange="handleDiagFileSelect(this.files)">
</label>
<input type="text" id="chat-input" placeholder="Nachricht an ARIA..." onpaste="handleDiagPaste(event)">
<button class="btn" id="btn-gw" onclick="testGateway()">Gateway senden</button>
<button class="btn" id="btn-rvs" onclick="testRVS()">Via RVS senden</button>
</div>
@@ -400,6 +407,23 @@
<div class="settings-section">
<h2>Sprachausgabe</h2>
<div class="card" style="max-width:500px;">
<!-- TTS aktiv (global fuer alle Engines) -->
<div style="display:flex;align-items:center;gap:12px;margin-bottom:12px;">
<label style="color:#8888AA;font-size:12px;">TTS aktiv:</label>
<label class="toggle"><input type="checkbox" id="diag-tts-enabled" checked onchange="sendVoiceConfig()"><span class="slider"></span></label>
</div>
<!-- TTS Engine Auswahl -->
<div style="display:flex;align-items:center;gap:12px;margin-bottom:12px;">
<label style="color:#8888AA;font-size:12px;">TTS Engine:</label>
<select id="diag-tts-engine" onchange="sendVoiceConfig();toggleXTTSPanel()" style="background:#1E1E2E;color:#fff;border:1px solid #2A2A3E;border-radius:6px;padding:6px 10px;font-size:13px;">
<option value="piper">Piper (lokal, CPU, schnell)</option>
<option value="xtts">XTTS v2 (remote, GPU, natuerlich)</option>
</select>
</div>
<!-- Piper Stimmen (nur bei Engine=piper) -->
<div id="piper-panel">
<div style="display:flex;align-items:center;gap:12px;margin-bottom:12px;">
<label style="color:#8888AA;font-size:12px;">Standard-Stimme:</label>
<select id="diag-default-voice" onchange="sendVoiceConfig()" style="background:#1E1E2E;color:#fff;border:1px solid #2A2A3E;border-radius:6px;padding:6px 10px;font-size:13px;">
@@ -414,9 +438,87 @@
<option value="ramona">Ramona (weiblich)</option>
</select>
</div>
<div style="display:flex;align-items:center;gap:12px;">
<label style="color:#8888AA;font-size:12px;">TTS aktiv:</label>
<label class="toggle"><input type="checkbox" id="diag-tts-enabled" checked onchange="sendVoiceConfig()"><span class="slider"></span></label>
<div style="margin-bottom:4px;">
<label style="color:#8888AA;font-size:12px;">Ramona Speed: <span id="speed-ramona-label">1.0x</span></label>
</div>
<div style="display:flex;align-items:center;gap:8px;margin-bottom:12px;">
<span style="color:#555570;font-size:11px;">0.5x</span>
<input type="range" id="diag-speed-ramona" min="0.5" max="2.0" step="0.1" value="1.0"
oninput="document.getElementById('speed-ramona-label').textContent=this.value+'x'"
onchange="sendVoiceConfig()"
style="flex:1;accent-color:#0096FF;">
<span style="color:#555570;font-size:11px;">2.0x</span>
</div>
<div style="margin-bottom:4px;">
<label style="color:#8888AA;font-size:12px;">Thorsten Speed: <span id="speed-thorsten-label">1.0x</span></label>
</div>
<div style="display:flex;align-items:center;gap:8px;">
<span style="color:#555570;font-size:11px;">0.5x</span>
<input type="range" id="diag-speed-thorsten" min="0.5" max="2.0" step="0.1" value="1.0"
oninput="document.getElementById('speed-thorsten-label').textContent=this.value+'x'"
onchange="sendVoiceConfig()"
style="flex:1;accent-color:#0096FF;">
<span style="color:#555570;font-size:11px;">2.0x</span>
</div>
</div><!-- /piper-panel -->
<!-- XTTS Panel (nur bei Engine=xtts) -->
<div id="xtts-panel" style="display:none;">
<div style="display:flex;align-items:center;gap:12px;margin-bottom:12px;">
<label style="color:#8888AA;font-size:12px;">XTTS Stimme:</label>
<select id="diag-xtts-voice" onchange="sendVoiceConfig()" style="background:#1E1E2E;color:#fff;border:1px solid #2A2A3E;border-radius:6px;padding:6px 10px;font-size:13px;">
<option value="">Standard (XTTS Default)</option>
</select>
<button class="btn secondary" onclick="loadXTTSVoices()" style="padding:4px 10px;font-size:11px;">Laden</button>
</div>
<!-- Voice Cloning -->
<div style="background:#1E1E2E;border-radius:8px;padding:12px;margin-top:8px;">
<div style="color:#0096FF;font-size:13px;font-weight:600;margin-bottom:8px;">Stimme klonen</div>
<div style="color:#8888AA;font-size:11px;margin-bottom:8px;">
Lade ein oder mehrere Audio-Samples hoch (WAV/MP3, min. 6-10 Sekunden).
Mehrere Dateien werden automatisch zusammengefuegt.
</div>
<div style="margin-bottom:8px;">
<input type="text" id="xtts-clone-name" placeholder="Name fuer die Stimme..." style="background:#0D0D1A;border:1px solid #2A2A3E;border-radius:6px;padding:6px 10px;color:#fff;font-size:13px;width:100%;box-sizing:border-box;">
</div>
<div style="margin-bottom:8px;">
<input type="file" id="xtts-clone-files" accept="audio/*" multiple style="color:#8888AA;font-size:12px;">
</div>
<div style="display:flex;gap:8px;">
<button class="btn" onclick="uploadVoiceSamples()" style="flex:1;">Stimme erstellen</button>
</div>
<div id="xtts-clone-status" style="font-size:11px;color:#555570;margin-top:6px;"></div>
</div>
<!-- XTTS Status -->
<div style="margin-top:8px;font-size:11px;color:#555570;" id="xtts-status">
XTTS-Server: Nicht verbunden (starte xtts/ auf dem Gaming-PC)
</div>
</div>
</div>
</div>
<!-- Whisper (STT) -->
<div class="settings-section">
<h2>Whisper (Spracherkennung)</h2>
<div style="font-size:11px;color:#8888AA;margin-bottom:8px;">
Aenderungen werden sofort an die Bridge gesendet und das Modell neu geladen
(kann bei medium/large 10-30s dauern — waehrend dieser Zeit ist STT kurz pausiert).
</div>
<div class="card" style="max-width:500px;">
<div style="display:flex;align-items:center;gap:12px;margin-bottom:8px;">
<label style="color:#8888AA;font-size:12px;min-width:80px;">Modell:</label>
<select id="diag-whisper-model" onchange="sendVoiceConfig()" style="flex:1;background:#1E1E2E;color:#fff;border:1px solid #2A2A3E;border-radius:6px;padding:6px 10px;font-size:13px;">
<option value="tiny">tiny (39MB, schnell, niedrige Qualitaet)</option>
<option value="base">base (74MB, schnell, ok)</option>
<option value="small">small (244MB, mittel)</option>
<option value="medium" selected>medium (769MB, gut — Empfehlung)</option>
<option value="large-v3">large-v3 (1.5GB, beste Qualitaet, langsam auf CPU)</option>
</select>
</div>
<div style="font-size:10px;color:#555570;">
Tipp: <code>medium</code> ist der beste Kompromiss fuer CPU. <code>large-v3</code> nur bei GPU sinnvoll.
</div>
</div>
</div>
@@ -642,10 +744,54 @@
return;
}
if (msg.type === 'xtts_voices_list') {
const select = document.getElementById('diag-xtts-voice');
// Behalte erste Option (Default)
while (select.options.length > 1) select.remove(1);
for (const v of (msg.payload?.voices || [])) {
const opt = document.createElement('option');
opt.value = v.name;
opt.textContent = `${v.name} (${(v.size / 1024).toFixed(0)}KB)`;
select.appendChild(opt);
}
document.getElementById('xtts-status').textContent = `XTTS: ${msg.payload?.voices?.length || 0} Stimme(n) verfuegbar`;
document.getElementById('xtts-status').style.color = '#34C759';
return;
}
if (msg.type === 'xtts_voice_saved') {
document.getElementById('xtts-clone-status').textContent = `Stimme "${msg.payload?.name}" gespeichert!`;
document.getElementById('xtts-clone-status').style.color = '#34C759';
loadXTTSVoices(); // Liste neu laden
return;
}
if (msg.type === 'voice_config') {
document.getElementById('diag-default-voice').value = msg.defaultVoice || 'ramona';
document.getElementById('diag-highlight-voice').value = msg.highlightVoice || 'thorsten';
document.getElementById('diag-tts-enabled').checked = msg.ttsEnabled !== false;
const sr = msg.speedRamona || 1.0;
const st = msg.speedThorsten || 1.0;
document.getElementById('diag-speed-ramona').value = sr;
document.getElementById('speed-ramona-label').textContent = sr + 'x';
document.getElementById('diag-speed-thorsten').value = st;
document.getElementById('speed-thorsten-label').textContent = st + 'x';
document.getElementById('diag-tts-engine').value = msg.ttsEngine || 'piper';
// XTTS-Voice setzen — Option hinzufuegen falls nicht vorhanden
const xttsSelect = document.getElementById('diag-xtts-voice');
const xttsVoice = msg.xttsVoice || '';
if (xttsVoice && !Array.from(xttsSelect.options).some(o => o.value === xttsVoice)) {
const opt = document.createElement('option');
opt.value = xttsVoice;
opt.textContent = xttsVoice;
xttsSelect.appendChild(opt);
}
xttsSelect.value = xttsVoice;
toggleXTTSPanel();
// Whisper-Modell wiederherstellen (falls gesetzt)
if (msg.whisperModel) {
const wSel = document.getElementById('diag-whisper-model');
if (wSel) wSel.value = msg.whisperModel;
}
return;
}
@@ -774,6 +920,18 @@
else alert('Loeschen fehlgeschlagen: ' + (msg.error || '?'));
return;
}
if (msg.type === 'session_export') {
if (!msg.ok) { alert('Export fehlgeschlagen: ' + (msg.error || '?')); return; }
const blob = new Blob([msg.markdown], { type: 'text/markdown;charset=utf-8' });
const url = URL.createObjectURL(blob);
const a = document.createElement('a');
a.href = url;
a.download = msg.filename;
document.body.appendChild(a);
a.click();
setTimeout(() => { URL.revokeObjectURL(url); a.remove(); }, 100);
return;
}
if (msg.type === 'active_session') {
updateActiveSessionBar(msg.sessionKey);
loadSessions(); // Tabelle neu rendern
@@ -828,21 +986,39 @@
}
}
function sendDiagAttachments() {
// Alle pending Dateien an RVS senden
for (const f of diagPendingFiles) {
send({ action: 'send_file', name: f.name, type: f.type, size: f.size, base64: f.base64 });
}
if (diagPendingFiles.length > 0) {
addChat('sent', `${diagPendingFiles.length} Anhang/Anhaenge`, 'Datei');
}
diagPendingFiles = [];
renderDiagPending();
}
function testGateway() {
const input = document.getElementById('chat-input');
const text = input.value.trim();
if (!text) return;
addChat('sent', text, 'Gateway direkt');
send({ action: 'test_gateway', text });
if (!text && diagPendingFiles.length === 0) return;
if (diagPendingFiles.length > 0) sendDiagAttachments();
if (text) {
addChat('sent', text, 'Gateway direkt');
send({ action: 'test_gateway', text });
}
input.value = '';
}
function testRVS() {
const input = document.getElementById('chat-input');
const text = input.value.trim();
if (!text) return;
addChat('sent', text, 'via RVS');
send({ action: 'test_rvs', text });
if (!text && diagPendingFiles.length === 0) return;
if (diagPendingFiles.length > 0) sendDiagAttachments();
if (text) {
addChat('sent', text, 'via RVS');
send({ action: 'test_rvs', text });
}
input.value = '';
}
@@ -1128,7 +1304,11 @@
label = 'ARIA schreibt...';
}
indicators.forEach(el => { if (el) el.style.display = 'block'; });
indicators.forEach((el, i) => {
if (!el) return;
// Haupt-Indicator ist flex (Abbrechen-Button rechts), Vollbild-Variante block
el.style.display = i === 0 ? 'flex' : 'block';
});
texts.forEach(el => { if (el) el.textContent = label; });
// Auto-Hide nach 2min (falls idle Event verpasst wird — ARIA arbeitet max 15min)
@@ -1138,12 +1318,127 @@
}, 120000);
}
// ── XTTS Panel ─────────────────────────────
function toggleXTTSPanel() {
const engine = document.getElementById('diag-tts-engine').value;
document.getElementById('piper-panel').style.display = engine === 'piper' ? 'block' : 'none';
document.getElementById('xtts-panel').style.display = engine === 'xtts' ? 'block' : 'none';
if (engine === 'xtts') loadXTTSVoices();
}
function loadXTTSVoices() {
send({ action: 'xtts_list_voices' });
}
function arrayBufferToBase64(buffer) {
const bytes = new Uint8Array(buffer);
let binary = '';
for (let i = 0; i < bytes.length; i += 8192) {
binary += String.fromCharCode.apply(null, bytes.subarray(i, i + 8192));
}
return btoa(binary);
}
async function uploadVoiceSamples() {
const name = document.getElementById('xtts-clone-name').value.trim();
const files = document.getElementById('xtts-clone-files').files;
if (!name) { alert('Bitte einen Namen eingeben'); return; }
if (!files || files.length === 0) { alert('Bitte Audio-Dateien auswaehlen'); return; }
if (files.length > 10) { alert('Maximal 10 Dateien'); return; }
const status = document.getElementById('xtts-clone-status');
status.textContent = `Lade ${files.length} Datei(en)...`;
status.style.color = '#FFD60A';
try {
const samples = [];
for (let i = 0; i < files.length; i++) {
status.textContent = `Lese Datei ${i + 1}/${files.length}: ${files[i].name}...`;
const buffer = await files[i].arrayBuffer();
const base64 = arrayBufferToBase64(buffer);
samples.push({ base64, name: files[i].name, size: files[i].size });
}
const totalSize = samples.reduce((s, f) => s + f.size, 0);
status.textContent = `Sende ${samples.length} Sample(s) (${(totalSize / 1024).toFixed(0)}KB)...`;
send({ action: 'voice_upload', name, samples });
status.textContent = `Gesendet — warte auf Bestaetigung vom XTTS-Server...`;
} catch (err) {
status.textContent = `Fehler: ${err.message}`;
status.style.color = '#FF3B30';
}
}
// ── Diagnostic Anhang-Handling ─────────────
let diagPendingFiles = [];
function handleDiagFileSelect(files) {
for (const file of files) {
const reader = new FileReader();
reader.onload = () => {
const base64 = reader.result.split(',')[1];
diagPendingFiles.push({ name: file.name, type: file.type, size: file.size, base64 });
renderDiagPending();
};
reader.readAsDataURL(file);
}
}
function handleDiagPaste(event) {
const items = event.clipboardData?.items;
if (!items) return;
for (const item of items) {
if (item.kind === 'file') {
event.preventDefault();
const file = item.getAsFile();
if (file) handleDiagFileSelect([file]);
}
}
}
function renderDiagPending() {
const container = document.getElementById('diag-pending-attachments');
if (diagPendingFiles.length === 0) {
container.style.display = 'none';
return;
}
container.style.display = 'flex';
container.innerHTML = diagPendingFiles.map((f, i) => {
const isImage = f.type.startsWith('image/');
const preview = isImage ? `<img src="data:${f.type};base64,${f.base64}" style="width:40px;height:40px;border-radius:4px;object-fit:cover;">` : `<span style="font-size:20px;">&#x1F4C4;</span>`;
return `<div style="position:relative;display:inline-block;">
${preview}
<span onclick="removeDiagPending(${i})" style="position:absolute;top:-4px;right:-4px;width:16px;height:16px;border-radius:8px;background:#FF3B30;color:#fff;font-size:10px;cursor:pointer;display:flex;align-items:center;justify-content:center;">X</span>
</div>`;
}).join('') + `<span style="color:#8888AA;font-size:11px;margin-left:4px;">${diagPendingFiles.length} Datei(en)</span>
<span onclick="diagPendingFiles=[];renderDiagPending();" style="color:#FF3B30;font-size:11px;cursor:pointer;margin-left:8px;">Alle X</span>`;
}
function removeDiagPending(idx) {
diagPendingFiles.splice(idx, 1);
renderDiagPending();
}
// ── Abbrechen ──────────────────────────────
function cancelRequest() {
send({ action: 'cancel_request' });
updateThinkingIndicator({ activity: 'idle' });
addChat('error', 'Anfrage abgebrochen', 'system');
}
// ── Stimmen-Config ──────────────────────────
function sendVoiceConfig() {
const defaultVoice = document.getElementById('diag-default-voice').value;
const highlightVoice = document.getElementById('diag-highlight-voice').value;
const ttsEnabled = document.getElementById('diag-tts-enabled').checked;
send({ action: 'send_voice_config', defaultVoice, highlightVoice, ttsEnabled });
const speedRamona = parseFloat(document.getElementById('diag-speed-ramona').value);
const speedThorsten = parseFloat(document.getElementById('diag-speed-thorsten').value);
const ttsEngine = document.getElementById('diag-tts-engine').value;
const xttsVoice = document.getElementById('diag-xtts-voice').value;
const whisperModel = document.getElementById('diag-whisper-model').value;
send({ action: 'send_voice_config', defaultVoice, highlightVoice, ttsEnabled, speedRamona, speedThorsten, ttsEngine, xttsVoice, whisperModel });
}
// ── Highlight-Trigger ────────────────────────
@@ -1408,32 +1703,60 @@
: '<div style="color:#555570;padding:8px;text-align:center;">Keine Sessions gefunden</div>';
return;
}
let html = '<table style="width:100%;border-collapse:collapse;">';
html += '<tr style="color:#8888AA;font-size:10px;text-align:left;border-bottom:1px solid #1E1E2E;">'
const active = data.sessions.filter(s => !s.archived);
const archives = data.sessions.filter(s => s.archived);
const headerRow = '<tr style="color:#8888AA;font-size:10px;text-align:left;border-bottom:1px solid #1E1E2E;">'
+ '<th style="padding:4px 6px;">Session</th>'
+ '<th style="padding:4px 6px;">Msgs</th>'
+ '<th style="padding:4px 6px;">Zuletzt</th>'
+ '<th style="padding:4px 6px;"></th></tr>';
for (const s of data.sessions) {
const rowFor = (s, opts) => {
const date = s.modified ? new Date(s.modified * 1000).toLocaleString('de-DE', {day:'2-digit',month:'2-digit',hour:'2-digit',minute:'2-digit'}) : '?';
const key = escapeHtml(s.sessionKey || s.path.split('/').pop());
const orphanBadge = s.orphan ? ' <span style="background:#FF3B30;color:#fff;font-size:9px;padding:1px 4px;border-radius:3px;">verwaist</span>' : '';
const archivedBadge = s.archived ? ' <span style="background:#555570;color:#fff;font-size:9px;padding:1px 4px;border-radius:3px;">archiv</span>' : '';
const modelBadge = s.model ? `<div style="font-size:9px;color:#555570;">${escapeHtml(s.model)}</div>` : '';
const isActive = (s.sessionKey === currentActiveSession);
const keyColor = isActive ? '#34C759' : (s.orphan ? '#555570' : '#E0E0F0');
const isActive = (s.sessionKey === currentActiveSession) && !s.archived;
const keyColor = isActive ? '#34C759' : (s.archived || s.orphan ? '#8888AA' : '#E0E0F0');
const activeBadge = isActive ? ' <span style="background:#34C759;color:#000;font-size:9px;padding:1px 4px;border-radius:3px;">aktiv</span>' : '';
const rowBg = isActive ? 'background:rgba(52,199,89,0.08);' : '';
html += `<tr style="border-bottom:1px solid #0D0D1A;cursor:pointer;${rowBg}" onmouseover="this.style.background='#1E1E2E'" onmouseout="this.style.background='${isActive ? 'rgba(52,199,89,0.08)' : ''}'">`
const rowBg = isActive ? 'background:rgba(52,199,89,0.08);' : (s.archived ? 'background:rgba(136,136,170,0.04);' : '');
let actions = '';
if (s.archived) {
// Archive: nur Export + Loeschen (kein Aktivieren — wuerde aktive Session ueberschreiben)
actions = `<button class="btn secondary" onclick="event.stopPropagation();deleteSession('${escapeHtml(s.path)}')" style="padding:2px 6px;font-size:10px;color:#FF6B6B;margin-right:2px;" title="Archiv endgueltig loeschen">X</button>`
+ `<button class="btn secondary" onclick="event.stopPropagation();exportSession('${escapeHtml(s.path)}','${escapeHtml(s.sessionKey)}')" style="padding:2px 6px;font-size:10px;color:#8888AA;" title="Als Markdown exportieren">&#x2B07;</button>`;
} else {
actions = (isActive ? '' : `<button class="btn secondary" onclick="event.stopPropagation();activateSession('${escapeHtml(s.sessionKey)}')" style="padding:2px 6px;font-size:10px;color:#34C759;margin-right:2px;" title="Aktivieren">&#9654;</button>`)
+ `<button class="btn secondary" onclick="event.stopPropagation();deleteSession('${escapeHtml(s.path)}')" style="padding:2px 6px;font-size:10px;color:#FF6B6B;margin-right:2px;" title="Loeschen">X</button>`
+ `<button class="btn secondary" onclick="event.stopPropagation();exportSession('${escapeHtml(s.path)}','${escapeHtml(s.sessionKey)}')" style="padding:2px 6px;font-size:10px;color:#8888AA;" title="Als Markdown exportieren">&#x2B07;</button>`;
}
return `<tr style="border-bottom:1px solid #0D0D1A;cursor:pointer;${rowBg}" onmouseover="this.style.background='#1E1E2E'" onmouseout="this.style.background='${isActive ? 'rgba(52,199,89,0.08)' : (s.archived ? 'rgba(136,136,170,0.04)' : '')}'">`
+ `<td style="padding:4px 6px;" onclick="viewSession('${escapeHtml(s.path)}')">`
+ `<div style="color:${keyColor};">${key}${activeBadge}${orphanBadge}</div>${modelBadge}</td>`
+ `<div style="color:${keyColor};">${key}${activeBadge}${orphanBadge}${archivedBadge}</div>${modelBadge}</td>`
+ `<td style="padding:4px 6px;color:#8888AA;">${s.lines}</td>`
+ `<td style="padding:4px 6px;color:#8888AA;font-size:10px;">${date}</td>`
+ `<td style="padding:4px 6px;white-space:nowrap;">`
+ (isActive ? '' : `<button class="btn secondary" onclick="event.stopPropagation();activateSession('${escapeHtml(s.sessionKey)}')" style="padding:2px 6px;font-size:10px;color:#34C759;margin-right:2px;" title="Aktivieren">&#9654;</button>`)
+ `<button class="btn secondary" onclick="event.stopPropagation();deleteSession('${escapeHtml(s.path)}')" style="padding:2px 6px;font-size:10px;color:#FF6B6B;" title="Loeschen">X</button>`
+ `</td></tr>`;
}
+ `<td style="padding:4px 6px;white-space:nowrap;">${actions}</td></tr>`;
};
let html = '<table style="width:100%;border-collapse:collapse;">' + headerRow;
for (const s of active) html += rowFor(s);
html += '</table>';
if (archives.length > 0) {
html += `<details style="margin-top:12px;" ${archives.length <= 5 ? 'open' : ''}>`
+ `<summary style="color:#8888AA;font-size:11px;cursor:pointer;padding:4px 0;">`
+ `Archivierte Versionen (${archives.length}) — von OpenClaw beim Session-Reset gesichert`
+ `</summary>`
+ `<table style="width:100%;border-collapse:collapse;margin-top:6px;">` + headerRow;
for (const s of archives) html += rowFor(s);
html += '</table></details>';
}
container.innerHTML = html;
}
@@ -1494,6 +1817,10 @@
send({ action: 'delete_session', sessionPath: path });
}
function exportSession(path, sessionKey) {
send({ action: 'export_session', sessionPath: path, sessionKey });
}
function activateSession(sessionKey) {
send({ action: 'set_active_session', sessionKey });
}
+271 -33
View File
@@ -37,15 +37,41 @@ const state = {
};
const SESSION_KEY_FILE = "/data/active-session";
// /data Verzeichnis sicherstellen (Volume Mount)
try { fs.mkdirSync("/data", { recursive: true }); } catch {}
try { fs.mkdirSync("/data", { recursive: true }); } catch (e) {
console.error(`[startup] /data mkdir fehlgeschlagen: ${e.message}`);
}
// sessionFromFile zeigt an, ob der aktive Key aus der Datei kam.
// Wenn true, darf resolveActiveSession NICHT mehr auto-picken (Wahl respektieren).
let sessionFromFile = false;
let activeSessionKey = (() => {
try {
const saved = fs.readFileSync(SESSION_KEY_FILE, "utf-8").trim();
if (saved) { console.log(`[startup] Gespeicherte Session geladen: '${saved}'`); return saved; }
} catch {}
if (saved) {
console.log(`[startup] Gespeicherte Session geladen: '${saved}'`);
sessionFromFile = true;
return saved;
}
} catch (e) {
console.error(`[startup] SESSION_KEY_FILE read: ${e.code || e.message}`);
}
console.log("[startup] Keine gespeicherte Session — Fallback 'main'");
return "main";
})();
// Atomic write: temp-file + rename, laute Logs bei Fehler.
function persistActiveSession(key) {
try {
const tmp = SESSION_KEY_FILE + ".tmp";
fs.writeFileSync(tmp, key);
fs.renameSync(tmp, SESSION_KEY_FILE);
sessionFromFile = true;
console.log(`[session] Aktive Session persistiert: '${key}'`);
return true;
} catch (e) {
console.error(`[session] FEHLER beim Persistieren von '${key}': ${e.message}`);
return false;
}
}
const logs = [];
let gatewayWs = null;
let rvsWs = null;
@@ -56,6 +82,12 @@ const browserClients = new Set();
let pipelineActive = false;
let pipelineStartTime = 0;
// Nach chat:final kommen oft noch Trailing Agent-Events. Waehrend dieses
// Fensters unterdruecken wir agent_activity-Broadcasts, damit der
// Thinking-Indicator nicht wieder anspringt.
let lastChatFinalAt = 0;
const SETTLED_WINDOW_MS = 3000;
function plog(message, level) {
const elapsed = pipelineActive ? `+${Date.now() - pipelineStartTime}ms` : "";
const entry = { ts: new Date().toISOString(), level: level || "info", source: "pipeline", message: `${elapsed ? `[${elapsed}] ` : ""}${message}` };
@@ -91,6 +123,9 @@ function pipelineEnd(ok, detail) {
}
plog(`━━━ Pipeline Ende ━━━`);
pipelineActive = false;
// Thinking-Indikator IMMER zuruecksetzen — auch bei Timeout/Fehler/Abbruch
broadcast({ type: "agent_activity", activity: "idle" });
pendingMessageTime = 0;
}
// ── Auto-Restart bei Netzwerk-Namespace-Verlust ──────
@@ -257,8 +292,10 @@ async function connectGateway() {
state.gateway.handshakeOk = false;
gatewayWs = null;
broadcastState();
// Stuck "ARIA denkt..." vermeiden, falls Gateway waehrend Pipeline abkackt
if (pipelineActive) pipelineEnd(false, `Gateway-Verbindung verloren (${code})`);
else broadcast({ type: "agent_activity", activity: "idle" });
checkGatewayHealth();
// Auto-Reconnect nach 5s
setTimeout(connectGateway, 5000);
});
@@ -325,17 +362,22 @@ function handleGatewayMessage(msg) {
broadcast({ type: "chat_delta", delta, payload });
}
// Nach chat:final trickeln noch Aufraeum-Events rein — unterdruecken,
// damit der Thinking-Indicator nicht wieder anspringt.
const settled = lastChatFinalAt && (Date.now() - lastChatFinalAt) < SETTLED_WINDOW_MS;
// Tool-Nutzung erkennen und broadcasten
if (stream === "tool_use" || data.type === "tool_use") {
const toolName = data.name || data.tool || payload.tool || "";
if (toolName) {
if (toolName && !settled) {
broadcast({ type: "agent_activity", activity: "tool", tool: toolName, data });
log("info", "gateway", `Tool: ${toolName}`);
}
}
// Genereller Activity-Heartbeat (ARIA denkt)
broadcast({ type: "agent_activity", activity: stream || "thinking" });
if (!settled) {
broadcast({ type: "agent_activity", activity: stream || "thinking" });
}
updateAgentActivity();
return;
}
@@ -350,11 +392,17 @@ function handleGatewayMessage(msg) {
if (runId && seenFinalRuns.has(runId)) return; // Duplikat
if (runId) { seenFinalRuns.add(runId); setTimeout(() => seenFinalRuns.delete(runId), 60000); }
log("info", "gateway", `ANTWORT: "${text.slice(0, 200)}"`);
lastChatFinalAt = Date.now();
if (pipelineActive) pipelineEnd(true, `"${text.slice(0, 120)}"`);
broadcast({ type: "chat_final", text, payload });
broadcast({ type: "agent_activity", activity: "idle" });
pendingMessageTime = 0; // Watchdog: Antwort erhalten
updateAgentActivity();
// Antwort in Backup-Log schreiben
try {
const entry = JSON.stringify({ ts: Date.now(), role: "assistant", text: text.slice(0, 2000), session: activeSessionKey }) + "\n";
fs.appendFileSync("/shared/config/chat_backup.jsonl", entry);
} catch {}
return;
}
@@ -367,6 +415,7 @@ function handleGatewayMessage(msg) {
const error = payload.error || text || "Unbekannt";
log("error", "gateway", `Chat-Fehler: ${error}`);
if (pipelineActive) pipelineEnd(false, error);
else broadcast({ type: "agent_activity", activity: "idle" });
broadcast({ type: "chat_error", error, payload });
return;
}
@@ -387,7 +436,9 @@ function handleGatewayMessage(msg) {
if (runId) { seenFinalRuns.add(runId); setTimeout(() => seenFinalRuns.delete(runId), 60000); }
const text = extractChatText(payload) || payload.text || "";
log("info", "gateway", `ANTWORT: "${text.slice(0, 200)}"`);
lastChatFinalAt = Date.now();
if (pipelineActive) pipelineEnd(true, `"${text.slice(0, 120)}"`);
else broadcast({ type: "agent_activity", activity: "idle" });
broadcast({ type: "chat_final", text, payload });
return;
}
@@ -395,6 +446,7 @@ function handleGatewayMessage(msg) {
const error = payload.error || payload.message || "Unbekannt";
log("error", "gateway", `Chat-Fehler: ${error}`);
if (pipelineActive) pipelineEnd(false, error);
else broadcast({ type: "agent_activity", activity: "idle" });
broadcast({ type: "chat_error", error, payload });
return;
}
@@ -428,6 +480,12 @@ function sendToGateway(text, isPipeline) {
log("debug", "gateway", `RAW >>> ${payload}`);
gatewayWs.send(payload);
pendingMessageTime = Date.now(); // Watchdog: Nachricht gesendet
// Nachricht sofort in Backup-Log schreiben (OpenClaw speichert erst nach Run-Ende)
try {
fs.mkdirSync("/shared/config", { recursive: true });
const entry = JSON.stringify({ ts: Date.now(), role: "user", text, session: activeSessionKey }) + "\n";
fs.appendFileSync("/shared/config/chat_backup.jsonl", entry);
} catch {}
log("info", "gateway", `chat.send [${reqId}]: "${text}"`);
if (isPipeline) plog(`chat.send [${reqId}] an Gateway gesendet — warte auf ACK...`);
@@ -549,6 +607,31 @@ function connectRVS(forcePlain) {
});
}
function sendToRVS_withResponse(sendType, sendPayload, expectType, clientWs) {
if (!RVS_HOST || !RVS_TOKEN) return;
const proto = RVS_TLS === "true" ? "wss" : "ws";
const url = `${proto}://${RVS_HOST}:${RVS_PORT}?token=${RVS_TOKEN}`;
const freshWs = new WebSocket(url);
const timeout = setTimeout(() => {
try { freshWs.close(); } catch (_) {}
clientWs.send(JSON.stringify({ type: expectType, payload: { voices: [], error: "Timeout" }, timestamp: Date.now() }));
}, 15000);
freshWs.on("open", () => {
freshWs.send(JSON.stringify({ type: sendType, payload: sendPayload, timestamp: Date.now() }));
});
freshWs.on("message", (raw) => {
try {
const resp = JSON.parse(raw.toString());
if (resp.type === expectType) {
clearTimeout(timeout);
clientWs.send(JSON.stringify(resp));
setTimeout(() => { try { freshWs.close(); } catch (_) {} }, 1000);
}
} catch {}
});
freshWs.on("error", () => {});
}
function sendToRVS_raw(msgObj) {
if (!RVS_HOST || !RVS_TOKEN) return;
const proto = RVS_TLS === "true" ? "wss" : "ws";
@@ -1005,6 +1088,7 @@ function waitForMessage(ws, timeoutMs) {
let lastAgentActivity = Date.now();
let watchdogWarned = false;
let watchdogFixAttempted = false;
let pendingMessageTime = 0; // Wann wurde die letzte Nachricht gesendet
function updateAgentActivity() {
@@ -1024,20 +1108,37 @@ setInterval(async () => {
broadcast({ type: "watchdog", status: "warning", waitingMs, message: "ARIA reagiert nicht — moeglicherweise stuck Run" });
}
// Nach 5min: Auto-Fix anbieten
if (waitingMs > 300000 && watchdogWarned) {
// Nach 5min: doctor --fix
if (waitingMs > 300000 && watchdogWarned && !watchdogFixAttempted) {
watchdogFixAttempted = true;
log("error", "server", "Watchdog: 5min ohne Antwort — fuehre openclaw doctor --fix aus");
broadcast({ type: "watchdog", status: "fixing", message: "Auto-Fix: openclaw doctor --fix" });
try {
await dockerExec("aria-core", "openclaw doctor --fix 2>/dev/null || true");
log("info", "server", "Watchdog: doctor --fix ausgefuehrt");
broadcast({ type: "watchdog", status: "fixed", message: "doctor --fix ausgefuehrt — sende Nachricht erneut" });
broadcast({ type: "watchdog", status: "fixed", message: "doctor --fix ausgefuehrt — warte auf Antwort..." });
} catch (err) {
log("error", "server", `Watchdog: doctor --fix fehlgeschlagen: ${err.message}`);
broadcast({ type: "watchdog", status: "error", message: `Auto-Fix fehlgeschlagen: ${err.message}` });
}
pendingMessageTime = 0; // Reset
}
// Nach 8min: Container neustarten
if (waitingMs > 480000 && watchdogFixAttempted) {
log("error", "server", "Watchdog: 8min ohne Antwort — starte aria-core + aria-proxy neu");
broadcast({ type: "watchdog", status: "restarting", message: "Container-Restart: aria-core + aria-proxy" });
try {
const { execSync } = require("child_process");
execSync("docker restart aria-core aria-proxy", { timeout: 60000 });
log("info", "server", "Watchdog: Container neugestartet");
broadcast({ type: "watchdog", status: "restarted", message: "Container neugestartet — warte auf Gateway-Reconnect..." });
// Gateway wird sich automatisch neu verbinden
} catch (err) {
log("error", "server", `Watchdog: Container-Restart fehlgeschlagen: ${err.message}`);
broadcast({ type: "watchdog", status: "error", message: `Restart fehlgeschlagen: ${err.message}` });
}
pendingMessageTime = 0;
watchdogWarned = false;
watchdogFixAttempted = false;
}
}, 30000);
@@ -1055,6 +1156,16 @@ const server = http.createServer((req, res) => {
} else if (req.url === "/api/session") {
res.writeHead(200, { "Content-Type": "application/json" });
res.end(JSON.stringify({ sessionKey: activeSessionKey }));
} else if (req.url === "/api/cancel" && req.method === "POST") {
log("warn", "server", "HTTP /api/cancel — Cancel-Request (von Bridge)");
pendingMessageTime = 0;
watchdogWarned = false;
watchdogFixAttempted = false;
if (pipelineActive) pipelineEnd(false, "Vom Benutzer abgebrochen (App)");
else broadcast({ type: "agent_activity", activity: "idle" });
dockerExec("aria-core", "openclaw doctor --fix 2>/dev/null || true").catch(() => {});
res.writeHead(200, { "Content-Type": "application/json" });
res.end(JSON.stringify({ ok: true }));
} else if (req.url.startsWith("/shared/")) {
// Dateien aus Shared Volume ausliefern (Bilder, Uploads)
const filePath = decodeURIComponent(req.url);
@@ -1127,21 +1238,54 @@ wss.on("connection", (ws) => {
if (ws._sshSock) ws._sshSock.write(msg.data);
} else if (msg.action === "live_ssh_close") {
if (ws._sshSock) { ws._sshSock.end(); ws._sshSock = null; }
} else if (msg.action === "send_file") {
// Datei von Diagnostic an Bridge via RVS senden
sendToRVS_raw({
type: "file",
payload: { name: msg.name, type: msg.type, size: msg.size, base64: msg.base64 },
timestamp: Date.now(),
});
log("info", "server", `Datei gesendet: ${msg.name} (${msg.type})`);
} else if (msg.action === "cancel_request") {
// Laufende Anfrage abbrechen — doctor --fix beendet stuck runs
log("warn", "server", "Anfrage abgebrochen — fuehre doctor --fix aus");
pendingMessageTime = 0;
watchdogWarned = false;
watchdogFixAttempted = false;
if (pipelineActive) pipelineEnd(false, "Vom Benutzer abgebrochen");
broadcast({ type: "agent_activity", activity: "idle" });
dockerExec("aria-core", "openclaw doctor --fix 2>/dev/null || true").catch(() => {});
} else if (msg.action === "voice_upload") {
// Voice-Samples an XTTS-Bridge via RVS weiterleiten, auf Bestätigung warten
log("info", "server", `Voice-Upload '${msg.name}' (${(msg.samples || []).length} Samples) sende an RVS...`);
sendToRVS_withResponse("voice_upload", { name: msg.name, samples: msg.samples }, "xtts_voice_saved", ws);
} else if (msg.action === "xtts_list_voices") {
// Frische Verbindung die auf Antwort wartet
sendToRVS_withResponse("xtts_list_voices", {}, "xtts_voices_list", ws);
} else if (msg.action === "get_voice_config") {
handleGetVoiceConfig(ws);
} else if (msg.action === "send_voice_config") {
// Stimmen-Config persistent speichern + an Bridge via RVS senden
// Bestehende Config lesen um Felder zu mergen die dieser Call nicht setzt
let existing = {};
try { existing = JSON.parse(fs.readFileSync("/shared/config/voice_config.json", "utf-8")); } catch {}
const voiceConfig = {
...existing,
defaultVoice: msg.defaultVoice || "ramona",
highlightVoice: msg.highlightVoice || "thorsten",
ttsEnabled: msg.ttsEnabled !== false,
ttsEngine: msg.ttsEngine || "piper",
xttsVoice: msg.xttsVoice || "",
speedRamona: msg.speedRamona || 1.0,
speedThorsten: msg.speedThorsten || 1.0,
};
if (msg.whisperModel !== undefined) voiceConfig.whisperModel = msg.whisperModel;
try {
fs.mkdirSync("/shared/config", { recursive: true });
fs.writeFileSync("/shared/config/voice_config.json", JSON.stringify(voiceConfig, null, 2));
} catch {}
sendToRVS_raw({ type: "config", payload: voiceConfig, timestamp: Date.now() });
log("info", "server", `Voice-Config gespeichert+gesendet: default=${voiceConfig.defaultVoice}, highlight=${voiceConfig.highlightVoice}, tts=${voiceConfig.ttsEnabled}`);
log("info", "server", `Voice-Config gespeichert+gesendet: default=${voiceConfig.defaultVoice}, whisper=${voiceConfig.whisperModel || "-"}`);
} else if (msg.action === "get_triggers") {
handleGetTriggers(ws);
} else if (msg.action === "save_triggers") {
@@ -1158,6 +1302,8 @@ wss.on("connection", (ws) => {
handleListSessions(ws);
} else if (msg.action === "read_session") {
handleReadSession(ws, msg.sessionPath);
} else if (msg.action === "export_session") {
handleExportSession(ws, msg.sessionPath, msg.sessionKey);
} else if (msg.action === "delete_session") {
handleDeleteSession(ws, msg.sessionPath);
} else if (msg.action === "set_active_session") {
@@ -1429,17 +1575,17 @@ async function handleListSessions(clientWs) {
try {
log("info", "server", "Lade Sessions aus aria-core...");
// sessions.json als Index lesen + Datei-Details holen
// sessions.json als Index lesen + Datei-Details holen (inkl. .reset.* Archive)
const raw = await dockerExec("aria-core", `
cat ${SESSIONS_DIR}/sessions.json 2>/dev/null || echo '{}' &&
echo '===FILE_DETAILS===' &&
for f in ${SESSIONS_DIR}/*.jsonl; do
for f in ${SESSIONS_DIR}/*.jsonl ${SESSIONS_DIR}/*.jsonl.reset.*; do
[ -f "$f" ] || continue
name=$(basename "$f")
lines=$(wc -l < "$f" 2>/dev/null || echo 0)
msgs=$(grep -cE '"role":"(user|assistant)"' "$f" 2>/dev/null || echo 0)
size=$(du -h "$f" 2>/dev/null | cut -f1)
modified=$(stat -c '%Y' "$f" 2>/dev/null || echo 0)
echo "FILE:$name|LINES:$lines|SIZE:$size|MODIFIED:$modified"
echo "FILE:$name|LINES:$msgs|SIZE:$size|MODIFIED:$modified"
done
`.trim());
@@ -1494,8 +1640,29 @@ async function handleListSessions(clientWs) {
delete fileDetails[filename];
}
// Dateien die nicht im Index stehen (Waisen / Reset-Files)
// Dateien die nicht im Index stehen (Waisen ODER Reset-Archive)
for (const [filename, details] of Object.entries(fileDetails)) {
// .jsonl.reset.<ISO-Timestamp>.Z → archivierte Session (OpenClaw-Reset)
const resetMatch = filename.match(/^([a-f0-9-]+)\.jsonl\.reset\.(.+)\.Z$/);
if (resetMatch) {
const id = resetMatch[1];
// Timestamp ISO-8601 parsen (in Dateinamen: : durch - ersetzt)
// z.B. 2026-04-18T09-49-44.814 → 2026-04-18T09:49:44.814Z
const tsStr = resetMatch[2].replace(/T(\d{2})-(\d{2})-(\d{2})/, "T$1:$2:$3");
const resetAt = Math.floor(new Date(tsStr + "Z").getTime() / 1000) || parseInt(details.MODIFIED) || 0;
sessions.push({
path: `${SESSIONS_DIR}/${filename}`,
sessionKey: id.slice(0, 8) + "… (archiv)",
sessionId: id,
lines: parseInt(details.LINES) || 0,
size: details.SIZE || "?",
modified: resetAt,
archived: true,
resetAt,
});
continue;
}
// Echte Waisen (UUID.jsonl ohne Eintrag in sessions.json)
const id = filename.replace(".jsonl", "");
sessions.push({
path: `${SESSIONS_DIR}/${filename}`,
@@ -1540,6 +1707,68 @@ async function handleReadSession(clientWs, sessionPath) {
}
}
async function handleExportSession(clientWs, sessionPath, sessionKey) {
if (!sessionPath || sessionPath.includes("..") || !sessionPath.startsWith(SESSIONS_DIR)) {
clientWs.send(JSON.stringify({ type: "session_export", ok: false, error: "Ungueltiger Pfad" }));
return;
}
try {
const safePath = sessionPath.replace(/'/g, "");
const raw = await dockerExec("aria-core", `cat '${safePath}'`);
const lines = raw.split("\n").filter(l => l.trim());
const blocks = [];
for (const line of lines) {
let obj;
try { obj = JSON.parse(line); } catch { continue; }
if (obj.type !== "message" || !obj.message) continue;
const role = obj.message.role;
if (role !== "user" && role !== "assistant") continue;
let text = "";
const content = obj.message.content;
if (typeof content === "string") text = content;
else if (Array.isArray(content)) text = content.filter(c => c.type === "text").map(c => c.text || "").join("\n");
if (!text) continue;
if (role === "user") {
text = text.replace(/^Sender \(untrusted metadata\):[\s\S]*?```[\s\S]*?```\s*\n*/m, "").trim();
text = text.replace(/^\[.*?\]\s*/, "").trim();
} else {
text = text.replace(/^\[\[reply_to_\w+\]\]\s*/g, "").trim();
}
if (!text) continue;
const ts = obj.message.timestamp || obj.timestamp || 0;
const when = ts ? new Date(ts).toISOString().replace("T", " ").slice(0, 19) : "";
const heading = role === "user" ? "## 🧑 User" : "## 🤖 ARIA";
blocks.push(`${heading}${when ? `${when}` : ""}\n\n${text}`);
}
const exportedAt = new Date().toISOString().replace("T", " ").slice(0, 19);
const title = sessionKey || sessionPath.split("/").pop().replace(".jsonl", "");
const markdown = [
`# Session: ${title}`,
``,
`Exportiert: ${exportedAt} `,
`Quelle: ${sessionPath}`,
``,
`---`,
``,
blocks.join("\n\n---\n\n"),
``,
].join("\n");
const safeKey = (sessionKey || "session").replace(/[^a-zA-Z0-9_-]/g, "_");
const filename = `${exportedAt.slice(0, 10)}_${safeKey}.md`;
clientWs.send(JSON.stringify({ type: "session_export", ok: true, filename, markdown }));
log("info", "server", `Session exportiert: ${filename} (${blocks.length} Nachrichten)`);
} catch (err) {
log("error", "server", `Session-Export fehlgeschlagen: ${err.message}`);
clientWs.send(JSON.stringify({ type: "session_export", ok: false, error: err.message }));
}
}
async function handleDeleteSession(clientWs, sessionPath) {
if (!sessionPath || sessionPath.includes("..") || !sessionPath.startsWith(SESSIONS_DIR)) {
clientWs.send(JSON.stringify({ type: "session_deleted", ok: false, error: "Ungueltiger Pfad" }));
@@ -1580,13 +1809,11 @@ async function handleDeleteSession(clientWs, sessionPath) {
}
// ── Session-Aufloesung: letzte aktive Session finden ────
// Wird nach Gateway-(Re-)Connect aufgerufen. Darf die explizit gewaehlte
// Session NIE ueberschreiben — nur beim absoluten Erststart auto-picken.
async function resolveActiveSession() {
// Nur bei Fallback-Key "main" automatisch aufloesen — gespeicherte Wahl respektieren
const hasSavedSession = (() => {
try { return !!fs.readFileSync(SESSION_KEY_FILE, "utf-8").trim(); } catch { return false; }
})();
if (hasSavedSession && activeSessionKey !== "main") {
log("info", "server", `Gespeicherte Session '${activeSessionKey}' wird beibehalten`);
if (sessionFromFile) {
log("info", "server", `Session '${activeSessionKey}' aus /data — keine Auto-Wahl`);
return;
}
@@ -1605,10 +1832,19 @@ async function resolveActiveSession() {
const keys = entries.map(e => (e.key || e.sessionKey || e.name || "?").replace(/^agent:main:/, ""));
log("info", "server", `Verfuegbare Sessions: [${keys.join(", ")}]`);
// Neueste Session nehmen
// Neueste Session nehmen — aber user-definierte bevorzugen.
// aria-bridge / aria-diagnostic werden von den Services auto-erstellt;
// bei erstem Start soll lieber eine "echte" Session gewaehlt werden,
// falls vorhanden.
const AUTO_KEYS = new Set(["aria-bridge", "aria-diagnostic"]);
const normalise = (e) => (e.key || e.sessionKey || e.name || "").replace(/^agent:main:/, "");
const userEntries = entries.filter(e => !AUTO_KEYS.has(normalise(e)));
const pool = userEntries.length > 0 ? userEntries : entries;
let newest = null;
let newestTime = 0;
for (const entry of entries) {
for (const entry of pool) {
const t = entry.updatedAt || entry.createdAt || 0;
if (t >= newestTime) {
newestTime = t;
@@ -1617,12 +1853,11 @@ async function resolveActiveSession() {
}
if (newest) {
const rawKey = newest.key || newest.sessionKey || newest.name || "";
const key = rawKey.replace(/^agent:main:/, "");
const key = normalise(newest);
if (key) {
activeSessionKey = key;
try { fs.writeFileSync(SESSION_KEY_FILE, activeSessionKey); } catch {}
log("info", "server", `Aktive Session auf neueste gewechselt: '${activeSessionKey}'`);
persistActiveSession(activeSessionKey);
log("info", "server", `Auto-Wahl Erststart: '${activeSessionKey}'`);
for (const c of browserClients) {
c.send(JSON.stringify({ type: "active_session", sessionKey: activeSessionKey }));
}
@@ -1711,8 +1946,11 @@ function handleSetActiveSession(clientWs, sessionKey) {
return;
}
activeSessionKey = sessionKey;
try { fs.writeFileSync(SESSION_KEY_FILE, activeSessionKey); } catch {}
log("info", "server", `Aktive Session: ${activeSessionKey}`);
const ok = persistActiveSession(activeSessionKey);
log("info", "server", `Aktive Session: ${activeSessionKey}${ok ? "" : " (WARN: nicht persistiert!)"}`);
if (!ok) {
clientWs.send(JSON.stringify({ type: "active_session", ok: false, sessionKey: activeSessionKey, error: "Persistierung fehlgeschlagen — /data Volume pruefen" }));
}
// Allen Clients mitteilen
for (const c of browserClients) {
c.send(JSON.stringify({ type: "active_session", sessionKey: activeSessionKey }));
@@ -1728,7 +1966,7 @@ async function handleCreateSession(clientWs, sessionName) {
try {
// Session wird automatisch erstellt wenn man die erste Nachricht sendet
activeSessionKey = sessionName;
try { fs.writeFileSync(SESSION_KEY_FILE, activeSessionKey); } catch {}
persistActiveSession(activeSessionKey);
log("info", "server", `Neue Session erstellt und aktiviert: ${sessionName}`);
// Allen Clients mitteilen
for (const c of browserClients) {
+1 -1
View File
@@ -18,7 +18,7 @@ services:
claude-max-api"
volumes:
- ~/.claude:/root/.claude # Claude CLI Auth (Credentials in /root/.claude/.credentials.json)
- ./aria-data/ssh:/root/.ssh:ro # SSH Keys fuer VM-Zugriff (aria-wohnung)
- ./aria-data/ssh:/root/.ssh # SSH Keys fuer VM-Zugriff (aria-wohnung, rw fuer ARIA)
- aria-shared:/shared # Shared Volume fuer Datei-Austausch (Uploads von App)
environment:
- HOST=0.0.0.0
+57 -19
View File
@@ -1,26 +1,64 @@
# erledigt bildupload ghet noch nicht.
# ende
# erledigt
sprachnachrichten werden nicht als zweite nachricht dargestellt, damit man weiß was man gesendet hat
# ende
# ARIA Issues & Features
## Erledigt
# erledigt cache leeren, bilder werden nicht neu geladen beim antippen.
autoload geht nicht
# ende
- [x] Bildupload funktioniert (Shared Volume /shared/uploads/)
- [x] Sprachnachrichten werden als Text angezeigt (STT → Chat-Bubble)
- [x] Cache leeren + Auto-Download von Anhaengen
- [x] ARIA liest Nachrichten vor (TTS via Piper)
- [x] Autoscroll zur letzten Nachricht (inverted FlatList)
- [x] Bilder im Chat groesser + Vollbild-Vorschau
- [x] Ohr-Button → Gespraechsmodus (Auto-Aufnahme nach ARIA-Antwort)
- [x] Play-Button in ARIA-Nachrichten fuer Sprachwiedergabe
- [x] Chat-Suche in der App (Lupe in Statusleiste)
- [x] Watchdog mit Container-Restart (2min Warnung → 5min doctor --fix → 8min Restart)
- [x] Abbrechen-Button im Diagnostic Chat
- [x] Nachrichten Backup on-the-fly (/shared/config/chat_backup.jsonl)
- [x] Grosse Nachrichten satzweise aufteilen fuer TTS
- [x] RVS Nachrichten vom Smartphone gehen durch
- [x] Stimmen-Einstellungen (Ramona/Thorsten, Speed pro Stimme)
- [x] Highlight-Trigger konfigurierbar in Diagnostic
- [x] XTTS v2 Integration (Gaming-PC, GPU, Voice Cloning)
- [x] XTTS Voice Cloning (Audio-Samples hochladen, eigene Stimme)
- [x] TTS Engine waehlbar (Piper/XTTS) in Diagnostic + App
- [x] Auto-Update System (APK via RVS WebSocket)
- [x] Auto-Update: APK-Installation via FileProvider
- [x] Auto-Update: "Auf Updates pruefen" Button in App-Einstellungen
- [x] Audio-Queue (sequentielle Wiedergabe, kein Ueberlappen)
- [x] Textnachrichten werden von ARIA beantwortet (Bridge chat handler fix)
- [x] Mehrere Anhaenge + Text vor dem Senden (Pending-Vorschau)
- [x] Paste-Support fuer Bilder in Diagnostic Chat
- [x] Markdown-Bereinigung fuer TTS (fett, kursiv, code, links, etc.)
- [x] SSH Volume read-write fuer Proxy (kein -F Workaround mehr)
- [x] Diagnostic: Sessions als Markdown exportieren (Download-Button)
- [x] Speech Gate: Aufnahme wird verworfen wenn keine Sprache erkannt (verhindert dass Umgebungsgeraeusche an Whisper gehen)
- [x] Session-Persistenz: Gewaehlte Session bleibt ueber Container-Restarts erhalten (sessionFromFile-Flag, atomic write)
- [x] Diagnostic: "ARIA denkt..." bleibt nicht mehr stehen (pipelineEnd broadcastet immer idle, auch bei Timeout/Fehler/Disconnect)
- [x] App: "ARIA denkt..." Indicator + Abbrechen-Button (Bridge spiegelt agent_activity via RVS)
- [x] Whisper STT: Model-Auswahl in Diagnostic (tiny/base/small/medium/large-v3), Hot-Reload in Bridge, Default auf medium
- [x] App: Audio-Aufnahme explizit 16kHz mono (spart Resample, optimal fuer Whisper)
- [x] Gespraechsmodus: Speech-Gate strenger (-28dB / 500ms) — keine Umgebungsgeraeusche mehr
- [x] Gespraechsmodus: Max-Dauer 30s pro Aufnahme, Cache-Cleanup alter Files, Messages-Array gekappt (500)
- [x] Diagnostic: Archivierte Session-Versionen (.reset.*) werden angezeigt + exportierbar — OpenClaw resettet Sessions bei erster Nutzung nach Container-Restart, Inhalt ist aber in .reset.<timestamp> Dateien gesichert
- [x] tools/export-jsonl-to-md.js: CLI-Konverter fuer beliebige Session-JSONL zu Markdown
wenn man auf das ohr zum hören klickt stürzt ab
## Offen
# erledigt aria liest die nachrichten nicht vor
#ende
### Bugs (Prioritaet)
- [ ] App: Audioausgabe hoert ab und zu einfach auf (mitten im Satz oder zwischen Chunks)
# erledigt autoscroll geht doch noch nicht zur letzten nachricht
unserer memory brain
# ende
### App Features
- [ ] Wake Word on-device (Porcupine "ARIA" Keyword, Phase 2 — passives Lauschen)
- [ ] Chat-History zuverlaessiger laden (AsyncStorage Race Condition)
- [ ] Background Audio Service (TTS auch bei minimierter App)
# erledigt bilder im chat größer darstellen
# ende
### TTS / Audio
- [ ] XTTS Audio-Streaming (PCM-Stream statt WAV-Dateien, eliminiert Stottern komplett)
- [ ] Audio-Normalisierung (Lautstaerke zwischen Chunks angleichen)
- [ ] Piper Voices Download ueber Diagnostic (neue Sprachen/Stimmen)
die viper voices downloaden über die diagnostic
# ende
### Architektur
- [ ] Bilder: Claude Vision direkt nutzen (aktuell nur Dateipfad an ARIA)
- [ ] Auto-Compacting und Memory/Brain Verwaltung (SQLite?)
- [ ] Diagnostic: System-Info Tab (Container-Status, Disk, RAM, CPU)
- [ ] RVS Zombie-Connections endgueltig loesen
+23 -1
View File
@@ -76,8 +76,11 @@ echo -e " ${GREEN}✓${NC} SettingsScreen → Version $VERSION"
echo ""
# ── APK bauen ─────────────────────────────────
echo -e "${GREEN}[2/5] APK bauen...${NC}"
echo -e "${GREEN}[2/5] APK bauen (Cache leeren + Build)...${NC}"
cd android
# Metro + Gradle Cache leeren damit neue Version sauber eingebettet wird
rm -rf node_modules/.cache 2>/dev/null
cd android && ./gradlew clean 2>/dev/null; cd ..
./build.sh release
cd ..
@@ -170,6 +173,24 @@ else
exit 1
fi
# ── Auto-Update: APK auf RVS-Server kopieren ─
RVS_UPDATE_HOST="${RVS_UPDATE_HOST:-}"
if [ -n "$RVS_UPDATE_HOST" ]; then
echo -e "${GREEN}[6/6] APK auf RVS-Server kopieren (Auto-Update)...${NC}"
# Alte APKs auf dem RVS loeschen, dann neue hochladen
ssh "$RVS_UPDATE_HOST" "rm -f ~/ARIA-AGENT/rvs/updates/ARIA-*.apk" 2>/dev/null
scp "$APK_PATH" "${RVS_UPDATE_HOST}:~/ARIA-AGENT/rvs/updates/${APK_NAME}" 2>/dev/null
if [ $? -eq 0 ]; then
echo -e " ${GREEN}${NC} APK auf RVS-Server kopiert (alte Versionen geloescht)"
else
echo -e " ${YELLOW}APK konnte nicht auf RVS kopiert werden (RVS_UPDATE_HOST=$RVS_UPDATE_HOST)${NC}"
echo -e " ${YELLOW}Manuell: scp $APK_PATH $RVS_UPDATE_HOST:~/ARIA-AGENT/rvs/updates/${APK_NAME}${NC}"
fi
else
echo -e "${YELLOW}Auto-Update uebersprungen (RVS_UPDATE_HOST nicht gesetzt)${NC}"
echo -e "${YELLOW}Setze RVS_UPDATE_HOST in .env fuer automatische Verteilung${NC}"
fi
# ── Fertig ────────────────────────────────────
echo ""
echo -e "${GREEN}╔═══════════════════════════════════════════════════╗${NC}"
@@ -177,4 +198,5 @@ echo -e "${GREEN}║ Release $TAG ist live!$(printf '%*s' $((27 - ${#TAG})) ''
echo -e "${GREEN}╠═══════════════════════════════════════════════════╣${NC}"
echo -e "${GREEN}${NC} $GITEA_URL/$GITEA_REPO/releases/tag/$TAG"
echo -e "${GREEN}${NC} APK: $APK_NAME ($APK_SIZE)"
echo -e "${GREEN}${NC} Auto-Update: ${RVS_UPDATE_HOST:-nicht konfiguriert}"
echo -e "${GREEN}╚═══════════════════════════════════════════════════╝${NC}"
+2
View File
@@ -4,5 +4,7 @@ services:
ports:
- "${RVS_PORT:-443}:3000"
restart: always
volumes:
- ./updates:/updates # APK-Dateien fuer Auto-Update
environment:
- MAX_SESSIONS=10
+114 -1
View File
@@ -1,15 +1,22 @@
"use strict";
const { WebSocketServer } = require("ws");
const fs = require("fs");
const path = require("path");
// ── Konfiguration aus Umgebungsvariablen ────────────────────────────
const PORT = parseInt(process.env.PORT || "3000", 10);
const MAX_SESSIONS = parseInt(process.env.MAX_SESSIONS || "10", 10);
const UPDATES_DIR = process.env.UPDATES_DIR || "/updates";
// Kein Polling — APK wird manuell per git pull bereitgestellt
// Erlaubte Nachrichtentypen — alles andere wird verworfen
const ALLOWED_TYPES = new Set([
"chat", "audio", "file", "location", "mode", "log", "event", "heartbeat",
"file_request", "file_response", "file_saved", "stt_result", "config",
"file_request", "file_response", "file_saved", "stt_result", "config", "tts_request",
"xtts_request", "xtts_response", "xtts_list_voices", "xtts_voices_list", "voice_upload", "xtts_voice_saved",
"update_check", "update_available", "update_download", "update_data",
"agent_activity", "cancel_request",
]);
// Token-Raum: token -> { clients: Set<ws> }
@@ -46,6 +53,9 @@ const wss = new WebSocketServer({ port: PORT });
wss.on("listening", () => {
log(`RVS läuft auf Port ${PORT} | Max Sessions: ${MAX_SESSIONS}`);
// Beim Start pruefen ob eine APK da ist
const apkInfo = getLatestAPK();
if (apkInfo) log(`APK bereit: v${apkInfo.version} (${(fs.statSync(apkInfo.path).size / 1024 / 1024).toFixed(1)}MB)`);
});
wss.on("connection", (ws, req) => {
@@ -107,6 +117,52 @@ function registerClient(ws, token) {
return;
}
// Update-Check: direkt an den anfragenden Client antworten (nicht relay'en)
if (msg.type === "update_check") {
const clientVersion = msg.payload?.version || "0.0.0.0";
const apkInfo = getLatestAPK();
if (apkInfo && compareVersions(apkInfo.version, clientVersion) > 0) {
ws.send(JSON.stringify({
type: "update_available",
payload: {
version: apkInfo.version,
downloadUrl: `/update/latest.apk`,
size: fs.statSync(apkInfo.path).size,
},
timestamp: Date.now(),
}));
}
return;
}
// Update-Download: APK als Base64 ueber WebSocket senden
if (msg.type === "update_download") {
const apkInfo = getLatestAPK();
if (!apkInfo) {
ws.send(JSON.stringify({ type: "update_data", payload: { error: "Keine APK verfuegbar" }, timestamp: Date.now() }));
return;
}
try {
const data = fs.readFileSync(apkInfo.path);
const base64 = data.toString("base64");
const sizeMB = (data.length / 1024 / 1024).toFixed(1);
log(`APK sende: v${apkInfo.version} (${sizeMB}MB) an Client`);
ws.send(JSON.stringify({
type: "update_data",
payload: {
version: apkInfo.version,
base64,
size: data.length,
fileName: `ARIA-v${apkInfo.version}.apk`,
},
timestamp: Date.now(),
}));
} catch (err) {
ws.send(JSON.stringify({ type: "update_data", payload: { error: err.message }, timestamp: Date.now() }));
}
return;
}
// An alle anderen Clients im Raum weiterleiten
for (const client of room.clients) {
if (client !== ws && client.readyState === 1) {
@@ -167,6 +223,63 @@ wss.on("close", () => {
clearInterval(cleanup);
});
// ── Auto-Update: APK-Erkennung + Push ──────────────────────────────
let latestVersion = null;
function getLatestAPK() {
try {
if (!fs.existsSync(UPDATES_DIR)) return null;
const files = fs.readdirSync(UPDATES_DIR)
.filter(f => f.endsWith(".apk"))
.map(f => {
// ARIA-v0.0.2.3.apk oder ARIA-Cockpit-release.apk
const match = f.match(/(\d+\.\d+\.\d+[\.\d]*)/);
return { file: f, path: path.join(UPDATES_DIR, f), version: match ? match[1] : null };
})
.filter(f => f.version)
.sort((a, b) => compareVersions(b.version, a.version)); // Neueste zuerst
return files[0] || null;
} catch {
return null;
}
}
function compareVersions(a, b) {
const pa = a.split(".").map(Number);
const pb = b.split(".").map(Number);
for (let i = 0; i < Math.max(pa.length, pb.length); i++) {
const diff = (pa[i] || 0) - (pb[i] || 0);
if (diff !== 0) return diff;
}
return 0;
}
function notifyClientsAboutUpdate(apkInfo) {
const msg = JSON.stringify({
type: "update_available",
payload: {
version: apkInfo.version,
downloadUrl: `/update/latest.apk`,
size: fs.statSync(apkInfo.path).size,
},
timestamp: Date.now(),
});
// An alle Clients in allen Rooms senden
for (const [, room] of rooms) {
for (const client of room.clients) {
if (client.readyState === 1) {
client.send(msg);
}
}
}
log(`Update-Benachrichtigung gesendet: v${apkInfo.version} (${rooms.size} Raum/Raeume)`);
}
// Kein Polling — Update-Check passiert on-demand (update_check Message von App)
// ── Sauberes Herunterfahren ─────────────────────────────────────────
process.on("SIGTERM", () => {
View File
+74
View File
@@ -0,0 +1,74 @@
#!/usr/bin/env node
/**
* Exportiert ein OpenClaw Session-JSONL (auch .reset.*) als Markdown.
*
* Nutzung:
* node export-jsonl-to-md.js <input.jsonl> [output.md]
*
* Oder direkt aus dem aria-core Container:
* docker exec aria-core cat /home/node/.openclaw/agents/main/sessions/<ID>.jsonl.reset.<TS> \
* | node export-jsonl-to-md.js - > output.md
*/
const fs = require("fs");
const inputArg = process.argv[2];
const outputArg = process.argv[3];
if (!inputArg) {
console.error("Usage: export-jsonl-to-md.js <input.jsonl|-> [output.md]");
process.exit(1);
}
const raw = inputArg === "-" ? fs.readFileSync(0, "utf-8") : fs.readFileSync(inputArg, "utf-8");
const lines = raw.split("\n").filter(l => l.trim());
const blocks = [];
for (const line of lines) {
let obj;
try { obj = JSON.parse(line); } catch { continue; }
if (obj.type !== "message" || !obj.message) continue;
const role = obj.message.role;
if (role !== "user" && role !== "assistant") continue;
let text = "";
const content = obj.message.content;
if (typeof content === "string") text = content;
else if (Array.isArray(content)) text = content.filter(c => c.type === "text").map(c => c.text || "").join("\n");
if (!text) continue;
if (role === "user") {
text = text.replace(/^Sender \(untrusted metadata\):[\s\S]*?```[\s\S]*?```\s*\n*/m, "").trim();
text = text.replace(/^\[.*?\]\s*/, "").trim();
} else {
text = text.replace(/^\[\[reply_to_\w+\]\]\s*/g, "").trim();
}
if (!text) continue;
const ts = obj.message.timestamp || obj.timestamp || 0;
const when = ts ? new Date(ts).toISOString().replace("T", " ").slice(0, 19) : "";
const heading = role === "user" ? "## 🧑 User" : "## 🤖 ARIA";
blocks.push(`${heading}${when ? `${when}` : ""}\n\n${text}`);
}
const exportedAt = new Date().toISOString().replace("T", " ").slice(0, 19);
const title = inputArg === "-" ? "Session" : inputArg.split("/").pop().replace(/\.jsonl.*/, "");
const md = [
`# Session: ${title}`,
``,
`Exportiert: ${exportedAt} `,
`Quelle: ${inputArg === "-" ? "stdin" : inputArg}`,
`Nachrichten: ${blocks.length}`,
``,
`---`,
``,
blocks.join("\n\n---\n\n"),
``,
].join("\n");
if (outputArg) {
fs.writeFileSync(outputArg, md);
console.error(`OK: ${blocks.length} Nachrichten → ${outputArg}`);
} else {
process.stdout.write(md);
}
+11
View File
@@ -0,0 +1,11 @@
# ════════════════════════════════════════════════
# ARIA XTTS v2 — Konfiguration
# Kopieren nach .env und anpassen
# ════════════════════════════════════════════════
# RVS Verbindung (gleiche Daten wie auf der ARIA-VM)
RVS_HOST=mobil.hacker-net.de
RVS_PORT=444
RVS_TLS=true
RVS_TLS_FALLBACK=true
RVS_TOKEN=dein_token_hier
+5
View File
@@ -0,0 +1,5 @@
FROM node:22-alpine
WORKDIR /app
COPY bridge.js package.json ./
RUN npm install --production
CMD ["node", "bridge.js"]
+312
View File
@@ -0,0 +1,312 @@
/**
* ARIA XTTS Bridge — Verbindet XTTS v2 Server mit dem RVS
*
* Empfaengt tts_request ueber RVS → rendert Audio via XTTS API → sendet zurueck
* Empfaengt voice_upload → speichert Voice-Sample fuer Cloning
* Empfaengt xtts_list_voices → listet verfuegbare Stimmen
*/
const WebSocket = require("ws");
const http = require("http");
const https = require("https");
const fs = require("fs");
const path = require("path");
const XTTS_API_URL = process.env.XTTS_API_URL || "http://xtts:8000";
const RVS_HOST = process.env.RVS_HOST || "";
const RVS_PORT = process.env.RVS_PORT || "443";
const RVS_TLS = process.env.RVS_TLS || "true";
const RVS_TLS_FALLBACK = process.env.RVS_TLS_FALLBACK || "true";
const RVS_TOKEN = process.env.RVS_TOKEN || "";
const VOICES_DIR = "/voices";
function log(msg) {
console.log(`[${new Date().toISOString()}] ${msg}`);
}
// ── RVS Verbindung ──────────────────────────────────
let rvsWs = null;
let retryDelay = 2;
function connectRVS(forcePlain) {
if (!RVS_HOST || !RVS_TOKEN) {
log("RVS nicht konfiguriert — beende");
process.exit(1);
}
const useTls = RVS_TLS === "true" && !forcePlain;
const proto = useTls ? "wss" : "ws";
const url = `${proto}://${RVS_HOST}:${RVS_PORT}?token=${RVS_TOKEN}`;
log(`Verbinde zu RVS: ${proto}://${RVS_HOST}:${RVS_PORT}`);
const ws = new WebSocket(url);
ws.on("open", () => {
log("RVS verbunden — warte auf TTS-Requests");
rvsWs = ws;
retryDelay = 2;
// Keepalive
setInterval(() => {
if (ws.readyState === WebSocket.OPEN) {
ws.ping();
ws.send(JSON.stringify({ type: "heartbeat", timestamp: Date.now() }));
}
}, 25000);
});
ws.on("message", async (raw) => {
try {
const msg = JSON.parse(raw.toString());
if (msg.type === "xtts_request") {
await handleTTSRequest(msg.payload);
} else if (msg.type === "voice_upload") {
await handleVoiceUpload(msg.payload);
} else if (msg.type === "xtts_list_voices") {
await handleListVoices();
}
} catch (err) {
log(`Fehler: ${err.message}`);
}
});
ws.on("close", () => {
log("RVS Verbindung geschlossen");
rvsWs = null;
setTimeout(() => connectRVS(), Math.min(retryDelay * 1000, 30000));
retryDelay = Math.min(retryDelay * 2, 30);
});
ws.on("error", (err) => {
log(`RVS Fehler: ${err.message}`);
if (useTls && RVS_TLS_FALLBACK === "true") {
log("TLS fehlgeschlagen — Fallback auf ws://");
ws.removeAllListeners();
try { ws.close(); } catch (_) {}
connectRVS(true);
}
});
}
// ── TTS Request Handler ─────────────────────────────
async function handleTTSRequest(payload) {
const { text, voice, requestId, language } = payload;
if (!text) return;
// Markdown + Sonderzeichen entfernen fuer natuerliche Sprache
let cleanText = text
.replace(/\*\*([^*]+)\*\*/g, "$1") // **fett** → fett
.replace(/\*([^*]+)\*/g, "$1") // *kursiv* → kursiv
.replace(/`([^`]+)`/g, "$1") // `code` → code
.replace(/```[\s\S]*?```/g, "") // Code-Bloecke entfernen
.replace(/\[([^\]]+)\]\([^)]+\)/g, "$1") // [text](url) → text
.replace(/#{1,6}\s*/g, "") // ### Ueberschriften → entfernen
.replace(/>\s*/g, "") // > Zitate → entfernen
.replace(/[-*]\s+/g, "") // - Listen → entfernen
.replace(/\n{2,}/g, ". ") // Mehrere Newlines → Punkt
.replace(/\n/g, ", ") // Einzelne Newlines → Komma
.replace(/\s{2,}/g, " ") // Mehrfach-Leerzeichen
.replace(/["""„]/g, "") // Anfuehrungszeichen entfernen
.replace(/\(\)/g, "") // Leere Klammern
.trim();
// Text in Saetze aufteilen, dann zu Chunks von 2-3 Saetzen zusammenfassen
// (mehr Kontext = konsistentere Stimme/Lautstaerke, aber nicht zu lang fuer WebSocket)
const sentences = cleanText.split(/(?<=[.!?])\s+/)
.map(s => s.trim())
.filter(s => s.length > 0)
.map(s => s.replace(/[.]+$/, '')); // Punkt am Ende entfernen
const MAX_CHUNK_CHARS = 150; // Max ~150 Zeichen pro Chunk (schnelles Rendering, Preloading reicht)
const chunks = [];
let currentChunk = '';
for (const sentence of sentences) {
if (currentChunk && (currentChunk.length + sentence.length + 2) > MAX_CHUNK_CHARS) {
chunks.push(currentChunk);
currentChunk = sentence;
} else {
currentChunk = currentChunk ? currentChunk + ', ' + sentence : sentence;
}
}
if (currentChunk) chunks.push(currentChunk);
if (chunks.length === 0) return;
log(`TTS-Request: "${cleanText.slice(0, 60)}..." (${sentences.length} Saetze → ${chunks.length} Chunks, voice: ${voice || "default"}, lang: ${language || "de"})`);
try {
const voiceSample = voice ? path.join(VOICES_DIR, `${voice}.wav`) : null;
const hasCustomVoice = voiceSample && fs.existsSync(voiceSample);
// Streaming: Chunk rendern → sofort senden → naechster Chunk
// App spielt mit Preloading-Queue nahtlos ab
let sentCount = 0;
for (let i = 0; i < chunks.length; i++) {
const chunk = chunks[i];
try {
const audioBuffer = await callXTTSAPI(chunk, language || "de", hasCustomVoice ? voiceSample : null);
if (audioBuffer && audioBuffer.length > 100) {
log(`TTS [${i + 1}/${chunks.length}]: ${(audioBuffer.length / 1024).toFixed(0)}KB — "${chunk.slice(0, 50)}"`);
sendToRVS({
type: "xtts_response",
payload: {
requestId: `${requestId || ""}_${i}`,
base64: audioBuffer.toString("base64"),
mimeType: "audio/wav",
voice: voice || "default",
engine: "xtts",
part: i + 1,
totalParts: chunks.length,
},
timestamp: Date.now(),
});
sentCount++;
}
} catch (chunkErr) {
log(`TTS [${i + 1}/${chunks.length}] Fehler: ${chunkErr.message} — ueberspringe`);
}
}
log(`TTS komplett: ${sentCount}/${chunks.length} Chunks gestreamt`);
} catch (err) {
log(`TTS Fehler: ${err.message}`);
sendToRVS({
type: "xtts_response",
payload: { requestId, error: err.message },
timestamp: Date.now(),
});
}
}
function callXTTSAPI(text, language, speakerWav) {
return new Promise((resolve, reject) => {
const body = JSON.stringify({
text,
language,
speaker_wav: speakerWav || "",
});
const url = new URL(`${XTTS_API_URL}/tts_to_audio/`);
const options = {
hostname: url.hostname,
port: url.port,
path: url.pathname,
method: "POST",
headers: {
"Content-Type": "application/json",
"Content-Length": Buffer.byteLength(body),
},
timeout: 60000,
};
const req = http.request(options, (res) => {
const chunks = [];
res.on("data", (chunk) => chunks.push(chunk));
res.on("end", () => {
if (res.statusCode === 200) {
resolve(Buffer.concat(chunks));
} else {
reject(new Error(`XTTS API HTTP ${res.statusCode}: ${Buffer.concat(chunks).toString().slice(0, 200)}`));
}
});
});
req.on("error", reject);
req.on("timeout", () => { req.destroy(); reject(new Error("XTTS API Timeout (60s)")); });
req.write(body);
req.end();
});
}
// ── Voice Upload Handler ────────────────────────────
async function handleVoiceUpload(payload) {
const { name, samples } = payload;
if (!name || !samples || !Array.isArray(samples) || samples.length === 0) {
log("Voice Upload: Ungueltige Daten");
return;
}
log(`Voice Upload: "${name}" (${samples.length} Samples)`);
try {
// Alle Samples zusammenfuegen
const buffers = samples.map(s => Buffer.from(s.base64, "base64"));
const combined = Buffer.concat(buffers);
// Als WAV speichern
fs.mkdirSync(VOICES_DIR, { recursive: true });
const filePath = path.join(VOICES_DIR, `${name.replace(/[^a-zA-Z0-9_-]/g, "_")}.wav`);
fs.writeFileSync(filePath, combined);
log(`Voice gespeichert: ${filePath} (${(combined.length / 1024).toFixed(0)}KB)`);
sendToRVS({
type: "xtts_voice_saved",
payload: { name, size: combined.length, path: filePath },
timestamp: Date.now(),
});
} catch (err) {
log(`Voice Upload Fehler: ${err.message}`);
}
}
// ── Voice List Handler ──────────────────────────────
async function handleListVoices() {
try {
const files = fs.existsSync(VOICES_DIR)
? fs.readdirSync(VOICES_DIR).filter(f => f.endsWith(".wav"))
: [];
const voices = files.map(f => ({
name: path.basename(f, ".wav"),
file: f,
size: fs.statSync(path.join(VOICES_DIR, f)).size,
}));
log(`Stimmen: ${voices.length} verfuegbar`);
sendToRVS({
type: "xtts_voices_list",
payload: { voices },
timestamp: Date.now(),
});
} catch (err) {
log(`Stimmen-Liste Fehler: ${err.message}`);
}
}
// ── RVS senden ──────────────────────────────────────
function sendToRVS(msg) {
if (rvsWs && rvsWs.readyState === WebSocket.OPEN) {
rvsWs.send(JSON.stringify(msg));
}
}
// ── Start ───────────────────────────────────────────
log("ARIA XTTS Bridge startet...");
log(`XTTS API: ${XTTS_API_URL}`);
log(`RVS: ${RVS_HOST}:${RVS_PORT}`);
// Warten bis XTTS API erreichbar ist
function waitForXTTS(callback, attempts) {
if (attempts <= 0) { log("XTTS API nicht erreichbar — starte trotzdem"); callback(); return; }
http.get(`${XTTS_API_URL}/docs`, (res) => {
log(`XTTS API erreichbar (HTTP ${res.statusCode})`);
callback();
}).on("error", () => {
log(`XTTS API noch nicht bereit — warte (${attempts} Versuche uebrig)...`);
setTimeout(() => waitForXTTS(callback, attempts - 1), 10000); // 10s statt 5s (Model laden dauert)
});
}
waitForXTTS(() => connectRVS(), 30); // Max 5min warten
+56
View File
@@ -0,0 +1,56 @@
# ════════════════════════════════════════════════
# ARIA XTTS v2 — GPU TTS Server
# Laeuft auf dem Gaming-PC (RTX 3060)
# Verbindet sich zum RVS fuer TTS-Requests
# ════════════════════════════════════════════════
#
# Voraussetzungen:
# - Docker Desktop mit WSL2
# - NVIDIA Container Toolkit
# - .env mit RVS-Verbindungsdaten
#
# Start: docker compose up -d
# Test: curl http://localhost:8000/docs
# ════════════════════════════════════════════════
services:
# ─── XTTS v2 API Server (GPU) ─────────────────
xtts:
image: daswer123/xtts-api-server:latest
container_name: aria-xtts
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
ports:
- "8000:8020"
volumes:
- xtts-models:/app/xtts_models # Model-Cache (~2GB)
- ./voices:/voices # Custom Voice Samples
environment:
- COQUI_TOS_AGREED=1
restart: unless-stopped
# ─── XTTS Bridge (verbindet zu RVS) ───────────
xtts-bridge:
build: .
container_name: aria-xtts-bridge
depends_on:
- xtts
volumes:
- ./voices:/voices # Shared mit XTTS-Server
environment:
- XTTS_API_URL=http://xtts:8020
- RVS_HOST=${RVS_HOST}
- RVS_PORT=${RVS_PORT:-443}
- RVS_TLS=${RVS_TLS:-true}
- RVS_TLS_FALLBACK=${RVS_TLS_FALLBACK:-true}
- RVS_TOKEN=${RVS_TOKEN}
restart: unless-stopped
volumes:
xtts-models:
+8
View File
@@ -0,0 +1,8 @@
{
"name": "aria-xtts-bridge",
"version": "1.0.0",
"private": true,
"dependencies": {
"ws": "^8.16.0"
}
}