Compare commits

..

102 Commits

Author SHA1 Message Date
duffyduck cd390a4115 release: bump version to 0.0.3.8 2026-04-18 11:41:12 +02:00
duffyduck a65ed579d2 feat: Whisper model selector + 16kHz mono recording
- App: AudioSamplingRateAndroid 16000 + AudioChannelsAndroid 1
  → Whisper bekommt direkt sein Ziel-Format, kein Resample mehr
- Bridge: STTEngine.reload() laedt Modell zur Laufzeit neu
  (tiny/base/small/medium/large-v3)
- Bridge: Config-Message triggert Hot-Reload wenn whisperModel sich aendert
- Bridge: Default auf 'medium' (besser als 'small' bei aehnlicher Latenz)
- Diagnostic: Neue Sektion "Whisper (Spracherkennung)" mit Dropdown,
  auto-save bei Auswahl, beim Laden wird der gespeicherte Wert gesetzt
- Diagnostic/Server: send_voice_config merged whisperModel in voice_config.json
- aria.env.example: WHISPER_MODEL + WHISPER_LANGUAGE dokumentiert

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 11:37:27 +02:00
duffyduck 2ad1f57382 feat: Thinking indicator + cancel button in the app
- Bridge: _emit_activity() spiegelt OpenClaw agent events als agent_activity
  an RVS, dedupliziert State-Wechsel. chat:final/error senden idle.
- Bridge: Neuer cancel_request-Handler ruft Diagnostic /api/cancel per HTTP.
- Diagnostic: Neuer POST /api/cancel Endpoint (gleiche Logik wie WS-Cancel).
- RVS: agent_activity + cancel_request in ALLOWED_TYPES.
- App: Gelber Indicator ueber der Input-Bar mit Text je nach Activity,
  roter Abbrechen-Button. Cancel sendet cancel_request via RVS.
- issue.md: Erledigte Bugfixes + Features konsolidiert.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 11:22:02 +02:00
duffyduck 58e3cfd3e6 feat: Session export as markdown in Diagnostic
- ⬇ Button per Session-Zeile — exportiert auch inaktive Sessions
- Server parst JSONL, extrahiert User/Assistant-Nachrichten mit Timestamp
- Metadata-Prefix wird entfernt, Markdown mit # Session-Header generiert
- Browser-Download via Blob + download-Attribut

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 11:14:15 +02:00
duffyduck 7de4ee8f5b fix: Stuck "ARIA denkt..." indicator after pipeline ends
- pipelineEnd() now broadcasts agent_activity: idle unconditionally
- chat:error and chat:final paths broadcast idle outside of active pipeline
- Gateway close event ends active pipeline + broadcasts idle
- Prevents indicator from hanging after timeout/error/disconnect

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 11:11:12 +02:00
duffyduck 213edac3a7 fix: Session persistence - respect user choice across container restarts
- sessionFromFile flag prevents auto-pick after first start
- Atomic write (temp + rename) with loud error logging
- Auto-pick filters out aria-bridge/aria-diagnostic when user sessions exist
- handleSetActiveSession reports persistence failures to client

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 11:03:26 +02:00
duffyduck acc13aef6b fix: Speech gate - only send recording if actual speech detected
- VAD_SPEECH_THRESHOLD_DB = -35 (louder than silence threshold)
- Needs 300ms of speech before counting as real speech
- Recording discarded if only background noise detected
- Prevents sending garbage to Whisper in conversation mode

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 18:20:05 +02:00
duffyduck 4bbc6f7787 release: bump version to 0.0.3.7 2026-04-11 13:18:17 +02:00
duffyduck 20f2ea1829 fix: Conversation mode starts recording immediately when ear button tapped
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 13:15:26 +02:00
duffyduck 2d23f0668b docs: update README with conversation mode, multi-attachments, markdown cleanup
- Conversation mode (ear button) documented in App Features
- Multiple attachments + paste support
- Markdown cleanup for TTS
- Auto-Update FileProvider + check button
- Roadmap: 22 items in Phase 1 completed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 12:43:09 +02:00
duffyduck d6030a06b7 docs: update issue.md - move completed items, clean up open list
28 items completed, 10 remaining open

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 12:23:04 +02:00
duffyduck 0df76e2af6 release: bump version to 0.0.3.6 2026-04-11 12:19:00 +02:00
duffyduck f80fe1df93 fix: Inverted FlatList - newest messages always visible at bottom
- No more scrollToEnd/scrollToIndex needed
- FlatList inverted=true with reversed data
- New messages appear at bottom automatically
- User scrolls up to see history (natural chat behavior)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 12:17:32 +02:00
duffyduck cff421bc53 release: bump version to 0.0.3.5 2026-04-11 12:13:41 +02:00
duffyduck bca925d385 fix: Use scrollToIndex with viewPosition:1 for reliable bottom scroll
- scrollToIndex targets last message at bottom of viewport
- onScrollToIndexFailed fallback to scrollToEnd
- More reliable than scrollToEnd with dynamic heights

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 12:12:24 +02:00
duffyduck 9abde89805 release: bump version to 0.0.3.4 2026-04-11 12:09:23 +02:00
duffyduck ea4f639fcb fix: Auto-scroll retry with multiple delays (100, 300, 600, 1000ms)
FlatList needs time to render - single setTimeout(150) was unreliable.
Now tries 4 times on initial load, 2 times for new messages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 12:07:54 +02:00
duffyduck 64cd5f7d52 release: bump version to 0.0.3.3 2026-04-11 12:04:37 +02:00
duffyduck 843ebe1d8f fix: Remove duplicate closure ending in ChatScreen (build error)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 12:03:20 +02:00
duffyduck 764619f076 fix: Comprehensive markdown/formatting cleanup for TTS (Piper + XTTS)
- Remove **bold**, *italic*, `code`, code blocks, links, headers, quotes, lists
- Replace newlines with natural pauses (period/comma)
- Remove quotation marks, empty brackets
- Fixes text being swallowed/garbled by TTS engines

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:47:04 +02:00
duffyduck e3a0cfb55a docs: mark conversation mode as done, keep Porcupine as Phase 2
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:42:33 +02:00
duffyduck 2929749314 feat: Conversation mode (ear button) - auto-record after ARIA speaks
- Ear button activates conversation mode (green dot)
- After TTS playback finishes → 800ms pause → auto-start recording
- VAD stops recording on silence → sends to ARIA → ARIA answers → TTS → loop
- Like a natural conversation / walkie-talkie mode
- Audio service fires onPlaybackFinished when queue empty

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:40:55 +02:00
duffyduck 51b9512f4e docs: mark scroll bugs as fixed in issue.md
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:37:53 +02:00
duffyduck ffcfa44eef fix: Auto-scroll to last message on app start + new messages
- useEffect on messages array instead of onContentSizeChange
- Instant jump (no animation) when loading history
- Animated scroll for single new messages
- Scroll pauses when user scrolls up, resumes at bottom

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:37:30 +02:00
duffyduck 6363da97b1 feat: Multiple attachments + paste support (App + Diagnostic)
App:
- Multiple pending attachments (horizontal scroll preview)
- Individual remove (X) or clear all
- Send button shows when any attachment pending
- All files sent before text message

Diagnostic:
- Clip icon for file selection (multiple)
- Paste images/files from clipboard (Ctrl+V)
- Pending preview with thumbnails
- Files sent via RVS before text message

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:34:33 +02:00
duffyduck 07ed2cdcf6 docs: mark attachment text feature as done in issue.md
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 10:06:13 +02:00
duffyduck 5ad68b7dfc feat: Attachments not sent immediately - add text/voice before sending
- File/photo selection stores as pending (not sent immediately)
- Preview bar shows pending attachment above input field
- User can add text message before sending (e.g. "Was siehst du?")
- Send button appears when attachment is pending (even without text)
- Placeholder changes to "Text zum Anhang (optional)..."
- X button to cancel pending attachment
- File + text sent together (file first, then chat message)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 10:05:50 +02:00
duffyduck 8a6ee018ea docs: mark text message bug as fixed in issue.md
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 09:59:48 +02:00
duffyduck b42590ff95 docs: mark auto-update bugs as fixed in issue.md
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 09:54:57 +02:00
duffyduck 056b579c47 release: bump version to 0.0.3.2 2026-04-11 09:53:54 +02:00
duffyduck 576e612cd0 fix: release.sh clears Metro + Gradle cache before build (version consistency)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 09:51:41 +02:00
duffyduck c2faa06a15 release: bump version to 0.0.3.1 2026-04-10 23:19:40 +02:00
duffyduck d3ed3556eb fix: Bridge chat handler was missing send_to_core (text messages ignored)
The chat handler checked sender but never forwarded the text to aria-core.
Only voice messages worked because they went through the audio→STT→send_to_core path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 23:13:29 +02:00
duffyduck d960d125c0 release: bump version to 0.0.3.0 2026-04-10 09:07:20 +02:00
duffyduck 89d5d7ec0a release: bump version to 0.0.2.9 2026-04-10 09:01:47 +02:00
duffyduck ea0c13936b fix: release.sh deletes old APKs on RVS before uploading new one
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 09:00:45 +02:00
duffyduck 773c976822 fix: Auto-update APK install via FileProvider + dynamic version
- Native ApkInstallerModule: FileProvider content:// URI for Android 7+
- REQUEST_INSTALL_PACKAGES permission in AndroidManifest
- file_paths.xml for FileProvider cache access
- APP_VERSION reads from package.json (not hardcoded)
- "Auf Updates pruefen" button in Settings
- Version display reads from package.json dynamically

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:59:52 +02:00
duffyduck cd05ed2379 docs: add auto-update FileProvider bug + update check button to issue.md
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:55:30 +02:00
duffyduck 054e4057d8 release: bump version to 0.0.2.8 2026-04-10 08:49:47 +02:00
duffyduck 3943e79bb1 docs: document .env.example with detailed comments, explain both tokens in README
- ARIA_AUTH_TOKEN: Gateway auth (who can talk to ARIA)
- RVS_TOKEN: Pairing token (same room in RVS relay)
- RVS_UPDATE_HOST: SSH target for auto-update APK copy
- All variables with German comments and examples

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:45:26 +02:00
duffyduck 87f4317c15 docs: add auto-update APK not reaching RVS bug to issue.md
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:39:59 +02:00
duffyduck 50aa793910 fix: Proxy SSH volume read-write (ARIA can manage keys without -F workaround)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:34:35 +02:00
duffyduck 5efc9865a8 docs: add 6 new bugs/features to issue.md
- Session persistence on container restart
- App: text/image/attachment messages not working (only voice)
- App: audio stops randomly
- App: auto-scroll to last message on start + new messages
- App: add text/voice to attachments
- Prioritized bugs section

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:31:49 +02:00
duffyduck 949c573c49 fix: XTTS chunk size 150 chars (faster render, preload overlaps playback)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 02:52:56 +02:00
duffyduck f7f450a09d fix: XTTS streaming mode - send each chunk immediately, comma between sentences
- Back to streaming: render chunk → send immediately → next chunk
- App plays with preloading queue (no waiting for all chunks)
- Comma instead of dot between sentences in chunk (no "Punkt" read aloud)
- Sentence-ending dots already removed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 02:48:50 +02:00
duffyduck 81f7c38383 fix: XTTS splits concatenated audio into ~8s parts (seamless with preload)
- All chunks rendered and PCM concatenated (consistent voice)
- Split into ~8 second WAV parts (not per-sentence)
- 8s is long enough for preload overlap, small enough for WebSocket
- Parts include part/totalParts metadata

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 02:41:14 +02:00
duffyduck 2c785cb37a feat: XTTS concatenates chunks into seamless WAV (no stuttering)
- All chunks rendered sequentially, PCM data concatenated
- Single WAV with proper header sent back (no queue needed in app)
- If total > 800KB, split into parts (WebSocket limit)
- Eliminates stuttering between sentences

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 02:40:16 +02:00
duffyduck 57e65b061c docs: update issue.md with XTTS streaming as next priority
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 02:38:21 +02:00
duffyduck aa54765b03 release: bump version to 0.0.2.7 2026-04-10 02:24:58 +02:00
duffyduck 8929bc99bb fix: XTTS groups sentences into ~250 char chunks for consistent voice quality
- 2-3 sentences per chunk (more context = stable voice/volume)
- Max 250 chars per chunk (keeps WebSocket packets manageable)
- Dots re-added between sentences within a chunk (natural pauses)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 02:23:29 +02:00
duffyduck 0428c06612 fix: Audio preloading to prevent stuttering, remove trailing dots for XTTS
- Preload next audio while current plays (eliminates gap between sentences)
- Remove trailing dots from sentences (XTTS reads them aloud)
- stopPlayback cleans up preloaded audio

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 02:21:19 +02:00
duffyduck a7eb3cf433 release: bump version to 0.0.2.6 2026-04-10 02:11:04 +02:00
duffyduck e4e0e793a8 fix: Audio queue for sequential TTS playback (no overlap/skip)
- Audio packets queued instead of stopping previous
- _playNext() plays sequentially, each sentence after the previous
- stopPlayback() clears queue
- Fixes overlapping/skipping with XTTS sentence-by-sentence rendering

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 02:09:35 +02:00
duffyduck b3d3b8b6bc fix: XTTS bridge splits text into sentences sequentially
- XTTS-Bridge does sentence splitting (not ARIA-Bridge)
- Sequential rendering: correct order guaranteed
- Each sentence sent as separate xtts_response
- Markdown removal before splitting
- App starts playback after first sentence (faster UX)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 02:03:29 +02:00
duffyduck 06bc456221 fix: XTTS splits long text into sentences before sending (WebSocket size limit)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 01:56:25 +02:00
duffyduck 3461f45207 docs: update README with XTTS v2 setup details, voice cloning guide
- Architecture diagram for XTTS flow (Gaming-PC ↔ RVS ↔ ARIA-VM)
- Port 8020 (not 8000), token must match, model caching
- Voice cloning step-by-step guide
- TTS engine switching (Piper/XTTS) with fallback
- Known limitation: RVS zombie connections

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 01:49:08 +02:00
duffyduck a17d4acc13 fix: XTTS bridge shares /voices volume with XTTS server
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 01:40:41 +02:00
duffyduck 62fd9193a1 fix: XTTS voice dropdown shows saved voice after page reload
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 01:34:00 +02:00
duffyduck 2329645df4 fix: XTTS voices list + upload use fresh RVS connection with response wait
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 01:24:55 +02:00
duffyduck 8a435ddf6c fix: voice upload uses send() via server, not client-side sendToRVS_raw
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 01:15:29 +02:00
duffyduck 25b754ba31 fix: voice upload Base64 conversion (chunked, no stack overflow)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 01:08:32 +02:00
duffyduck b734593bf2 fix: Bridge _send_to_rvs ping-check before send, force reconnect on zombie
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 00:37:22 +02:00
duffyduck 16847ce6f7 fix: TTS toggle global above engine selector, health check /docs
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 00:27:55 +02:00
duffyduck 6300829317 fix: XTTS model cache volume path /app/xtts_models
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:44:29 +02:00
duffyduck a1e1ee31bd fix: XTTS bridge port 8020, longer startup wait
- XTTS API runs on port 8020 (not 8000)
- Bridge waits up to 5min for model download (30x10s)
- Health check uses / instead of /docs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:39:45 +02:00
duffyduck 7ed70b876d updated image public path 2026-04-07 23:06:26 +02:00
duffyduck 3ca85da906 release: bump version to 0.0.2.5 2026-04-05 20:12:56 +02:00
duffyduck d6a89168ef release: bump version to 0.0.2.4 2026-04-05 19:51:19 +02:00
duffyduck cb33a20694 docs: update README with XTTS, auto-update, watchdog, TTS settings
- Architecture: Added XTTS v2 (Gaming-PC) and auto-update flow
- Diagnostic: Thinking indicator, cancel button, TTS tab, voice cloning
- App: Play button, chat search, auto-update, voice speed settings
- RVS: Auto-update APK distribution over WebSocket
- Watchdog: 2min warning → 5min doctor --fix → 8min container restart
- Roadmap: Phase 1 fully completed, updated Phase 2+3

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 19:46:16 +02:00
duffyduck a242693751 feat: XTTS v2 integration, auto-update system, TTS engine abstraction
- XTTS v2: Docker setup for Gaming-PC (GPU), bridge via RVS relay
- XTTS: Voice cloning UI in Diagnostic (multi-file upload)
- XTTS: Engine selectable (Piper local vs XTTS remote) with fallback
- Auto-Update: RVS serves APK over WebSocket (no HTTP needed)
- Auto-Update: App checks version on start, prompts install
- Auto-Update: release.sh copies APK to RVS via scp
- Bridge: TTS engine abstraction (piper/xtts), config persistent
- Bridge: xtts_response handler, tts_request on-demand
- Diagnostic: TTS engine dropdown, XTTS voice panel, voice cloning
- App: Play button on ARIA messages, chat search, update service
- Wake word: Disabled LiveAudioStream (crash fix), Phase 1 placeholder
- Watchdog: Container restart after 8min stuck
- Chat backup: on-the-fly to /shared/config/chat_backup.jsonl

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 19:42:10 +02:00
duffyduck 81ca3cc7a7 Ohr-Button Absturz gefixt (LiveAudioStream entfernt, Phase 1 , Play-Button in ARIA-Nachrichten fuer Sprachwiedergabe
- [x] Chat-Suche in der App (Lupe in Statusleiste)
- [x] Watchdog mit Container-Restart (2min Warnung → 5min doctor --fix → 8min Restart),Abbrechen-Button im Diagnostic Chat
- [x] Nachrichten Backup on-the-fly (/shared/config/chat_backup.jsonl)
- [x] Grosse Nachrichten satzweise aufteilen fuer TTS
- [x] RVS Nachrichten vom Smartphone gehen durch
2026-04-01 23:45:25 +02:00
duffyduck 1a32098c9e release: bump version to 0.0.2.3 2026-04-01 23:45:15 +02:00
duffyduck fa4c32270b sst immer 2026-03-29 19:18:41 +02:00
duffyduck 9c43b875f4 release: bump version to 0.0.2.2 2026-03-29 19:04:31 +02:00
duffyduck 63560e290b two speed 2026-03-29 19:03:40 +02:00
duffyduck 1ab8a6a2fe addes speed config for voice 2026-03-29 18:50:09 +02:00
duffyduck a2c0196e05 release: bump version to 0.0.2.1 2026-03-29 18:49:37 +02:00
duffyduck 680f7a64e2 slpit setnteces 2026-03-29 18:42:24 +02:00
duffyduck 4893616a5a playback issue 2026-03-29 18:36:00 +02:00
duffyduck 04e8c0245d voiice settings permanent 2026-03-29 18:23:31 +02:00
duffyduck 10cefaf1cd changed connection model 2026-03-29 18:12:26 +02:00
duffyduck adbb1fe80a changed docker file 2026-03-29 17:46:27 +02:00
duffyduck 79c50aedcc release: bump version to 0.0.2.0 2026-03-29 17:42:23 +02:00
duffyduck eb72b35e23 added voice settings in adroid app and diagnostic, higlight trigger in app und diagnostic
change voicec
2026-03-29 17:41:28 +02:00
duffyduck bbd02d46a6 changed issue md 2026-03-29 17:28:40 +02:00
duffyduck 3d3c8ce973 fixed tts format, added trigger words settings 2026-03-29 17:27:43 +02:00
duffyduck 562f929056 added setting for states and voices in setting diagnostic, added states in diagnostic, added watchdog and debug tts do diagnostic 2026-03-29 17:12:25 +02:00
duffyduck ff03d8ce62 release: bump version to 0.0.1.9 2026-03-29 17:11:33 +02:00
duffyduck 8281131432 tts fix big pictures 2026-03-29 17:02:02 +02:00
duffyduck 8a6bd4e0e7 voice message are send double to diagnostic 2026-03-29 16:50:48 +02:00
duffyduck 1b4df0565a wait at an attachment for instructions, show picture in diagnostic chat 2026-03-29 16:42:56 +02:00
duffyduck eb3692ef81 fixed arai proxy shared volume 2026-03-29 16:34:55 +02:00
duffyduck 46a9ac9f84 release: bump version to 0.0.1.8 2026-03-29 16:25:37 +02:00
duffyduck a012ec65ef filter own sender to hide own messages, these ar sendet from rvs twice 2026-03-29 16:15:10 +02:00
duffyduck b86c4a0d1a fixed double diagnostic message 2026-03-29 16:12:24 +02:00
duffyduck 11de9a01b9 error through loops no message received, fixed 2026-03-29 16:08:37 +02:00
duffyduck 80dec2daf9 reset connection as every send message 2026-03-29 16:04:43 +02:00
duffyduck da591bb53c fixed fallback issue clodes before sessions 2026-03-29 15:58:39 +02:00
duffyduck 7545c9c823 check still open 2026-03-29 15:53:11 +02:00
duffyduck ecc3d59a8f change rvs server 2026-03-29 15:40:17 +02:00
duffyduck b8862f025b fixed, thinking in webgui 2026-03-29 15:10:41 +02:00
duffyduck db20a07b27 fixed time out aria-core 2026-03-29 14:56:55 +02:00
31 changed files with 3187 additions and 376 deletions
+37 -7
View File
@@ -1,20 +1,50 @@
# ARIA Environment Configuration
# Copy to .env and fill in values
# ════════════════════════════════════════════════
# ARIA — Umgebungsvariablen
# Kopieren nach .env und Werte eintragen
# ════════════════════════════════════════════════
# Auth token for ARIA Core (generate a long random string)
# openssl rand -hex 32
# ── ARIA Auth Token ──────────────────────────────
# Authentifizierung fuer den OpenClaw Gateway (aria-core).
# Wird von Diagnostic, Bridge und App genutzt um sich am Gateway anzumelden.
# Alle Services die mit aria-core kommunizieren brauchen diesen Token.
# Generieren: openssl rand -hex 32
ARIA_AUTH_TOKEN=change-me-to-a-long-random-string
# RVS — Rendezvous-Server (Bridge + App verbinden sich hierüber)
# ── RVS — Rendezvous-Server ─────────────────────
# Der RVS ist ein WebSocket-Relay im Rechenzentrum.
# App, Bridge, Diagnostic und XTTS-Bridge verbinden sich hierueber.
# Alle muessen den gleichen Host, Port und Token nutzen.
# Hostname des RVS-Servers (z.B. rvs.example.de oder mobil.hacker-net.de)
RVS_HOST=rvs.example.de
# Port auf dem der RVS laeuft (muss mit rvs/docker-compose.yml uebereinstimmen)
RVS_PORT=443
# TLS (wss://) verwenden? true = verschluesselt, false = unverschluesselt (ws://)
RVS_TLS=true
# Bei TLS-Fehler automatisch auf ws:// (ohne TLS) fallback?
# true = Fallback erlaubt, false = nur mit TLS verbinden
# Nuetzlich wenn kein TLS-Zertifikat vorhanden (z.B. Entwicklung)
RVS_TLS_FALLBACK=true
# Pairing-Token: Wer den gleichen Token hat, landet im gleichen RVS-Room.
# Wird von generate-token.sh automatisch generiert und hier eingetragen.
# Die Android App bekommt den Token per QR-Code beim Pairing.
# WICHTIG: Muss auf ARIA-VM, Gaming-PC (xtts/.env) und App identisch sein!
# Generieren: ./generate-token.sh (traegt den Token automatisch ein)
RVS_TOKEN=
# Gitea (for release.sh — Kennwort wird interaktiv abgefragt)
# ── Gitea — Release-Verwaltung ───────────────────
# Wird von release.sh genutzt um APKs auf Gitea zu veroeffentlichen.
# Kennwort wird beim Release interaktiv abgefragt (nicht in .env!).
GITEA_URL=https://git.hacker-net.de
GITEA_REPO=Hacker-Software/ARIA-AGENT
GITEA_USER=duffyduck
# ── Auto-Update — APK auf RVS-Server kopieren ───
# SSH-Ziel fuer scp: release.sh kopiert die APK dorthin.
# Der RVS-Server stellt sie dann per WebSocket an die App bereit.
# Format: user@host (z.B. root@aria-rvs oder root@rvs.example.de)
# Leer lassen = Auto-Update ueberspringen, APK manuell auf RVS kopieren.
RVS_UPDATE_HOST=
+1
View File
@@ -36,6 +36,7 @@ android/local.properties
android/package-lock.json
*.apk
*.aab
rvs/updates/*.apk
# ── Tauri / Desktop Build ───────────────────────
desktop/src-tauri/target/
+172 -24
View File
@@ -29,11 +29,18 @@ ARIA hat zwei Rollen:
┌─────────────────────────────────────────────────────────┐
│ RVS — Rendezvous-Server │
│ Node.js WebSocket Relay (Docker, Rechenzentrum) │
│ Reiner Relay — kennt keine Tokens, leitet durch
│ Relay + Auto-Update (APK-Verteilung)
│ rvs/docker-compose.yml │
└───────────────────────┬─────────────────────────────────┘
│ WebSocket Tunnel
└───────────┬───────────────────────────┬─────────────────┘
│ WebSocket Tunnel │ WebSocket Tunnel
┌───────────────────────────┐
│ Gaming-PC (optional) │
│ RTX 3060, Docker+WSL2 │
│ XTTS v2 (natuerliche │
│ Stimmen, Voice Cloning) │
│ xtts/docker-compose.yml │
└───────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ ARIA-VM (Proxmox, Debian 13) — ARIAs Wohnung │
│ Basissystem + Docker. Rest richtet ARIA selbst ein. │
@@ -66,13 +73,14 @@ ARIA hat zwei Rollen:
└─────────────────────────────────────────────────────────┘
```
**Drei separate Deployments:**
**Vier separate Deployments:**
| Was | Wo | Wie |
|-----|----|-----|
| RVS | Rechenzentrum | `cd rvs && docker compose up -d` |
| ARIA Core | Debian 13 VM | `docker compose up -d && ./aria-setup.sh` |
| Android App | Stefans Handy | APK installieren, QR-Code scannen |
| XTTS v2 (optional) | Gaming-PC (GPU) | `cd xtts && docker compose up -d` |
| Android App | Stefans Handy | APK installieren (Auto-Update via RVS) |
---
@@ -95,16 +103,31 @@ cd ~/ARIA-AGENT
cp .env.example .env
```
`.env` Datei editieren:
`.env` Datei editieren (Details siehe `.env.example`):
```bash
# Gateway-Auth: Alle Services die mit aria-core reden brauchen diesen Token
# Diagnostic, Bridge, App nutzen ihn fuer den WebSocket-Handshake
ARIA_AUTH_TOKEN= # openssl rand -hex 32
# RVS-Verbindung: Hostname + Port deines Rendezvous-Servers
RVS_HOST= # z.B. rvs.hackersoft.de
RVS_PORT=443
RVS_TLS=true
RVS_TLS_FALLBACK=true
RVS_TOKEN= # wird von generate-token.sh automatisch gesetzt
# Pairing-Token: Verbindet App, Bridge, Diagnostic und XTTS im gleichen RVS-Room
# MUSS auf allen Geraeten identisch sein (ARIA-VM, Gaming-PC, App)
# Wird von generate-token.sh automatisch generiert und eingetragen
RVS_TOKEN= # ./generate-token.sh
# Optional: SSH-Host des RVS-Servers fuer Auto-Update (z.B. root@aria-rvs)
RVS_UPDATE_HOST=
```
**Zwei Tokens, zwei Zwecke:**
- **ARIA_AUTH_TOKEN**: Authentifizierung am OpenClaw Gateway (aria-core). Wer diesen Token hat, kann ARIA Befehle geben.
- **RVS_TOKEN**: Pairing-Token fuer den Rendezvous-Server. Alle Geraete mit dem gleichen Token landen im gleichen "Room" und koennen kommunizieren. Die App bekommt diesen Token per QR-Code.
### 2. Claude CLI einloggen (Proxy-Auth)
Der Proxy-Container nutzt deine Claude Max Subscription. Die Credentials muessen
@@ -283,7 +306,8 @@ aria-core → Antwort → Gateway → Diagnostic → RVS → App
### Features
- **STT**: faster-whisper (lokal, offline, 16kHz mono)
- **TTS**: Piper (Ramona + Thorsten, offline)
- **TTS**: Piper (Ramona + Thorsten, offline) oder XTTS v2 (remote, GPU, Voice Cloning)
- **Markdown-Bereinigung**: Entfernt **fett**, *kursiv*, `code`, Links, Listen etc. vor TTS (natuerliche Sprache)
- **Wake-Word**: openwakeword (lokales Mikrofon auf der VM)
- **App-Audio**: Base64 Audio von App → FFmpeg → Whisper STT → Text an aria-core
- **Modi**: Normal, Nicht stoeren, Fluestern, Hangar, Gaming
@@ -314,13 +338,19 @@ Erreichbar unter `http://<VM-IP>:3001`. Teilt das Netzwerk mit aria-core.
### Features
- **Status-Karten**: Gateway (Handshake), RVS (TLS-Fallback), Proxy (Auth)
- **Chat-Test**: Nachrichten direkt an ARIA senden (Gateway oder via RVS)
- **Chat-Test**: Nachrichten direkt an ARIA senden (Gateway oder via RVS), Vollbild-Modus
- **"ARIA denkt..." Indikator**: Zeigt live was ARIA gerade tut (Denken, Tool, Schreiben)
- **Abbrechen-Button**: Stoppt laufende Anfragen + doctor --fix
- **Session-Verwaltung**: Sessions auflisten, wechseln, erstellen, loeschen
- **Chat-History**: Wird beim Laden und Session-Wechsel angezeigt (read-only aus JSONL)
- **TTS-Diagnose Tab**: Stimmen testen, Status pruefen, Fehler anzeigen
- **Einstellungen**: TTS-Engine (Piper/XTTS), Stimmen, Speed, Highlight-Trigger, Betriebsmodi
- **XTTS Voice Cloning**: Audio-Samples hochladen, eigene Stimme erstellen
- **Claude Login**: Browser-Terminal zum Einloggen in den Proxy
- **Core Terminal**: Shell in aria-core (openclaw CLI)
- **Container-Logs**: Echtzeit-Logs aller Container (gefiltert nach Tab)
- **Container-Logs**: Echtzeit-Logs aller Container (gefiltert nach Tab + Pipeline)
- **SSH Terminal**: Direkter SSH-Zugang zu aria-wohnung
- **Watchdog**: Erkennt stuck Runs (2min Warnung → 5min doctor --fix → 8min Container-Restart)
### Session-Verwaltung
@@ -338,12 +368,17 @@ API-Endpoint fuer andere Services: `GET http://localhost:3001/api/session`
- Text-Chat mit ARIA
- **Sprachaufnahme**: Push-to-Talk (halten) oder Tap-to-Talk (tippen, Auto-Stop bei Stille)
- **Gespraechsmodus** (Ohr-Button): Nach jeder ARIA-Antwort startet automatisch die Aufnahme — wie ein natuerliches Gespraech hin und her, ohne Buttons druecken
- **VAD (Voice Activity Detection)**: Erkennt 1.8s Stille und stoppt automatisch
- **STT (Speech-to-Text)**: Audio wird in der Bridge per Whisper transkribiert, transkribierter Text erscheint im Chat
- **Wake Word**: Toggle-Button (Ohr-Symbol) aktiviert kontinuierliches Mikrofon-Monitoring
- **TTS-Wiedergabe**: ARIA antwortet per Lautsprecher (Ramona/Thorsten)
- **Datei- und Bild-Upload**: Bilder inline im Chat, Dateien mit Icon + Name + Groesse
- **Anhaenge**: Bridge speichert Dateien in Shared Volume (`/shared/uploads/`), ARIA kann darauf zugreifen
- **TTS-Wiedergabe**: ARIA antwortet per Lautsprecher (Piper oder XTTS v2), Audio-Queue mit Preloading
- **Play-Button**: Jede ARIA-Nachricht kann nochmal vorgelesen werden
- **Chat-Suche**: Lupe in der Statusleiste filtert Nachrichten live
- **Mehrere Anhaenge**: Bilder + Dateien sammeln, Text hinzufuegen, dann zusammen senden
- **Paste-Support**: Bilder aus Zwischenablage einfuegen (Diagnostic)
- **Anhaenge**: Bridge speichert in Shared Volume, ARIA kann darauf zugreifen, Re-Download ueber RVS
- **Einstellungen**: TTS Engine, Stimmen, Speed pro Stimme, Speicherort, Auto-Download, GPS
- **Auto-Update**: Prueft beim Start + per Button auf neue Version, Download + Installation ueber RVS (FileProvider)
- GPS-Position (optional)
- QR-Code Scanner fuer Token-Pairing
@@ -374,19 +409,31 @@ cd android
```
Das Script macht alles in einem Schritt:
1. Fragt Gitea-Kennwort ab (wird nirgends gespeichert)
2. Baut die Release-APK
3. Erstellt Git Tag + pusht
4. Erstellt Gitea Release
5. Laedt APK als Asset hoch
1. Setzt Versionsnummern (package.json, build.gradle, SettingsScreen)
2. Fragt Gitea-Kennwort ab (wird nirgends gespeichert)
3. Baut die Release-APK
4. Git Commit + Tag + Push
5. Erstellt Gitea Release + laedt APK hoch
6. Kopiert APK auf RVS-Server (Auto-Update, optional)
Voraussetzung in `.env`:
```bash
GITEA_URL=https://gitea.hackersoft.de
GITEA_REPO=stefan/aria-agent
GITEA_USER=stefan
RVS_UPDATE_HOST=root@aria-rvs # Optional: fuer Auto-Update
```
### Auto-Update
Die App prueft beim Start ob eine neuere Version auf dem RVS liegt.
Der Update-Flow:
1. `./release.sh 0.0.3.0` → APK wird auf RVS kopiert (via scp)
2. Alternativ: `git pull` auf dem RVS-Server → APK in `rvs/updates/`
3. App sendet `update_check` mit aktueller Version
4. RVS vergleicht → sendet `update_available`
5. App zeigt Dialog → Download ueber WebSocket → Installation
### Audio-Pipeline (Spracheingabe)
```
@@ -454,6 +501,11 @@ aria-data/
│ ├── aria.env ← Voice Bridge Config
│ └── diag-state/ ← Diagnostic persistenter State
│ (im Shared Volume /shared/config/):
│ ├── voice_config.json ← TTS-Einstellungen (Stimme, Speed, Engine)
│ ├── highlight_triggers.json ← Highlight-Trigger Woerter
│ └── chat_backup.jsonl ← Nachrichten-Backup (on-the-fly)
└── ssh/ ← SSH Keys fuer VM-Zugriff
├── id_ed25519 ← Private Key (generiert von aria-setup.sh)
├── id_ed25519.pub ← Public Key (muss in VM authorized_keys!)
@@ -469,7 +521,7 @@ tar -czf aria-backup-$(date +%Y%m%d).tar.gz aria-data/
## RVS — Rendezvous-Server
Laeuft im Rechenzentrum. Reiner Relay — kennt keine Tokens, speichert nichts.
Laeuft im Rechenzentrum. WebSocket Relay + Auto-Update Server.
Wer sich mit dem gleichen Token verbindet, landet im gleichen Room.
```bash
@@ -477,10 +529,90 @@ cd rvs
docker compose up -d
```
**Features:**
- WebSocket Relay (alle Message-Types: chat, audio, file, config, xtts, update, etc.)
- Auto-Update: APK-Verteilung an Apps ueber WebSocket
- Heartbeat + tote Verbindungen aufraeumen
**Auto-Update APK bereitstellen:**
```bash
# APK in updates/ legen (manuell oder via release.sh)
cp ARIA-v0.0.3.0.apk ~/ARIA-AGENT/rvs/updates/
# RVS erkennt die Version aus dem Dateinamen
```
**Multi-Instanz:** Mehrere ARIA-VMs koennen denselben RVS nutzen — jede mit eigenem Token.
---
## XTTS v2 — GPU TTS Server (optional)
Laeuft auf einem separaten Rechner mit NVIDIA GPU (z.B. Gaming-PC mit RTX 3060).
Verbindet sich ueber RVS mit der ARIA-Infrastruktur — kein VPN noetig, funktioniert
ueber verschiedene Netze hinweg.
### Architektur
```
Gaming-PC (Windows, RTX 3060, Docker Desktop + WSL2)
├── aria-xtts XTTS v2 GPU Server (Port 8020 intern)
└── aria-xtts-bridge RVS-Relay (empfaengt Requests, sendet Audio)
└── Beide teilen ./voices/ Volume fuer Voice Cloning
↕ RVS (Rechenzentrum, WebSocket Relay)
ARIA-VM
└── aria-bridge: tts_engine="xtts" → xtts_request via RVS → wartet auf xtts_response
```
### Voraussetzungen
- Docker Desktop mit WSL2 (Windows) oder Docker mit NVIDIA Runtime (Linux)
- NVIDIA Container Toolkit
- GPU mit mindestens 4GB VRAM (6GB+ empfohlen)
- **Gleicher RVS_TOKEN wie auf der ARIA-VM!**
### Setup
```bash
cd xtts
cp .env.example .env
# .env mit RVS-Verbindungsdaten fuellen (gleicher Token wie ARIA-VM!)
docker compose up -d
# Erster Start laedt ~2GB Model herunter (danach gecacht)
```
**Wichtig:** Der XTTS-Server laeuft intern auf Port **8020** (nicht 8000).
Das Model wird im Volume `xtts-models` gecacht und muss nur einmal geladen werden.
### Features
- **Natuerliche Stimmen**: Deutlich bessere Qualitaet als Piper
- **Voice Cloning**: Eigene Stimme mit 6-10s Audio-Sample (~2s Latenz auf RTX 3060)
- **16 Sprachen**: Deutsch, Englisch, Franzoesisch, etc.
- **Fallback**: Wenn XTTS nicht erreichbar, nutzt die Bridge automatisch Piper
### TTS-Engine umschalten
In der Diagnostic unter Einstellungen → Sprachausgabe:
- **TTS aktiv**: Global An/Aus
- **TTS Engine**: Piper (lokal, CPU, schnell) oder XTTS v2 (remote, GPU, natuerlich)
- **Piper**: Standard-Stimme, Highlight-Stimme, Speed pro Stimme
- **XTTS**: Stimmen-Auswahl, Voice Cloning
### Stimme klonen
1. TTS Engine auf "XTTS v2" stellen
2. "Stimme klonen" → Audio-Dateien hochladen (WAV/MP3, 1-10 Dateien, min. 6-10s gesamt)
3. Name vergeben → "Stimme erstellen"
4. "Laden" klicken → neue Stimme in der Auswahl
5. Stimme auswaehlen → Config wird automatisch gespeichert
> **Tipp:** Fuer beste Ergebnisse: saubere Aufnahme, eine Stimme, kein Hintergrund,
> 10-30 Sekunden Gesamtlaenge. Mehrere kurze Dateien werden zusammengefuegt.
---
## Docker Volumes
| Volume | Pfad im Container | Zweck |
@@ -491,7 +623,7 @@ docker compose up -d
| `./aria-data/ssh` (bind) | `/root/.ssh`, `/home/node/.ssh` | SSH Keys |
| `./aria-data/brain` (bind) | `/home/node/.openclaw/workspace/memory` | Gedaechtnis |
| `./aria-data/skills` (bind) | `/home/node/.openclaw/workspace/skills` | Skills |
| `aria-shared` | `/shared` (Core + Bridge) | Datei-Austausch (Uploads von App) |
| `aria-shared` | `/shared` (Core + Bridge + Proxy + Diag) | Datei-Austausch, Config, Uploads |
| `./aria-data/config/diag-state` (bind) | `/data` (Diagnostic) | Persistenter State (aktive Session) |
---
@@ -549,6 +681,8 @@ docker exec aria-core ssh aria-wohnung hostname
- **Wake Word nur auf VM**: Die Bridge hoert auf "ARIA" ueber das lokale Mikrofon der VM.
In der App gibt es Energy-basierte Erkennung (Phase 1). On-device "ARIA"-Keyword (Porcupine) ist Phase 2.
- **Audio-Format**: App nimmt AAC/MP4 auf, Bridge konvertiert via FFmpeg zu 16kHz PCM.
- **RVS Zombie-Connections**: WebSocket-Verbindungen sterben gelegentlich ohne Fehlermeldung.
Bridge hat Ping-Check (5s), Diagnostic nutzt frische Verbindungen pro Request.
- **Bildanalyse eingeschraenkt**: Bilder werden in `/shared/uploads/` gespeichert. ARIA kann
sie per Bash/Read-Tool oeffnen, aber Claude Vision (direkte Bildanalyse) ist ueber den
Proxy-Pfad (`claude --print`) noch nicht moeglich. ARIA sieht den Dateipfad, nicht das Bild.
@@ -569,8 +703,20 @@ docker exec aria-core ssh aria-wohnung hostname
- [x] Android App (Chat + Sprache + Uploads)
- [x] Tool-Permissions (alle Tools freigeschaltet)
- [x] SSH-Zugriff auf VM (aria-wohnung)
- [x] Diagnostic Web-UI
- [x] Diagnostic Web-UI + Einstellungen
- [x] Session-Verwaltung + Chat-History
- [x] Stimmen-Einstellungen (Ramona/Thorsten, Speed, Highlight-Trigger)
- [x] TTS satzweise fuer lange Texte
- [x] Datei-/Bild-Upload mit Shared Volume
- [x] Watchdog (stuck Run Erkennung + Auto-Fix + Container-Restart)
- [x] Auto-Update System (APK via RVS)
- [x] Chat-Suche, Play-Button, Abbrechen-Button
- [x] XTTS v2 Integration (GPU, Voice Cloning, remote ueber RVS)
- [x] Gespraechsmodus (Ohr-Button, automatische Aufnahme nach ARIA-Antwort)
- [x] Mehrere Anhaenge + Text vor dem Senden + Paste-Support
- [x] Markdown-Bereinigung fuer TTS
- [x] Auto-Update mit FileProvider + Update-Check Button
- [x] Inverted FlatList (zuverlaessiges Scroll-to-Bottom)
### Phase 2 — ARIA wird produktiv
@@ -578,7 +724,8 @@ docker exec aria-core ssh aria-wohnung hostname
- [ ] Gitea-Integration
- [ ] VM einrichten (Desktop, Browser, Tools)
- [ ] Heartbeat (periodische Selbst-Checks)
- [ ] Lokales LLM als Wächter (Triage vor Claude-Call)
- [ ] Lokales LLM als Waechter (Triage vor Claude-Call)
- [ ] Auto-Compacting / Memory-Verwaltung
### Phase 3 — Erweiterungen
@@ -586,3 +733,4 @@ docker exec aria-core ssh aria-wohnung hostname
- [ ] Desktop Client (Tauri)
- [ ] bKVM Remote IT-Support
- [ ] Porcupine Wake Word (on-device "ARIA" in der App)
- [ ] Claude Vision direkt (Bildanalyse ohne Dateipfad-Umweg)
+2 -2
View File
@@ -79,8 +79,8 @@ android {
applicationId "com.ariacockpit"
minSdkVersion rootProject.ext.minSdkVersion
targetSdkVersion rootProject.ext.targetSdkVersion
versionCode 1
versionName "0.0.1.7"
versionCode 308
versionName "0.0.3.8"
// Fallback fuer Libraries mit Product Flavors
missingDimensionStrategy 'react-native-camera', 'general'
}
@@ -3,6 +3,7 @@
<uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.CAMERA" />
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-permission android:name="android.permission.REQUEST_INSTALL_PACKAGES" />
<application
android:name=".MainApplication"
@@ -24,5 +25,15 @@
<category android:name="android.intent.category.LAUNCHER" />
</intent-filter>
</activity>
<provider
android:name="androidx.core.content.FileProvider"
android:authorities="${applicationId}.fileprovider"
android:exported="false"
android:grantUriPermissions="true">
<meta-data
android:name="android.support.FILE_PROVIDER_PATHS"
android:resource="@xml/file_paths" />
</provider>
</application>
</manifest>
@@ -0,0 +1,44 @@
package com.ariacockpit
import android.content.Intent
import android.net.Uri
import android.os.Build
import androidx.core.content.FileProvider
import com.facebook.react.bridge.ReactApplicationContext
import com.facebook.react.bridge.ReactContextBaseJavaModule
import com.facebook.react.bridge.ReactMethod
import com.facebook.react.bridge.Promise
import java.io.File
class ApkInstallerModule(reactContext: ReactApplicationContext) : ReactContextBaseJavaModule(reactContext) {
override fun getName() = "ApkInstaller"
@ReactMethod
fun install(filePath: String, promise: Promise) {
try {
val file = File(filePath)
if (!file.exists()) {
promise.reject("FILE_NOT_FOUND", "APK nicht gefunden: $filePath")
return
}
val context = reactApplicationContext
val uri: Uri = if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.N) {
FileProvider.getUriForFile(context, "${context.packageName}.fileprovider", file)
} else {
Uri.fromFile(file)
}
val intent = Intent(Intent.ACTION_VIEW).apply {
setDataAndType(uri, "application/vnd.android.package-archive")
addFlags(Intent.FLAG_ACTIVITY_NEW_TASK)
addFlags(Intent.FLAG_GRANT_READ_URI_PERMISSION)
}
context.startActivity(intent)
promise.resolve(true)
} catch (e: Exception) {
promise.reject("INSTALL_ERROR", e.message, e)
}
}
}
@@ -0,0 +1,16 @@
package com.ariacockpit
import com.facebook.react.ReactPackage
import com.facebook.react.bridge.NativeModule
import com.facebook.react.bridge.ReactApplicationContext
import com.facebook.react.uimanager.ViewManager
class ApkInstallerPackage : ReactPackage {
override fun createNativeModules(reactContext: ReactApplicationContext): List<NativeModule> {
return listOf(ApkInstallerModule(reactContext))
}
override fun createViewManagers(reactContext: ReactApplicationContext): List<ViewManager<*, *>> {
return emptyList()
}
}
@@ -18,8 +18,7 @@ class MainApplication : Application(), ReactApplication {
object : DefaultReactNativeHost(this) {
override fun getPackages(): List<ReactPackage> =
PackageList(this).packages.apply {
// Packages that cannot be autolinked yet can be added manually here, for example:
// add(MyReactNativePackage())
add(ApkInstallerPackage())
}
override fun getJSMainModuleName(): String = "index"
@@ -0,0 +1,4 @@
<?xml version="1.0" encoding="utf-8"?>
<paths>
<cache-path name="cache" path="." />
</paths>
+2 -3
View File
@@ -1,6 +1,6 @@
{
"name": "aria-cockpit",
"version": "0.0.1.7",
"version": "0.0.3.8",
"private": true,
"scripts": {
"android": "react-native run-android",
@@ -24,8 +24,7 @@
"react-native-camera-kit": "^13.0.0",
"@react-native-async-storage/async-storage": "^1.21.0",
"react-native-fs": "^2.20.0",
"react-native-audio-recorder-player": "^3.6.7",
"react-native-live-audio-stream": "^1.1.1"
"react-native-audio-recorder-player": "^3.6.7"
},
"devDependencies": {
"typescript": "^5.3.3",
+320 -96
View File
@@ -5,7 +5,7 @@
* Datei- und Kamera-Upload.
*/
import React, { useState, useEffect, useRef, useCallback } from 'react';
import React, { useState, useEffect, useRef, useCallback, useMemo } from 'react';
import {
View,
Text,
@@ -16,6 +16,7 @@ import {
Platform,
StyleSheet,
Image,
ScrollView,
Modal,
} from 'react-native';
import AsyncStorage from '@react-native-async-storage/async-storage';
@@ -23,6 +24,7 @@ import RNFS from 'react-native-fs';
import rvs, { RVSMessage, ConnectionState } from '../services/rvs';
import audioService from '../services/audio';
import wakeWordService from '../services/wakeword';
import updateService from '../services/updater';
import VoiceButton from '../components/VoiceButton';
import FileUpload, { FileData } from '../components/FileUpload';
import CameraUpload, { PhotoData } from '../components/CameraUpload';
@@ -90,6 +92,11 @@ const ChatScreen: React.FC = () => {
const [showCameraUpload, setShowCameraUpload] = useState(false);
const [gpsEnabled, setGpsEnabled] = useState(false);
const [wakeWordActive, setWakeWordActive] = useState(false);
const [fullscreenImage, setFullscreenImage] = useState<string | null>(null);
const [searchQuery, setSearchQuery] = useState('');
const [searchVisible, setSearchVisible] = useState(false);
const [pendingAttachments, setPendingAttachments] = useState<{file: any, isPhoto: boolean}[]>([]);
const [agentActivity, setAgentActivity] = useState<{activity: string, tool: string}>({activity: 'idle', tool: ''});
const flatListRef = useRef<FlatList>(null);
const messageIdCounter = useRef(0);
@@ -244,6 +251,13 @@ const ChatScreen: React.FC = () => {
if (message.type === 'audio' && message.payload.base64) {
audioService.playAudio(message.payload.base64 as string);
}
// Thinking-Indicator Status von der Bridge
if (message.type === 'agent_activity') {
const activity = (message.payload.activity as string) || 'idle';
const tool = (message.payload.tool as string) || '';
setAgentActivity({ activity, tool });
}
});
const unsubState = rvs.onStateChange((state) => {
@@ -259,12 +273,30 @@ const ChatScreen: React.FC = () => {
};
}, []);
// Wake Word: "ARIA" Erkennung → Auto-Aufnahme starten
// Auto-Update: Bei App-Start pruefen
useEffect(() => {
const unsubUpdate = updateService.onUpdateAvailable((info) => {
updateService.promptUpdate(info);
});
// Nach 5s pruefen (RVS muss erst verbunden sein)
const timer = setTimeout(() => updateService.checkForUpdate(), 5000);
return () => { unsubUpdate(); clearTimeout(timer); };
}, []);
// Gespraechsmodus: Nach TTS-Wiedergabe automatisch Aufnahme starten
useEffect(() => {
const unsubPlayback = audioService.onPlaybackFinished(() => {
if (wakeWordService.isActive()) {
wakeWordService.resume();
}
});
return () => unsubPlayback();
}, []);
// Wake Word / Gespraechsmodus: Auto-Aufnahme starten
useEffect(() => {
const unsubWake = wakeWordService.onWakeWord(async () => {
console.log('[Chat] Wake Word erkannt — starte Auto-Aufnahme');
// TTS stoppen damit ARIA sich nicht selbst hoert
audioService.stopPlayback();
console.log('[Chat] Gespraechsmodus — starte Auto-Aufnahme');
// Aufnahme mit Auto-Stop (VAD) starten
const started = await audioService.startRecording(true);
if (!started) {
@@ -345,22 +377,8 @@ const ChatScreen: React.FC = () => {
return () => { if (saveTimer.current) clearTimeout(saveTimer.current); };
}, [messages]);
// Auto-Scroll wird ueber onContentSizeChange der FlatList gesteuert
const shouldAutoScroll = useRef(true);
const handleContentSizeChange = useCallback(() => {
if (shouldAutoScroll.current) {
flatListRef.current?.scrollToEnd({ animated: false });
}
}, []);
const handleScrollBeginDrag = useCallback(() => {
shouldAutoScroll.current = false;
}, []);
const handleScrollEndDrag = useCallback((e: any) => {
// Auto-Scroll wieder aktivieren wenn User ganz unten ist
const { contentOffset, contentSize, layoutMeasurement } = e.nativeEvent;
const isAtBottom = contentOffset.y + layoutMeasurement.height >= contentSize.height - 50;
shouldAutoScroll.current = isAtBottom;
}, []);
// Inverted FlatList: neueste Nachrichten unten, kein manuelles Scrollen noetig
const invertedMessages = useMemo(() => [...messages].reverse(), [messages]);
// GPS-Position holen (optional)
const getCurrentLocation = useCallback((): Promise<{ lat: number; lon: number } | null> => {
@@ -386,6 +404,13 @@ const ChatScreen: React.FC = () => {
const sendTextMessage = useCallback(async () => {
const text = inputText.trim();
// Wenn pending Anhaenge vorhanden → Anhaenge + Text zusammen senden
if (pendingAttachments.length > 0) {
sendPendingAttachments(text);
return;
}
if (!text) return;
setInputText('');
@@ -405,7 +430,13 @@ const ChatScreen: React.FC = () => {
text,
...(location && { location }),
});
}, [inputText, getCurrentLocation]);
}, [inputText, getCurrentLocation, pendingAttachments, sendPendingAttachments]);
// Anfrage abbrechen — sofort lokalen Indicator weg, Bridge triggert doctor --fix
const cancelRequest = useCallback(() => {
setAgentActivity({ activity: 'idle', tool: '' });
rvs.send('cancel_request' as any, {});
}, []);
// Sprachaufnahme abgeschlossen
const handleVoiceRecording = useCallback(async (result: RecordingResult) => {
@@ -427,88 +458,91 @@ const ChatScreen: React.FC = () => {
});
}, [getCurrentLocation]);
// Datei senden
// Datei auswaehlen → zur Pending-Liste hinzufuegen
const handleFileSelected = useCallback(async (file: FileData) => {
setShowFileUpload(false);
const location = await getCurrentLocation();
setPendingAttachments(prev => [...prev, { file, isPhoto: false }]);
}, []);
const isImage = file.type.startsWith('image/');
const msgId = nextId();
let imageUri = isImage && file.base64 ? `data:${file.type};base64,${file.base64}` : file.uri;
const userMsg: ChatMessage = {
id: msgId,
sender: 'user',
text: 'Anhang empfangen',
timestamp: Date.now(),
attachments: [{
type: isImage ? 'image' : 'file',
name: file.name,
size: file.size,
uri: imageUri,
mimeType: file.type,
}],
};
setMessages(prev => [...prev, userMsg]);
// Anhang auf Disk speichern fuer Persistenz
if (file.base64) {
persistAttachment(file.base64, msgId, file.name).then(filePath => {
setMessages(prev => prev.map(m =>
m.id === msgId ? { ...m, attachments: m.attachments?.map(a => ({ ...a, uri: filePath })) } : m
));
}).catch(() => {});
}
rvs.send('file', {
name: file.name,
type: file.type,
size: file.size,
base64: file.base64,
...(location && { location }),
});
}, [getCurrentLocation]);
// Foto senden
// Foto auswaehlen → zur Pending-Liste hinzufuegen
const handlePhotoSelected = useCallback(async (photo: PhotoData) => {
setShowCameraUpload(false);
setPendingAttachments(prev => [...prev, { file: photo, isPhoto: true }]);
}, []);
// Alle Pending Anhaenge + Text senden
const sendPendingAttachments = useCallback(async (messageText: string) => {
if (pendingAttachments.length === 0) return;
const location = await getCurrentLocation();
const msgId = nextId();
const dataUri = photo.base64 ? `data:${photo.type};base64,${photo.base64}` : undefined;
// Alle Attachments fuer die Chat-Nachricht sammeln
const attachments: Attachment[] = [];
for (const { file, isPhoto } of pendingAttachments) {
const isImage = isPhoto || (file.type && file.type.startsWith('image/'));
const name = isPhoto ? file.fileName : file.name;
const base64 = file.base64 || '';
const mimeType = file.type || '';
const imageUri = isImage && base64 ? `data:${mimeType};base64,${base64}` : file.uri;
attachments.push({
type: isImage ? 'image' : 'file',
name,
size: file.size,
uri: imageUri,
mimeType,
});
}
// Chat-Nachricht mit allen Anhaengen
const userMsg: ChatMessage = {
id: msgId,
sender: 'user',
text: 'Anhang empfangen',
text: messageText || `${pendingAttachments.length} Anhang/Anhaenge`,
timestamp: Date.now(),
attachments: [{
type: 'image',
name: photo.fileName,
uri: dataUri,
mimeType: photo.type,
}],
attachments,
};
setMessages(prev => [...prev, userMsg]);
// Foto auf Disk speichern fuer Persistenz
if (photo.base64) {
persistAttachment(photo.base64, msgId, photo.fileName).then(filePath => {
setMessages(prev => prev.map(m =>
m.id === msgId ? { ...m, attachments: m.attachments?.map(a => ({ ...a, uri: filePath })) } : m
));
}).catch(() => {});
// Alle Dateien an RVS senden + auf Disk speichern
for (const { file, isPhoto } of pendingAttachments) {
const name = isPhoto ? file.fileName : file.name;
const base64 = file.base64 || '';
const mimeType = file.type || '';
// Auf Disk speichern
if (base64) {
persistAttachment(base64, msgId + '_' + name, name).then(filePath => {
setMessages(prev => prev.map(m =>
m.id === msgId ? { ...m, attachments: m.attachments?.map(a =>
a.name === name && !a.uri?.startsWith('file://') ? { ...a, uri: filePath } : a
)} : m
));
}).catch(() => {});
}
// An RVS senden
rvs.send('file', {
name,
type: mimeType,
size: file.size,
base64,
...(isPhoto && file.width && { width: file.width, height: file.height }),
...(location && { location }),
});
}
rvs.send('file', {
name: photo.fileName,
type: photo.type,
base64: photo.base64,
width: photo.width,
height: photo.height,
...(location && { location }),
});
}, [getCurrentLocation]);
// Text als separate Nachricht (damit ARIA weiss was zu tun ist)
if (messageText) {
rvs.send('chat', {
text: messageText,
...(location && { location }),
});
}
setPendingAttachments([]);
setInputText('');
}, [pendingAttachments, getCurrentLocation]);
// --- Rendering ---
@@ -525,12 +559,12 @@ const ChatScreen: React.FC = () => {
{item.attachments?.map((att, idx) => (
<View key={idx}>
{att.type === 'image' && att.uri ? (
<TouchableOpacity onPress={() => setFullscreenImage(att.uri || null)} activeOpacity={0.8}>
<Image
source={{ uri: att.uri }}
style={styles.attachmentImage}
resizeMode="contain"
resizeMode="cover"
onError={() => {
// Bild nicht mehr verfuegbar — Placeholder setzen
setMessages(prev => prev.map(m =>
m.id === item.id ? { ...m, attachments: m.attachments?.map((a, i) =>
i === idx ? { ...a, uri: undefined } : a
@@ -538,6 +572,7 @@ const ChatScreen: React.FC = () => {
));
}}
/>
</TouchableOpacity>
) : att.type === 'image' && !att.uri ? (
<TouchableOpacity
style={styles.attachmentFile}
@@ -579,6 +614,18 @@ const ChatScreen: React.FC = () => {
{item.text}
</Text>
)}
{/* Play-Button fuer ARIA-Nachrichten */}
{!isUser && item.text.length > 0 && (
<TouchableOpacity
style={styles.playButton}
onPress={() => {
// TTS-Request an Bridge senden
rvs.send('tts_request' as any, { text: item.text, voice: '' });
}}
>
<Text style={styles.playButtonText}>{'\uD83D\uDD0A'}</Text>
</TouchableOpacity>
)}
<Text style={styles.timestamp}>{time}</Text>
</View>
);
@@ -601,19 +648,37 @@ const ChatScreen: React.FC = () => {
{connectionState === 'connected' ? 'Verbunden' :
connectionState === 'connecting' ? 'Verbinde...' : 'Getrennt'}
</Text>
<TouchableOpacity onPress={() => setSearchVisible(!searchVisible)} style={{marginLeft: 'auto', paddingHorizontal: 8}}>
<Text style={{fontSize: 16}}>{'\uD83D\uDD0D'}</Text>
</TouchableOpacity>
</View>
{/* Suchleiste */}
{searchVisible && (
<View style={styles.searchBar}>
<TextInput
style={styles.searchInput}
value={searchQuery}
onChangeText={setSearchQuery}
placeholder="Chat durchsuchen..."
placeholderTextColor="#555570"
autoFocus
/>
<TouchableOpacity onPress={() => { setSearchVisible(false); setSearchQuery(''); }}>
<Text style={{color: '#FF3B30', fontSize: 14, paddingHorizontal: 8}}>X</Text>
</TouchableOpacity>
</View>
)}
{/* Nachrichtenliste */}
<FlatList
ref={flatListRef}
data={messages}
inverted
data={searchQuery ? messages.filter(m => m.text.toLowerCase().includes(searchQuery.toLowerCase())).reverse() : invertedMessages}
keyExtractor={item => item.id}
renderItem={renderMessage}
contentContainerStyle={styles.messageList}
showsVerticalScrollIndicator={false}
onContentSizeChange={handleContentSizeChange}
onScrollBeginDrag={handleScrollBeginDrag}
onScrollEndDrag={handleScrollEndDrag}
ListEmptyComponent={
<View style={styles.emptyContainer}>
<Text style={styles.emptyIcon}>{'\uD83E\uDD16'}</Text>
@@ -623,6 +688,56 @@ const ChatScreen: React.FC = () => {
}
/>
{/* Thinking-Indicator */}
{agentActivity.activity !== 'idle' && (
<View style={styles.thinkingBar}>
<Text style={styles.thinkingText}>
{agentActivity.activity === 'tool' && agentActivity.tool
? `\uD83D\uDD27 ${agentActivity.tool}`
: agentActivity.activity === 'assistant'
? '\u270D\uFE0F ARIA schreibt...'
: '\uD83D\uDCAD ARIA denkt...'}
</Text>
<TouchableOpacity style={styles.thinkingCancel} onPress={cancelRequest}>
<Text style={styles.thinkingCancelText}>Abbrechen</Text>
</TouchableOpacity>
</View>
)}
{/* Pending Anhaenge Vorschau */}
{pendingAttachments.length > 0 && (
<View style={styles.pendingBar}>
<ScrollView horizontal showsHorizontalScrollIndicator={false} style={{flex: 1}}>
{pendingAttachments.map((att, idx) => (
<View key={idx} style={styles.pendingItem}>
{att.file.type?.startsWith('image/') || att.isPhoto ? (
<Image
source={{ uri: att.file.base64
? `data:${att.file.type};base64,${att.file.base64}`
: att.file.uri }}
style={styles.pendingThumb}
/>
) : (
<View style={[styles.pendingThumb, {justifyContent: 'center', alignItems: 'center'}]}>
<Text style={{fontSize: 20}}>{'\uD83D\uDCC4'}</Text>
</View>
)}
<TouchableOpacity
style={styles.pendingRemove}
onPress={() => setPendingAttachments(prev => prev.filter((_, i) => i !== idx))}
>
<Text style={{color: '#fff', fontSize: 10, fontWeight: 'bold'}}>X</Text>
</TouchableOpacity>
</View>
))}
</ScrollView>
<Text style={{color: '#8888AA', fontSize: 11, marginLeft: 8}}>{pendingAttachments.length}</Text>
<TouchableOpacity onPress={() => setPendingAttachments([])}>
<Text style={{color: '#FF3B30', fontSize: 14, paddingHorizontal: 8}}>Alle X</Text>
</TouchableOpacity>
</View>
)}
{/* Eingabebereich */}
<View style={styles.inputContainer}>
{/* Datei-Buttons */}
@@ -645,7 +760,7 @@ const ChatScreen: React.FC = () => {
style={styles.textInput}
value={inputText}
onChangeText={setInputText}
placeholder="Nachricht an ARIA..."
placeholder={pendingAttachments.length > 0 ? "Text zu den Anhaengen (optional)..." : "Nachricht an ARIA..."}
placeholderTextColor="#555570"
multiline
maxLength={4000}
@@ -654,7 +769,7 @@ const ChatScreen: React.FC = () => {
/>
{/* Senden oder Sprache */}
{inputText.trim() ? (
{inputText.trim() || pendingAttachments.length > 0 ? (
<TouchableOpacity style={styles.sendButton} onPress={sendTextMessage}>
<Text style={styles.sendIcon}>{'\u2B06\uFE0F'}</Text>
</TouchableOpacity>
@@ -675,6 +790,23 @@ const ChatScreen: React.FC = () => {
)}
</View>
{/* Bild-Vollbild Modal */}
<Modal visible={!!fullscreenImage} transparent animationType="fade" onRequestClose={() => setFullscreenImage(null)}>
<TouchableOpacity
style={styles.fullscreenOverlay}
activeOpacity={1}
onPress={() => setFullscreenImage(null)}
>
{fullscreenImage && (
<Image
source={{ uri: fullscreenImage }}
style={styles.fullscreenImage}
resizeMode="contain"
/>
)}
</TouchableOpacity>
</Modal>
{/* Datei-Upload Modal */}
<Modal visible={showFileUpload} transparent animationType="slide">
<View style={styles.modalOverlay}>
@@ -757,7 +889,8 @@ const styles = StyleSheet.create({
},
attachmentImage: {
width: '100%',
height: 200,
minHeight: 200,
maxHeight: 400,
borderRadius: 8,
marginBottom: 6,
backgroundColor: '#0D0D1A',
@@ -867,6 +1000,97 @@ const styles = StyleSheet.create({
wakeWordIcon: {
fontSize: 16,
},
thinkingBar: {
flexDirection: 'row',
alignItems: 'center',
justifyContent: 'space-between',
backgroundColor: '#1E1E2E',
paddingHorizontal: 12,
paddingVertical: 6,
borderTopWidth: 1,
borderTopColor: '#2A2A3E',
},
thinkingText: {
color: '#FFD60A',
fontSize: 12,
flex: 1,
},
thinkingCancel: {
paddingHorizontal: 10,
paddingVertical: 4,
borderWidth: 1,
borderColor: '#FF3B30',
borderRadius: 4,
},
thinkingCancelText: {
color: '#FF3B30',
fontSize: 11,
fontWeight: 'bold',
},
pendingBar: {
flexDirection: 'row',
alignItems: 'center',
backgroundColor: '#1E1E2E',
paddingHorizontal: 12,
paddingVertical: 8,
borderTopWidth: 1,
borderTopColor: '#2A2A3E',
},
pendingItem: {
position: 'relative',
marginRight: 8,
},
pendingThumb: {
width: 50,
height: 50,
borderRadius: 6,
backgroundColor: '#0D0D1A',
},
pendingRemove: {
position: 'absolute',
top: -4,
right: -4,
width: 18,
height: 18,
borderRadius: 9,
backgroundColor: '#FF3B30',
justifyContent: 'center',
alignItems: 'center',
},
searchBar: {
flexDirection: 'row',
alignItems: 'center',
backgroundColor: '#12122A',
paddingHorizontal: 12,
paddingVertical: 6,
borderBottomWidth: 1,
borderBottomColor: '#1E1E2E',
},
searchInput: {
flex: 1,
color: '#FFFFFF',
fontSize: 14,
paddingVertical: 4,
},
playButton: {
alignSelf: 'flex-end',
paddingHorizontal: 8,
paddingVertical: 2,
marginTop: 4,
},
playButtonText: {
fontSize: 16,
},
fullscreenOverlay: {
flex: 1,
backgroundColor: 'rgba(0,0,0,0.95)',
justifyContent: 'center',
alignItems: 'center',
},
fullscreenImage: {
width: '100%',
height: '100%',
},
modalOverlay: {
flex: 1,
backgroundColor: 'rgba(0,0,0,0.6)',
+190 -1
View File
@@ -71,6 +71,11 @@ const SettingsScreen: React.FC = () => {
const [storagePath, setStoragePath] = useState(DEFAULT_STORAGE_PATH);
const [autoDownload, setAutoDownload] = useState(true);
const [storageSize, setStorageSize] = useState('...');
const [ttsEnabled, setTtsEnabled] = useState(true);
const [defaultVoice, setDefaultVoice] = useState('ramona');
const [highlightVoice, setHighlightVoice] = useState('thorsten');
const [speedRamona, setSpeedRamona] = useState(1.0);
const [speedThorsten, setSpeedThorsten] = useState(1.0);
const [editingPath, setEditingPath] = useState(false);
const [tempPath, setTempPath] = useState('');
@@ -91,6 +96,21 @@ const SettingsScreen: React.FC = () => {
AsyncStorage.getItem('aria_auto_download').then(saved => {
if (saved !== null) setAutoDownload(saved === 'true');
});
AsyncStorage.getItem('aria_tts_enabled').then(saved => {
if (saved !== null) setTtsEnabled(saved === 'true');
});
AsyncStorage.getItem('aria_default_voice').then(saved => {
if (saved) setDefaultVoice(saved);
});
AsyncStorage.getItem('aria_highlight_voice').then(saved => {
if (saved) setHighlightVoice(saved);
});
AsyncStorage.getItem('aria_speed_ramona').then(saved => {
if (saved) setSpeedRamona(parseFloat(saved));
});
AsyncStorage.getItem('aria_speed_thorsten').then(saved => {
if (saved) setSpeedThorsten(parseFloat(saved));
});
}, []);
// Speichergroesse berechnen
@@ -442,6 +462,133 @@ const SettingsScreen: React.FC = () => {
</View>
</View>
{/* === Sprachausgabe === */}
<Text style={styles.sectionTitle}>Sprachausgabe</Text>
<View style={styles.card}>
{/* TTS An/Aus */}
<View style={styles.toggleRow}>
<View style={styles.toggleInfo}>
<Text style={styles.toggleLabel}>Sprachausgabe</Text>
<Text style={styles.toggleHint}>ARIA antwortet per Sprache (TTS)</Text>
</View>
<Switch
value={ttsEnabled}
onValueChange={(val) => {
setTtsEnabled(val);
AsyncStorage.setItem('aria_tts_enabled', String(val));
rvs.send('config' as any, { ttsEnabled: val });
}}
trackColor={{ false: '#2A2A3E', true: '#0096FF' }}
thumbColor={ttsEnabled ? '#FFFFFF' : '#666680'}
/>
</View>
{/* Standard-Stimme */}
<View style={{marginTop: 16}}>
<Text style={styles.toggleLabel}>Standard-Stimme</Text>
<Text style={styles.toggleHint}>Fuer normale Antworten und Gespraeche</Text>
<View style={{flexDirection: 'row', gap: 8, marginTop: 8}}>
<TouchableOpacity
style={[styles.voiceBtn, defaultVoice === 'ramona' && styles.voiceBtnActive]}
onPress={() => { setDefaultVoice('ramona'); AsyncStorage.setItem('aria_default_voice', 'ramona'); rvs.send('config' as any, { defaultVoice: 'ramona' }); }}
>
<Text style={styles.voiceBtnIcon}>{'\uD83D\uDE4E\u200D\u2640\uFE0F'}</Text>
<Text style={[styles.voiceBtnText, defaultVoice === 'ramona' && styles.voiceBtnTextActive]}>Ramona</Text>
<Text style={styles.voiceBtnHint}>Weiblich, warm</Text>
</TouchableOpacity>
<TouchableOpacity
style={[styles.voiceBtn, defaultVoice === 'thorsten' && styles.voiceBtnActive]}
onPress={() => { setDefaultVoice('thorsten'); AsyncStorage.setItem('aria_default_voice', 'thorsten'); rvs.send('config' as any, { defaultVoice: 'thorsten' }); }}
>
<Text style={styles.voiceBtnIcon}>{'\uD83E\uDDD4'}</Text>
<Text style={[styles.voiceBtnText, defaultVoice === 'thorsten' && styles.voiceBtnTextActive]}>Thorsten</Text>
<Text style={styles.voiceBtnHint}>Maennlich, tief</Text>
</TouchableOpacity>
</View>
</View>
{/* Highlight-Stimme */}
<View style={{marginTop: 16}}>
<Text style={styles.toggleLabel}>Highlight-Stimme</Text>
<Text style={styles.toggleHint}>Fuer besondere Ereignisse (Deploy, Alarm, Erfolg)</Text>
<View style={{flexDirection: 'row', gap: 8, marginTop: 8}}>
<TouchableOpacity
style={[styles.voiceBtn, highlightVoice === 'thorsten' && styles.voiceBtnActive]}
onPress={() => { setHighlightVoice('thorsten'); AsyncStorage.setItem('aria_highlight_voice', 'thorsten'); rvs.send('config' as any, { highlightVoice: 'thorsten' }); }}
>
<Text style={styles.voiceBtnIcon}>{'\uD83E\uDDD4'}</Text>
<Text style={[styles.voiceBtnText, highlightVoice === 'thorsten' && styles.voiceBtnTextActive]}>Thorsten</Text>
</TouchableOpacity>
<TouchableOpacity
style={[styles.voiceBtn, highlightVoice === 'ramona' && styles.voiceBtnActive]}
onPress={() => { setHighlightVoice('ramona'); AsyncStorage.setItem('aria_highlight_voice', 'ramona'); rvs.send('config' as any, { highlightVoice: 'ramona' }); }}
>
<Text style={styles.voiceBtnIcon}>{'\uD83D\uDE4E\u200D\u2640\uFE0F'}</Text>
<Text style={[styles.voiceBtnText, highlightVoice === 'ramona' && styles.voiceBtnTextActive]}>Ramona</Text>
</TouchableOpacity>
</View>
</View>
{/* Sprechgeschwindigkeit Ramona */}
<View style={{marginTop: 16}}>
<Text style={styles.toggleLabel}>Ramona Speed: {speedRamona.toFixed(1)}x</Text>
<View style={{flexDirection: 'row', justifyContent: 'space-around', marginTop: 8}}>
{[0.5, 0.75, 1.0, 1.25, 1.5, 2.0].map(speed => (
<TouchableOpacity
key={speed}
onPress={() => {
setSpeedRamona(speed);
AsyncStorage.setItem('aria_speed_ramona', String(speed));
rvs.send('config' as any, { speedRamona: speed });
}}
style={{
paddingHorizontal: 10, paddingVertical: 6, borderRadius: 6,
backgroundColor: speedRamona === speed ? '#0096FF' : '#1E1E2E',
}}
>
<Text style={{color: speedRamona === speed ? '#fff' : '#8888AA', fontSize: 12, fontWeight: '600'}}>
{speed}x
</Text>
</TouchableOpacity>
))}
</View>
</View>
{/* Sprechgeschwindigkeit Thorsten */}
<View style={{marginTop: 16}}>
<Text style={styles.toggleLabel}>Thorsten Speed: {speedThorsten.toFixed(1)}x</Text>
<View style={{flexDirection: 'row', justifyContent: 'space-around', marginTop: 8}}>
{[0.5, 0.75, 1.0, 1.25, 1.5, 2.0].map(speed => (
<TouchableOpacity
key={speed}
onPress={() => {
setSpeedThorsten(speed);
AsyncStorage.setItem('aria_speed_thorsten', String(speed));
rvs.send('config' as any, { speedThorsten: speed });
}}
style={{
paddingHorizontal: 10, paddingVertical: 6, borderRadius: 6,
backgroundColor: speedThorsten === speed ? '#0096FF' : '#1E1E2E',
}}
>
<Text style={{color: speedThorsten === speed ? '#fff' : '#8888AA', fontSize: 12, fontWeight: '600'}}>
{speed}x
</Text>
</TouchableOpacity>
))}
</View>
</View>
{/* Highlight-Trigger Info */}
<View style={{marginTop: 16, padding: 10, backgroundColor: '#1E1E2E', borderRadius: 8}}>
<Text style={styles.toggleLabel}>{'\u26A1'} Highlight-Trigger</Text>
<Text style={[styles.toggleHint, {marginTop: 4}]}>
Die Highlight-Stimme wird automatisch bei diesen Woertern verwendet:{'\n'}
deploy, erfolgreich, alarm, so soll es sein, kritisch, server down, sicherheitswarnung, ticket geloest, aufgabe abgeschlossen
</Text>
</View>
</View>
{/* === Speicher === */}
<Text style={styles.sectionTitle}>Anhang-Speicher</Text>
<View style={styles.card}>
@@ -601,11 +748,21 @@ const SettingsScreen: React.FC = () => {
<Text style={styles.sectionTitle}>{'\u00DC'}ber</Text>
<View style={styles.card}>
<Text style={styles.aboutTitle}>ARIA Cockpit</Text>
<Text style={styles.aboutVersion}>Version 0.0.1.6 </Text>
<Text style={styles.aboutVersion}>Version {require('../../package.json').version}</Text>
<Text style={styles.aboutInfo}>
Stefans Kommandozentrale f{'\u00FC'}r ARIA.{'\n'}
Gebaut mit React Native + TypeScript.
</Text>
<TouchableOpacity
style={[styles.connectButton, {marginTop: 12}]}
onPress={() => {
const updateService = require('../services/updater').default;
updateService.checkForUpdate();
Alert.alert('Update-Check', 'Pruefe auf neue Version...');
}}
>
<Text style={styles.connectButtonText}>Auf Updates pr{'\u00FC'}fen</Text>
</TouchableOpacity>
</View>
{/* Platz am Ende */}
@@ -744,6 +901,38 @@ const styles = StyleSheet.create({
marginTop: 2,
},
// Stimmen
voiceBtn: {
flex: 1,
padding: 12,
borderRadius: 10,
backgroundColor: '#1E1E2E',
alignItems: 'center',
borderWidth: 2,
borderColor: 'transparent',
},
voiceBtnActive: {
borderColor: '#0096FF',
backgroundColor: '#0D1A2E',
},
voiceBtnIcon: {
fontSize: 28,
marginBottom: 4,
},
voiceBtnText: {
color: '#8888AA',
fontSize: 14,
fontWeight: '600',
},
voiceBtnTextActive: {
color: '#FFFFFF',
},
voiceBtnHint: {
color: '#555570',
fontSize: 11,
marginTop: 2,
},
// Speicher
storagePathText: {
color: '#0096FF',
+133 -31
View File
@@ -42,6 +42,8 @@ const AUDIO_ENCODING = 'audio/wav';
// VAD (Voice Activity Detection) — Stille-Erkennung
const VAD_SILENCE_THRESHOLD_DB = -45; // dB unter dem als "Stille" gilt
const VAD_SILENCE_DURATION_MS = 1800; // ms Stille bevor Auto-Stop
const VAD_SPEECH_THRESHOLD_DB = -35; // dB ueber dem als "Sprache" gilt (Sprach-Gate)
const VAD_SPEECH_MIN_MS = 300; // ms Sprache bevor Aufnahme zaehlt
// --- Audio-Service ---
@@ -55,6 +57,16 @@ class AudioService {
private recorder: AudioRecorderPlayer;
private recordingPath: string = '';
// Audio-Queue fuer sequentielle TTS-Wiedergabe
private audioQueue: string[] = [];
private isPlaying: boolean = false;
private preloadedSound: Sound | null = null;
private preloadedPath: string = '';
// Sprach-Gate: Aufnahme erst senden wenn tatsaechlich gesprochen wurde
private speechDetected: boolean = false;
private speechStartTime: number = 0;
// VAD State
private vadEnabled: boolean = false;
private lastSpeechTime: number = 0;
@@ -115,6 +127,8 @@ class AudioService {
AudioEncoderAndroid: AudioEncoderAndroidType.AAC,
AudioSourceAndroid: AudioSourceAndroidType.MIC,
OutputFormatAndroid: OutputFormatAndroidType.MPEG_4,
AudioSamplingRateAndroid: 16000,
AudioChannelsAndroid: 1,
}, true); // meteringEnabled = true
// Metering-Callback
@@ -122,7 +136,21 @@ class AudioService {
const db = e.currentMetering ?? -160;
this.meterListeners.forEach(cb => cb(db));
// VAD: Stille erkennen
// Sprach-Gate: Erkennen ob tatsaechlich gesprochen wird
if (db > VAD_SPEECH_THRESHOLD_DB) {
if (!this.speechDetected && this.speechStartTime === 0) {
this.speechStartTime = Date.now();
}
if (this.speechStartTime > 0 && Date.now() - this.speechStartTime >= VAD_SPEECH_MIN_MS) {
this.speechDetected = true;
}
} else {
if (!this.speechDetected) {
this.speechStartTime = 0; // Reset wenn noch nicht als Sprache erkannt
}
}
// VAD: Stille erkennen (nur wenn Sprache erkannt wurde)
if (this.vadEnabled) {
if (db > VAD_SILENCE_THRESHOLD_DB) {
this.lastSpeechTime = Date.now();
@@ -132,6 +160,8 @@ class AudioService {
this.recordingStartTime = Date.now();
this.lastSpeechTime = Date.now();
this.speechDetected = false;
this.speechStartTime = 0;
this.setState('recording');
// VAD aktivieren
@@ -174,6 +204,15 @@ class AudioService {
this.recorder.removeRecordBackListener();
const durationMs = Date.now() - this.recordingStartTime;
const hadSpeech = this.speechDetected;
// Sprach-Gate: Wenn keine Sprache erkannt → Aufnahme verwerfen
if (!hadSpeech) {
RNFS.unlink(this.recordingPath).catch(() => {});
this.setState('idle');
console.log('[Audio] Aufnahme verworfen — keine Sprache erkannt (nur Umgebungsgeraeusche)');
return null;
}
// Audio-Datei als Base64 lesen
const base64Data = await RNFS.readFile(this.recordingPath, 'base64');
@@ -182,7 +221,7 @@ class AudioService {
RNFS.unlink(this.recordingPath).catch(() => {});
this.setState('idle');
console.log(`[Audio] Aufnahme beendet (${durationMs}ms, ${Math.round(base64Data.length / 1024)}KB)`);
console.log(`[Audio] Aufnahme beendet (${durationMs}ms, ${Math.round(base64Data.length / 1024)}KB, Sprache erkannt)`);
return {
base64: base64Data,
@@ -198,47 +237,110 @@ class AudioService {
// --- Wiedergabe ---
/** Base64-kodiertes Audio abspielen (z.B. TTS-Antwort von ARIA) */
/** Base64-kodiertes Audio in die Queue stellen und abspielen */
async playAudio(base64Data: string): Promise<void> {
if (!base64Data) return;
// Laufende Wiedergabe stoppen
this.stopPlayback();
try {
// Base64 -> temporaere WAV-Datei -> Sound abspielen
const tmpPath = `${RNFS.CachesDirectoryPath}/aria_tts_${Date.now()}.wav`;
await RNFS.writeFile(tmpPath, base64Data, 'base64');
this.currentSound = new Sound(tmpPath, '', (error) => {
if (error) {
console.error('[Audio] Fehler beim Laden:', error);
RNFS.unlink(tmpPath).catch(() => {});
return;
}
this.currentSound?.play((success) => {
if (success) {
console.log('[Audio] Wiedergabe abgeschlossen');
} else {
console.warn('[Audio] Wiedergabe fehlgeschlagen');
}
this.currentSound?.release();
this.currentSound = null;
RNFS.unlink(tmpPath).catch(() => {});
});
});
} catch (err) {
console.error('[Audio] Wiedergabefehler:', err);
this.audioQueue.push(base64Data);
if (!this.isPlaying) {
this._playNext();
}
}
/** Laufende Wiedergabe stoppen */
// Callback wenn alle Audio-Teile abgespielt sind
private playbackFinishedListeners: (() => void)[] = [];
onPlaybackFinished(callback: () => void): () => void {
this.playbackFinishedListeners.push(callback);
return () => {
this.playbackFinishedListeners = this.playbackFinishedListeners.filter(cb => cb !== callback);
};
}
/** Naechstes Audio aus der Queue abspielen */
private async _playNext(): Promise<void> {
if (this.audioQueue.length === 0) {
this.isPlaying = false;
// Alle Audio-Teile abgespielt → Listener benachrichtigen
this.playbackFinishedListeners.forEach(cb => cb());
return;
}
this.isPlaying = true;
// Preloaded Sound verwenden wenn verfuegbar, sonst neu laden
let sound: Sound;
let soundPath: string;
if (this.preloadedSound) {
sound = this.preloadedSound;
soundPath = this.preloadedPath;
this.preloadedSound = null;
this.preloadedPath = '';
// Daten aus Queue entfernen (wurde schon preloaded)
this.audioQueue.shift();
} else {
const base64Data = this.audioQueue.shift()!;
try {
soundPath = `${RNFS.CachesDirectoryPath}/aria_tts_${Date.now()}.wav`;
await RNFS.writeFile(soundPath, base64Data, 'base64');
sound = await new Promise<Sound>((resolve, reject) => {
const s = new Sound(soundPath, '', (err) => err ? reject(err) : resolve(s));
});
} catch (err) {
console.error('[Audio] Laden fehlgeschlagen:', err);
this._playNext();
return;
}
}
this.currentSound = sound;
// Naechstes Audio schon vorbereiten waehrend dieses abspielt
this._preloadNext();
sound.play((success) => {
if (!success) console.warn('[Audio] Wiedergabe fehlgeschlagen');
sound.release();
this.currentSound = null;
RNFS.unlink(soundPath).catch(() => {});
this._playNext();
});
}
/** Naechstes Audio im Hintergrund vorladen (verhindert Stottern) */
private async _preloadNext(): Promise<void> {
if (this.audioQueue.length === 0 || this.preloadedSound) return;
const base64Data = this.audioQueue[0]; // Nicht shift — bleibt in Queue
try {
const tmpPath = `${RNFS.CachesDirectoryPath}/aria_tts_pre_${Date.now()}.wav`;
await RNFS.writeFile(tmpPath, base64Data, 'base64');
this.preloadedSound = await new Promise<Sound>((resolve, reject) => {
const s = new Sound(tmpPath, '', (err) => err ? reject(err) : resolve(s));
});
this.preloadedPath = tmpPath;
} catch {
this.preloadedSound = null;
this.preloadedPath = '';
}
}
/** Laufende Wiedergabe stoppen + Queue leeren */
stopPlayback(): void {
this.audioQueue = [];
this.isPlaying = false;
if (this.currentSound) {
this.currentSound.stop();
this.currentSound.release();
this.currentSound = null;
}
if (this.preloadedSound) {
this.preloadedSound.release();
this.preloadedSound = null;
if (this.preloadedPath) RNFS.unlink(this.preloadedPath).catch(() => {});
this.preloadedPath = '';
}
}
// --- Status & Callbacks ---
+1 -1
View File
@@ -12,7 +12,7 @@ import AsyncStorage from '@react-native-async-storage/async-storage';
export type ConnectionState = 'connecting' | 'connected' | 'disconnected';
export type MessageType = 'chat' | 'audio' | 'file' | 'location' | 'mode' | 'log' | 'event';
export type MessageType = 'chat' | 'audio' | 'file' | 'location' | 'mode' | 'log' | 'event' | 'update_available' | string;
export interface RVSMessage {
type: MessageType;
+158
View File
@@ -0,0 +1,158 @@
/**
* Auto-Update Service — prueft und installiert App-Updates via RVS
*
* Flow:
* 1. App sendet "update_check" mit aktueller Version an RVS
* 2. RVS vergleicht → sendet "update_available" mit Download-URL
* 3. App zeigt Benachrichtigung → User bestaetigt → Download + Install
*/
import { Alert, Linking, Platform, NativeModules } from 'react-native';
import RNFS from 'react-native-fs';
import rvs, { RVSMessage } from './rvs';
// Version aus package.json (wird beim Build eingebettet)
const packageJson = require('../../package.json');
const APP_VERSION = packageJson.version || '0.0.0.0';
type UpdateCallback = (info: UpdateInfo) => void;
export interface UpdateInfo {
version: string;
downloadUrl: string;
size: number;
}
class UpdateService {
private listeners: UpdateCallback[] = [];
private checking = false;
private downloading = false;
constructor() {
// Auf update_available Nachrichten lauschen
rvs.onMessage((msg: RVSMessage) => {
if (msg.type === 'update_available' as any) {
const info: UpdateInfo = {
version: (msg.payload.version as string) || '',
downloadUrl: (msg.payload.downloadUrl as string) || '',
size: (msg.payload.size as number) || 0,
};
if (info.version && this.isNewer(info.version)) {
console.log(`[Update] Neue Version verfuegbar: ${info.version} (aktuell: ${APP_VERSION})`);
this.listeners.forEach(cb => cb(info));
}
}
});
}
/** Bei App-Start Update pruefen */
checkForUpdate(): void {
if (this.checking) return;
this.checking = true;
console.log(`[Update] Pruefe auf Updates (aktuell: ${APP_VERSION})`);
rvs.send('update_check' as any, { version: APP_VERSION });
setTimeout(() => { this.checking = false; }, 10000);
}
/** Callback registrieren */
onUpdateAvailable(callback: UpdateCallback): () => void {
this.listeners.push(callback);
return () => {
this.listeners = this.listeners.filter(cb => cb !== callback);
};
}
/** Update-Dialog anzeigen */
promptUpdate(info: UpdateInfo): void {
const sizeMB = (info.size / 1024 / 1024).toFixed(1);
Alert.alert(
'ARIA Update verfuegbar',
`Version ${info.version} (${sizeMB} MB)\n\nAktuell: ${APP_VERSION}\n\nJetzt herunterladen und installieren?`,
[
{ text: 'Spaeter', style: 'cancel' },
{
text: 'Installieren',
onPress: () => this.downloadAndInstall(info),
},
],
);
}
/** APK ueber WebSocket herunterladen und installieren */
async downloadAndInstall(info: UpdateInfo): Promise<void> {
if (this.downloading) return;
this.downloading = true;
try {
console.log(`[Update] Fordere APK v${info.version} an...`);
Alert.alert('Download gestartet', `Version ${info.version} wird ueber RVS heruntergeladen...`);
// APK ueber WebSocket anfordern
rvs.send('update_download' as any, {});
// Auf update_data warten (einmalig)
const apkData = await new Promise<{base64: string, fileName: string}>((resolve, reject) => {
const timeout = setTimeout(() => reject(new Error('Download-Timeout (60s)')), 60000);
const unsub = rvs.onMessage((msg: RVSMessage) => {
if ((msg.type as string) === 'update_data') {
clearTimeout(timeout);
unsub();
if (msg.payload.error) {
reject(new Error(msg.payload.error as string));
} else {
resolve({
base64: msg.payload.base64 as string,
fileName: msg.payload.fileName as string || `ARIA-${info.version}.apk`,
});
}
}
});
});
// Base64 als APK-Datei speichern
const destPath = `${RNFS.CachesDirectoryPath}/${apkData.fileName}`;
await RNFS.writeFile(destPath, apkData.base64, 'base64');
const fileSize = await RNFS.stat(destPath);
console.log(`[Update] APK gespeichert: ${destPath} (${(parseInt(fileSize.size) / 1024 / 1024).toFixed(1)}MB)`);
// APK installieren via natives ApkInstaller Module (FileProvider + Intent)
if (Platform.OS === 'android') {
try {
const { ApkInstaller } = NativeModules;
await ApkInstaller.install(destPath);
} catch (installErr: any) {
Alert.alert(
'APK heruntergeladen',
`Version ${info.version} gespeichert.\n\nBitte manuell installieren:\nDateimanager → ${apkData.fileName} antippen.\n\n(${installErr.message})`,
);
}
}
} catch (err: any) {
console.error(`[Update] Fehler: ${err.message}`);
Alert.alert('Update fehlgeschlagen', err.message);
} finally {
this.downloading = false;
}
}
/** Versionsvergleich */
private isNewer(remote: string): boolean {
const r = remote.split('.').map(Number);
const l = APP_VERSION.split('.').map(Number);
for (let i = 0; i < Math.max(r.length, l.length); i++) {
const diff = (r[i] || 0) - (l[i] || 0);
if (diff > 0) return true;
if (diff < 0) return false;
}
return false;
}
getCurrentVersion(): string {
return APP_VERSION;
}
}
const updateService = new UpdateService();
export default updateService;
+30 -90
View File
@@ -1,21 +1,13 @@
/**
* Wake Word Service — "ARIA" Erkennung
* Gespraechsmodus — "Ohr-Button"
*
* Nutzt react-native-live-audio-stream fuer kontinuierliches Mikrofon-Monitoring.
* Erkennt Sprache per Energie-Schwellwert und sendet kurze Audio-Clips
* zur serverseitigen Wake-Word-Pruefung (openwakeword in der Bridge).
* Wenn aktiv: Nach jeder ARIA-Antwort (TTS fertig) startet automatisch die Aufnahme.
* Wie ein Walkie-Talkie / natuerliches Gespraech:
* ARIA spricht → Aufnahme startet → User spricht → VAD stoppt → ARIA antwortet → ...
*
* Architektur:
* App (Mikrofon) → Energie-Erkennung → Audio-Buffer
* → RVS "wake_check" → Bridge → openwakeword → Bestaetigung
* → App startet Aufnahme
*
* Aktuell (Phase 1): Einfacher Tap-to-Talk + Auto-Stop.
* Spaeter (Phase 2): Porcupine on-device "ARIA" Keyword.
* Phase 2 (geplant): Porcupine "ARIA" Wake Word fuer passives Lauschen.
*/
import LiveAudioStream from 'react-native-live-audio-stream';
type WakeWordCallback = () => void;
type StateCallback = (state: WakeWordState) => void;
@@ -25,72 +17,40 @@ class WakeWordService {
private state: WakeWordState = 'off';
private wakeCallbacks: WakeWordCallback[] = [];
private stateCallbacks: StateCallback[] = [];
private isInitialized = false;
/** Wake Word Erkennung starten */
/** Gespraechsmodus starten */
async start(): Promise<boolean> {
if (this.state === 'listening') return true;
try {
if (!this.isInitialized) {
LiveAudioStream.init({
sampleRate: 16000,
channels: 1,
bitsPerSample: 16,
audioSource: 6, // VOICE_RECOGNITION
bufferSize: 4096,
});
this.isInitialized = true;
console.log('[WakeWord] Gespraechsmodus aktiviert — starte sofort Aufnahme');
this.setState('listening');
// Sofort erste Aufnahme starten
setTimeout(() => {
if (this.state === 'listening') {
this.wakeCallbacks.forEach(cb => cb());
}
}, 500);
return true;
}
// Audio-Stream starten und auf Energie pruefen
LiveAudioStream.start();
/** Gespraechsmodus stoppen */
stop(): void {
console.log('[WakeWord] Gespraechsmodus deaktiviert');
this.setState('off');
}
LiveAudioStream.on('data', (base64Chunk: string) => {
if (this.state !== 'listening') return;
// Base64 → Int16 Array → RMS berechnen
const raw = this._base64ToInt16(base64Chunk);
const rms = this._calculateRMS(raw);
// Schwellwert: wenn laut genug → Wake Word erkannt
// Phase 1: Einfache Energie-Erkennung (jemand spricht)
// Phase 2: Porcupine "ARIA" Keyword
if (rms > 2000) {
this.setState('detected');
this.wakeCallbacks.forEach(cb => cb());
// Nach Detection kurz pausieren, Aufnahme uebernimmt das Mikrofon
this.stop();
}
});
this.setState('listening');
console.log('[WakeWord] Listening gestartet');
return true;
} catch (err) {
console.error('[WakeWord] Start fehlgeschlagen:', err);
return false;
/** Nach ARIA-Antwort (TTS fertig): Aufnahme automatisch starten */
async resume(): Promise<void> {
if (this.state !== 'listening') return;
// Kurze Pause damit TTS-Audio nicht ins Mikrofon geht
await new Promise(resolve => setTimeout(resolve, 800));
if (this.state === 'listening') {
console.log('[WakeWord] TTS fertig — starte automatisch Aufnahme');
this.wakeCallbacks.forEach(cb => cb());
}
}
/** Wake Word Erkennung stoppen */
stop(): void {
if (this.state === 'off') return;
try {
LiveAudioStream.stop();
} catch {}
this.setState('off');
console.log('[WakeWord] Gestoppt');
}
/** Nach Aufnahme erneut starten */
async resume(): Promise<void> {
// Kurze Pause damit Aufnahme das Mikrofon freigeben kann
setTimeout(() => {
if (this.state === 'off') {
this.start();
}
}, 500);
isActive(): boolean {
return this.state === 'listening';
}
// --- Callbacks ---
@@ -113,32 +73,12 @@ class WakeWordService {
return this.state;
}
// --- Hilfsfunktionen ---
private setState(state: WakeWordState): void {
if (this.state !== state) {
this.state = state;
this.stateCallbacks.forEach(cb => cb(state));
}
}
private _base64ToInt16(base64: string): Int16Array {
const binary = atob(base64);
const bytes = new Uint8Array(binary.length);
for (let i = 0; i < binary.length; i++) {
bytes[i] = binary.charCodeAt(i);
}
return new Int16Array(bytes.buffer);
}
private _calculateRMS(samples: Int16Array): number {
if (samples.length === 0) return 0;
let sum = 0;
for (let i = 0; i < samples.length; i++) {
sum += samples[i] * samples[i];
}
return Math.sqrt(sum / samples.length);
}
}
const wakeWordService = new WakeWordService();
+7
View File
@@ -9,3 +9,10 @@ PIPER_THORSTEN=/voices/de_DE-thorsten-high.onnx
# Wake-Word
WAKE_WORD=aria
# Whisper STT — wird zur Laufzeit in der Diagnostic (Sektion "Whisper") umgeschaltet
# und in /shared/config/voice_config.json gespeichert. Der Wert hier ist nur der
# Initial-Default beim ersten Start.
# Optionen: tiny | base | small | medium | large-v3
WHISPER_MODEL=medium
WHISPER_LANGUAGE=de
+373 -40
View File
@@ -38,6 +38,7 @@ import websockets
from faster_whisper import WhisperModel
from openwakeword.model import Model as WakeWordModel
from piper import PiperVoice
from piper.config import SynthesisConfig
from modes import Mode, detect_mode_switch, should_speak
@@ -62,7 +63,7 @@ RVS_TLS = os.getenv("RVS_TLS", "true") # true = wss://, false = ws://
RVS_TLS_FALLBACK = os.getenv("RVS_TLS_FALLBACK", "true") # Bei TLS-Fehler ws:// versuchen
RVS_TOKEN = os.getenv("RVS_TOKEN", "") # Pairing-Token (gleich wie in der App)
DIAGNOSTIC_URL = os.getenv("DIAGNOSTIC_URL", "http://127.0.0.1:3001") # Diagnostic API
WHISPER_MODEL = os.getenv("WHISPER_MODEL", "small")
WHISPER_MODEL = os.getenv("WHISPER_MODEL", "medium")
WHISPER_LANGUAGE = os.getenv("WHISPER_LANGUAGE", "de")
# Audio-Parameter
@@ -72,7 +73,7 @@ BLOCK_SIZE = 1280 # 80ms bei 16kHz — gut fuer Wake-Word-Erkennung
RECORD_SECONDS = 8 # Max. Aufnahmedauer nach Wake-Word
# Epische Trigger — bei diesen Woertern spricht Thorsten
EPIC_TRIGGERS = [
EPIC_TRIGGERS_DEFAULT = [
"deploy",
"erfolgreich",
"alarm",
@@ -84,6 +85,24 @@ EPIC_TRIGGERS = [
"aufgabe abgeschlossen",
]
# Trigger aus Shared-Config laden (von Diagnostic gespeichert)
TRIGGERS_FILE = "/shared/config/highlight_triggers.json"
def load_epic_triggers():
"""Laedt Highlight-Trigger aus Shared-Config oder nutzt Defaults."""
try:
if os.path.exists(TRIGGERS_FILE):
with open(TRIGGERS_FILE) as f:
triggers = json.load(f)
if isinstance(triggers, list) and len(triggers) > 0:
logger.info("Highlight-Trigger geladen: %d aus %s", len(triggers), TRIGGERS_FILE)
return triggers
except Exception as e:
logger.warning("Highlight-Trigger laden fehlgeschlagen: %s — nutze Defaults", e)
return EPIC_TRIGGERS_DEFAULT
EPIC_TRIGGERS = load_epic_triggers()
def load_config() -> dict[str, str]:
"""Laedt Konfiguration aus /config/aria.env."""
@@ -111,6 +130,9 @@ class VoiceEngine:
def __init__(self, voices_dir: Path) -> None:
self.voices_dir = voices_dir
self.voices: dict[str, PiperVoice] = {}
self.default_voice = "ramona"
self.highlight_voice = "thorsten"
self.speech_speed = {"ramona": 1.0, "thorsten": 1.0}
def initialize(self) -> None:
"""Laedt die Piper-Stimmen aus dem Voices-Verzeichnis."""
@@ -154,14 +176,14 @@ class VoiceEngine:
if requested_voice and requested_voice in self.voices:
return requested_voice
# Epische Trigger pruefen
# Highlight-Trigger pruefen
text_lower = text.lower()
for trigger in EPIC_TRIGGERS:
if trigger in text_lower:
logger.info("Epischer Trigger erkannt: '%s'Thorsten spricht", trigger)
return "thorsten"
logger.info("Highlight-Trigger erkannt: '%s'%s spricht", trigger, self.highlight_voice)
return self.highlight_voice
return "ramona"
return self.default_voice
def synthesize(self, text: str, voice_name: str = "ramona") -> Optional[bytes]:
"""Erzeugt Audio-Daten aus Text mit der gewaehlten Stimme.
@@ -179,20 +201,62 @@ class VoiceEngine:
return None
try:
# Piper gibt PCM-Samples zurueck, wir schreiben sie als WAV
# Markdown + Sonderzeichen entfernen fuer natuerliche Sprache
import re
clean = text.strip()
clean = re.sub(r'\*\*([^*]+)\*\*', r'\1', clean) # **fett**
clean = re.sub(r'\*([^*]+)\*', r'\1', clean) # *kursiv*
clean = re.sub(r'`[^`]+`', '', clean) # `code`
clean = re.sub(r'```[\s\S]*?```', '', clean) # Code-Bloecke
clean = re.sub(r'\[([^\]]+)\]\([^)]+\)', r'\1', clean) # [text](url)
clean = re.sub(r'#{1,6}\s*', '', clean) # ### Ueberschriften
clean = re.sub(r'>\s*', '', clean) # > Zitate
clean = re.sub(r'[-*]\s+', '', clean) # Listen
clean = re.sub(r'\n{2,}', '. ', clean) # Absaetze
clean = re.sub(r'\n', ', ', clean) # Zeilenumbrueche
clean = re.sub(r'\s{2,}', ' ', clean) # Mehrfach-Leerzeichen
clean = re.sub(r'["""„]', '', clean) # Anfuehrungszeichen
sentences = re.split(r'(?<=[.!?])\s+', clean)
sentences = [s.strip() for s in sentences if s.strip()]
if not sentences:
return None
# Jeden Satz einzeln synthetisieren und WAVs zusammenfuegen
all_audio = b""
sample_rate = None
for sentence in sentences:
if not sentence:
continue
with tempfile.NamedTemporaryFile(suffix=".wav", delete=False) as tmp:
tmp_path = tmp.name
speed = self.speech_speed.get(voice_name, 1.0)
syn_config = SynthesisConfig(length_scale=1.0 / max(0.3, speed))
with wave.open(tmp_path, "wb") as wav_file:
voice.synthesize_wav(sentence, wav_file, syn_config=syn_config)
with wave.open(tmp_path, "rb") as wav_file:
if sample_rate is None:
sample_rate = wav_file.getframerate()
all_audio += wav_file.readframes(wav_file.getnframes())
Path(tmp_path).unlink(missing_ok=True)
# Zusammengefuegtes WAV erstellen
with tempfile.NamedTemporaryFile(suffix=".wav", delete=False) as tmp:
tmp_path = tmp.name
final_path = tmp.name
with wave.open(final_path, "wb") as wav_file:
wav_file.setnchannels(1)
wav_file.setsampwidth(2)
wav_file.setframerate(sample_rate or 22050)
wav_file.writeframes(all_audio)
with wave.open(tmp_path, "wb") as wav_file:
voice.synthesize(text, wav_file)
audio_data = Path(tmp_path).read_bytes()
Path(tmp_path).unlink(missing_ok=True)
audio_data = Path(final_path).read_bytes()
Path(final_path).unlink(missing_ok=True)
logger.info(
"TTS: %d bytes erzeugt mit %s'%s'",
"TTS: %d bytes erzeugt mit %s (%d Saetze)'%s'",
len(audio_data),
voice_name,
len(sentences),
text[:60],
)
return audio_data
@@ -266,6 +330,25 @@ class STTEngine:
self.model = WhisperModel(self.model_size, device="cpu", compute_type="int8")
logger.info("Whisper-Modell geladen")
def reload(self, model_size: str) -> bool:
"""Laedt ein anderes Whisper-Modell (bei Config-Aenderung)."""
if model_size == self.model_size and self.model is not None:
return False
allowed = {"tiny", "base", "small", "medium", "large-v3"}
if model_size not in allowed:
logger.warning("Ungueltiges Whisper-Modell: %s (erlaubt: %s)", model_size, allowed)
return False
logger.info("Lade Whisper-Modell neu: %s -> %s", self.model_size, model_size)
self.model_size = model_size
self.model = None
try:
self.model = WhisperModel(model_size, device="cpu", compute_type="int8")
logger.info("Whisper-Modell '%s' geladen", model_size)
return True
except Exception:
logger.exception("Whisper-Modell '%s' konnte nicht geladen werden", model_size)
return False
def transcribe(self, audio_data: np.ndarray) -> str:
"""Transkribiert Audio-Daten zu Text.
@@ -437,8 +520,30 @@ class ARIABridge:
# Komponenten
self.voice_engine = VoiceEngine(VOICES_DIR)
self.tts_enabled = True
vc: dict = {}
# Gespeicherte Voice-Config laden
try:
vc_path = "/shared/config/voice_config.json"
if os.path.exists(vc_path):
with open(vc_path) as f:
vc = json.load(f)
self.voice_engine.default_voice = vc.get("defaultVoice", "ramona")
self.voice_engine.highlight_voice = vc.get("highlightVoice", "thorsten")
self.voice_engine.speech_speed = {
"ramona": vc.get("speedRamona", 1.0),
"thorsten": vc.get("speedThorsten", 1.0),
}
self.tts_enabled = vc.get("ttsEnabled", True)
self.tts_engine_type = vc.get("ttsEngine", "piper")
self.xtts_voice = vc.get("xttsVoice", "")
logger.info("Voice-Config geladen: %s", vc)
except Exception as e:
logger.warning("Voice-Config laden fehlgeschlagen: %s", e)
# Whisper-Modell: Config hat Vorrang, dann env/Default (medium)
whisper_model = vc.get("whisperModel") or self.config.get("WHISPER_MODEL", WHISPER_MODEL)
self.stt_engine = STTEngine(
model_size=self.config.get("WHISPER_MODEL", WHISPER_MODEL),
model_size=whisper_model,
language=self.config.get("WHISPER_LANGUAGE", WHISPER_LANGUAGE),
)
self.wake_word = WakeWordDetector()
@@ -447,6 +552,9 @@ class ARIABridge:
self.ws_core: Optional[websockets.WebSocketClientProtocol] = None
self.ws_rvs: Optional[websockets.WebSocketClientProtocol] = None
# Letzter gesendeter agent_activity-State (zum Entduplizieren)
self._last_activity_state: Optional[tuple] = None
def initialize(self) -> None:
"""Initialisiert alle Komponenten.
@@ -461,17 +569,20 @@ class ARIABridge:
# Voice-Engine IMMER laden — rendert Audio fuer die App (auch ohne Soundkarte)
self.voice_engine.initialize()
# STT IMMER laden — verarbeitet Audio von der App (braucht kein Sounddevice)
self.stt_engine.initialize()
# Audio-Hardware pruefen (fuer lokales Mikro/Lautsprecher)
self.audio_available = False
try:
sd.query_devices()
devices = sd.query_devices()
sd.query_devices(kind='output')
self.audio_available = True
logger.info("Audio-Geraet gefunden — Wake-Word und lokale TTS aktiv")
self.stt_engine.initialize()
self.wake_word.initialize()
except (sd.PortAudioError, Exception):
logger.warning("Kein Audio-Geraet — Wake-Word und lokale TTS deaktiviert")
logger.info("Piper TTS rendert Audio fuer die App (via RVS)")
logger.warning("Kein Audio-Geraet — Wake-Word und lokale Wiedergabe deaktiviert")
logger.info("TTS rendert fuer App (via RVS), STT verarbeitet App-Audio")
logger.info("Alle Komponenten initialisiert")
logger.info("aria-core: %s", self.ws_url)
@@ -648,8 +759,18 @@ class ARIABridge:
if event_name == "agent":
data = payload.get("data", {})
delta = data.get("delta", "")
if delta and payload.get("stream") == "assistant":
stream = payload.get("stream", "")
if delta and stream == "assistant":
logger.debug("[core] Delta: '%s'", delta[:40])
# Activity-Signal zur App (entdupliziert)
tool_name = data.get("name") or data.get("tool") or payload.get("tool") or ""
if stream == "tool_use" or data.get("type") == "tool_use":
activity = "tool"
elif stream == "assistant":
activity = "assistant"
else:
activity = "thinking"
await self._emit_activity(activity, tool_name)
return
# ── chat Events: Snapshots mit state=delta|final|error ──
@@ -658,6 +779,7 @@ class ARIABridge:
if state == "final":
text = self._extract_chat_text(payload)
await self._emit_activity("idle", "")
if not text:
logger.warning("[core] chat final ohne Text: %s", json.dumps(payload)[:200])
return
@@ -668,6 +790,7 @@ class ARIABridge:
if state == "error":
error = payload.get("error", "Unbekannt")
logger.error("[core] Chat-Fehler: %s", error)
await self._emit_activity("idle", "")
await self._send_to_rvs({
"type": "chat",
"payload": {
@@ -773,18 +896,48 @@ class ARIABridge:
})
# TTS-Audio rendern und an die App senden (wenn Modus es erlaubt)
if should_speak(self.current_mode, is_critical):
audio_data = self.voice_engine.synthesize(text, voice_name)
if audio_data:
audio_b64 = base64.b64encode(audio_data).decode("ascii")
await self._send_to_rvs({
"type": "audio",
"payload": {
"base64": audio_b64,
"mimeType": "audio/wav",
"voice": voice_name,
},
"timestamp": int(asyncio.get_event_loop().time() * 1000),
if getattr(self, 'tts_enabled', True) and should_speak(self.current_mode, is_critical):
tts_engine = getattr(self, 'tts_engine_type', 'piper')
if tts_engine == "xtts":
# XTTS: Ganzen Text senden, XTTS-Bridge teilt satzweise auf
xtts_voice = getattr(self, 'xtts_voice', '')
try:
await self._send_to_rvs({
"type": "xtts_request",
"payload": {
"text": text,
"voice": xtts_voice,
"language": "de",
"requestId": str(uuid.uuid4()),
},
"timestamp": int(asyncio.get_event_loop().time() * 1000),
})
logger.info("[core] XTTS-Request gesendet (%s): '%s'", xtts_voice or "default", text[:60])
except Exception as e:
logger.warning("[core] XTTS-Request fehlgeschlagen: %s — Fallback auf Piper", e)
# Fallback auf Piper
audio_data = self.voice_engine.synthesize(text, voice_name)
if audio_data:
audio_b64 = base64.b64encode(audio_data).decode("ascii")
await self._send_to_rvs({
"type": "audio",
"payload": {"base64": audio_b64, "mimeType": "audio/wav", "voice": voice_name},
"timestamp": int(asyncio.get_event_loop().time() * 1000),
})
else:
# Piper: Lokal rendern
audio_data = self.voice_engine.synthesize(text, voice_name)
if audio_data:
audio_b64 = base64.b64encode(audio_data).decode("ascii")
await self._send_to_rvs({
"type": "audio",
"payload": {
"base64": audio_b64,
"mimeType": "audio/wav",
"voice": voice_name,
},
"timestamp": int(asyncio.get_event_loop().time() * 1000),
})
logger.info("[core] TTS-Audio gesendet: %d bytes (%s)", len(audio_data), voice_name)
@@ -893,10 +1046,22 @@ class ARIABridge:
retry_delay = min(retry_delay * 2, 30)
async def _rvs_heartbeat(self) -> None:
"""Sendet Heartbeats an den RVS damit die Verbindung offen bleibt."""
"""Sendet Heartbeats + WebSocket Pings an den RVS damit die Verbindung offen bleibt."""
while True:
await asyncio.sleep(25)
await asyncio.sleep(15)
if self.ws_rvs:
try:
# WebSocket Protocol-Level Ping (haelt TCP-Verbindung am Leben)
pong = await self.ws_rvs.ping()
await asyncio.wait_for(pong, timeout=10)
except Exception:
logger.warning("[rvs] Ping fehlgeschlagen — Verbindung tot, erzwinge Reconnect")
try:
await self.ws_rvs.close()
except Exception:
pass
self.ws_rvs = None
break
try:
await self.ws_rvs.send(json.dumps({
"type": "heartbeat",
@@ -927,12 +1092,132 @@ class ARIABridge:
if msg_type == "chat":
# Nur User-Nachrichten weiterleiten — ARIA/Diagnostic-Antworten ignorieren (sonst Loop!)
sender = payload.get("sender", "")
if sender in ("aria", "diagnostic", "stt"):
if sender in ("aria", "stt"):
return
text = payload.get("text", "")
if text:
logger.info("[rvs] App-Chat: '%s'", text[:80])
await self.send_to_core(text, source="app")
return
if msg_type == "cancel_request":
logger.info("[rvs] Cancel-Request von App — rufe Diagnostic /api/cancel auf")
await self._cancel_via_diagnostic()
await self._emit_activity("idle", "")
return
elif msg_type == "xtts_response":
# XTTS-Audio vom Gaming-PC empfangen → an App weiterleiten
audio_b64 = payload.get("base64", "")
error = payload.get("error", "")
if error:
logger.warning("[rvs] XTTS Fehler: %s", error)
return
if audio_b64:
logger.info("[rvs] XTTS-Audio empfangen: %dKB", len(audio_b64) // 1365)
await self._send_to_rvs({
"type": "audio",
"payload": {
"base64": audio_b64,
"mimeType": payload.get("mimeType", "audio/wav"),
"voice": payload.get("voice", "xtts"),
},
"timestamp": int(asyncio.get_event_loop().time() * 1000),
})
return
elif msg_type == "tts_request":
# App fordert TTS-Audio fuer einen Text an (Play-Button)
text = payload.get("text", "")
requested_voice = payload.get("voice", "")
if text:
voice_name = requested_voice or self.voice_engine.select_voice(text)
audio_data = self.voice_engine.synthesize(text, voice_name)
if audio_data:
audio_b64 = base64.b64encode(audio_data).decode("ascii")
try:
await self._send_to_rvs({
"type": "audio",
"payload": {
"base64": audio_b64,
"mimeType": "audio/wav",
"voice": voice_name,
},
"timestamp": int(asyncio.get_event_loop().time() * 1000),
})
logger.info("[rvs] TTS on-demand: %d bytes (%s)", len(audio_data), voice_name)
except Exception as e:
logger.warning("[rvs] TTS on-demand senden fehlgeschlagen: %s", e)
return
elif msg_type == "config":
# Konfiguration von App/Diagnostic empfangen + persistent speichern
changed = False
if "defaultVoice" in payload:
new_voice = payload["defaultVoice"]
if new_voice in self.voice_engine.voices:
self.voice_engine.default_voice = new_voice
logger.info("[rvs] Standard-Stimme gewechselt: %s", new_voice)
changed = True
if "highlightVoice" in payload:
new_voice = payload["highlightVoice"]
if new_voice in self.voice_engine.voices:
self.voice_engine.highlight_voice = new_voice
logger.info("[rvs] Highlight-Stimme gewechselt: %s", new_voice)
changed = True
if "ttsEnabled" in payload:
self.tts_enabled = bool(payload["ttsEnabled"])
logger.info("[rvs] TTS %s", "aktiviert" if self.tts_enabled else "deaktiviert")
changed = True
if "ttsEngine" in payload:
self.tts_engine_type = payload["ttsEngine"]
logger.info("[rvs] TTS-Engine: %s", self.tts_engine_type)
changed = True
if "xttsVoice" in payload:
self.xtts_voice = payload["xttsVoice"]
logger.info("[rvs] XTTS-Stimme: %s", self.xtts_voice)
changed = True
if "speedRamona" in payload:
self.voice_engine.speech_speed["ramona"] = max(0.3, min(2.0, float(payload["speedRamona"])))
logger.info("[rvs] Speed Ramona: %.1f", self.voice_engine.speech_speed["ramona"])
changed = True
if "speedThorsten" in payload:
self.voice_engine.speech_speed["thorsten"] = max(0.3, min(2.0, float(payload["speedThorsten"])))
logger.info("[rvs] Speed Thorsten: %.1f", self.voice_engine.speech_speed["thorsten"])
changed = True
whisper_reloaded = False
if "whisperModel" in payload:
new_model = payload["whisperModel"]
if new_model and new_model != self.stt_engine.model_size:
logger.info("[rvs] Whisper-Modell Wechsel: %s -> %s (laedt...)", self.stt_engine.model_size, new_model)
loop = asyncio.get_event_loop()
whisper_reloaded = await loop.run_in_executor(None, self.stt_engine.reload, new_model)
if whisper_reloaded:
changed = True
# Persistent speichern in Shared Volume
if changed:
try:
os.makedirs("/shared/config", exist_ok=True)
config_data = {
"defaultVoice": self.voice_engine.default_voice,
"highlightVoice": self.voice_engine.highlight_voice,
"ttsEnabled": getattr(self, "tts_enabled", True),
"ttsEngine": getattr(self, "tts_engine_type", "piper"),
"xttsVoice": getattr(self, "xtts_voice", ""),
"speedRamona": self.voice_engine.speech_speed.get("ramona", 1.0),
"speedThorsten": self.voice_engine.speech_speed.get("thorsten", 1.0),
"whisperModel": self.stt_engine.model_size,
}
with open("/shared/config/voice_config.json", "w") as f:
json.dump(config_data, f, indent=2)
logger.info("[rvs] Voice-Config gespeichert: %s", config_data)
except Exception as e:
logger.warning("[rvs] Config speichern fehlgeschlagen: %s", e)
return
text = payload.get("text", "")
if text:
logger.info("[rvs] App-Chat: '%s'", text[:80])
await self.send_to_core(text, source="app")
elif msg_type == "mode":
# Moduswechsel von der App
@@ -984,7 +1269,8 @@ class ARIABridge:
text = (f"Stefan hat dir ein Bild geschickt: {file_name}"
f"{f' ({width}x{height}px)' if width else ''}"
f", {size_kb}KB."
f" Das Bild liegt unter: {file_path}")
f" Das Bild liegt unter: {file_path}"
f" Warte auf Stefans Anweisung was du damit tun sollst.")
await self.send_to_core(text, source="app-file")
# Dann App informieren (optional, darf nicht crashen)
try:
@@ -1006,7 +1292,8 @@ class ARIABridge:
# ERST an aria-core senden
text = (f"Stefan hat dir eine Datei geschickt: {file_name}"
f" ({file_type}, {size_kb}KB)."
f" Die Datei liegt unter: {file_path}")
f" Die Datei liegt unter: {file_path}"
f" Warte auf Stefans Anweisung was du damit tun sollst.")
await self.send_to_core(text, source="app-file")
try:
await self._send_to_rvs({
@@ -1137,10 +1424,24 @@ class ARIABridge:
pass
async def _send_to_rvs(self, message: dict) -> None:
"""Sendet eine Nachricht an die App (via RVS)."""
"""Sendet eine Nachricht an die App (via RVS) mit Verbindungs-Check."""
if self.ws_rvs is None:
return
# Ping-Check: Verbindung wirklich aktiv?
try:
pong = await self.ws_rvs.ping()
await asyncio.wait_for(pong, timeout=5)
except Exception:
logger.warning("[rvs] Ping fehlgeschlagen — Verbindung tot, erzwinge Reconnect")
try:
await self.ws_rvs.close()
except Exception:
pass
self.ws_rvs = None
# Reconnect wird vom connect_to_rvs Loop uebernommen
return
try:
await self.ws_rvs.send(json.dumps(message))
except Exception:
@@ -1148,6 +1449,36 @@ class ARIABridge:
# ── Log-Streaming an die App ─────────────────────────────
async def _cancel_via_diagnostic(self) -> None:
"""Ruft das Diagnostic /api/cancel an — dort laeuft die volle Abbruch-Logik
(openclaw doctor --fix mit Docker-Socket)."""
def _do_request():
try:
req = urllib.request.Request(
f"{self._diagnostic_url}/api/cancel",
method="POST",
data=b"",
)
with urllib.request.urlopen(req, timeout=5) as resp:
return resp.status
except Exception as e:
return f"error: {e}"
status = await asyncio.get_event_loop().run_in_executor(None, _do_request)
logger.info("[cancel] Diagnostic /api/cancel: %s", status)
async def _emit_activity(self, activity: str, tool: str = "") -> None:
"""Sendet agent_activity an die App — nur wenn sich der State geaendert hat."""
state = (activity, tool)
if state == self._last_activity_state:
return
self._last_activity_state = state
await self._send_to_rvs({
"type": "agent_activity",
"payload": {"activity": activity, "tool": tool},
"timestamp": int(asyncio.get_event_loop().time() * 1000),
})
async def send_log_to_app(self, source: str, message: str, level: str = "info") -> None:
"""Sendet einen Log-Eintrag an die App (erscheint im Log-Viewer)."""
await self._send_to_rvs({
@@ -1216,8 +1547,10 @@ class ARIABridge:
logger.info("Keine Sprache erkannt — ignoriert")
except sd.PortAudioError:
logger.error("Audio-Geraet nicht verfuegbar — warte 5 Sekunden")
await asyncio.sleep(5)
if not hasattr(self, '_audio_warned'):
logger.warning("Audio-Geraet nicht verfuegbar — lokales Mikrofon deaktiviert (kein Spam mehr)")
self._audio_warned = True
await asyncio.sleep(60) # 60s statt 5s — spart Log-Spam
except Exception:
logger.exception("Fehler in der Audio-Schleife")
await asyncio.sleep(1)
+575 -8
View File
@@ -201,8 +201,18 @@
<button class="btn secondary" onclick="toggleChatFullscreen()" id="btn-chat-fs" style="padding:4px 10px;font-size:11px;">Vollbild</button>
</div>
<div class="chat-box" id="chat-box"></div>
<div id="thinking-indicator" style="display:none;padding:6px 10px;font-size:12px;color:#FFD60A;background:#1E1E2E;border-radius:0 0 6px 6px;margin-top:-8px;margin-bottom:8px;display:flex;align-items:center;justify-content:space-between;">
<span><span style="animation:pulse 1s infinite;">&#x1F4AD;</span> <span id="thinking-text">ARIA denkt...</span></span>
<button class="btn secondary" onclick="cancelRequest()" style="padding:2px 10px;font-size:11px;color:#FF3B30;border-color:#FF3B30;">Abbrechen</button>
</div>
<div id="diag-pending-attachments" style="display:none;padding:6px 10px;background:#1E1E2E;border-radius:6px 6px 0 0;margin-bottom:-4px;display:flex;gap:6px;flex-wrap:wrap;align-items:center;">
</div>
<div class="input-row">
<input type="text" id="chat-input" placeholder="Nachricht an ARIA...">
<label class="btn secondary" style="padding:6px 10px;cursor:pointer;font-size:14px;" title="Datei anhaengen">
&#x1F4CE;
<input type="file" id="diag-file-input" multiple accept="image/*,application/pdf,.doc,.docx,.txt" style="display:none;" onchange="handleDiagFileSelect(this.files)">
</label>
<input type="text" id="chat-input" placeholder="Nachricht an ARIA..." onpaste="handleDiagPaste(event)">
<button class="btn" id="btn-gw" onclick="testGateway()">Gateway senden</button>
<button class="btn" id="btn-rvs" onclick="testRVS()">Via RVS senden</button>
</div>
@@ -216,6 +226,9 @@
<button class="btn secondary" onclick="toggleChatFullscreen()" style="padding:6px 14px;">Schliessen</button>
</div>
<div id="chat-box-fs" class="chat-box" style="flex:1;max-height:none;min-height:0;overflow-y:auto;"></div>
<div id="thinking-indicator-fs" style="display:none;padding:6px 10px;font-size:12px;color:#FFD60A;background:#1E1E2E;border-radius:6px;margin-top:4px;">
<span style="animation:pulse 1s infinite;">&#x1F4AD;</span> <span id="thinking-text-fs">ARIA denkt...</span>
</div>
<div class="input-row" style="margin-top:8px;">
<input type="text" id="chat-input-fs" placeholder="Nachricht an ARIA..." onkeydown="if(event.key==='Enter'){testRVSFS();event.preventDefault();}">
<button class="btn" onclick="testGatewayFS()">Gateway senden</button>
@@ -277,6 +290,7 @@
<button class="tab-btn" data-tab="bridge" onclick="switchTab('bridge')">Bridge <span class="tab-count" id="count-bridge">0</span></button>
<button class="tab-btn" data-tab="server" onclick="switchTab('server')">Server <span class="tab-count" id="count-server">0</span></button>
<button class="tab-btn" data-tab="pipeline" onclick="switchTab('pipeline')" style="margin-left:auto;border-color:#0096FF44;color:#0096FF">Pipeline <span class="tab-count" id="count-pipeline">0</span></button>
<button class="tab-btn" data-tab="tts" onclick="switchTab('tts')" style="border-color:#34C75944;color:#34C759">TTS</button>
</div>
</div>
<div class="log-panel">
@@ -296,6 +310,36 @@
<div class="log-box hidden" id="log-bridge"></div>
<div class="log-box hidden" id="log-server"></div>
<div class="log-box hidden" id="log-pipeline"></div>
<div class="log-box hidden" id="log-tts" style="padding:12px;">
<h3 style="color:#34C759;margin:0 0 12px;">TTS Diagnose</h3>
<div style="display:grid;grid-template-columns:1fr 1fr;gap:8px;margin-bottom:12px;">
<div style="background:#1E1E2E;padding:8px;border-radius:6px;">
<div style="color:#8888AA;font-size:10px;text-transform:uppercase;">Standard-Stimme</div>
<div style="color:#fff;font-size:14px;margin-top:4px;" id="tts-default-voice">Ramona</div>
</div>
<div style="background:#1E1E2E;padding:8px;border-radius:6px;">
<div style="color:#8888AA;font-size:10px;text-transform:uppercase;">Highlight-Stimme</div>
<div style="color:#fff;font-size:14px;margin-top:4px;" id="tts-highlight-voice">Thorsten</div>
</div>
<div style="background:#1E1E2E;padding:8px;border-radius:6px;">
<div style="color:#8888AA;font-size:10px;text-transform:uppercase;">Status</div>
<div style="font-size:14px;margin-top:4px;" id="tts-status">Unbekannt</div>
</div>
<div style="background:#1E1E2E;padding:8px;border-radius:6px;">
<div style="color:#8888AA;font-size:10px;text-transform:uppercase;">Letzter Fehler</div>
<div style="color:#FF6B6B;font-size:12px;margin-top:4px;word-break:break-all;" id="tts-last-error">-</div>
</div>
</div>
<div style="margin-bottom:8px;">
<input type="text" id="tts-test-text" value="Hallo Stefan, ich bin ARIA." placeholder="Test-Text..." style="background:#1E1E2E;border:1px solid #2A2A3E;border-radius:6px;padding:8px;color:#fff;font-size:13px;width:100%;box-sizing:border-box;">
</div>
<div style="display:flex;gap:8px;">
<button class="btn" onclick="testTTS('ramona')" style="flex:1;">Ramona testen</button>
<button class="btn" onclick="testTTS('thorsten')" style="flex:1;">Thorsten testen</button>
<button class="btn secondary" onclick="checkTTSStatus()" style="flex:1;">Status pruefen</button>
</div>
<div id="tts-log" style="margin-top:12px;max-height:200px;overflow-y:auto;font-size:11px;font-family:monospace;color:#8888AA;"></div>
</div>
</div>
</div>
@@ -334,6 +378,169 @@
<!-- ══════ TAB: Einstellungen ══════ -->
<div id="tab-settings" class="main-tab">
<!-- Betriebsmodus -->
<div class="settings-section">
<h2>Betriebsmodus</h2>
<div class="card" style="max-width:500px;">
<div id="mode-selector" style="display:grid;grid-template-columns:1fr 1fr;gap:8px;">
<button class="btn mode-btn" data-mode="normal" onclick="setMode('normal')" style="background:#1E1E2E;border:2px solid transparent;">
<span style="font-size:18px;">&#x1F7E2;</span> Normal<br><span style="font-size:10px;color:#8888AA;">Hoert zu, antwortet, spricht</span>
</button>
<button class="btn mode-btn" data-mode="dnd" onclick="setMode('dnd')" style="background:#1E1E2E;border:2px solid transparent;">
<span style="font-size:18px;">&#x1F534;</span> Nicht stoeren<br><span style="font-size:10px;color:#8888AA;">Nur Kritikalarme</span>
</button>
<button class="btn mode-btn" data-mode="whisper" onclick="setMode('whisper')" style="background:#1E1E2E;border:2px solid transparent;">
<span style="font-size:18px;">&#x1F7E1;</span> Fluestern<br><span style="font-size:10px;color:#8888AA;">Nur Text, keine Sprache</span>
</button>
<button class="btn mode-btn" data-mode="hangar" onclick="setMode('hangar')" style="background:#1E1E2E;border:2px solid transparent;">
<span style="font-size:18px;">&#x2708;&#xFE0F;</span> Hangar<br><span style="font-size:10px;color:#8888AA;">Nur wichtige Meldungen</span>
</button>
<button class="btn mode-btn" data-mode="gaming" onclick="setMode('gaming')" style="background:#1E1E2E;border:2px solid transparent;grid-column:1/-1;">
<span style="font-size:18px;">&#x1F3AE;</span> Gaming<br><span style="font-size:10px;color:#8888AA;">Nur direkte Fragen</span>
</button>
</div>
<div style="margin-top:8px;font-size:11px;color:#555570;" id="mode-status">Aktueller Modus: Normal</div>
</div>
</div>
<!-- Stimmen -->
<div class="settings-section">
<h2>Sprachausgabe</h2>
<div class="card" style="max-width:500px;">
<!-- TTS aktiv (global fuer alle Engines) -->
<div style="display:flex;align-items:center;gap:12px;margin-bottom:12px;">
<label style="color:#8888AA;font-size:12px;">TTS aktiv:</label>
<label class="toggle"><input type="checkbox" id="diag-tts-enabled" checked onchange="sendVoiceConfig()"><span class="slider"></span></label>
</div>
<!-- TTS Engine Auswahl -->
<div style="display:flex;align-items:center;gap:12px;margin-bottom:12px;">
<label style="color:#8888AA;font-size:12px;">TTS Engine:</label>
<select id="diag-tts-engine" onchange="sendVoiceConfig();toggleXTTSPanel()" style="background:#1E1E2E;color:#fff;border:1px solid #2A2A3E;border-radius:6px;padding:6px 10px;font-size:13px;">
<option value="piper">Piper (lokal, CPU, schnell)</option>
<option value="xtts">XTTS v2 (remote, GPU, natuerlich)</option>
</select>
</div>
<!-- Piper Stimmen (nur bei Engine=piper) -->
<div id="piper-panel">
<div style="display:flex;align-items:center;gap:12px;margin-bottom:12px;">
<label style="color:#8888AA;font-size:12px;">Standard-Stimme:</label>
<select id="diag-default-voice" onchange="sendVoiceConfig()" style="background:#1E1E2E;color:#fff;border:1px solid #2A2A3E;border-radius:6px;padding:6px 10px;font-size:13px;">
<option value="ramona">Ramona (weiblich)</option>
<option value="thorsten">Thorsten (maennlich)</option>
</select>
</div>
<div style="display:flex;align-items:center;gap:12px;margin-bottom:12px;">
<label style="color:#8888AA;font-size:12px;">Highlight-Stimme:</label>
<select id="diag-highlight-voice" onchange="sendVoiceConfig()" style="background:#1E1E2E;color:#fff;border:1px solid #2A2A3E;border-radius:6px;padding:6px 10px;font-size:13px;">
<option value="thorsten">Thorsten (maennlich)</option>
<option value="ramona">Ramona (weiblich)</option>
</select>
</div>
<div style="margin-bottom:4px;">
<label style="color:#8888AA;font-size:12px;">Ramona Speed: <span id="speed-ramona-label">1.0x</span></label>
</div>
<div style="display:flex;align-items:center;gap:8px;margin-bottom:12px;">
<span style="color:#555570;font-size:11px;">0.5x</span>
<input type="range" id="diag-speed-ramona" min="0.5" max="2.0" step="0.1" value="1.0"
oninput="document.getElementById('speed-ramona-label').textContent=this.value+'x'"
onchange="sendVoiceConfig()"
style="flex:1;accent-color:#0096FF;">
<span style="color:#555570;font-size:11px;">2.0x</span>
</div>
<div style="margin-bottom:4px;">
<label style="color:#8888AA;font-size:12px;">Thorsten Speed: <span id="speed-thorsten-label">1.0x</span></label>
</div>
<div style="display:flex;align-items:center;gap:8px;">
<span style="color:#555570;font-size:11px;">0.5x</span>
<input type="range" id="diag-speed-thorsten" min="0.5" max="2.0" step="0.1" value="1.0"
oninput="document.getElementById('speed-thorsten-label').textContent=this.value+'x'"
onchange="sendVoiceConfig()"
style="flex:1;accent-color:#0096FF;">
<span style="color:#555570;font-size:11px;">2.0x</span>
</div>
</div><!-- /piper-panel -->
<!-- XTTS Panel (nur bei Engine=xtts) -->
<div id="xtts-panel" style="display:none;">
<div style="display:flex;align-items:center;gap:12px;margin-bottom:12px;">
<label style="color:#8888AA;font-size:12px;">XTTS Stimme:</label>
<select id="diag-xtts-voice" onchange="sendVoiceConfig()" style="background:#1E1E2E;color:#fff;border:1px solid #2A2A3E;border-radius:6px;padding:6px 10px;font-size:13px;">
<option value="">Standard (XTTS Default)</option>
</select>
<button class="btn secondary" onclick="loadXTTSVoices()" style="padding:4px 10px;font-size:11px;">Laden</button>
</div>
<!-- Voice Cloning -->
<div style="background:#1E1E2E;border-radius:8px;padding:12px;margin-top:8px;">
<div style="color:#0096FF;font-size:13px;font-weight:600;margin-bottom:8px;">Stimme klonen</div>
<div style="color:#8888AA;font-size:11px;margin-bottom:8px;">
Lade ein oder mehrere Audio-Samples hoch (WAV/MP3, min. 6-10 Sekunden).
Mehrere Dateien werden automatisch zusammengefuegt.
</div>
<div style="margin-bottom:8px;">
<input type="text" id="xtts-clone-name" placeholder="Name fuer die Stimme..." style="background:#0D0D1A;border:1px solid #2A2A3E;border-radius:6px;padding:6px 10px;color:#fff;font-size:13px;width:100%;box-sizing:border-box;">
</div>
<div style="margin-bottom:8px;">
<input type="file" id="xtts-clone-files" accept="audio/*" multiple style="color:#8888AA;font-size:12px;">
</div>
<div style="display:flex;gap:8px;">
<button class="btn" onclick="uploadVoiceSamples()" style="flex:1;">Stimme erstellen</button>
</div>
<div id="xtts-clone-status" style="font-size:11px;color:#555570;margin-top:6px;"></div>
</div>
<!-- XTTS Status -->
<div style="margin-top:8px;font-size:11px;color:#555570;" id="xtts-status">
XTTS-Server: Nicht verbunden (starte xtts/ auf dem Gaming-PC)
</div>
</div>
</div>
</div>
<!-- Whisper (STT) -->
<div class="settings-section">
<h2>Whisper (Spracherkennung)</h2>
<div style="font-size:11px;color:#8888AA;margin-bottom:8px;">
Aenderungen werden sofort an die Bridge gesendet und das Modell neu geladen
(kann bei medium/large 10-30s dauern — waehrend dieser Zeit ist STT kurz pausiert).
</div>
<div class="card" style="max-width:500px;">
<div style="display:flex;align-items:center;gap:12px;margin-bottom:8px;">
<label style="color:#8888AA;font-size:12px;min-width:80px;">Modell:</label>
<select id="diag-whisper-model" onchange="sendVoiceConfig()" style="flex:1;background:#1E1E2E;color:#fff;border:1px solid #2A2A3E;border-radius:6px;padding:6px 10px;font-size:13px;">
<option value="tiny">tiny (39MB, schnell, niedrige Qualitaet)</option>
<option value="base">base (74MB, schnell, ok)</option>
<option value="small">small (244MB, mittel)</option>
<option value="medium" selected>medium (769MB, gut — Empfehlung)</option>
<option value="large-v3">large-v3 (1.5GB, beste Qualitaet, langsam auf CPU)</option>
</select>
</div>
<div style="font-size:10px;color:#555570;">
Tipp: <code>medium</code> ist der beste Kompromiss fuer CPU. <code>large-v3</code> nur bei GPU sinnvoll.
</div>
</div>
</div>
<!-- Highlight-Trigger -->
<div class="settings-section">
<h2>Highlight-Trigger</h2>
<div style="font-size:11px;color:#8888AA;margin-bottom:8px;">
Woerter die automatisch die Highlight-Stimme (Thorsten) ausloesen.
Eines pro Zeile. Aenderungen werden in der Bridge gespeichert.
</div>
<div class="card" style="max-width:500px;">
<textarea id="highlight-triggers" rows="8" style="width:100%;box-sizing:border-box;background:#1E1E2E;border:1px solid #2A2A3E;border-radius:6px;padding:8px;color:#fff;font-size:13px;font-family:monospace;resize:vertical;"
placeholder="Lade..."></textarea>
<div style="display:flex;gap:8px;margin-top:8px;">
<button class="btn" onclick="saveHighlightTriggers()" style="flex:1;">Speichern</button>
<button class="btn secondary" onclick="loadHighlightTriggers()" style="flex:1;">Neu laden</button>
</div>
<div id="trigger-status" style="font-size:11px;color:#555570;margin-top:6px;"></div>
</div>
</div>
<!-- Tool-Berechtigungen -->
<div class="settings-section">
<h2>Tool-Berechtigungen</h2>
@@ -414,6 +621,7 @@
bridge: document.getElementById('log-bridge'),
server: document.getElementById('log-server'),
pipeline: document.getElementById('log-pipeline'),
tts: document.getElementById('log-tts'),
};
// Scroll-Pause pro aktivem Tab
@@ -507,6 +715,102 @@
if (msg.type === 'state') { updateState(msg.state); return; }
if (msg.type === 'log') { addLog(msg.entry.level, msg.entry.source, msg.entry.message, msg.entry.ts); return; }
if (msg.type === 'tts_result') {
if (msg.ok) {
ttsLog(`\u2705 ${msg.voice}: ${msg.duration}ms, ${msg.size} bytes`);
document.getElementById('tts-status').textContent = 'OK';
document.getElementById('tts-status').style.color = '#34C759';
} else {
ttsLog(`\u274C Fehler: ${msg.error}`);
document.getElementById('tts-status').textContent = 'Fehler';
document.getElementById('tts-status').style.color = '#FF3B30';
document.getElementById('tts-last-error').textContent = msg.error;
}
return;
}
if (msg.type === 'tts_status') {
document.getElementById('tts-default-voice').textContent = msg.defaultVoice || '?';
document.getElementById('tts-highlight-voice').textContent = msg.highlightVoice || '?';
document.getElementById('tts-status').textContent = msg.ok ? 'OK' : 'Fehler';
document.getElementById('tts-status').style.color = msg.ok ? '#34C759' : '#FF3B30';
if (msg.voices) ttsLog(`Stimmen: ${msg.voices.join(', ')}`);
if (msg.error) { document.getElementById('tts-last-error').textContent = msg.error; ttsLog(`Fehler: ${msg.error}`); }
else { document.getElementById('tts-last-error').textContent = '-'; ttsLog('TTS OK'); }
return;
}
if (msg.type === 'agent_activity') {
updateThinkingIndicator(msg);
return;
}
if (msg.type === 'xtts_voices_list') {
const select = document.getElementById('diag-xtts-voice');
// Behalte erste Option (Default)
while (select.options.length > 1) select.remove(1);
for (const v of (msg.payload?.voices || [])) {
const opt = document.createElement('option');
opt.value = v.name;
opt.textContent = `${v.name} (${(v.size / 1024).toFixed(0)}KB)`;
select.appendChild(opt);
}
document.getElementById('xtts-status').textContent = `XTTS: ${msg.payload?.voices?.length || 0} Stimme(n) verfuegbar`;
document.getElementById('xtts-status').style.color = '#34C759';
return;
}
if (msg.type === 'xtts_voice_saved') {
document.getElementById('xtts-clone-status').textContent = `Stimme "${msg.payload?.name}" gespeichert!`;
document.getElementById('xtts-clone-status').style.color = '#34C759';
loadXTTSVoices(); // Liste neu laden
return;
}
if (msg.type === 'voice_config') {
document.getElementById('diag-default-voice').value = msg.defaultVoice || 'ramona';
document.getElementById('diag-highlight-voice').value = msg.highlightVoice || 'thorsten';
document.getElementById('diag-tts-enabled').checked = msg.ttsEnabled !== false;
const sr = msg.speedRamona || 1.0;
const st = msg.speedThorsten || 1.0;
document.getElementById('diag-speed-ramona').value = sr;
document.getElementById('speed-ramona-label').textContent = sr + 'x';
document.getElementById('diag-speed-thorsten').value = st;
document.getElementById('speed-thorsten-label').textContent = st + 'x';
document.getElementById('diag-tts-engine').value = msg.ttsEngine || 'piper';
// XTTS-Voice setzen — Option hinzufuegen falls nicht vorhanden
const xttsSelect = document.getElementById('diag-xtts-voice');
const xttsVoice = msg.xttsVoice || '';
if (xttsVoice && !Array.from(xttsSelect.options).some(o => o.value === xttsVoice)) {
const opt = document.createElement('option');
opt.value = xttsVoice;
opt.textContent = xttsVoice;
xttsSelect.appendChild(opt);
}
xttsSelect.value = xttsVoice;
toggleXTTSPanel();
// Whisper-Modell wiederherstellen (falls gesetzt)
if (msg.whisperModel) {
const wSel = document.getElementById('diag-whisper-model');
if (wSel) wSel.value = msg.whisperModel;
}
return;
}
if (msg.type === 'trigger_list') {
const textarea = document.getElementById('highlight-triggers');
textarea.value = (msg.triggers || []).join('\n');
document.getElementById('trigger-status').textContent = msg.triggers.length + ' Trigger geladen';
document.getElementById('trigger-status').style.color = '#8888AA';
return;
}
if (msg.type === 'watchdog') {
const colors = { warning: '#FFD60A', fixing: '#FF9500', fixed: '#34C759', error: '#FF3B30' };
const color = colors[msg.status] || '#FFD60A';
addChat('error', `\u26A0\uFE0F Watchdog: ${msg.message}`, `system — ${msg.status}`);
addLog('warn', 'server', `Watchdog: ${msg.message}`);
return;
}
if (msg.type === 'chat_final') {
addChat('received', msg.text, 'chat:final');
return;
@@ -616,6 +920,18 @@
else alert('Loeschen fehlgeschlagen: ' + (msg.error || '?'));
return;
}
if (msg.type === 'session_export') {
if (!msg.ok) { alert('Export fehlgeschlagen: ' + (msg.error || '?')); return; }
const blob = new Blob([msg.markdown], { type: 'text/markdown;charset=utf-8' });
const url = URL.createObjectURL(blob);
const a = document.createElement('a');
a.href = url;
a.download = msg.filename;
document.body.appendChild(a);
a.click();
setTimeout(() => { URL.revokeObjectURL(url); a.remove(); }, 100);
return;
}
if (msg.type === 'active_session') {
updateActiveSessionBar(msg.sessionKey);
loadSessions(); // Tabelle neu rendern
@@ -670,21 +986,39 @@
}
}
function sendDiagAttachments() {
// Alle pending Dateien an RVS senden
for (const f of diagPendingFiles) {
send({ action: 'send_file', name: f.name, type: f.type, size: f.size, base64: f.base64 });
}
if (diagPendingFiles.length > 0) {
addChat('sent', `${diagPendingFiles.length} Anhang/Anhaenge`, 'Datei');
}
diagPendingFiles = [];
renderDiagPending();
}
function testGateway() {
const input = document.getElementById('chat-input');
const text = input.value.trim();
if (!text) return;
addChat('sent', text, 'Gateway direkt');
send({ action: 'test_gateway', text });
if (!text && diagPendingFiles.length === 0) return;
if (diagPendingFiles.length > 0) sendDiagAttachments();
if (text) {
addChat('sent', text, 'Gateway direkt');
send({ action: 'test_gateway', text });
}
input.value = '';
}
function testRVS() {
const input = document.getElementById('chat-input');
const text = input.value.trim();
if (!text) return;
addChat('sent', text, 'via RVS');
send({ action: 'test_rvs', text });
if (!text && diagPendingFiles.length === 0) return;
if (diagPendingFiles.length > 0) sendDiagAttachments();
if (text) {
addChat('sent', text, 'via RVS');
send({ action: 'test_rvs', text });
}
input.value = '';
}
@@ -883,6 +1217,10 @@
return `<a href="${match}" target="_blank">${match}</a><img src="${match}" class="chat-media" onclick="openLightbox('image','${match}')" onerror="this.style.display='none'">`;
});
const html = `${linked}<div class="meta">${escapeHtml(meta)}${new Date().toLocaleTimeString('de-DE')}</div>`;
// Thinking-Indikator ausblenden bei neuer Nachricht
updateThinkingIndicator({ activity: 'idle' });
// In beide Chat-Boxen schreiben (normal + Vollbild)
for (const box of [chatBox, document.getElementById('chat-box-fs')]) {
if (!box) continue;
@@ -930,6 +1268,225 @@
if (e.key === 'Escape' && chatFullscreen) toggleChatFullscreen();
});
// ── Thinking-Indikator ─────────────────────────────
let thinkingTimeout = null;
const TOOL_LABELS = {
'Bash': '\uD83D\uDDA5\uFE0F Shell-Befehl',
'WebFetch': '\uD83C\uDF10 Webseite abrufen',
'WebSearch': '\uD83D\uDD0D Suche',
'Read': '\uD83D\uDCC4 Datei lesen',
'Write': '\u270D\uFE0F Datei schreiben',
'Edit': '\u270D\uFE0F Datei bearbeiten',
'Grep': '\uD83D\uDD0D Code durchsuchen',
'Glob': '\uD83D\uDCC1 Dateien suchen',
'Agent': '\uD83E\uDD16 Sub-Agent',
};
function updateThinkingIndicator(msg) {
const indicators = [
document.getElementById('thinking-indicator'),
document.getElementById('thinking-indicator-fs'),
];
const texts = [
document.getElementById('thinking-text'),
document.getElementById('thinking-text-fs'),
];
if (msg.activity === 'idle') {
indicators.forEach(el => { if (el) el.style.display = 'none'; });
if (thinkingTimeout) { clearTimeout(thinkingTimeout); thinkingTimeout = null; }
return;
}
let label = 'ARIA denkt...';
if (msg.activity === 'tool' && msg.tool) {
label = TOOL_LABELS[msg.tool] || `\uD83D\uDD27 ${msg.tool}`;
} else if (msg.activity === 'assistant') {
label = 'ARIA schreibt...';
}
indicators.forEach(el => { if (el) el.style.display = 'block'; });
texts.forEach(el => { if (el) el.textContent = label; });
// Auto-Hide nach 2min (falls idle Event verpasst wird — ARIA arbeitet max 15min)
if (thinkingTimeout) clearTimeout(thinkingTimeout);
thinkingTimeout = setTimeout(() => {
indicators.forEach(el => { if (el) el.style.display = 'none'; });
}, 120000);
}
// ── XTTS Panel ─────────────────────────────
function toggleXTTSPanel() {
const engine = document.getElementById('diag-tts-engine').value;
document.getElementById('piper-panel').style.display = engine === 'piper' ? 'block' : 'none';
document.getElementById('xtts-panel').style.display = engine === 'xtts' ? 'block' : 'none';
if (engine === 'xtts') loadXTTSVoices();
}
function loadXTTSVoices() {
send({ action: 'xtts_list_voices' });
}
function arrayBufferToBase64(buffer) {
const bytes = new Uint8Array(buffer);
let binary = '';
for (let i = 0; i < bytes.length; i += 8192) {
binary += String.fromCharCode.apply(null, bytes.subarray(i, i + 8192));
}
return btoa(binary);
}
async function uploadVoiceSamples() {
const name = document.getElementById('xtts-clone-name').value.trim();
const files = document.getElementById('xtts-clone-files').files;
if (!name) { alert('Bitte einen Namen eingeben'); return; }
if (!files || files.length === 0) { alert('Bitte Audio-Dateien auswaehlen'); return; }
if (files.length > 10) { alert('Maximal 10 Dateien'); return; }
const status = document.getElementById('xtts-clone-status');
status.textContent = `Lade ${files.length} Datei(en)...`;
status.style.color = '#FFD60A';
try {
const samples = [];
for (let i = 0; i < files.length; i++) {
status.textContent = `Lese Datei ${i + 1}/${files.length}: ${files[i].name}...`;
const buffer = await files[i].arrayBuffer();
const base64 = arrayBufferToBase64(buffer);
samples.push({ base64, name: files[i].name, size: files[i].size });
}
const totalSize = samples.reduce((s, f) => s + f.size, 0);
status.textContent = `Sende ${samples.length} Sample(s) (${(totalSize / 1024).toFixed(0)}KB)...`;
send({ action: 'voice_upload', name, samples });
status.textContent = `Gesendet — warte auf Bestaetigung vom XTTS-Server...`;
} catch (err) {
status.textContent = `Fehler: ${err.message}`;
status.style.color = '#FF3B30';
}
}
// ── Diagnostic Anhang-Handling ─────────────
let diagPendingFiles = [];
function handleDiagFileSelect(files) {
for (const file of files) {
const reader = new FileReader();
reader.onload = () => {
const base64 = reader.result.split(',')[1];
diagPendingFiles.push({ name: file.name, type: file.type, size: file.size, base64 });
renderDiagPending();
};
reader.readAsDataURL(file);
}
}
function handleDiagPaste(event) {
const items = event.clipboardData?.items;
if (!items) return;
for (const item of items) {
if (item.kind === 'file') {
event.preventDefault();
const file = item.getAsFile();
if (file) handleDiagFileSelect([file]);
}
}
}
function renderDiagPending() {
const container = document.getElementById('diag-pending-attachments');
if (diagPendingFiles.length === 0) {
container.style.display = 'none';
return;
}
container.style.display = 'flex';
container.innerHTML = diagPendingFiles.map((f, i) => {
const isImage = f.type.startsWith('image/');
const preview = isImage ? `<img src="data:${f.type};base64,${f.base64}" style="width:40px;height:40px;border-radius:4px;object-fit:cover;">` : `<span style="font-size:20px;">&#x1F4C4;</span>`;
return `<div style="position:relative;display:inline-block;">
${preview}
<span onclick="removeDiagPending(${i})" style="position:absolute;top:-4px;right:-4px;width:16px;height:16px;border-radius:8px;background:#FF3B30;color:#fff;font-size:10px;cursor:pointer;display:flex;align-items:center;justify-content:center;">X</span>
</div>`;
}).join('') + `<span style="color:#8888AA;font-size:11px;margin-left:4px;">${diagPendingFiles.length} Datei(en)</span>
<span onclick="diagPendingFiles=[];renderDiagPending();" style="color:#FF3B30;font-size:11px;cursor:pointer;margin-left:8px;">Alle X</span>`;
}
function removeDiagPending(idx) {
diagPendingFiles.splice(idx, 1);
renderDiagPending();
}
// ── Abbrechen ──────────────────────────────
function cancelRequest() {
send({ action: 'cancel_request' });
updateThinkingIndicator({ activity: 'idle' });
addChat('error', 'Anfrage abgebrochen', 'system');
}
// ── Stimmen-Config ──────────────────────────
function sendVoiceConfig() {
const defaultVoice = document.getElementById('diag-default-voice').value;
const highlightVoice = document.getElementById('diag-highlight-voice').value;
const ttsEnabled = document.getElementById('diag-tts-enabled').checked;
const speedRamona = parseFloat(document.getElementById('diag-speed-ramona').value);
const speedThorsten = parseFloat(document.getElementById('diag-speed-thorsten').value);
const ttsEngine = document.getElementById('diag-tts-engine').value;
const xttsVoice = document.getElementById('diag-xtts-voice').value;
const whisperModel = document.getElementById('diag-whisper-model').value;
send({ action: 'send_voice_config', defaultVoice, highlightVoice, ttsEnabled, speedRamona, speedThorsten, ttsEngine, xttsVoice, whisperModel });
}
// ── Highlight-Trigger ────────────────────────
function loadHighlightTriggers() {
send({ action: 'get_triggers' });
}
function saveHighlightTriggers() {
const text = document.getElementById('highlight-triggers').value;
const triggers = text.split('\n').map(t => t.trim()).filter(t => t.length > 0);
send({ action: 'save_triggers', triggers });
document.getElementById('trigger-status').textContent = 'Gespeichert (' + triggers.length + ' Trigger)';
document.getElementById('trigger-status').style.color = '#34C759';
}
// Beim Tab-Wechsel zu Einstellungen: Trigger laden
const origSwitchMainTab = typeof switchMainTab === 'function' ? switchMainTab : null;
// ── Modus-Wechsel ────────────────────────────
let currentMode = 'normal';
const MODE_LABELS = { normal: 'Normal', dnd: 'Nicht stoeren', whisper: 'Fluestern', hangar: 'Hangar', gaming: 'Gaming' };
function setMode(mode) {
currentMode = mode;
// Visuelles Feedback
document.querySelectorAll('.mode-btn').forEach(btn => {
btn.style.borderColor = btn.dataset.mode === mode ? '#0096FF' : 'transparent';
});
document.getElementById('mode-status').textContent = `Aktueller Modus: ${MODE_LABELS[mode] || mode}`;
// An Bridge senden via RVS
sendToRVS(`ARIA, ${MODE_LABELS[mode]}-Modus`, false);
log("info", "server", `Modus gewechselt: ${mode}`);
}
// ── TTS Diagnose ─────────────────────────────
function ttsLog(msg) {
const el = document.getElementById('tts-log');
const time = new Date().toLocaleTimeString('de-DE');
el.innerHTML += `<div>[${time}] ${escapeHtml(msg)}</div>`;
el.scrollTop = el.scrollHeight;
}
function testTTS(voice) {
const text = document.getElementById('tts-test-text').value.trim();
if (!text) return;
ttsLog(`Teste ${voice}: "${text}"...`);
send({ action: 'test_tts', voice, text });
}
function checkTTSStatus() {
ttsLog('Pruefe TTS-Status...');
send({ action: 'check_tts' });
}
function openLightbox(mediaType, url) {
const lb = document.getElementById('lightbox');
if (mediaType === 'video') {
@@ -1164,7 +1721,8 @@
+ `<td style="padding:4px 6px;color:#8888AA;font-size:10px;">${date}</td>`
+ `<td style="padding:4px 6px;white-space:nowrap;">`
+ (isActive ? '' : `<button class="btn secondary" onclick="event.stopPropagation();activateSession('${escapeHtml(s.sessionKey)}')" style="padding:2px 6px;font-size:10px;color:#34C759;margin-right:2px;" title="Aktivieren">&#9654;</button>`)
+ `<button class="btn secondary" onclick="event.stopPropagation();deleteSession('${escapeHtml(s.path)}')" style="padding:2px 6px;font-size:10px;color:#FF6B6B;" title="Loeschen">X</button>`
+ `<button class="btn secondary" onclick="event.stopPropagation();deleteSession('${escapeHtml(s.path)}')" style="padding:2px 6px;font-size:10px;color:#FF6B6B;margin-right:2px;" title="Loeschen">X</button>`
+ `<button class="btn secondary" onclick="event.stopPropagation();exportSession('${escapeHtml(s.path)}','${escapeHtml(s.sessionKey)}')" style="padding:2px 6px;font-size:10px;color:#8888AA;" title="Als Markdown exportieren">&#x2B07;</button>`
+ `</td></tr>`;
}
html += '</table>';
@@ -1228,6 +1786,10 @@
send({ action: 'delete_session', sessionPath: path });
}
function exportSession(path, sessionKey) {
send({ action: 'export_session', sessionPath: path, sessionKey });
}
function activateSession(sessionKey) {
send({ action: 'set_active_session', sessionKey });
}
@@ -1328,6 +1890,11 @@
document.querySelectorAll('.main-nav-btn').forEach(b => {
if (b.textContent.trim().toLowerCase().includes(tab === 'main' ? 'main' : 'einstellung')) b.classList.add('active');
});
// Einstellungen: Config + Trigger laden
if (tab === 'settings') {
loadHighlightTriggers();
send({ action: 'get_voice_config' });
}
}
// ── Einstellungen: Tool-Berechtigungen ──────────────────
+507 -54
View File
@@ -37,15 +37,41 @@ const state = {
};
const SESSION_KEY_FILE = "/data/active-session";
// /data Verzeichnis sicherstellen (Volume Mount)
try { fs.mkdirSync("/data", { recursive: true }); } catch {}
try { fs.mkdirSync("/data", { recursive: true }); } catch (e) {
console.error(`[startup] /data mkdir fehlgeschlagen: ${e.message}`);
}
// sessionFromFile zeigt an, ob der aktive Key aus der Datei kam.
// Wenn true, darf resolveActiveSession NICHT mehr auto-picken (Wahl respektieren).
let sessionFromFile = false;
let activeSessionKey = (() => {
try {
const saved = fs.readFileSync(SESSION_KEY_FILE, "utf-8").trim();
if (saved) { console.log(`[startup] Gespeicherte Session geladen: '${saved}'`); return saved; }
} catch {}
if (saved) {
console.log(`[startup] Gespeicherte Session geladen: '${saved}'`);
sessionFromFile = true;
return saved;
}
} catch (e) {
console.error(`[startup] SESSION_KEY_FILE read: ${e.code || e.message}`);
}
console.log("[startup] Keine gespeicherte Session — Fallback 'main'");
return "main";
})();
// Atomic write: temp-file + rename, laute Logs bei Fehler.
function persistActiveSession(key) {
try {
const tmp = SESSION_KEY_FILE + ".tmp";
fs.writeFileSync(tmp, key);
fs.renameSync(tmp, SESSION_KEY_FILE);
sessionFromFile = true;
console.log(`[session] Aktive Session persistiert: '${key}'`);
return true;
} catch (e) {
console.error(`[session] FEHLER beim Persistieren von '${key}': ${e.message}`);
return false;
}
}
const logs = [];
let gatewayWs = null;
let rvsWs = null;
@@ -74,8 +100,8 @@ function pipelineStart(method, text) {
pipelineStartTime = Date.now();
if (pipelineTimeout) clearTimeout(pipelineTimeout);
pipelineTimeout = setTimeout(() => {
if (pipelineActive) pipelineEnd(false, "Timeout — keine Antwort nach 60s");
}, 60000);
if (pipelineActive) pipelineEnd(false, "Timeout — keine Antwort nach 10min");
}, 600000);
plog(`━━━ Pipeline Start: ${method} ━━━`);
plog(`Nachricht: "${text}"`);
}
@@ -91,6 +117,9 @@ function pipelineEnd(ok, detail) {
}
plog(`━━━ Pipeline Ende ━━━`);
pipelineActive = false;
// Thinking-Indikator IMMER zuruecksetzen — auch bei Timeout/Fehler/Abbruch
broadcast({ type: "agent_activity", activity: "idle" });
pendingMessageTime = 0;
}
// ── Auto-Restart bei Netzwerk-Namespace-Verlust ──────
@@ -257,8 +286,10 @@ async function connectGateway() {
state.gateway.handshakeOk = false;
gatewayWs = null;
broadcastState();
// Stuck "ARIA denkt..." vermeiden, falls Gateway waehrend Pipeline abkackt
if (pipelineActive) pipelineEnd(false, `Gateway-Verbindung verloren (${code})`);
else broadcast({ type: "agent_activity", activity: "idle" });
checkGatewayHealth();
// Auto-Reconnect nach 5s
setTimeout(connectGateway, 5000);
});
@@ -319,10 +350,24 @@ function handleGatewayMessage(msg) {
if (event === "agent") {
const data = payload.data || {};
const delta = data.delta || "";
if (delta && payload.stream === "assistant") {
const stream = payload.stream || "";
if (delta && stream === "assistant") {
broadcast({ type: "chat_delta", delta, payload });
}
// agent Events nicht einzeln loggen (zu viele)
// Tool-Nutzung erkennen und broadcasten
if (stream === "tool_use" || data.type === "tool_use") {
const toolName = data.name || data.tool || payload.tool || "";
if (toolName) {
broadcast({ type: "agent_activity", activity: "tool", tool: toolName, data });
log("info", "gateway", `Tool: ${toolName}`);
}
}
// Genereller Activity-Heartbeat (ARIA denkt)
broadcast({ type: "agent_activity", activity: stream || "thinking" });
updateAgentActivity();
return;
}
@@ -338,6 +383,14 @@ function handleGatewayMessage(msg) {
log("info", "gateway", `ANTWORT: "${text.slice(0, 200)}"`);
if (pipelineActive) pipelineEnd(true, `"${text.slice(0, 120)}"`);
broadcast({ type: "chat_final", text, payload });
broadcast({ type: "agent_activity", activity: "idle" });
pendingMessageTime = 0; // Watchdog: Antwort erhalten
updateAgentActivity();
// Antwort in Backup-Log schreiben
try {
const entry = JSON.stringify({ ts: Date.now(), role: "assistant", text: text.slice(0, 2000), session: activeSessionKey }) + "\n";
fs.appendFileSync("/shared/config/chat_backup.jsonl", entry);
} catch {}
return;
}
@@ -350,6 +403,7 @@ function handleGatewayMessage(msg) {
const error = payload.error || text || "Unbekannt";
log("error", "gateway", `Chat-Fehler: ${error}`);
if (pipelineActive) pipelineEnd(false, error);
else broadcast({ type: "agent_activity", activity: "idle" });
broadcast({ type: "chat_error", error, payload });
return;
}
@@ -371,6 +425,7 @@ function handleGatewayMessage(msg) {
const text = extractChatText(payload) || payload.text || "";
log("info", "gateway", `ANTWORT: "${text.slice(0, 200)}"`);
if (pipelineActive) pipelineEnd(true, `"${text.slice(0, 120)}"`);
else broadcast({ type: "agent_activity", activity: "idle" });
broadcast({ type: "chat_final", text, payload });
return;
}
@@ -378,6 +433,7 @@ function handleGatewayMessage(msg) {
const error = payload.error || payload.message || "Unbekannt";
log("error", "gateway", `Chat-Fehler: ${error}`);
if (pipelineActive) pipelineEnd(false, error);
else broadcast({ type: "agent_activity", activity: "idle" });
broadcast({ type: "chat_error", error, payload });
return;
}
@@ -410,17 +466,17 @@ function sendToGateway(text, isPipeline) {
const payload = JSON.stringify(msg);
log("debug", "gateway", `RAW >>> ${payload}`);
gatewayWs.send(payload);
pendingMessageTime = Date.now(); // Watchdog: Nachricht gesendet
// Nachricht sofort in Backup-Log schreiben (OpenClaw speichert erst nach Run-Ende)
try {
fs.mkdirSync("/shared/config", { recursive: true });
const entry = JSON.stringify({ ts: Date.now(), role: "user", text, session: activeSessionKey }) + "\n";
fs.appendFileSync("/shared/config/chat_backup.jsonl", entry);
} catch {}
log("info", "gateway", `chat.send [${reqId}]: "${text}"`);
if (isPipeline) plog(`chat.send [${reqId}] an Gateway gesendet — warte auf ACK...`);
// Nachricht auch an RVS senden damit die App sie sieht
if (rvsWs && rvsWs.readyState === WebSocket.OPEN) {
rvsWs.send(JSON.stringify({
type: "chat",
payload: { text, sender: "diagnostic" },
timestamp: Date.now(),
}));
}
// Gateway-Nachrichten NICHT an RVS senden (sonst doppelter ARIA-Request via Bridge)
return true;
}
@@ -434,7 +490,13 @@ function connectRVS(forcePlain) {
return;
}
// TLS-Logik: wss zuerst, bei Fehler Fallback auf ws (wenn erlaubt)
// Alte Verbindung sauber schliessen
if (rvsWs) {
try { rvsWs.removeAllListeners(); rvsWs.close(); } catch (_) {}
rvsWs = null;
}
// TLS-Logik: wss zuerst, bei Fehler Fallback auf ws
const useTls = RVS_TLS === "true" && !forcePlain;
const proto = useTls ? "wss" : "ws";
const url = `${proto}://${RVS_HOST}:${RVS_PORT}?token=${RVS_TOKEN}`;
@@ -443,7 +505,18 @@ function connectRVS(forcePlain) {
broadcastState();
log("info", "rvs", `Verbinde: ${proto}://${RVS_HOST}:${RVS_PORT}`);
const ws = new WebSocket(url);
let ws;
try {
ws = new WebSocket(url);
} catch (err) {
log("error", "rvs", `WebSocket erstellen fehlgeschlagen: ${err.message}`);
if (useTls && RVS_TLS_FALLBACK === "true") {
connectRVS(true);
}
return;
}
let fallbackTriggered = false;
ws.on("open", () => {
log("info", "rvs", `Verbunden (${proto})`);
@@ -451,6 +524,16 @@ function connectRVS(forcePlain) {
state.rvs.lastError = null;
rvsWs = ws;
broadcastState();
// Keepalive: alle 25s ein Ping senden damit die Verbindung nicht stirbt
const keepalive = setInterval(() => {
if (ws.readyState === WebSocket.OPEN) {
try { ws.ping(); } catch (_) {}
} else {
clearInterval(keepalive);
}
}, 25000);
ws._keepalive = keepalive;
});
ws.on("message", (raw) => {
@@ -458,11 +541,24 @@ function connectRVS(forcePlain) {
const msg = JSON.parse(raw.toString());
if (msg.type === "chat" && msg.payload) {
const sender = msg.payload.sender || "?";
// Eigene Nachrichten ignorieren (Echo)
if (sender === "diagnostic") return;
log("info", "rvs", `Chat von ${sender}: "${(msg.payload.text || "").slice(0, 100)}"`);
if (pipelineActive && sender !== "diagnostic") {
if (pipelineActive) {
pipelineEnd(true, `Antwort via RVS von ${sender}: "${(msg.payload.text || "").slice(0, 120)}"`);
}
broadcast({ type: "rvs_chat", msg });
} else if (msg.type === "file_saved" && msg.payload) {
// Bild/Datei-Upload von der App — im Chat anzeigen
const name = msg.payload.name || "?";
const serverPath = msg.payload.serverPath || "";
const mimeType = msg.payload.mimeType || "";
log("info", "rvs", `Datei empfangen: ${name} (${serverPath})`);
// Als User-Nachricht mit Pfad broadcasten (Diagnostic zeigt Bilder inline)
broadcast({ type: "rvs_chat", msg: {
type: "chat",
payload: { text: `Anhang: ${name}\n${serverPath}`, sender: "user" }
}});
} else if (msg.type === "heartbeat") {
// ignorieren
} else {
@@ -473,10 +569,13 @@ function connectRVS(forcePlain) {
ws.on("close", () => {
log("warn", "rvs", "Verbindung geschlossen");
if (ws._keepalive) clearInterval(ws._keepalive);
state.rvs.status = "disconnected";
rvsWs = null;
if (rvsWs === ws) rvsWs = null;
broadcastState();
setTimeout(() => connectRVS(), 5000);
if (!fallbackTriggered) {
setTimeout(() => connectRVS(), 5000);
}
});
ws.on("error", (err) => {
@@ -484,31 +583,71 @@ function connectRVS(forcePlain) {
state.rvs.lastError = err.message;
broadcastState();
// TLS Fallback: wenn wss fehlschlaegt und Fallback erlaubt → ws versuchen
if (useTls && RVS_TLS_FALLBACK === "true") {
// TLS Fallback
if (useTls && RVS_TLS_FALLBACK === "true" && !fallbackTriggered) {
fallbackTriggered = true;
log("warn", "rvs", "TLS fehlgeschlagen — Fallback auf ws://");
ws.removeAllListeners();
try { ws.close(); } catch (_) {}
try { ws.removeAllListeners(); ws.close(); } catch (_) {}
if (rvsWs === ws) rvsWs = null;
connectRVS(true);
}
});
}
function sendToRVS(text, isPipeline) {
if (!rvsWs || rvsWs.readyState !== WebSocket.OPEN) {
log("error", "rvs", "Nicht verbunden");
if (isPipeline) pipelineEnd(false, "RVS nicht verbunden");
return false;
}
function sendToRVS_withResponse(sendType, sendPayload, expectType, clientWs) {
if (!RVS_HOST || !RVS_TOKEN) return;
const proto = RVS_TLS === "true" ? "wss" : "ws";
const url = `${proto}://${RVS_HOST}:${RVS_PORT}?token=${RVS_TOKEN}`;
const freshWs = new WebSocket(url);
const timeout = setTimeout(() => {
try { freshWs.close(); } catch (_) {}
clientWs.send(JSON.stringify({ type: expectType, payload: { voices: [], error: "Timeout" }, timestamp: Date.now() }));
}, 15000);
freshWs.on("open", () => {
freshWs.send(JSON.stringify({ type: sendType, payload: sendPayload, timestamp: Date.now() }));
});
freshWs.on("message", (raw) => {
try {
const resp = JSON.parse(raw.toString());
if (resp.type === expectType) {
clearTimeout(timeout);
clientWs.send(JSON.stringify(resp));
setTimeout(() => { try { freshWs.close(); } catch (_) {} }, 1000);
}
} catch {}
});
freshWs.on("error", () => {});
}
rvsWs.send(JSON.stringify({
function sendToRVS_raw(msgObj) {
if (!RVS_HOST || !RVS_TOKEN) return;
const proto = RVS_TLS === "true" ? "wss" : "ws";
const url = `${proto}://${RVS_HOST}:${RVS_PORT}?token=${RVS_TOKEN}`;
const freshWs = new WebSocket(url);
freshWs.on("open", () => {
freshWs.send(JSON.stringify(msgObj));
setTimeout(() => { try { freshWs.close(); } catch (_) {} }, 5000);
});
freshWs.on("error", () => {});
}
function sendToRVS(text, isPipeline) {
// Ueber Gateway senden (zuverlaessig) UND an RVS fuer App-Sichtbarkeit
// Die Bridge empfaengt RVS-Nachrichten von der App zuverlaessig,
// aber die Diagnostic→RVS→Bridge Route hat Zombie-Probleme.
// Deshalb: Gateway fuer ARIA, RVS nur fuer App-Anzeige.
// 1. An Gateway senden (damit ARIA antwortet)
const gatewayOk = sendToGateway(text, isPipeline);
// 2. An RVS senden (damit die App die Nachricht sieht)
sendToRVS_raw({
type: "chat",
payload: { text, sender: "diagnostic" },
timestamp: Date.now(),
}));
log("info", "rvs", `Gesendet via RVS: "${text}"`);
if (isPipeline) plog(`Nachricht an RVS gesendet — warte auf Antwort via RVS...`);
return true;
});
return gatewayOk;
}
// ── Claude Proxy Test ────────────────────────────────────
@@ -526,7 +665,7 @@ async function testProxy(prompt) {
const modelsRes = await fetch(healthUrl, {
headers: { "Authorization": "Bearer not-needed" },
signal: AbortSignal.timeout(10000),
signal: AbortSignal.timeout(30000),
});
if (!modelsRes.ok) {
@@ -553,7 +692,7 @@ async function testProxy(prompt) {
}
// Schritt 2: Chat Completion testen (kurzer Prompt)
const testPrompt = prompt || "Antworte mit genau einem Wort: Ping";
const testPrompt = prompt || "Antworte in einem Satz: Wer bist du und funktionierst du?";
log("info", "proxy", `Sende Test-Prompt: "${testPrompt}"`);
const chatRes = await fetch(`${PROXY_URL}/v1/chat/completions`, {
@@ -567,7 +706,7 @@ async function testProxy(prompt) {
messages: [{ role: "user", content: testPrompt }],
max_tokens: 200,
}),
signal: AbortSignal.timeout(30000),
signal: AbortSignal.timeout(120000), // 2min — Cold Start braucht Zeit
});
if (!chatRes.ok) {
@@ -932,6 +1071,64 @@ function waitForMessage(ws, timeoutMs) {
});
}
// ── Watchdog: Stuck Run Erkennung ────────────────────────
let lastAgentActivity = Date.now();
let watchdogWarned = false;
let watchdogFixAttempted = false;
let pendingMessageTime = 0; // Wann wurde die letzte Nachricht gesendet
function updateAgentActivity() {
lastAgentActivity = Date.now();
watchdogWarned = false;
}
// Watchdog prüft alle 30s ob ARIA nach einer gesendeten Nachricht reagiert
setInterval(async () => {
if (pendingMessageTime === 0) return; // Keine Nachricht gesendet
const waitingMs = Date.now() - pendingMessageTime;
// Nach 2min ohne Agent-Activity: Warnung
if (waitingMs > 120000 && !watchdogWarned) {
watchdogWarned = true;
log("warn", "server", `Watchdog: Keine ARIA-Aktivitaet seit ${Math.round(waitingMs / 1000)}s — moeglicherweise stuck`);
broadcast({ type: "watchdog", status: "warning", waitingMs, message: "ARIA reagiert nicht — moeglicherweise stuck Run" });
}
// Nach 5min: doctor --fix
if (waitingMs > 300000 && watchdogWarned && !watchdogFixAttempted) {
watchdogFixAttempted = true;
log("error", "server", "Watchdog: 5min ohne Antwort — fuehre openclaw doctor --fix aus");
broadcast({ type: "watchdog", status: "fixing", message: "Auto-Fix: openclaw doctor --fix" });
try {
await dockerExec("aria-core", "openclaw doctor --fix 2>/dev/null || true");
log("info", "server", "Watchdog: doctor --fix ausgefuehrt");
broadcast({ type: "watchdog", status: "fixed", message: "doctor --fix ausgefuehrt — warte auf Antwort..." });
} catch (err) {
log("error", "server", `Watchdog: doctor --fix fehlgeschlagen: ${err.message}`);
}
}
// Nach 8min: Container neustarten
if (waitingMs > 480000 && watchdogFixAttempted) {
log("error", "server", "Watchdog: 8min ohne Antwort — starte aria-core + aria-proxy neu");
broadcast({ type: "watchdog", status: "restarting", message: "Container-Restart: aria-core + aria-proxy" });
try {
const { execSync } = require("child_process");
execSync("docker restart aria-core aria-proxy", { timeout: 60000 });
log("info", "server", "Watchdog: Container neugestartet");
broadcast({ type: "watchdog", status: "restarted", message: "Container neugestartet — warte auf Gateway-Reconnect..." });
// Gateway wird sich automatisch neu verbinden
} catch (err) {
log("error", "server", `Watchdog: Container-Restart fehlgeschlagen: ${err.message}`);
broadcast({ type: "watchdog", status: "error", message: `Restart fehlgeschlagen: ${err.message}` });
}
pendingMessageTime = 0;
watchdogWarned = false;
watchdogFixAttempted = false;
}
}, 30000);
// ── HTTP Server + WebSocket fuer Browser ────────────────
const htmlPath = path.join(__dirname, "index.html");
@@ -946,6 +1143,16 @@ const server = http.createServer((req, res) => {
} else if (req.url === "/api/session") {
res.writeHead(200, { "Content-Type": "application/json" });
res.end(JSON.stringify({ sessionKey: activeSessionKey }));
} else if (req.url === "/api/cancel" && req.method === "POST") {
log("warn", "server", "HTTP /api/cancel — Cancel-Request (von Bridge)");
pendingMessageTime = 0;
watchdogWarned = false;
watchdogFixAttempted = false;
if (pipelineActive) pipelineEnd(false, "Vom Benutzer abgebrochen (App)");
else broadcast({ type: "agent_activity", activity: "idle" });
dockerExec("aria-core", "openclaw doctor --fix 2>/dev/null || true").catch(() => {});
res.writeHead(200, { "Content-Type": "application/json" });
res.end(JSON.stringify({ ok: true }));
} else if (req.url.startsWith("/shared/")) {
// Dateien aus Shared Volume ausliefern (Bilder, Uploads)
const filePath = decodeURIComponent(req.url);
@@ -1018,6 +1225,62 @@ wss.on("connection", (ws) => {
if (ws._sshSock) ws._sshSock.write(msg.data);
} else if (msg.action === "live_ssh_close") {
if (ws._sshSock) { ws._sshSock.end(); ws._sshSock = null; }
} else if (msg.action === "send_file") {
// Datei von Diagnostic an Bridge via RVS senden
sendToRVS_raw({
type: "file",
payload: { name: msg.name, type: msg.type, size: msg.size, base64: msg.base64 },
timestamp: Date.now(),
});
log("info", "server", `Datei gesendet: ${msg.name} (${msg.type})`);
} else if (msg.action === "cancel_request") {
// Laufende Anfrage abbrechen — doctor --fix beendet stuck runs
log("warn", "server", "Anfrage abgebrochen — fuehre doctor --fix aus");
pendingMessageTime = 0;
watchdogWarned = false;
watchdogFixAttempted = false;
if (pipelineActive) pipelineEnd(false, "Vom Benutzer abgebrochen");
broadcast({ type: "agent_activity", activity: "idle" });
dockerExec("aria-core", "openclaw doctor --fix 2>/dev/null || true").catch(() => {});
} else if (msg.action === "voice_upload") {
// Voice-Samples an XTTS-Bridge via RVS weiterleiten, auf Bestätigung warten
log("info", "server", `Voice-Upload '${msg.name}' (${(msg.samples || []).length} Samples) sende an RVS...`);
sendToRVS_withResponse("voice_upload", { name: msg.name, samples: msg.samples }, "xtts_voice_saved", ws);
} else if (msg.action === "xtts_list_voices") {
// Frische Verbindung die auf Antwort wartet
sendToRVS_withResponse("xtts_list_voices", {}, "xtts_voices_list", ws);
} else if (msg.action === "get_voice_config") {
handleGetVoiceConfig(ws);
} else if (msg.action === "send_voice_config") {
// Stimmen-Config persistent speichern + an Bridge via RVS senden
// Bestehende Config lesen um Felder zu mergen die dieser Call nicht setzt
let existing = {};
try { existing = JSON.parse(fs.readFileSync("/shared/config/voice_config.json", "utf-8")); } catch {}
const voiceConfig = {
...existing,
defaultVoice: msg.defaultVoice || "ramona",
highlightVoice: msg.highlightVoice || "thorsten",
ttsEnabled: msg.ttsEnabled !== false,
ttsEngine: msg.ttsEngine || "piper",
xttsVoice: msg.xttsVoice || "",
speedRamona: msg.speedRamona || 1.0,
speedThorsten: msg.speedThorsten || 1.0,
};
if (msg.whisperModel !== undefined) voiceConfig.whisperModel = msg.whisperModel;
try {
fs.mkdirSync("/shared/config", { recursive: true });
fs.writeFileSync("/shared/config/voice_config.json", JSON.stringify(voiceConfig, null, 2));
} catch {}
sendToRVS_raw({ type: "config", payload: voiceConfig, timestamp: Date.now() });
log("info", "server", `Voice-Config gespeichert+gesendet: default=${voiceConfig.defaultVoice}, whisper=${voiceConfig.whisperModel || "-"}`);
} else if (msg.action === "get_triggers") {
handleGetTriggers(ws);
} else if (msg.action === "save_triggers") {
handleSaveTriggers(ws, msg.triggers || []);
} else if (msg.action === "test_tts") {
handleTestTTS(ws, msg.voice || "ramona", msg.text || "Test");
} else if (msg.action === "check_tts") {
handleCheckTTS(ws);
} else if (msg.action === "check_desktop") {
checkDesktopAvailable(ws);
} else if (msg.action === "load_chat_history") {
@@ -1026,6 +1289,8 @@ wss.on("connection", (ws) => {
handleListSessions(ws);
} else if (msg.action === "read_session") {
handleReadSession(ws, msg.sessionPath);
} else if (msg.action === "export_session") {
handleExportSession(ws, msg.sessionPath, msg.sessionKey);
} else if (msg.action === "delete_session") {
handleDeleteSession(ws, msg.sessionPath);
} else if (msg.action === "set_active_session") {
@@ -1144,6 +1409,123 @@ function startLiveSSH(clientWs) {
createReq.end(createBody);
}
// ── Voice-Config laden ────────────────────────────────
function handleGetVoiceConfig(clientWs) {
try {
const configPath = "/shared/config/voice_config.json";
if (fs.existsSync(configPath)) {
const config = JSON.parse(fs.readFileSync(configPath, "utf-8"));
clientWs.send(JSON.stringify({ type: "voice_config", ...config }));
} else {
clientWs.send(JSON.stringify({ type: "voice_config", defaultVoice: "ramona", highlightVoice: "thorsten", ttsEnabled: true }));
}
} catch (err) {
clientWs.send(JSON.stringify({ type: "voice_config", defaultVoice: "ramona", highlightVoice: "thorsten", ttsEnabled: true }));
}
}
// ── Highlight-Trigger ─────────────────────────────────
const TRIGGERS_FILE = "/shared/config/highlight_triggers.json";
async function handleGetTriggers(clientWs) {
try {
// Zuerst aus Shared Volume lesen, dann Fallback auf Bridge-Defaults
let triggers;
if (fs.existsSync(TRIGGERS_FILE)) {
triggers = JSON.parse(fs.readFileSync(TRIGGERS_FILE, "utf-8"));
} else {
// Defaults aus der Bridge lesen
const result = await dockerExec("aria-bridge", `python3 -c "
import sys; sys.path.insert(0,'/app')
from aria_bridge import EPIC_TRIGGERS
print('\\n'.join(EPIC_TRIGGERS))
"`);
triggers = result.trim().split("\n").filter(t => t);
}
clientWs.send(JSON.stringify({ type: "trigger_list", triggers }));
} catch (err) {
clientWs.send(JSON.stringify({ type: "trigger_list", triggers: [], error: err.message }));
}
}
async function handleSaveTriggers(clientWs, triggers) {
try {
// In Shared Volume speichern (fuer Bridge lesbar)
fs.mkdirSync("/shared/config", { recursive: true });
fs.writeFileSync(TRIGGERS_FILE, JSON.stringify(triggers, null, 2));
log("info", "server", `${triggers.length} Highlight-Trigger gespeichert`);
// Bridge informieren (wird beim naechsten Start geladen)
clientWs.send(JSON.stringify({ type: "trigger_list", triggers }));
} catch (err) {
log("error", "server", `Trigger speichern fehlgeschlagen: ${err.message}`);
}
}
// ── TTS Diagnose ──────────────────────────────────────
async function handleTestTTS(clientWs, voice, text) {
try {
log("info", "server", `TTS-Test: ${voice} — "${text}"`);
const result = await dockerExec("aria-bridge", `python3 -c "
import time, sys
sys.path.insert(0, '/app')
from piper import PiperVoice
import wave, tempfile, os
voices = {'ramona': '/voices/de_DE-ramona-low.onnx', 'thorsten': '/voices/de_DE-thorsten-high.onnx'}
path = voices.get('${voice}')
if not path or not os.path.exists(path):
print('FEHLER: Stimme nicht gefunden')
sys.exit(1)
v = PiperVoice.load(path)
start = time.time()
tmp = tempfile.NamedTemporaryFile(suffix='.wav', delete=False)
with wave.open(tmp.name, 'wb') as wf:
wf.setnchannels(1)
wf.setsampwidth(2)
wf.setframerate(v.config.sample_rate)
v.synthesize('${text.replace(/'/g, "\\'")}', wf)
size = os.path.getsize(tmp.name)
dur = int((time.time() - start) * 1000)
os.unlink(tmp.name)
print(f'OK:{dur}:{size}')
"`);
const parts = result.trim().split(":");
if (parts[0] === "OK") {
clientWs.send(JSON.stringify({ type: "tts_result", ok: true, voice, duration: parts[1], size: parts[2] }));
} else {
clientWs.send(JSON.stringify({ type: "tts_result", ok: false, voice, error: result.trim() }));
}
} catch (err) {
clientWs.send(JSON.stringify({ type: "tts_result", ok: false, voice, error: err.message }));
}
}
async function handleCheckTTS(clientWs) {
try {
const result = await dockerExec("aria-bridge", `python3 -c "
import os, json
voices = {}
for name, path in [('ramona', '/voices/de_DE-ramona-low.onnx'), ('thorsten', '/voices/de_DE-thorsten-high.onnx')]:
voices[name] = os.path.exists(path)
print(json.dumps(voices))
"`);
const voices = JSON.parse(result.trim());
const available = Object.entries(voices).filter(([,v]) => v).map(([k]) => k);
const missing = Object.entries(voices).filter(([,v]) => !v).map(([k]) => k);
clientWs.send(JSON.stringify({
type: "tts_status",
ok: missing.length === 0,
voices: available,
defaultVoice: "ramona",
highlightVoice: "thorsten",
error: missing.length > 0 ? `Fehlend: ${missing.join(", ")}` : null,
}));
} catch (err) {
clientWs.send(JSON.stringify({ type: "tts_status", ok: false, error: err.message }));
}
}
function checkDesktopAvailable(clientWs) {
// Pruefen ob VNC auf der VM laeuft (Port 5900/5901)
const checkSock = net.connect({ host: "host.docker.internal", port: 5901 }, () => {
@@ -1291,6 +1673,68 @@ async function handleReadSession(clientWs, sessionPath) {
}
}
async function handleExportSession(clientWs, sessionPath, sessionKey) {
if (!sessionPath || sessionPath.includes("..") || !sessionPath.startsWith(SESSIONS_DIR)) {
clientWs.send(JSON.stringify({ type: "session_export", ok: false, error: "Ungueltiger Pfad" }));
return;
}
try {
const safePath = sessionPath.replace(/'/g, "");
const raw = await dockerExec("aria-core", `cat '${safePath}'`);
const lines = raw.split("\n").filter(l => l.trim());
const blocks = [];
for (const line of lines) {
let obj;
try { obj = JSON.parse(line); } catch { continue; }
if (obj.type !== "message" || !obj.message) continue;
const role = obj.message.role;
if (role !== "user" && role !== "assistant") continue;
let text = "";
const content = obj.message.content;
if (typeof content === "string") text = content;
else if (Array.isArray(content)) text = content.filter(c => c.type === "text").map(c => c.text || "").join("\n");
if (!text) continue;
if (role === "user") {
text = text.replace(/^Sender \(untrusted metadata\):[\s\S]*?```[\s\S]*?```\s*\n*/m, "").trim();
text = text.replace(/^\[.*?\]\s*/, "").trim();
} else {
text = text.replace(/^\[\[reply_to_\w+\]\]\s*/g, "").trim();
}
if (!text) continue;
const ts = obj.message.timestamp || obj.timestamp || 0;
const when = ts ? new Date(ts).toISOString().replace("T", " ").slice(0, 19) : "";
const heading = role === "user" ? "## 🧑 User" : "## 🤖 ARIA";
blocks.push(`${heading}${when ? `${when}` : ""}\n\n${text}`);
}
const exportedAt = new Date().toISOString().replace("T", " ").slice(0, 19);
const title = sessionKey || sessionPath.split("/").pop().replace(".jsonl", "");
const markdown = [
`# Session: ${title}`,
``,
`Exportiert: ${exportedAt} `,
`Quelle: ${sessionPath}`,
``,
`---`,
``,
blocks.join("\n\n---\n\n"),
``,
].join("\n");
const safeKey = (sessionKey || "session").replace(/[^a-zA-Z0-9_-]/g, "_");
const filename = `${exportedAt.slice(0, 10)}_${safeKey}.md`;
clientWs.send(JSON.stringify({ type: "session_export", ok: true, filename, markdown }));
log("info", "server", `Session exportiert: ${filename} (${blocks.length} Nachrichten)`);
} catch (err) {
log("error", "server", `Session-Export fehlgeschlagen: ${err.message}`);
clientWs.send(JSON.stringify({ type: "session_export", ok: false, error: err.message }));
}
}
async function handleDeleteSession(clientWs, sessionPath) {
if (!sessionPath || sessionPath.includes("..") || !sessionPath.startsWith(SESSIONS_DIR)) {
clientWs.send(JSON.stringify({ type: "session_deleted", ok: false, error: "Ungueltiger Pfad" }));
@@ -1331,13 +1775,11 @@ async function handleDeleteSession(clientWs, sessionPath) {
}
// ── Session-Aufloesung: letzte aktive Session finden ────
// Wird nach Gateway-(Re-)Connect aufgerufen. Darf die explizit gewaehlte
// Session NIE ueberschreiben — nur beim absoluten Erststart auto-picken.
async function resolveActiveSession() {
// Nur bei Fallback-Key "main" automatisch aufloesen — gespeicherte Wahl respektieren
const hasSavedSession = (() => {
try { return !!fs.readFileSync(SESSION_KEY_FILE, "utf-8").trim(); } catch { return false; }
})();
if (hasSavedSession && activeSessionKey !== "main") {
log("info", "server", `Gespeicherte Session '${activeSessionKey}' wird beibehalten`);
if (sessionFromFile) {
log("info", "server", `Session '${activeSessionKey}' aus /data — keine Auto-Wahl`);
return;
}
@@ -1356,10 +1798,19 @@ async function resolveActiveSession() {
const keys = entries.map(e => (e.key || e.sessionKey || e.name || "?").replace(/^agent:main:/, ""));
log("info", "server", `Verfuegbare Sessions: [${keys.join(", ")}]`);
// Neueste Session nehmen
// Neueste Session nehmen — aber user-definierte bevorzugen.
// aria-bridge / aria-diagnostic werden von den Services auto-erstellt;
// bei erstem Start soll lieber eine "echte" Session gewaehlt werden,
// falls vorhanden.
const AUTO_KEYS = new Set(["aria-bridge", "aria-diagnostic"]);
const normalise = (e) => (e.key || e.sessionKey || e.name || "").replace(/^agent:main:/, "");
const userEntries = entries.filter(e => !AUTO_KEYS.has(normalise(e)));
const pool = userEntries.length > 0 ? userEntries : entries;
let newest = null;
let newestTime = 0;
for (const entry of entries) {
for (const entry of pool) {
const t = entry.updatedAt || entry.createdAt || 0;
if (t >= newestTime) {
newestTime = t;
@@ -1368,12 +1819,11 @@ async function resolveActiveSession() {
}
if (newest) {
const rawKey = newest.key || newest.sessionKey || newest.name || "";
const key = rawKey.replace(/^agent:main:/, "");
const key = normalise(newest);
if (key) {
activeSessionKey = key;
try { fs.writeFileSync(SESSION_KEY_FILE, activeSessionKey); } catch {}
log("info", "server", `Aktive Session auf neueste gewechselt: '${activeSessionKey}'`);
persistActiveSession(activeSessionKey);
log("info", "server", `Auto-Wahl Erststart: '${activeSessionKey}'`);
for (const c of browserClients) {
c.send(JSON.stringify({ type: "active_session", sessionKey: activeSessionKey }));
}
@@ -1462,8 +1912,11 @@ function handleSetActiveSession(clientWs, sessionKey) {
return;
}
activeSessionKey = sessionKey;
try { fs.writeFileSync(SESSION_KEY_FILE, activeSessionKey); } catch {}
log("info", "server", `Aktive Session: ${activeSessionKey}`);
const ok = persistActiveSession(activeSessionKey);
log("info", "server", `Aktive Session: ${activeSessionKey}${ok ? "" : " (WARN: nicht persistiert!)"}`);
if (!ok) {
clientWs.send(JSON.stringify({ type: "active_session", ok: false, sessionKey: activeSessionKey, error: "Persistierung fehlgeschlagen — /data Volume pruefen" }));
}
// Allen Clients mitteilen
for (const c of browserClients) {
c.send(JSON.stringify({ type: "active_session", sessionKey: activeSessionKey }));
@@ -1479,7 +1932,7 @@ async function handleCreateSession(clientWs, sessionName) {
try {
// Session wird automatisch erstellt wenn man die erste Nachricht sendet
activeSessionKey = sessionName;
try { fs.writeFileSync(SESSION_KEY_FILE, activeSessionKey); } catch {}
persistActiveSession(activeSessionKey);
log("info", "server", `Neue Session erstellt und aktiviert: ${sessionName}`);
// Allen Clients mitteilen
for (const c of browserClients) {
+3 -2
View File
@@ -18,7 +18,8 @@ services:
claude-max-api"
volumes:
- ~/.claude:/root/.claude # Claude CLI Auth (Credentials in /root/.claude/.credentials.json)
- ./aria-data/ssh:/root/.ssh:ro # SSH Keys fuer VM-Zugriff (aria-wohnung)
- ./aria-data/ssh:/root/.ssh # SSH Keys fuer VM-Zugriff (aria-wohnung, rw fuer ARIA)
- aria-shared:/shared # Shared Volume fuer Datei-Austausch (Uploads von App)
environment:
- HOST=0.0.0.0
- SHELL=/bin/bash # Claude Code Bash-Tool braucht bash (nicht nur sh/ash)
@@ -99,7 +100,7 @@ services:
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- ./aria-data/config/diag-state:/data # Persistenter State (aktive Session etc.)
- aria-shared:/shared:ro # Shared Volume (Uploads lesen fuer Vorschau)
- aria-shared:/shared # Shared Volume (Uploads + Config)
environment:
- ARIA_AUTH_TOKEN=${ARIA_AUTH_TOKEN:-}
- PROXY_URL=http://proxy:3456
+60 -6
View File
@@ -1,6 +1,60 @@
bildupload ghet noch nicht.
sprachnachrichten werden nicht als zweite nachricht dargestellt, damit man weiß was man gesendet hat
cache leeren, bilder werden nicht neu geladen beim antippen.
autoload geht nicht
wenn man auf das ohr zum hören klickt stürzt ab
aria liest die nachrichten nicht vor
# ARIA Issues & Features
## Erledigt
- [x] Bildupload funktioniert (Shared Volume /shared/uploads/)
- [x] Sprachnachrichten werden als Text angezeigt (STT → Chat-Bubble)
- [x] Cache leeren + Auto-Download von Anhaengen
- [x] ARIA liest Nachrichten vor (TTS via Piper)
- [x] Autoscroll zur letzten Nachricht (inverted FlatList)
- [x] Bilder im Chat groesser + Vollbild-Vorschau
- [x] Ohr-Button → Gespraechsmodus (Auto-Aufnahme nach ARIA-Antwort)
- [x] Play-Button in ARIA-Nachrichten fuer Sprachwiedergabe
- [x] Chat-Suche in der App (Lupe in Statusleiste)
- [x] Watchdog mit Container-Restart (2min Warnung → 5min doctor --fix → 8min Restart)
- [x] Abbrechen-Button im Diagnostic Chat
- [x] Nachrichten Backup on-the-fly (/shared/config/chat_backup.jsonl)
- [x] Grosse Nachrichten satzweise aufteilen fuer TTS
- [x] RVS Nachrichten vom Smartphone gehen durch
- [x] Stimmen-Einstellungen (Ramona/Thorsten, Speed pro Stimme)
- [x] Highlight-Trigger konfigurierbar in Diagnostic
- [x] XTTS v2 Integration (Gaming-PC, GPU, Voice Cloning)
- [x] XTTS Voice Cloning (Audio-Samples hochladen, eigene Stimme)
- [x] TTS Engine waehlbar (Piper/XTTS) in Diagnostic + App
- [x] Auto-Update System (APK via RVS WebSocket)
- [x] Auto-Update: APK-Installation via FileProvider
- [x] Auto-Update: "Auf Updates pruefen" Button in App-Einstellungen
- [x] Audio-Queue (sequentielle Wiedergabe, kein Ueberlappen)
- [x] Textnachrichten werden von ARIA beantwortet (Bridge chat handler fix)
- [x] Mehrere Anhaenge + Text vor dem Senden (Pending-Vorschau)
- [x] Paste-Support fuer Bilder in Diagnostic Chat
- [x] Markdown-Bereinigung fuer TTS (fett, kursiv, code, links, etc.)
- [x] SSH Volume read-write fuer Proxy (kein -F Workaround mehr)
- [x] Diagnostic: Sessions als Markdown exportieren (Download-Button)
- [x] Speech Gate: Aufnahme wird verworfen wenn keine Sprache erkannt (verhindert dass Umgebungsgeraeusche an Whisper gehen)
- [x] Session-Persistenz: Gewaehlte Session bleibt ueber Container-Restarts erhalten (sessionFromFile-Flag, atomic write)
- [x] Diagnostic: "ARIA denkt..." bleibt nicht mehr stehen (pipelineEnd broadcastet immer idle, auch bei Timeout/Fehler/Disconnect)
- [x] App: "ARIA denkt..." Indicator + Abbrechen-Button (Bridge spiegelt agent_activity via RVS)
- [x] Whisper STT: Model-Auswahl in Diagnostic (tiny/base/small/medium/large-v3), Hot-Reload in Bridge, Default auf medium
- [x] App: Audio-Aufnahme explizit 16kHz mono (spart Resample, optimal fuer Whisper)
## Offen
### Bugs (Prioritaet)
- [ ] App: Audioausgabe hoert ab und zu einfach auf (mitten im Satz oder zwischen Chunks)
### App Features
- [ ] Wake Word on-device (Porcupine "ARIA" Keyword, Phase 2 — passives Lauschen)
- [ ] Chat-History zuverlaessiger laden (AsyncStorage Race Condition)
- [ ] Background Audio Service (TTS auch bei minimierter App)
### TTS / Audio
- [ ] XTTS Audio-Streaming (PCM-Stream statt WAV-Dateien, eliminiert Stottern komplett)
- [ ] Audio-Normalisierung (Lautstaerke zwischen Chunks angleichen)
- [ ] Piper Voices Download ueber Diagnostic (neue Sprachen/Stimmen)
### Architektur
- [ ] Bilder: Claude Vision direkt nutzen (aktuell nur Dateipfad an ARIA)
- [ ] Auto-Compacting und Memory/Brain Verwaltung (SQLite?)
- [ ] Diagnostic: System-Info Tab (Container-Status, Disk, RAM, CPU)
- [ ] RVS Zombie-Connections endgueltig loesen
+32 -8
View File
@@ -58,24 +58,29 @@ echo -e "${GREEN}[1/5] Versionsnummern auf $VERSION setzen...${NC}"
sed -i "s/\"version\": \"[^\"]*\"/\"version\": \"$VERSION\"/" android/package.json
echo -e " ${GREEN}${NC} package.json → $VERSION"
# build.gradle: versionName + versionCode (aus Major.Minor.Patch berechnen)
MAJOR=$(echo "$VERSION" | cut -d. -f1)
MINOR=$(echo "$VERSION" | cut -d. -f2)
PATCH=$(echo "$VERSION" | cut -d. -f3)
VERSION_CODE=$((MAJOR * 10000 + MINOR * 100 + PATCH))
# build.gradle: versionName + versionCode (aus Version berechnen)
# Unterstuetzt 3-stellig (1.2.3) und 4-stellig (0.0.1.7)
IFS='.' read -ra VER_PARTS <<< "$VERSION"
V1=${VER_PARTS[0]:-0}; V2=${VER_PARTS[1]:-0}; V3=${VER_PARTS[2]:-0}; V4=${VER_PARTS[3]:-0}
VERSION_CODE=$((V1 * 1000000 + V2 * 10000 + V3 * 100 + V4))
# Mindestens 1 (Android erfordert versionCode >= 1)
[ "$VERSION_CODE" -lt 1 ] && VERSION_CODE=1
sed -i "s/versionName \"[^\"]*\"/versionName \"$VERSION\"/" android/android/app/build.gradle
sed -i "s/versionCode [0-9]*/versionCode $VERSION_CODE/" android/android/app/build.gradle
echo -e " ${GREEN}${NC} build.gradle → versionName $VERSION, versionCode $VERSION_CODE"
# SettingsScreen: Anzeige-Version
sed -i "s/Version [0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]* [^<]*/Version $VERSION /" android/src/screens/SettingsScreen.tsx
# SettingsScreen: Anzeige-Version (beliebiges Versionsformat)
sed -i "s/Version [0-9][0-9.]*[^<]*/Version $VERSION /" android/src/screens/SettingsScreen.tsx
echo -e " ${GREEN}${NC} SettingsScreen → Version $VERSION"
echo ""
# ── APK bauen ─────────────────────────────────
echo -e "${GREEN}[2/5] APK bauen...${NC}"
echo -e "${GREEN}[2/5] APK bauen (Cache leeren + Build)...${NC}"
cd android
# Metro + Gradle Cache leeren damit neue Version sauber eingebettet wird
rm -rf node_modules/.cache 2>/dev/null
cd android && ./gradlew clean 2>/dev/null; cd ..
./build.sh release
cd ..
@@ -168,6 +173,24 @@ else
exit 1
fi
# ── Auto-Update: APK auf RVS-Server kopieren ─
RVS_UPDATE_HOST="${RVS_UPDATE_HOST:-}"
if [ -n "$RVS_UPDATE_HOST" ]; then
echo -e "${GREEN}[6/6] APK auf RVS-Server kopieren (Auto-Update)...${NC}"
# Alte APKs auf dem RVS loeschen, dann neue hochladen
ssh "$RVS_UPDATE_HOST" "rm -f ~/ARIA-AGENT/rvs/updates/ARIA-*.apk" 2>/dev/null
scp "$APK_PATH" "${RVS_UPDATE_HOST}:~/ARIA-AGENT/rvs/updates/${APK_NAME}" 2>/dev/null
if [ $? -eq 0 ]; then
echo -e " ${GREEN}${NC} APK auf RVS-Server kopiert (alte Versionen geloescht)"
else
echo -e " ${YELLOW}APK konnte nicht auf RVS kopiert werden (RVS_UPDATE_HOST=$RVS_UPDATE_HOST)${NC}"
echo -e " ${YELLOW}Manuell: scp $APK_PATH $RVS_UPDATE_HOST:~/ARIA-AGENT/rvs/updates/${APK_NAME}${NC}"
fi
else
echo -e "${YELLOW}Auto-Update uebersprungen (RVS_UPDATE_HOST nicht gesetzt)${NC}"
echo -e "${YELLOW}Setze RVS_UPDATE_HOST in .env fuer automatische Verteilung${NC}"
fi
# ── Fertig ────────────────────────────────────
echo ""
echo -e "${GREEN}╔═══════════════════════════════════════════════════╗${NC}"
@@ -175,4 +198,5 @@ echo -e "${GREEN}║ Release $TAG ist live!$(printf '%*s' $((27 - ${#TAG})) ''
echo -e "${GREEN}╠═══════════════════════════════════════════════════╣${NC}"
echo -e "${GREEN}${NC} $GITEA_URL/$GITEA_REPO/releases/tag/$TAG"
echo -e "${GREEN}${NC} APK: $APK_NAME ($APK_SIZE)"
echo -e "${GREEN}${NC} Auto-Update: ${RVS_UPDATE_HOST:-nicht konfiguriert}"
echo -e "${GREEN}╚═══════════════════════════════════════════════════╝${NC}"
+2
View File
@@ -4,5 +4,7 @@ services:
ports:
- "${RVS_PORT:-443}:3000"
restart: always
volumes:
- ./updates:/updates # APK-Dateien fuer Auto-Update
environment:
- MAX_SESSIONS=10
+114 -1
View File
@@ -1,15 +1,22 @@
"use strict";
const { WebSocketServer } = require("ws");
const fs = require("fs");
const path = require("path");
// ── Konfiguration aus Umgebungsvariablen ────────────────────────────
const PORT = parseInt(process.env.PORT || "3000", 10);
const MAX_SESSIONS = parseInt(process.env.MAX_SESSIONS || "10", 10);
const UPDATES_DIR = process.env.UPDATES_DIR || "/updates";
// Kein Polling — APK wird manuell per git pull bereitgestellt
// Erlaubte Nachrichtentypen — alles andere wird verworfen
const ALLOWED_TYPES = new Set([
"chat", "audio", "file", "location", "mode", "log", "event", "heartbeat",
"file_request", "file_response", "file_saved", "stt_result",
"file_request", "file_response", "file_saved", "stt_result", "config", "tts_request",
"xtts_request", "xtts_response", "xtts_list_voices", "xtts_voices_list", "voice_upload", "xtts_voice_saved",
"update_check", "update_available", "update_download", "update_data",
"agent_activity", "cancel_request",
]);
// Token-Raum: token -> { clients: Set<ws> }
@@ -46,6 +53,9 @@ const wss = new WebSocketServer({ port: PORT });
wss.on("listening", () => {
log(`RVS läuft auf Port ${PORT} | Max Sessions: ${MAX_SESSIONS}`);
// Beim Start pruefen ob eine APK da ist
const apkInfo = getLatestAPK();
if (apkInfo) log(`APK bereit: v${apkInfo.version} (${(fs.statSync(apkInfo.path).size / 1024 / 1024).toFixed(1)}MB)`);
});
wss.on("connection", (ws, req) => {
@@ -107,6 +117,52 @@ function registerClient(ws, token) {
return;
}
// Update-Check: direkt an den anfragenden Client antworten (nicht relay'en)
if (msg.type === "update_check") {
const clientVersion = msg.payload?.version || "0.0.0.0";
const apkInfo = getLatestAPK();
if (apkInfo && compareVersions(apkInfo.version, clientVersion) > 0) {
ws.send(JSON.stringify({
type: "update_available",
payload: {
version: apkInfo.version,
downloadUrl: `/update/latest.apk`,
size: fs.statSync(apkInfo.path).size,
},
timestamp: Date.now(),
}));
}
return;
}
// Update-Download: APK als Base64 ueber WebSocket senden
if (msg.type === "update_download") {
const apkInfo = getLatestAPK();
if (!apkInfo) {
ws.send(JSON.stringify({ type: "update_data", payload: { error: "Keine APK verfuegbar" }, timestamp: Date.now() }));
return;
}
try {
const data = fs.readFileSync(apkInfo.path);
const base64 = data.toString("base64");
const sizeMB = (data.length / 1024 / 1024).toFixed(1);
log(`APK sende: v${apkInfo.version} (${sizeMB}MB) an Client`);
ws.send(JSON.stringify({
type: "update_data",
payload: {
version: apkInfo.version,
base64,
size: data.length,
fileName: `ARIA-v${apkInfo.version}.apk`,
},
timestamp: Date.now(),
}));
} catch (err) {
ws.send(JSON.stringify({ type: "update_data", payload: { error: err.message }, timestamp: Date.now() }));
}
return;
}
// An alle anderen Clients im Raum weiterleiten
for (const client of room.clients) {
if (client !== ws && client.readyState === 1) {
@@ -167,6 +223,63 @@ wss.on("close", () => {
clearInterval(cleanup);
});
// ── Auto-Update: APK-Erkennung + Push ──────────────────────────────
let latestVersion = null;
function getLatestAPK() {
try {
if (!fs.existsSync(UPDATES_DIR)) return null;
const files = fs.readdirSync(UPDATES_DIR)
.filter(f => f.endsWith(".apk"))
.map(f => {
// ARIA-v0.0.2.3.apk oder ARIA-Cockpit-release.apk
const match = f.match(/(\d+\.\d+\.\d+[\.\d]*)/);
return { file: f, path: path.join(UPDATES_DIR, f), version: match ? match[1] : null };
})
.filter(f => f.version)
.sort((a, b) => compareVersions(b.version, a.version)); // Neueste zuerst
return files[0] || null;
} catch {
return null;
}
}
function compareVersions(a, b) {
const pa = a.split(".").map(Number);
const pb = b.split(".").map(Number);
for (let i = 0; i < Math.max(pa.length, pb.length); i++) {
const diff = (pa[i] || 0) - (pb[i] || 0);
if (diff !== 0) return diff;
}
return 0;
}
function notifyClientsAboutUpdate(apkInfo) {
const msg = JSON.stringify({
type: "update_available",
payload: {
version: apkInfo.version,
downloadUrl: `/update/latest.apk`,
size: fs.statSync(apkInfo.path).size,
},
timestamp: Date.now(),
});
// An alle Clients in allen Rooms senden
for (const [, room] of rooms) {
for (const client of room.clients) {
if (client.readyState === 1) {
client.send(msg);
}
}
}
log(`Update-Benachrichtigung gesendet: v${apkInfo.version} (${rooms.size} Raum/Raeume)`);
}
// Kein Polling — Update-Check passiert on-demand (update_check Message von App)
// ── Sauberes Herunterfahren ─────────────────────────────────────────
process.on("SIGTERM", () => {
View File
+11
View File
@@ -0,0 +1,11 @@
# ════════════════════════════════════════════════
# ARIA XTTS v2 — Konfiguration
# Kopieren nach .env und anpassen
# ════════════════════════════════════════════════
# RVS Verbindung (gleiche Daten wie auf der ARIA-VM)
RVS_HOST=mobil.hacker-net.de
RVS_PORT=444
RVS_TLS=true
RVS_TLS_FALLBACK=true
RVS_TOKEN=dein_token_hier
+5
View File
@@ -0,0 +1,5 @@
FROM node:22-alpine
WORKDIR /app
COPY bridge.js package.json ./
RUN npm install --production
CMD ["node", "bridge.js"]
+312
View File
@@ -0,0 +1,312 @@
/**
* ARIA XTTS Bridge — Verbindet XTTS v2 Server mit dem RVS
*
* Empfaengt tts_request ueber RVS → rendert Audio via XTTS API → sendet zurueck
* Empfaengt voice_upload → speichert Voice-Sample fuer Cloning
* Empfaengt xtts_list_voices → listet verfuegbare Stimmen
*/
const WebSocket = require("ws");
const http = require("http");
const https = require("https");
const fs = require("fs");
const path = require("path");
const XTTS_API_URL = process.env.XTTS_API_URL || "http://xtts:8000";
const RVS_HOST = process.env.RVS_HOST || "";
const RVS_PORT = process.env.RVS_PORT || "443";
const RVS_TLS = process.env.RVS_TLS || "true";
const RVS_TLS_FALLBACK = process.env.RVS_TLS_FALLBACK || "true";
const RVS_TOKEN = process.env.RVS_TOKEN || "";
const VOICES_DIR = "/voices";
function log(msg) {
console.log(`[${new Date().toISOString()}] ${msg}`);
}
// ── RVS Verbindung ──────────────────────────────────
let rvsWs = null;
let retryDelay = 2;
function connectRVS(forcePlain) {
if (!RVS_HOST || !RVS_TOKEN) {
log("RVS nicht konfiguriert — beende");
process.exit(1);
}
const useTls = RVS_TLS === "true" && !forcePlain;
const proto = useTls ? "wss" : "ws";
const url = `${proto}://${RVS_HOST}:${RVS_PORT}?token=${RVS_TOKEN}`;
log(`Verbinde zu RVS: ${proto}://${RVS_HOST}:${RVS_PORT}`);
const ws = new WebSocket(url);
ws.on("open", () => {
log("RVS verbunden — warte auf TTS-Requests");
rvsWs = ws;
retryDelay = 2;
// Keepalive
setInterval(() => {
if (ws.readyState === WebSocket.OPEN) {
ws.ping();
ws.send(JSON.stringify({ type: "heartbeat", timestamp: Date.now() }));
}
}, 25000);
});
ws.on("message", async (raw) => {
try {
const msg = JSON.parse(raw.toString());
if (msg.type === "xtts_request") {
await handleTTSRequest(msg.payload);
} else if (msg.type === "voice_upload") {
await handleVoiceUpload(msg.payload);
} else if (msg.type === "xtts_list_voices") {
await handleListVoices();
}
} catch (err) {
log(`Fehler: ${err.message}`);
}
});
ws.on("close", () => {
log("RVS Verbindung geschlossen");
rvsWs = null;
setTimeout(() => connectRVS(), Math.min(retryDelay * 1000, 30000));
retryDelay = Math.min(retryDelay * 2, 30);
});
ws.on("error", (err) => {
log(`RVS Fehler: ${err.message}`);
if (useTls && RVS_TLS_FALLBACK === "true") {
log("TLS fehlgeschlagen — Fallback auf ws://");
ws.removeAllListeners();
try { ws.close(); } catch (_) {}
connectRVS(true);
}
});
}
// ── TTS Request Handler ─────────────────────────────
async function handleTTSRequest(payload) {
const { text, voice, requestId, language } = payload;
if (!text) return;
// Markdown + Sonderzeichen entfernen fuer natuerliche Sprache
let cleanText = text
.replace(/\*\*([^*]+)\*\*/g, "$1") // **fett** → fett
.replace(/\*([^*]+)\*/g, "$1") // *kursiv* → kursiv
.replace(/`([^`]+)`/g, "$1") // `code` → code
.replace(/```[\s\S]*?```/g, "") // Code-Bloecke entfernen
.replace(/\[([^\]]+)\]\([^)]+\)/g, "$1") // [text](url) → text
.replace(/#{1,6}\s*/g, "") // ### Ueberschriften → entfernen
.replace(/>\s*/g, "") // > Zitate → entfernen
.replace(/[-*]\s+/g, "") // - Listen → entfernen
.replace(/\n{2,}/g, ". ") // Mehrere Newlines → Punkt
.replace(/\n/g, ", ") // Einzelne Newlines → Komma
.replace(/\s{2,}/g, " ") // Mehrfach-Leerzeichen
.replace(/["""„]/g, "") // Anfuehrungszeichen entfernen
.replace(/\(\)/g, "") // Leere Klammern
.trim();
// Text in Saetze aufteilen, dann zu Chunks von 2-3 Saetzen zusammenfassen
// (mehr Kontext = konsistentere Stimme/Lautstaerke, aber nicht zu lang fuer WebSocket)
const sentences = cleanText.split(/(?<=[.!?])\s+/)
.map(s => s.trim())
.filter(s => s.length > 0)
.map(s => s.replace(/[.]+$/, '')); // Punkt am Ende entfernen
const MAX_CHUNK_CHARS = 150; // Max ~150 Zeichen pro Chunk (schnelles Rendering, Preloading reicht)
const chunks = [];
let currentChunk = '';
for (const sentence of sentences) {
if (currentChunk && (currentChunk.length + sentence.length + 2) > MAX_CHUNK_CHARS) {
chunks.push(currentChunk);
currentChunk = sentence;
} else {
currentChunk = currentChunk ? currentChunk + ', ' + sentence : sentence;
}
}
if (currentChunk) chunks.push(currentChunk);
if (chunks.length === 0) return;
log(`TTS-Request: "${cleanText.slice(0, 60)}..." (${sentences.length} Saetze → ${chunks.length} Chunks, voice: ${voice || "default"}, lang: ${language || "de"})`);
try {
const voiceSample = voice ? path.join(VOICES_DIR, `${voice}.wav`) : null;
const hasCustomVoice = voiceSample && fs.existsSync(voiceSample);
// Streaming: Chunk rendern → sofort senden → naechster Chunk
// App spielt mit Preloading-Queue nahtlos ab
let sentCount = 0;
for (let i = 0; i < chunks.length; i++) {
const chunk = chunks[i];
try {
const audioBuffer = await callXTTSAPI(chunk, language || "de", hasCustomVoice ? voiceSample : null);
if (audioBuffer && audioBuffer.length > 100) {
log(`TTS [${i + 1}/${chunks.length}]: ${(audioBuffer.length / 1024).toFixed(0)}KB — "${chunk.slice(0, 50)}"`);
sendToRVS({
type: "xtts_response",
payload: {
requestId: `${requestId || ""}_${i}`,
base64: audioBuffer.toString("base64"),
mimeType: "audio/wav",
voice: voice || "default",
engine: "xtts",
part: i + 1,
totalParts: chunks.length,
},
timestamp: Date.now(),
});
sentCount++;
}
} catch (chunkErr) {
log(`TTS [${i + 1}/${chunks.length}] Fehler: ${chunkErr.message} — ueberspringe`);
}
}
log(`TTS komplett: ${sentCount}/${chunks.length} Chunks gestreamt`);
} catch (err) {
log(`TTS Fehler: ${err.message}`);
sendToRVS({
type: "xtts_response",
payload: { requestId, error: err.message },
timestamp: Date.now(),
});
}
}
function callXTTSAPI(text, language, speakerWav) {
return new Promise((resolve, reject) => {
const body = JSON.stringify({
text,
language,
speaker_wav: speakerWav || "",
});
const url = new URL(`${XTTS_API_URL}/tts_to_audio/`);
const options = {
hostname: url.hostname,
port: url.port,
path: url.pathname,
method: "POST",
headers: {
"Content-Type": "application/json",
"Content-Length": Buffer.byteLength(body),
},
timeout: 60000,
};
const req = http.request(options, (res) => {
const chunks = [];
res.on("data", (chunk) => chunks.push(chunk));
res.on("end", () => {
if (res.statusCode === 200) {
resolve(Buffer.concat(chunks));
} else {
reject(new Error(`XTTS API HTTP ${res.statusCode}: ${Buffer.concat(chunks).toString().slice(0, 200)}`));
}
});
});
req.on("error", reject);
req.on("timeout", () => { req.destroy(); reject(new Error("XTTS API Timeout (60s)")); });
req.write(body);
req.end();
});
}
// ── Voice Upload Handler ────────────────────────────
async function handleVoiceUpload(payload) {
const { name, samples } = payload;
if (!name || !samples || !Array.isArray(samples) || samples.length === 0) {
log("Voice Upload: Ungueltige Daten");
return;
}
log(`Voice Upload: "${name}" (${samples.length} Samples)`);
try {
// Alle Samples zusammenfuegen
const buffers = samples.map(s => Buffer.from(s.base64, "base64"));
const combined = Buffer.concat(buffers);
// Als WAV speichern
fs.mkdirSync(VOICES_DIR, { recursive: true });
const filePath = path.join(VOICES_DIR, `${name.replace(/[^a-zA-Z0-9_-]/g, "_")}.wav`);
fs.writeFileSync(filePath, combined);
log(`Voice gespeichert: ${filePath} (${(combined.length / 1024).toFixed(0)}KB)`);
sendToRVS({
type: "xtts_voice_saved",
payload: { name, size: combined.length, path: filePath },
timestamp: Date.now(),
});
} catch (err) {
log(`Voice Upload Fehler: ${err.message}`);
}
}
// ── Voice List Handler ──────────────────────────────
async function handleListVoices() {
try {
const files = fs.existsSync(VOICES_DIR)
? fs.readdirSync(VOICES_DIR).filter(f => f.endsWith(".wav"))
: [];
const voices = files.map(f => ({
name: path.basename(f, ".wav"),
file: f,
size: fs.statSync(path.join(VOICES_DIR, f)).size,
}));
log(`Stimmen: ${voices.length} verfuegbar`);
sendToRVS({
type: "xtts_voices_list",
payload: { voices },
timestamp: Date.now(),
});
} catch (err) {
log(`Stimmen-Liste Fehler: ${err.message}`);
}
}
// ── RVS senden ──────────────────────────────────────
function sendToRVS(msg) {
if (rvsWs && rvsWs.readyState === WebSocket.OPEN) {
rvsWs.send(JSON.stringify(msg));
}
}
// ── Start ───────────────────────────────────────────
log("ARIA XTTS Bridge startet...");
log(`XTTS API: ${XTTS_API_URL}`);
log(`RVS: ${RVS_HOST}:${RVS_PORT}`);
// Warten bis XTTS API erreichbar ist
function waitForXTTS(callback, attempts) {
if (attempts <= 0) { log("XTTS API nicht erreichbar — starte trotzdem"); callback(); return; }
http.get(`${XTTS_API_URL}/docs`, (res) => {
log(`XTTS API erreichbar (HTTP ${res.statusCode})`);
callback();
}).on("error", () => {
log(`XTTS API noch nicht bereit — warte (${attempts} Versuche uebrig)...`);
setTimeout(() => waitForXTTS(callback, attempts - 1), 10000); // 10s statt 5s (Model laden dauert)
});
}
waitForXTTS(() => connectRVS(), 30); // Max 5min warten
+56
View File
@@ -0,0 +1,56 @@
# ════════════════════════════════════════════════
# ARIA XTTS v2 — GPU TTS Server
# Laeuft auf dem Gaming-PC (RTX 3060)
# Verbindet sich zum RVS fuer TTS-Requests
# ════════════════════════════════════════════════
#
# Voraussetzungen:
# - Docker Desktop mit WSL2
# - NVIDIA Container Toolkit
# - .env mit RVS-Verbindungsdaten
#
# Start: docker compose up -d
# Test: curl http://localhost:8000/docs
# ════════════════════════════════════════════════
services:
# ─── XTTS v2 API Server (GPU) ─────────────────
xtts:
image: daswer123/xtts-api-server:latest
container_name: aria-xtts
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
ports:
- "8000:8020"
volumes:
- xtts-models:/app/xtts_models # Model-Cache (~2GB)
- ./voices:/voices # Custom Voice Samples
environment:
- COQUI_TOS_AGREED=1
restart: unless-stopped
# ─── XTTS Bridge (verbindet zu RVS) ───────────
xtts-bridge:
build: .
container_name: aria-xtts-bridge
depends_on:
- xtts
volumes:
- ./voices:/voices # Shared mit XTTS-Server
environment:
- XTTS_API_URL=http://xtts:8020
- RVS_HOST=${RVS_HOST}
- RVS_PORT=${RVS_PORT:-443}
- RVS_TLS=${RVS_TLS:-true}
- RVS_TLS_FALLBACK=${RVS_TLS_FALLBACK:-true}
- RVS_TOKEN=${RVS_TOKEN}
restart: unless-stopped
volumes:
xtts-models:
+8
View File
@@ -0,0 +1,8 @@
{
"name": "aria-xtts-bridge",
"version": "1.0.0",
"private": true,
"dependencies": {
"ws": "^8.16.0"
}
}