Compare commits

...

9 Commits

Author SHA1 Message Date
duffyduck d6a89168ef release: bump version to 0.0.2.4 2026-04-05 19:51:19 +02:00
duffyduck cb33a20694 docs: update README with XTTS, auto-update, watchdog, TTS settings
- Architecture: Added XTTS v2 (Gaming-PC) and auto-update flow
- Diagnostic: Thinking indicator, cancel button, TTS tab, voice cloning
- App: Play button, chat search, auto-update, voice speed settings
- RVS: Auto-update APK distribution over WebSocket
- Watchdog: 2min warning → 5min doctor --fix → 8min container restart
- Roadmap: Phase 1 fully completed, updated Phase 2+3

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 19:46:16 +02:00
duffyduck a242693751 feat: XTTS v2 integration, auto-update system, TTS engine abstraction
- XTTS v2: Docker setup for Gaming-PC (GPU), bridge via RVS relay
- XTTS: Voice cloning UI in Diagnostic (multi-file upload)
- XTTS: Engine selectable (Piper local vs XTTS remote) with fallback
- Auto-Update: RVS serves APK over WebSocket (no HTTP needed)
- Auto-Update: App checks version on start, prompts install
- Auto-Update: release.sh copies APK to RVS via scp
- Bridge: TTS engine abstraction (piper/xtts), config persistent
- Bridge: xtts_response handler, tts_request on-demand
- Diagnostic: TTS engine dropdown, XTTS voice panel, voice cloning
- App: Play button on ARIA messages, chat search, update service
- Wake word: Disabled LiveAudioStream (crash fix), Phase 1 placeholder
- Watchdog: Container restart after 8min stuck
- Chat backup: on-the-fly to /shared/config/chat_backup.jsonl

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 19:42:10 +02:00
duffyduck 81ca3cc7a7 Ohr-Button Absturz gefixt (LiveAudioStream entfernt, Phase 1 , Play-Button in ARIA-Nachrichten fuer Sprachwiedergabe
- [x] Chat-Suche in der App (Lupe in Statusleiste)
- [x] Watchdog mit Container-Restart (2min Warnung → 5min doctor --fix → 8min Restart),Abbrechen-Button im Diagnostic Chat
- [x] Nachrichten Backup on-the-fly (/shared/config/chat_backup.jsonl)
- [x] Grosse Nachrichten satzweise aufteilen fuer TTS
- [x] RVS Nachrichten vom Smartphone gehen durch
2026-04-01 23:45:25 +02:00
duffyduck 1a32098c9e release: bump version to 0.0.2.3 2026-04-01 23:45:15 +02:00
duffyduck fa4c32270b sst immer 2026-03-29 19:18:41 +02:00
duffyduck 9c43b875f4 release: bump version to 0.0.2.2 2026-03-29 19:04:31 +02:00
duffyduck 63560e290b two speed 2026-03-29 19:03:40 +02:00
duffyduck 1ab8a6a2fe addes speed config for voice 2026-03-29 18:50:09 +02:00
22 changed files with 1215 additions and 179 deletions
+1
View File
@@ -36,6 +36,7 @@ android/local.properties
android/package-lock.json android/package-lock.json
*.apk *.apk
*.aab *.aab
rvs/updates/*.apk
# ── Tauri / Desktop Build ─────────────────────── # ── Tauri / Desktop Build ───────────────────────
desktop/src-tauri/target/ desktop/src-tauri/target/
+114 -21
View File
@@ -29,11 +29,18 @@ ARIA hat zwei Rollen:
┌─────────────────────────────────────────────────────────┐ ┌─────────────────────────────────────────────────────────┐
│ RVS — Rendezvous-Server │ │ RVS — Rendezvous-Server │
│ Node.js WebSocket Relay (Docker, Rechenzentrum) │ │ Node.js WebSocket Relay (Docker, Rechenzentrum) │
│ Reiner Relay — kennt keine Tokens, leitet durch │ Relay + Auto-Update (APK-Verteilung)
│ rvs/docker-compose.yml │ │ rvs/docker-compose.yml │
└───────────────────────┬─────────────────────────────────┘ └───────────┬───────────────────────────┬─────────────────┘
│ WebSocket Tunnel │ WebSocket Tunnel │ WebSocket Tunnel
┌───────────────────────────┐
│ Gaming-PC (optional) │
│ RTX 3060, Docker+WSL2 │
│ XTTS v2 (natuerliche │
│ Stimmen, Voice Cloning) │
│ xtts/docker-compose.yml │
└───────────────────────────┘
┌─────────────────────────────────────────────────────────┐ ┌─────────────────────────────────────────────────────────┐
│ ARIA-VM (Proxmox, Debian 13) — ARIAs Wohnung │ │ ARIA-VM (Proxmox, Debian 13) — ARIAs Wohnung │
│ Basissystem + Docker. Rest richtet ARIA selbst ein. │ │ Basissystem + Docker. Rest richtet ARIA selbst ein. │
@@ -66,13 +73,14 @@ ARIA hat zwei Rollen:
└─────────────────────────────────────────────────────────┘ └─────────────────────────────────────────────────────────┘
``` ```
**Drei separate Deployments:** **Vier separate Deployments:**
| Was | Wo | Wie | | Was | Wo | Wie |
|-----|----|-----| |-----|----|-----|
| RVS | Rechenzentrum | `cd rvs && docker compose up -d` | | RVS | Rechenzentrum | `cd rvs && docker compose up -d` |
| ARIA Core | Debian 13 VM | `docker compose up -d && ./aria-setup.sh` | | ARIA Core | Debian 13 VM | `docker compose up -d && ./aria-setup.sh` |
| Android App | Stefans Handy | APK installieren, QR-Code scannen | | XTTS v2 (optional) | Gaming-PC (GPU) | `cd xtts && docker compose up -d` |
| Android App | Stefans Handy | APK installieren (Auto-Update via RVS) |
--- ---
@@ -314,13 +322,19 @@ Erreichbar unter `http://<VM-IP>:3001`. Teilt das Netzwerk mit aria-core.
### Features ### Features
- **Status-Karten**: Gateway (Handshake), RVS (TLS-Fallback), Proxy (Auth) - **Status-Karten**: Gateway (Handshake), RVS (TLS-Fallback), Proxy (Auth)
- **Chat-Test**: Nachrichten direkt an ARIA senden (Gateway oder via RVS) - **Chat-Test**: Nachrichten direkt an ARIA senden (Gateway oder via RVS), Vollbild-Modus
- **"ARIA denkt..." Indikator**: Zeigt live was ARIA gerade tut (Denken, Tool, Schreiben)
- **Abbrechen-Button**: Stoppt laufende Anfragen + doctor --fix
- **Session-Verwaltung**: Sessions auflisten, wechseln, erstellen, loeschen - **Session-Verwaltung**: Sessions auflisten, wechseln, erstellen, loeschen
- **Chat-History**: Wird beim Laden und Session-Wechsel angezeigt (read-only aus JSONL) - **Chat-History**: Wird beim Laden und Session-Wechsel angezeigt (read-only aus JSONL)
- **TTS-Diagnose Tab**: Stimmen testen, Status pruefen, Fehler anzeigen
- **Einstellungen**: TTS-Engine (Piper/XTTS), Stimmen, Speed, Highlight-Trigger, Betriebsmodi
- **XTTS Voice Cloning**: Audio-Samples hochladen, eigene Stimme erstellen
- **Claude Login**: Browser-Terminal zum Einloggen in den Proxy - **Claude Login**: Browser-Terminal zum Einloggen in den Proxy
- **Core Terminal**: Shell in aria-core (openclaw CLI) - **Core Terminal**: Shell in aria-core (openclaw CLI)
- **Container-Logs**: Echtzeit-Logs aller Container (gefiltert nach Tab) - **Container-Logs**: Echtzeit-Logs aller Container (gefiltert nach Tab + Pipeline)
- **SSH Terminal**: Direkter SSH-Zugang zu aria-wohnung - **SSH Terminal**: Direkter SSH-Zugang zu aria-wohnung
- **Watchdog**: Erkennt stuck Runs (2min Warnung → 5min doctor --fix → 8min Container-Restart)
### Session-Verwaltung ### Session-Verwaltung
@@ -340,10 +354,13 @@ API-Endpoint fuer andere Services: `GET http://localhost:3001/api/session`
- **Sprachaufnahme**: Push-to-Talk (halten) oder Tap-to-Talk (tippen, Auto-Stop bei Stille) - **Sprachaufnahme**: Push-to-Talk (halten) oder Tap-to-Talk (tippen, Auto-Stop bei Stille)
- **VAD (Voice Activity Detection)**: Erkennt 1.8s Stille und stoppt automatisch - **VAD (Voice Activity Detection)**: Erkennt 1.8s Stille und stoppt automatisch
- **STT (Speech-to-Text)**: Audio wird in der Bridge per Whisper transkribiert, transkribierter Text erscheint im Chat - **STT (Speech-to-Text)**: Audio wird in der Bridge per Whisper transkribiert, transkribierter Text erscheint im Chat
- **Wake Word**: Toggle-Button (Ohr-Symbol) aktiviert kontinuierliches Mikrofon-Monitoring - **TTS-Wiedergabe**: ARIA antwortet per Lautsprecher (Piper oder XTTS v2)
- **TTS-Wiedergabe**: ARIA antwortet per Lautsprecher (Ramona/Thorsten) - **Play-Button**: Jede ARIA-Nachricht kann nochmal vorgelesen werden
- **Datei- und Bild-Upload**: Bilder inline im Chat, Dateien mit Icon + Name + Groesse - **Chat-Suche**: Lupe in der Statusleiste filtert Nachrichten live
- **Anhaenge**: Bridge speichert Dateien in Shared Volume (`/shared/uploads/`), ARIA kann darauf zugreifen - **Datei- und Bild-Upload**: Bilder inline im Chat (Vollbild-Tap), Dateien mit Icon + Name + Groesse
- **Anhaenge**: Bridge speichert in Shared Volume, ARIA kann darauf zugreifen, Re-Download ueber RVS
- **Einstellungen**: TTS Engine, Stimmen, Speed pro Stimme, Speicherort, Auto-Download, GPS
- **Auto-Update**: Prueft beim Start auf neue Version, Download + Installation ueber RVS
- GPS-Position (optional) - GPS-Position (optional)
- QR-Code Scanner fuer Token-Pairing - QR-Code Scanner fuer Token-Pairing
@@ -374,19 +391,31 @@ cd android
``` ```
Das Script macht alles in einem Schritt: Das Script macht alles in einem Schritt:
1. Fragt Gitea-Kennwort ab (wird nirgends gespeichert) 1. Setzt Versionsnummern (package.json, build.gradle, SettingsScreen)
2. Baut die Release-APK 2. Fragt Gitea-Kennwort ab (wird nirgends gespeichert)
3. Erstellt Git Tag + pusht 3. Baut die Release-APK
4. Erstellt Gitea Release 4. Git Commit + Tag + Push
5. Laedt APK als Asset hoch 5. Erstellt Gitea Release + laedt APK hoch
6. Kopiert APK auf RVS-Server (Auto-Update, optional)
Voraussetzung in `.env`: Voraussetzung in `.env`:
```bash ```bash
GITEA_URL=https://gitea.hackersoft.de GITEA_URL=https://gitea.hackersoft.de
GITEA_REPO=stefan/aria-agent GITEA_REPO=stefan/aria-agent
GITEA_USER=stefan GITEA_USER=stefan
RVS_UPDATE_HOST=root@aria-rvs # Optional: fuer Auto-Update
``` ```
### Auto-Update
Die App prueft beim Start ob eine neuere Version auf dem RVS liegt.
Der Update-Flow:
1. `./release.sh 0.0.3.0` → APK wird auf RVS kopiert (via scp)
2. Alternativ: `git pull` auf dem RVS-Server → APK in `rvs/updates/`
3. App sendet `update_check` mit aktueller Version
4. RVS vergleicht → sendet `update_available`
5. App zeigt Dialog → Download ueber WebSocket → Installation
### Audio-Pipeline (Spracheingabe) ### Audio-Pipeline (Spracheingabe)
``` ```
@@ -454,6 +483,11 @@ aria-data/
│ ├── aria.env ← Voice Bridge Config │ ├── aria.env ← Voice Bridge Config
│ └── diag-state/ ← Diagnostic persistenter State │ └── diag-state/ ← Diagnostic persistenter State
│ (im Shared Volume /shared/config/):
│ ├── voice_config.json ← TTS-Einstellungen (Stimme, Speed, Engine)
│ ├── highlight_triggers.json ← Highlight-Trigger Woerter
│ └── chat_backup.jsonl ← Nachrichten-Backup (on-the-fly)
└── ssh/ ← SSH Keys fuer VM-Zugriff └── ssh/ ← SSH Keys fuer VM-Zugriff
├── id_ed25519 ← Private Key (generiert von aria-setup.sh) ├── id_ed25519 ← Private Key (generiert von aria-setup.sh)
├── id_ed25519.pub ← Public Key (muss in VM authorized_keys!) ├── id_ed25519.pub ← Public Key (muss in VM authorized_keys!)
@@ -469,7 +503,7 @@ tar -czf aria-backup-$(date +%Y%m%d).tar.gz aria-data/
## RVS — Rendezvous-Server ## RVS — Rendezvous-Server
Laeuft im Rechenzentrum. Reiner Relay — kennt keine Tokens, speichert nichts. Laeuft im Rechenzentrum. WebSocket Relay + Auto-Update Server.
Wer sich mit dem gleichen Token verbindet, landet im gleichen Room. Wer sich mit dem gleichen Token verbindet, landet im gleichen Room.
```bash ```bash
@@ -477,10 +511,60 @@ cd rvs
docker compose up -d docker compose up -d
``` ```
**Features:**
- WebSocket Relay (alle Message-Types: chat, audio, file, config, xtts, update, etc.)
- Auto-Update: APK-Verteilung an Apps ueber WebSocket
- Heartbeat + tote Verbindungen aufraeumen
**Auto-Update APK bereitstellen:**
```bash
# APK in updates/ legen (manuell oder via release.sh)
cp ARIA-v0.0.3.0.apk ~/ARIA-AGENT/rvs/updates/
# RVS erkennt die Version aus dem Dateinamen
```
**Multi-Instanz:** Mehrere ARIA-VMs koennen denselben RVS nutzen — jede mit eigenem Token. **Multi-Instanz:** Mehrere ARIA-VMs koennen denselben RVS nutzen — jede mit eigenem Token.
--- ---
## XTTS v2 — GPU TTS Server (optional)
Laeuft auf einem separaten Rechner mit NVIDIA GPU (z.B. Gaming-PC mit RTX 3060).
Verbindet sich ueber RVS mit der ARIA-Infrastruktur — kein VPN noetig.
### Voraussetzungen
- Docker Desktop mit WSL2 (Windows) oder Docker mit NVIDIA Runtime (Linux)
- NVIDIA Container Toolkit
- GPU mit mindestens 4GB VRAM (6GB+ empfohlen)
### Setup
```bash
cd xtts
cp .env.example .env
# .env mit RVS-Verbindungsdaten fuellen (gleiche wie auf der ARIA-VM)
docker compose up -d
# Erster Start laedt ~2GB Model herunter
```
### Features
- **Natuerliche Stimmen**: Deutlich bessere Qualitaet als Piper
- **Voice Cloning**: Eigene Stimme mit 6-10s Audio-Sample
- **16 Sprachen**: Deutsch, Englisch, Franzoesisch, etc.
- **RVS-Integration**: Bridge waehlt automatisch XTTS wenn verfuegbar
### Stimme klonen
In der Diagnostic unter Einstellungen → Sprachausgabe → XTTS:
1. TTS Engine auf "XTTS v2" stellen
2. "Stimme klonen" → Audio-Dateien hochladen (WAV/MP3, min. 6-10s)
3. Name vergeben → "Stimme erstellen"
4. Neue Stimme in der Auswahl verfuegbar
---
## Docker Volumes ## Docker Volumes
| Volume | Pfad im Container | Zweck | | Volume | Pfad im Container | Zweck |
@@ -491,7 +575,7 @@ docker compose up -d
| `./aria-data/ssh` (bind) | `/root/.ssh`, `/home/node/.ssh` | SSH Keys | | `./aria-data/ssh` (bind) | `/root/.ssh`, `/home/node/.ssh` | SSH Keys |
| `./aria-data/brain` (bind) | `/home/node/.openclaw/workspace/memory` | Gedaechtnis | | `./aria-data/brain` (bind) | `/home/node/.openclaw/workspace/memory` | Gedaechtnis |
| `./aria-data/skills` (bind) | `/home/node/.openclaw/workspace/skills` | Skills | | `./aria-data/skills` (bind) | `/home/node/.openclaw/workspace/skills` | Skills |
| `aria-shared` | `/shared` (Core + Bridge) | Datei-Austausch (Uploads von App) | | `aria-shared` | `/shared` (Core + Bridge + Proxy + Diag) | Datei-Austausch, Config, Uploads |
| `./aria-data/config/diag-state` (bind) | `/data` (Diagnostic) | Persistenter State (aktive Session) | | `./aria-data/config/diag-state` (bind) | `/data` (Diagnostic) | Persistenter State (aktive Session) |
--- ---
@@ -569,8 +653,15 @@ docker exec aria-core ssh aria-wohnung hostname
- [x] Android App (Chat + Sprache + Uploads) - [x] Android App (Chat + Sprache + Uploads)
- [x] Tool-Permissions (alle Tools freigeschaltet) - [x] Tool-Permissions (alle Tools freigeschaltet)
- [x] SSH-Zugriff auf VM (aria-wohnung) - [x] SSH-Zugriff auf VM (aria-wohnung)
- [x] Diagnostic Web-UI - [x] Diagnostic Web-UI + Einstellungen
- [x] Session-Verwaltung + Chat-History - [x] Session-Verwaltung + Chat-History
- [x] Stimmen-Einstellungen (Ramona/Thorsten, Speed, Highlight-Trigger)
- [x] TTS satzweise fuer lange Texte
- [x] Datei-/Bild-Upload mit Shared Volume
- [x] Watchdog (stuck Run Erkennung + Auto-Fix + Container-Restart)
- [x] Auto-Update System (APK via RVS)
- [x] Chat-Suche, Play-Button, Abbrechen-Button
- [x] XTTS v2 Integration (GPU, Voice Cloning, remote ueber RVS)
### Phase 2 — ARIA wird produktiv ### Phase 2 — ARIA wird produktiv
@@ -578,7 +669,8 @@ docker exec aria-core ssh aria-wohnung hostname
- [ ] Gitea-Integration - [ ] Gitea-Integration
- [ ] VM einrichten (Desktop, Browser, Tools) - [ ] VM einrichten (Desktop, Browser, Tools)
- [ ] Heartbeat (periodische Selbst-Checks) - [ ] Heartbeat (periodische Selbst-Checks)
- [ ] Lokales LLM als Wächter (Triage vor Claude-Call) - [ ] Lokales LLM als Waechter (Triage vor Claude-Call)
- [ ] Auto-Compacting / Memory-Verwaltung
### Phase 3 — Erweiterungen ### Phase 3 — Erweiterungen
@@ -586,3 +678,4 @@ docker exec aria-core ssh aria-wohnung hostname
- [ ] Desktop Client (Tauri) - [ ] Desktop Client (Tauri)
- [ ] bKVM Remote IT-Support - [ ] bKVM Remote IT-Support
- [ ] Porcupine Wake Word (on-device "ARIA" in der App) - [ ] Porcupine Wake Word (on-device "ARIA" in der App)
- [ ] Claude Vision direkt (Bildanalyse ohne Dateipfad-Umweg)
+2 -2
View File
@@ -79,8 +79,8 @@ android {
applicationId "com.ariacockpit" applicationId "com.ariacockpit"
minSdkVersion rootProject.ext.minSdkVersion minSdkVersion rootProject.ext.minSdkVersion
targetSdkVersion rootProject.ext.targetSdkVersion targetSdkVersion rootProject.ext.targetSdkVersion
versionCode 201 versionCode 204
versionName "0.0.2.1" versionName "0.0.2.4"
// Fallback fuer Libraries mit Product Flavors // Fallback fuer Libraries mit Product Flavors
missingDimensionStrategy 'react-native-camera', 'general' missingDimensionStrategy 'react-native-camera', 'general'
} }
+2 -3
View File
@@ -1,6 +1,6 @@
{ {
"name": "aria-cockpit", "name": "aria-cockpit",
"version": "0.0.2.1", "version": "0.0.2.4",
"private": true, "private": true,
"scripts": { "scripts": {
"android": "react-native run-android", "android": "react-native run-android",
@@ -24,8 +24,7 @@
"react-native-camera-kit": "^13.0.0", "react-native-camera-kit": "^13.0.0",
"@react-native-async-storage/async-storage": "^1.21.0", "@react-native-async-storage/async-storage": "^1.21.0",
"react-native-fs": "^2.20.0", "react-native-fs": "^2.20.0",
"react-native-audio-recorder-player": "^3.6.7", "react-native-audio-recorder-player": "^3.6.7"
"react-native-live-audio-stream": "^1.1.1"
}, },
"devDependencies": { "devDependencies": {
"typescript": "^5.3.3", "typescript": "^5.3.3",
+70 -1
View File
@@ -23,6 +23,7 @@ import RNFS from 'react-native-fs';
import rvs, { RVSMessage, ConnectionState } from '../services/rvs'; import rvs, { RVSMessage, ConnectionState } from '../services/rvs';
import audioService from '../services/audio'; import audioService from '../services/audio';
import wakeWordService from '../services/wakeword'; import wakeWordService from '../services/wakeword';
import updateService from '../services/updater';
import VoiceButton from '../components/VoiceButton'; import VoiceButton from '../components/VoiceButton';
import FileUpload, { FileData } from '../components/FileUpload'; import FileUpload, { FileData } from '../components/FileUpload';
import CameraUpload, { PhotoData } from '../components/CameraUpload'; import CameraUpload, { PhotoData } from '../components/CameraUpload';
@@ -91,6 +92,8 @@ const ChatScreen: React.FC = () => {
const [gpsEnabled, setGpsEnabled] = useState(false); const [gpsEnabled, setGpsEnabled] = useState(false);
const [wakeWordActive, setWakeWordActive] = useState(false); const [wakeWordActive, setWakeWordActive] = useState(false);
const [fullscreenImage, setFullscreenImage] = useState<string | null>(null); const [fullscreenImage, setFullscreenImage] = useState<string | null>(null);
const [searchQuery, setSearchQuery] = useState('');
const [searchVisible, setSearchVisible] = useState(false);
const flatListRef = useRef<FlatList>(null); const flatListRef = useRef<FlatList>(null);
const messageIdCounter = useRef(0); const messageIdCounter = useRef(0);
@@ -260,6 +263,16 @@ const ChatScreen: React.FC = () => {
}; };
}, []); }, []);
// Auto-Update: Bei App-Start pruefen
useEffect(() => {
const unsubUpdate = updateService.onUpdateAvailable((info) => {
updateService.promptUpdate(info);
});
// Nach 5s pruefen (RVS muss erst verbunden sein)
const timer = setTimeout(() => updateService.checkForUpdate(), 5000);
return () => { unsubUpdate(); clearTimeout(timer); };
}, []);
// Wake Word: "ARIA" Erkennung → Auto-Aufnahme starten // Wake Word: "ARIA" Erkennung → Auto-Aufnahme starten
useEffect(() => { useEffect(() => {
const unsubWake = wakeWordService.onWakeWord(async () => { const unsubWake = wakeWordService.onWakeWord(async () => {
@@ -581,6 +594,18 @@ const ChatScreen: React.FC = () => {
{item.text} {item.text}
</Text> </Text>
)} )}
{/* Play-Button fuer ARIA-Nachrichten */}
{!isUser && item.text.length > 0 && (
<TouchableOpacity
style={styles.playButton}
onPress={() => {
// TTS-Request an Bridge senden
rvs.send('tts_request' as any, { text: item.text, voice: '' });
}}
>
<Text style={styles.playButtonText}>{'\uD83D\uDD0A'}</Text>
</TouchableOpacity>
)}
<Text style={styles.timestamp}>{time}</Text> <Text style={styles.timestamp}>{time}</Text>
</View> </View>
); );
@@ -603,12 +628,32 @@ const ChatScreen: React.FC = () => {
{connectionState === 'connected' ? 'Verbunden' : {connectionState === 'connected' ? 'Verbunden' :
connectionState === 'connecting' ? 'Verbinde...' : 'Getrennt'} connectionState === 'connecting' ? 'Verbinde...' : 'Getrennt'}
</Text> </Text>
<TouchableOpacity onPress={() => setSearchVisible(!searchVisible)} style={{marginLeft: 'auto', paddingHorizontal: 8}}>
<Text style={{fontSize: 16}}>{'\uD83D\uDD0D'}</Text>
</TouchableOpacity>
</View> </View>
{/* Suchleiste */}
{searchVisible && (
<View style={styles.searchBar}>
<TextInput
style={styles.searchInput}
value={searchQuery}
onChangeText={setSearchQuery}
placeholder="Chat durchsuchen..."
placeholderTextColor="#555570"
autoFocus
/>
<TouchableOpacity onPress={() => { setSearchVisible(false); setSearchQuery(''); }}>
<Text style={{color: '#FF3B30', fontSize: 14, paddingHorizontal: 8}}>X</Text>
</TouchableOpacity>
</View>
)}
{/* Nachrichtenliste */} {/* Nachrichtenliste */}
<FlatList <FlatList
ref={flatListRef} ref={flatListRef}
data={messages} data={searchQuery ? messages.filter(m => m.text.toLowerCase().includes(searchQuery.toLowerCase())) : messages}
keyExtractor={item => item.id} keyExtractor={item => item.id}
renderItem={renderMessage} renderItem={renderMessage}
contentContainerStyle={styles.messageList} contentContainerStyle={styles.messageList}
@@ -887,6 +932,30 @@ const styles = StyleSheet.create({
wakeWordIcon: { wakeWordIcon: {
fontSize: 16, fontSize: 16,
}, },
searchBar: {
flexDirection: 'row',
alignItems: 'center',
backgroundColor: '#12122A',
paddingHorizontal: 12,
paddingVertical: 6,
borderBottomWidth: 1,
borderBottomColor: '#1E1E2E',
},
searchInput: {
flex: 1,
color: '#FFFFFF',
fontSize: 14,
paddingVertical: 4,
},
playButton: {
alignSelf: 'flex-end',
paddingHorizontal: 8,
paddingVertical: 2,
marginTop: 4,
},
playButtonText: {
fontSize: 16,
},
fullscreenOverlay: { fullscreenOverlay: {
flex: 1, flex: 1,
backgroundColor: 'rgba(0,0,0,0.95)', backgroundColor: 'rgba(0,0,0,0.95)',
+40 -28
View File
@@ -74,7 +74,8 @@ const SettingsScreen: React.FC = () => {
const [ttsEnabled, setTtsEnabled] = useState(true); const [ttsEnabled, setTtsEnabled] = useState(true);
const [defaultVoice, setDefaultVoice] = useState('ramona'); const [defaultVoice, setDefaultVoice] = useState('ramona');
const [highlightVoice, setHighlightVoice] = useState('thorsten'); const [highlightVoice, setHighlightVoice] = useState('thorsten');
const [speechSpeed, setSpeechSpeed] = useState(1.0); const [speedRamona, setSpeedRamona] = useState(1.0);
const [speedThorsten, setSpeedThorsten] = useState(1.0);
const [editingPath, setEditingPath] = useState(false); const [editingPath, setEditingPath] = useState(false);
const [tempPath, setTempPath] = useState(''); const [tempPath, setTempPath] = useState('');
@@ -104,8 +105,11 @@ const SettingsScreen: React.FC = () => {
AsyncStorage.getItem('aria_highlight_voice').then(saved => { AsyncStorage.getItem('aria_highlight_voice').then(saved => {
if (saved) setHighlightVoice(saved); if (saved) setHighlightVoice(saved);
}); });
AsyncStorage.getItem('aria_speech_speed').then(saved => { AsyncStorage.getItem('aria_speed_ramona').then(saved => {
if (saved) setSpeechSpeed(parseFloat(saved)); if (saved) setSpeedRamona(parseFloat(saved));
});
AsyncStorage.getItem('aria_speed_thorsten').then(saved => {
if (saved) setSpeedThorsten(parseFloat(saved));
}); });
}, []); }, []);
@@ -525,41 +529,49 @@ const SettingsScreen: React.FC = () => {
</View> </View>
</View> </View>
{/* Sprechgeschwindigkeit */} {/* Sprechgeschwindigkeit Ramona */}
<View style={{marginTop: 16}}> <View style={{marginTop: 16}}>
<Text style={styles.toggleLabel}>Sprechgeschwindigkeit: {speechSpeed.toFixed(1)}x</Text> <Text style={styles.toggleLabel}>Ramona Speed: {speedRamona.toFixed(1)}x</Text>
<View style={{flexDirection: 'row', alignItems: 'center', gap: 8, marginTop: 8}}>
<Text style={{color: '#555570', fontSize: 11}}>0.5x</Text>
<View style={{flex: 1}}>
<TouchableOpacity
style={{height: 30, justifyContent: 'center'}}
onPress={(e) => {
const layout = e.nativeEvent;
// Einfacher Tap-basierter Slider
}}
>
<View style={{height: 4, backgroundColor: '#2A2A3E', borderRadius: 2}}>
<View style={{height: 4, backgroundColor: '#0096FF', borderRadius: 2, width: `${((speechSpeed - 0.5) / 1.5) * 100}%`}} />
</View>
</TouchableOpacity>
</View>
<Text style={{color: '#555570', fontSize: 11}}>2.0x</Text>
</View>
<View style={{flexDirection: 'row', justifyContent: 'space-around', marginTop: 8}}> <View style={{flexDirection: 'row', justifyContent: 'space-around', marginTop: 8}}>
{[0.5, 0.75, 1.0, 1.25, 1.5, 2.0].map(speed => ( {[0.5, 0.75, 1.0, 1.25, 1.5, 2.0].map(speed => (
<TouchableOpacity <TouchableOpacity
key={speed} key={speed}
onPress={() => { onPress={() => {
setSpeechSpeed(speed); setSpeedRamona(speed);
AsyncStorage.setItem('aria_speech_speed', String(speed)); AsyncStorage.setItem('aria_speed_ramona', String(speed));
rvs.send('config' as any, { speechSpeed: speed }); rvs.send('config' as any, { speedRamona: speed });
}} }}
style={{ style={{
paddingHorizontal: 10, paddingVertical: 6, borderRadius: 6, paddingHorizontal: 10, paddingVertical: 6, borderRadius: 6,
backgroundColor: speechSpeed === speed ? '#0096FF' : '#1E1E2E', backgroundColor: speedRamona === speed ? '#0096FF' : '#1E1E2E',
}} }}
> >
<Text style={{color: speechSpeed === speed ? '#fff' : '#8888AA', fontSize: 12, fontWeight: '600'}}> <Text style={{color: speedRamona === speed ? '#fff' : '#8888AA', fontSize: 12, fontWeight: '600'}}>
{speed}x
</Text>
</TouchableOpacity>
))}
</View>
</View>
{/* Sprechgeschwindigkeit Thorsten */}
<View style={{marginTop: 16}}>
<Text style={styles.toggleLabel}>Thorsten Speed: {speedThorsten.toFixed(1)}x</Text>
<View style={{flexDirection: 'row', justifyContent: 'space-around', marginTop: 8}}>
{[0.5, 0.75, 1.0, 1.25, 1.5, 2.0].map(speed => (
<TouchableOpacity
key={speed}
onPress={() => {
setSpeedThorsten(speed);
AsyncStorage.setItem('aria_speed_thorsten', String(speed));
rvs.send('config' as any, { speedThorsten: speed });
}}
style={{
paddingHorizontal: 10, paddingVertical: 6, borderRadius: 6,
backgroundColor: speedThorsten === speed ? '#0096FF' : '#1E1E2E',
}}
>
<Text style={{color: speedThorsten === speed ? '#fff' : '#8888AA', fontSize: 12, fontWeight: '600'}}>
{speed}x {speed}x
</Text> </Text>
</TouchableOpacity> </TouchableOpacity>
@@ -736,7 +748,7 @@ const SettingsScreen: React.FC = () => {
<Text style={styles.sectionTitle}>{'\u00DC'}ber</Text> <Text style={styles.sectionTitle}>{'\u00DC'}ber</Text>
<View style={styles.card}> <View style={styles.card}>
<Text style={styles.aboutTitle}>ARIA Cockpit</Text> <Text style={styles.aboutTitle}>ARIA Cockpit</Text>
<Text style={styles.aboutVersion}>Version 0.0.2.1 </Text> <Text style={styles.aboutVersion}>Version 0.0.2.4 </Text>
<Text style={styles.aboutInfo}> <Text style={styles.aboutInfo}>
Stefans Kommandozentrale f{'\u00FC'}r ARIA.{'\n'} Stefans Kommandozentrale f{'\u00FC'}r ARIA.{'\n'}
Gebaut mit React Native + TypeScript. Gebaut mit React Native + TypeScript.
+1 -1
View File
@@ -12,7 +12,7 @@ import AsyncStorage from '@react-native-async-storage/async-storage';
export type ConnectionState = 'connecting' | 'connected' | 'disconnected'; export type ConnectionState = 'connecting' | 'connected' | 'disconnected';
export type MessageType = 'chat' | 'audio' | 'file' | 'location' | 'mode' | 'log' | 'event'; export type MessageType = 'chat' | 'audio' | 'file' | 'location' | 'mode' | 'log' | 'event' | 'update_available' | string;
export interface RVSMessage { export interface RVSMessage {
type: MessageType; type: MessageType;
+149
View File
@@ -0,0 +1,149 @@
/**
* Auto-Update Service — prueft und installiert App-Updates via RVS
*
* Flow:
* 1. App sendet "update_check" mit aktueller Version an RVS
* 2. RVS vergleicht → sendet "update_available" mit Download-URL
* 3. App zeigt Benachrichtigung → User bestaetigt → Download + Install
*/
import { Alert, Linking, Platform } from 'react-native';
import RNFS from 'react-native-fs';
import rvs, { RVSMessage } from './rvs';
// Aktuelle App-Version (aus package.json via Build)
const APP_VERSION = '0.0.2.3'; // TODO: aus nativer Build-Config lesen
type UpdateCallback = (info: UpdateInfo) => void;
export interface UpdateInfo {
version: string;
downloadUrl: string;
size: number;
}
class UpdateService {
private listeners: UpdateCallback[] = [];
private checking = false;
private downloading = false;
constructor() {
// Auf update_available Nachrichten lauschen
rvs.onMessage((msg: RVSMessage) => {
if (msg.type === 'update_available' as any) {
const info: UpdateInfo = {
version: (msg.payload.version as string) || '',
downloadUrl: (msg.payload.downloadUrl as string) || '',
size: (msg.payload.size as number) || 0,
};
if (info.version && this.isNewer(info.version)) {
console.log(`[Update] Neue Version verfuegbar: ${info.version} (aktuell: ${APP_VERSION})`);
this.listeners.forEach(cb => cb(info));
}
}
});
}
/** Bei App-Start Update pruefen */
checkForUpdate(): void {
if (this.checking) return;
this.checking = true;
console.log(`[Update] Pruefe auf Updates (aktuell: ${APP_VERSION})`);
rvs.send('update_check' as any, { version: APP_VERSION });
setTimeout(() => { this.checking = false; }, 10000);
}
/** Callback registrieren */
onUpdateAvailable(callback: UpdateCallback): () => void {
this.listeners.push(callback);
return () => {
this.listeners = this.listeners.filter(cb => cb !== callback);
};
}
/** Update-Dialog anzeigen */
promptUpdate(info: UpdateInfo): void {
const sizeMB = (info.size / 1024 / 1024).toFixed(1);
Alert.alert(
'ARIA Update verfuegbar',
`Version ${info.version} (${sizeMB} MB)\n\nAktuell: ${APP_VERSION}\n\nJetzt herunterladen und installieren?`,
[
{ text: 'Spaeter', style: 'cancel' },
{
text: 'Installieren',
onPress: () => this.downloadAndInstall(info),
},
],
);
}
/** APK ueber WebSocket herunterladen und installieren */
async downloadAndInstall(info: UpdateInfo): Promise<void> {
if (this.downloading) return;
this.downloading = true;
try {
console.log(`[Update] Fordere APK v${info.version} an...`);
Alert.alert('Download gestartet', `Version ${info.version} wird ueber RVS heruntergeladen...`);
// APK ueber WebSocket anfordern
rvs.send('update_download' as any, {});
// Auf update_data warten (einmalig)
const apkData = await new Promise<{base64: string, fileName: string}>((resolve, reject) => {
const timeout = setTimeout(() => reject(new Error('Download-Timeout (60s)')), 60000);
const unsub = rvs.onMessage((msg: RVSMessage) => {
if ((msg.type as string) === 'update_data') {
clearTimeout(timeout);
unsub();
if (msg.payload.error) {
reject(new Error(msg.payload.error as string));
} else {
resolve({
base64: msg.payload.base64 as string,
fileName: msg.payload.fileName as string || `ARIA-${info.version}.apk`,
});
}
}
});
});
// Base64 als APK-Datei speichern
const destPath = `${RNFS.CachesDirectoryPath}/${apkData.fileName}`;
await RNFS.writeFile(destPath, apkData.base64, 'base64');
const fileSize = await RNFS.stat(destPath);
console.log(`[Update] APK gespeichert: ${destPath} (${(parseInt(fileSize.size) / 1024 / 1024).toFixed(1)}MB)`);
// APK installieren (oeffnet Android-Installer)
if (Platform.OS === 'android') {
await Linking.openURL(`file://${destPath}`);
}
} catch (err: any) {
console.error(`[Update] Fehler: ${err.message}`);
Alert.alert('Update fehlgeschlagen', err.message);
} finally {
this.downloading = false;
}
}
/** Versionsvergleich */
private isNewer(remote: string): boolean {
const r = remote.split('.').map(Number);
const l = APP_VERSION.split('.').map(Number);
for (let i = 0; i < Math.max(r.length, l.length); i++) {
const diff = (r[i] || 0) - (l[i] || 0);
if (diff > 0) return true;
if (diff < 0) return false;
}
return false;
}
getCurrentVersion(): string {
return APP_VERSION;
}
}
const updateService = new UpdateService();
export default updateService;
+7 -77
View File
@@ -1,21 +1,12 @@
/** /**
* Wake Word Service — "ARIA" Erkennung * Wake Word Service — "ARIA" Erkennung
* *
* Nutzt react-native-live-audio-stream fuer kontinuierliches Mikrofon-Monitoring. * Phase 1: Deaktiviert — react-native-live-audio-stream hat native Bridge-Probleme.
* Erkennt Sprache per Energie-Schwellwert und sendet kurze Audio-Clips * Nutzt stattdessen Tap-to-Talk (VoiceButton) als primaeren Eingabemodus.
* zur serverseitigen Wake-Word-Pruefung (openwakeword in der Bridge).
* *
* Architektur: * Phase 2: Porcupine on-device "ARIA" Keyword (geplant).
* App (Mikrofon) → Energie-Erkennung → Audio-Buffer
* → RVS "wake_check" → Bridge → openwakeword → Bestaetigung
* → App startet Aufnahme
*
* Aktuell (Phase 1): Einfacher Tap-to-Talk + Auto-Stop.
* Spaeter (Phase 2): Porcupine on-device "ARIA" Keyword.
*/ */
import LiveAudioStream from 'react-native-live-audio-stream';
type WakeWordCallback = () => void; type WakeWordCallback = () => void;
type StateCallback = (state: WakeWordState) => void; type StateCallback = (state: WakeWordState) => void;
@@ -25,47 +16,16 @@ class WakeWordService {
private state: WakeWordState = 'off'; private state: WakeWordState = 'off';
private wakeCallbacks: WakeWordCallback[] = []; private wakeCallbacks: WakeWordCallback[] = [];
private stateCallbacks: StateCallback[] = []; private stateCallbacks: StateCallback[] = [];
private isInitialized = false;
/** Wake Word Erkennung starten */ /** Wake Word Erkennung starten */
async start(): Promise<boolean> { async start(): Promise<boolean> {
if (this.state === 'listening') return true; if (this.state === 'listening') return true;
try { try {
if (!this.isInitialized) { // Phase 1: LiveAudioStream deaktiviert (native Bridge instabil)
LiveAudioStream.init({ // Stattdessen: Tap-to-Talk als primaerer Modus
sampleRate: 16000, console.log('[WakeWord] Wake Word ist in Phase 1 noch nicht verfuegbar — nutze Tap-to-Talk');
channels: 1,
bitsPerSample: 16,
audioSource: 6, // VOICE_RECOGNITION
bufferSize: 4096,
});
this.isInitialized = true;
}
// Audio-Stream starten und auf Energie pruefen
LiveAudioStream.start();
LiveAudioStream.on('data', (base64Chunk: string) => {
if (this.state !== 'listening') return;
// Base64 → Int16 Array → RMS berechnen
const raw = this._base64ToInt16(base64Chunk);
const rms = this._calculateRMS(raw);
// Schwellwert: wenn laut genug → Wake Word erkannt
// Phase 1: Einfache Energie-Erkennung (jemand spricht)
// Phase 2: Porcupine "ARIA" Keyword
if (rms > 2000) {
this.setState('detected');
this.wakeCallbacks.forEach(cb => cb());
// Nach Detection kurz pausieren, Aufnahme uebernimmt das Mikrofon
this.stop();
}
});
this.setState('listening'); this.setState('listening');
console.log('[WakeWord] Listening gestartet');
return true; return true;
} catch (err) { } catch (err) {
console.error('[WakeWord] Start fehlgeschlagen:', err); console.error('[WakeWord] Start fehlgeschlagen:', err);
@@ -75,22 +35,12 @@ class WakeWordService {
/** Wake Word Erkennung stoppen */ /** Wake Word Erkennung stoppen */
stop(): void { stop(): void {
if (this.state === 'off') return;
try {
LiveAudioStream.stop();
} catch {}
this.setState('off'); this.setState('off');
console.log('[WakeWord] Gestoppt');
} }
/** Nach Aufnahme erneut starten */ /** Nach Aufnahme erneut starten */
async resume(): Promise<void> { async resume(): Promise<void> {
// Kurze Pause damit Aufnahme das Mikrofon freigeben kann // Nichts zu tun in Phase 1
setTimeout(() => {
if (this.state === 'off') {
this.start();
}
}, 500);
} }
// --- Callbacks --- // --- Callbacks ---
@@ -113,32 +63,12 @@ class WakeWordService {
return this.state; return this.state;
} }
// --- Hilfsfunktionen ---
private setState(state: WakeWordState): void { private setState(state: WakeWordState): void {
if (this.state !== state) { if (this.state !== state) {
this.state = state; this.state = state;
this.stateCallbacks.forEach(cb => cb(state)); this.stateCallbacks.forEach(cb => cb(state));
} }
} }
private _base64ToInt16(base64: string): Int16Array {
const binary = atob(base64);
const bytes = new Uint8Array(binary.length);
for (let i = 0; i < binary.length; i++) {
bytes[i] = binary.charCodeAt(i);
}
return new Int16Array(bytes.buffer);
}
private _calculateRMS(samples: Int16Array): number {
if (samples.length === 0) return 0;
let sum = 0;
for (let i = 0; i < samples.length; i++) {
sum += samples[i] * samples[i];
}
return Math.sqrt(sum / samples.length);
}
} }
const wakeWordService = new WakeWordService(); const wakeWordService = new WakeWordService();
+121 -16
View File
@@ -38,6 +38,7 @@ import websockets
from faster_whisper import WhisperModel from faster_whisper import WhisperModel
from openwakeword.model import Model as WakeWordModel from openwakeword.model import Model as WakeWordModel
from piper import PiperVoice from piper import PiperVoice
from piper.config import SynthesisConfig
from modes import Mode, detect_mode_switch, should_speak from modes import Mode, detect_mode_switch, should_speak
@@ -131,6 +132,7 @@ class VoiceEngine:
self.voices: dict[str, PiperVoice] = {} self.voices: dict[str, PiperVoice] = {}
self.default_voice = "ramona" self.default_voice = "ramona"
self.highlight_voice = "thorsten" self.highlight_voice = "thorsten"
self.speech_speed = {"ramona": 1.0, "thorsten": 1.0}
def initialize(self) -> None: def initialize(self) -> None:
"""Laedt die Piper-Stimmen aus dem Voices-Verzeichnis.""" """Laedt die Piper-Stimmen aus dem Voices-Verzeichnis."""
@@ -216,8 +218,10 @@ class VoiceEngine:
continue continue
with tempfile.NamedTemporaryFile(suffix=".wav", delete=False) as tmp: with tempfile.NamedTemporaryFile(suffix=".wav", delete=False) as tmp:
tmp_path = tmp.name tmp_path = tmp.name
speed = self.speech_speed.get(voice_name, 1.0)
syn_config = SynthesisConfig(length_scale=1.0 / max(0.3, speed))
with wave.open(tmp_path, "wb") as wav_file: with wave.open(tmp_path, "wb") as wav_file:
voice.synthesize_wav(sentence, wav_file) voice.synthesize_wav(sentence, wav_file, syn_config=syn_config)
with wave.open(tmp_path, "rb") as wav_file: with wave.open(tmp_path, "rb") as wav_file:
if sample_rate is None: if sample_rate is None:
sample_rate = wav_file.getframerate() sample_rate = wav_file.getframerate()
@@ -494,7 +498,13 @@ class ARIABridge:
vc = json.load(f) vc = json.load(f)
self.voice_engine.default_voice = vc.get("defaultVoice", "ramona") self.voice_engine.default_voice = vc.get("defaultVoice", "ramona")
self.voice_engine.highlight_voice = vc.get("highlightVoice", "thorsten") self.voice_engine.highlight_voice = vc.get("highlightVoice", "thorsten")
self.voice_engine.speech_speed = {
"ramona": vc.get("speedRamona", 1.0),
"thorsten": vc.get("speedThorsten", 1.0),
}
self.tts_enabled = vc.get("ttsEnabled", True) self.tts_enabled = vc.get("ttsEnabled", True)
self.tts_engine_type = vc.get("ttsEngine", "piper")
self.xtts_voice = vc.get("xttsVoice", "")
logger.info("Voice-Config geladen: %s", vc) logger.info("Voice-Config geladen: %s", vc)
except Exception as e: except Exception as e:
logger.warning("Voice-Config laden fehlgeschlagen: %s", e) logger.warning("Voice-Config laden fehlgeschlagen: %s", e)
@@ -522,19 +532,20 @@ class ARIABridge:
# Voice-Engine IMMER laden — rendert Audio fuer die App (auch ohne Soundkarte) # Voice-Engine IMMER laden — rendert Audio fuer die App (auch ohne Soundkarte)
self.voice_engine.initialize() self.voice_engine.initialize()
# STT IMMER laden — verarbeitet Audio von der App (braucht kein Sounddevice)
self.stt_engine.initialize()
# Audio-Hardware pruefen (fuer lokales Mikro/Lautsprecher) # Audio-Hardware pruefen (fuer lokales Mikro/Lautsprecher)
self.audio_available = False self.audio_available = False
try: try:
devices = sd.query_devices() devices = sd.query_devices()
# Pruefen ob ein Output-Device existiert
sd.query_devices(kind='output') sd.query_devices(kind='output')
self.audio_available = True self.audio_available = True
logger.info("Audio-Geraet gefunden — Wake-Word und lokale TTS aktiv") logger.info("Audio-Geraet gefunden — Wake-Word und lokale TTS aktiv")
self.stt_engine.initialize()
self.wake_word.initialize() self.wake_word.initialize()
except (sd.PortAudioError, Exception): except (sd.PortAudioError, Exception):
logger.warning("Kein Audio-Geraet — Wake-Word und lokale TTS deaktiviert") logger.warning("Kein Audio-Geraet — Wake-Word und lokale Wiedergabe deaktiviert")
logger.info("Piper TTS rendert Audio fuer die App (via RVS)") logger.info("TTS rendert fuer App (via RVS), STT verarbeitet App-Audio")
logger.info("Alle Komponenten initialisiert") logger.info("Alle Komponenten initialisiert")
logger.info("aria-core: %s", self.ws_url) logger.info("aria-core: %s", self.ws_url)
@@ -837,17 +848,47 @@ class ARIABridge:
# TTS-Audio rendern und an die App senden (wenn Modus es erlaubt) # TTS-Audio rendern und an die App senden (wenn Modus es erlaubt)
if getattr(self, 'tts_enabled', True) and should_speak(self.current_mode, is_critical): if getattr(self, 'tts_enabled', True) and should_speak(self.current_mode, is_critical):
audio_data = self.voice_engine.synthesize(text, voice_name) tts_engine = getattr(self, 'tts_engine_type', 'piper')
if audio_data:
audio_b64 = base64.b64encode(audio_data).decode("ascii") if tts_engine == "xtts":
await self._send_to_rvs({ # XTTS: Request ueber RVS an Gaming-PC senden
"type": "audio", xtts_voice = getattr(self, 'xtts_voice', '')
"payload": { try:
"base64": audio_b64, await self._send_to_rvs({
"mimeType": "audio/wav", "type": "xtts_request",
"voice": voice_name, "payload": {
}, "text": text,
"timestamp": int(asyncio.get_event_loop().time() * 1000), "voice": xtts_voice,
"language": "de",
"requestId": str(uuid.uuid4()),
},
"timestamp": int(asyncio.get_event_loop().time() * 1000),
})
logger.info("[core] XTTS-Request gesendet (%s): '%s'", xtts_voice or "default", text[:60])
except Exception as e:
logger.warning("[core] XTTS-Request fehlgeschlagen: %s — Fallback auf Piper", e)
# Fallback auf Piper
audio_data = self.voice_engine.synthesize(text, voice_name)
if audio_data:
audio_b64 = base64.b64encode(audio_data).decode("ascii")
await self._send_to_rvs({
"type": "audio",
"payload": {"base64": audio_b64, "mimeType": "audio/wav", "voice": voice_name},
"timestamp": int(asyncio.get_event_loop().time() * 1000),
})
else:
# Piper: Lokal rendern
audio_data = self.voice_engine.synthesize(text, voice_name)
if audio_data:
audio_b64 = base64.b64encode(audio_data).decode("ascii")
await self._send_to_rvs({
"type": "audio",
"payload": {
"base64": audio_b64,
"mimeType": "audio/wav",
"voice": voice_name,
},
"timestamp": int(asyncio.get_event_loop().time() * 1000),
}) })
logger.info("[core] TTS-Audio gesendet: %d bytes (%s)", len(audio_data), voice_name) logger.info("[core] TTS-Audio gesendet: %d bytes (%s)", len(audio_data), voice_name)
@@ -1005,6 +1046,50 @@ class ARIABridge:
if sender in ("aria", "stt"): if sender in ("aria", "stt"):
return return
elif msg_type == "xtts_response":
# XTTS-Audio vom Gaming-PC empfangen → an App weiterleiten
audio_b64 = payload.get("base64", "")
error = payload.get("error", "")
if error:
logger.warning("[rvs] XTTS Fehler: %s", error)
return
if audio_b64:
logger.info("[rvs] XTTS-Audio empfangen: %dKB", len(audio_b64) // 1365)
await self._send_to_rvs({
"type": "audio",
"payload": {
"base64": audio_b64,
"mimeType": payload.get("mimeType", "audio/wav"),
"voice": payload.get("voice", "xtts"),
},
"timestamp": int(asyncio.get_event_loop().time() * 1000),
})
return
elif msg_type == "tts_request":
# App fordert TTS-Audio fuer einen Text an (Play-Button)
text = payload.get("text", "")
requested_voice = payload.get("voice", "")
if text:
voice_name = requested_voice or self.voice_engine.select_voice(text)
audio_data = self.voice_engine.synthesize(text, voice_name)
if audio_data:
audio_b64 = base64.b64encode(audio_data).decode("ascii")
try:
await self._send_to_rvs({
"type": "audio",
"payload": {
"base64": audio_b64,
"mimeType": "audio/wav",
"voice": voice_name,
},
"timestamp": int(asyncio.get_event_loop().time() * 1000),
})
logger.info("[rvs] TTS on-demand: %d bytes (%s)", len(audio_data), voice_name)
except Exception as e:
logger.warning("[rvs] TTS on-demand senden fehlgeschlagen: %s", e)
return
elif msg_type == "config": elif msg_type == "config":
# Konfiguration von App/Diagnostic empfangen + persistent speichern # Konfiguration von App/Diagnostic empfangen + persistent speichern
changed = False changed = False
@@ -1024,6 +1109,22 @@ class ARIABridge:
self.tts_enabled = bool(payload["ttsEnabled"]) self.tts_enabled = bool(payload["ttsEnabled"])
logger.info("[rvs] TTS %s", "aktiviert" if self.tts_enabled else "deaktiviert") logger.info("[rvs] TTS %s", "aktiviert" if self.tts_enabled else "deaktiviert")
changed = True changed = True
if "ttsEngine" in payload:
self.tts_engine_type = payload["ttsEngine"]
logger.info("[rvs] TTS-Engine: %s", self.tts_engine_type)
changed = True
if "xttsVoice" in payload:
self.xtts_voice = payload["xttsVoice"]
logger.info("[rvs] XTTS-Stimme: %s", self.xtts_voice)
changed = True
if "speedRamona" in payload:
self.voice_engine.speech_speed["ramona"] = max(0.3, min(2.0, float(payload["speedRamona"])))
logger.info("[rvs] Speed Ramona: %.1f", self.voice_engine.speech_speed["ramona"])
changed = True
if "speedThorsten" in payload:
self.voice_engine.speech_speed["thorsten"] = max(0.3, min(2.0, float(payload["speedThorsten"])))
logger.info("[rvs] Speed Thorsten: %.1f", self.voice_engine.speech_speed["thorsten"])
changed = True
# Persistent speichern in Shared Volume # Persistent speichern in Shared Volume
if changed: if changed:
try: try:
@@ -1032,6 +1133,10 @@ class ARIABridge:
"defaultVoice": self.voice_engine.default_voice, "defaultVoice": self.voice_engine.default_voice,
"highlightVoice": self.voice_engine.highlight_voice, "highlightVoice": self.voice_engine.highlight_voice,
"ttsEnabled": getattr(self, "tts_enabled", True), "ttsEnabled": getattr(self, "tts_enabled", True),
"ttsEngine": getattr(self, "tts_engine_type", "piper"),
"xttsVoice": getattr(self, "xtts_voice", ""),
"speedRamona": self.voice_engine.speech_speed.get("ramona", 1.0),
"speedThorsten": self.voice_engine.speech_speed.get("thorsten", 1.0),
} }
with open("/shared/config/voice_config.json", "w") as f: with open("/shared/config/voice_config.json", "w") as f:
json.dump(config_data, f, indent=2) json.dump(config_data, f, indent=2)
+153 -4
View File
@@ -201,8 +201,9 @@
<button class="btn secondary" onclick="toggleChatFullscreen()" id="btn-chat-fs" style="padding:4px 10px;font-size:11px;">Vollbild</button> <button class="btn secondary" onclick="toggleChatFullscreen()" id="btn-chat-fs" style="padding:4px 10px;font-size:11px;">Vollbild</button>
</div> </div>
<div class="chat-box" id="chat-box"></div> <div class="chat-box" id="chat-box"></div>
<div id="thinking-indicator" style="display:none;padding:6px 10px;font-size:12px;color:#FFD60A;background:#1E1E2E;border-radius:0 0 6px 6px;margin-top:-8px;margin-bottom:8px;"> <div id="thinking-indicator" style="display:none;padding:6px 10px;font-size:12px;color:#FFD60A;background:#1E1E2E;border-radius:0 0 6px 6px;margin-top:-8px;margin-bottom:8px;display:flex;align-items:center;justify-content:space-between;">
<span style="animation:pulse 1s infinite;">&#x1F4AD;</span> <span id="thinking-text">ARIA denkt...</span> <span><span style="animation:pulse 1s infinite;">&#x1F4AD;</span> <span id="thinking-text">ARIA denkt...</span></span>
<button class="btn secondary" onclick="cancelRequest()" style="padding:2px 10px;font-size:11px;color:#FF3B30;border-color:#FF3B30;">Abbrechen</button>
</div> </div>
<div class="input-row"> <div class="input-row">
<input type="text" id="chat-input" placeholder="Nachricht an ARIA..."> <input type="text" id="chat-input" placeholder="Nachricht an ARIA...">
@@ -400,6 +401,17 @@
<div class="settings-section"> <div class="settings-section">
<h2>Sprachausgabe</h2> <h2>Sprachausgabe</h2>
<div class="card" style="max-width:500px;"> <div class="card" style="max-width:500px;">
<!-- TTS Engine Auswahl -->
<div style="display:flex;align-items:center;gap:12px;margin-bottom:12px;">
<label style="color:#8888AA;font-size:12px;">TTS Engine:</label>
<select id="diag-tts-engine" onchange="sendVoiceConfig();toggleXTTSPanel()" style="background:#1E1E2E;color:#fff;border:1px solid #2A2A3E;border-radius:6px;padding:6px 10px;font-size:13px;">
<option value="piper">Piper (lokal, CPU, schnell)</option>
<option value="xtts">XTTS v2 (remote, GPU, natuerlich)</option>
</select>
</div>
<!-- Piper Stimmen (nur bei Engine=piper) -->
<div id="piper-panel">
<div style="display:flex;align-items:center;gap:12px;margin-bottom:12px;"> <div style="display:flex;align-items:center;gap:12px;margin-bottom:12px;">
<label style="color:#8888AA;font-size:12px;">Standard-Stimme:</label> <label style="color:#8888AA;font-size:12px;">Standard-Stimme:</label>
<select id="diag-default-voice" onchange="sendVoiceConfig()" style="background:#1E1E2E;color:#fff;border:1px solid #2A2A3E;border-radius:6px;padding:6px 10px;font-size:13px;"> <select id="diag-default-voice" onchange="sendVoiceConfig()" style="background:#1E1E2E;color:#fff;border:1px solid #2A2A3E;border-radius:6px;padding:6px 10px;font-size:13px;">
@@ -414,10 +426,68 @@
<option value="ramona">Ramona (weiblich)</option> <option value="ramona">Ramona (weiblich)</option>
</select> </select>
</div> </div>
<div style="display:flex;align-items:center;gap:12px;"> <div style="display:flex;align-items:center;gap:12px;margin-bottom:12px;">
<label style="color:#8888AA;font-size:12px;">TTS aktiv:</label> <label style="color:#8888AA;font-size:12px;">TTS aktiv:</label>
<label class="toggle"><input type="checkbox" id="diag-tts-enabled" checked onchange="sendVoiceConfig()"><span class="slider"></span></label> <label class="toggle"><input type="checkbox" id="diag-tts-enabled" checked onchange="sendVoiceConfig()"><span class="slider"></span></label>
</div> </div>
<div style="margin-bottom:4px;">
<label style="color:#8888AA;font-size:12px;">Ramona Speed: <span id="speed-ramona-label">1.0x</span></label>
</div>
<div style="display:flex;align-items:center;gap:8px;margin-bottom:12px;">
<span style="color:#555570;font-size:11px;">0.5x</span>
<input type="range" id="diag-speed-ramona" min="0.5" max="2.0" step="0.1" value="1.0"
oninput="document.getElementById('speed-ramona-label').textContent=this.value+'x'"
onchange="sendVoiceConfig()"
style="flex:1;accent-color:#0096FF;">
<span style="color:#555570;font-size:11px;">2.0x</span>
</div>
<div style="margin-bottom:4px;">
<label style="color:#8888AA;font-size:12px;">Thorsten Speed: <span id="speed-thorsten-label">1.0x</span></label>
</div>
<div style="display:flex;align-items:center;gap:8px;">
<span style="color:#555570;font-size:11px;">0.5x</span>
<input type="range" id="diag-speed-thorsten" min="0.5" max="2.0" step="0.1" value="1.0"
oninput="document.getElementById('speed-thorsten-label').textContent=this.value+'x'"
onchange="sendVoiceConfig()"
style="flex:1;accent-color:#0096FF;">
<span style="color:#555570;font-size:11px;">2.0x</span>
</div>
</div><!-- /piper-panel -->
<!-- XTTS Panel (nur bei Engine=xtts) -->
<div id="xtts-panel" style="display:none;">
<div style="display:flex;align-items:center;gap:12px;margin-bottom:12px;">
<label style="color:#8888AA;font-size:12px;">XTTS Stimme:</label>
<select id="diag-xtts-voice" onchange="sendVoiceConfig()" style="background:#1E1E2E;color:#fff;border:1px solid #2A2A3E;border-radius:6px;padding:6px 10px;font-size:13px;">
<option value="">Standard (XTTS Default)</option>
</select>
<button class="btn secondary" onclick="loadXTTSVoices()" style="padding:4px 10px;font-size:11px;">Laden</button>
</div>
<!-- Voice Cloning -->
<div style="background:#1E1E2E;border-radius:8px;padding:12px;margin-top:8px;">
<div style="color:#0096FF;font-size:13px;font-weight:600;margin-bottom:8px;">Stimme klonen</div>
<div style="color:#8888AA;font-size:11px;margin-bottom:8px;">
Lade ein oder mehrere Audio-Samples hoch (WAV/MP3, min. 6-10 Sekunden).
Mehrere Dateien werden automatisch zusammengefuegt.
</div>
<div style="margin-bottom:8px;">
<input type="text" id="xtts-clone-name" placeholder="Name fuer die Stimme..." style="background:#0D0D1A;border:1px solid #2A2A3E;border-radius:6px;padding:6px 10px;color:#fff;font-size:13px;width:100%;box-sizing:border-box;">
</div>
<div style="margin-bottom:8px;">
<input type="file" id="xtts-clone-files" accept="audio/*" multiple style="color:#8888AA;font-size:12px;">
</div>
<div style="display:flex;gap:8px;">
<button class="btn" onclick="uploadVoiceSamples()" style="flex:1;">Stimme erstellen</button>
</div>
<div id="xtts-clone-status" style="font-size:11px;color:#555570;margin-top:6px;"></div>
</div>
<!-- XTTS Status -->
<div style="margin-top:8px;font-size:11px;color:#555570;" id="xtts-status">
XTTS-Server: Nicht verbunden (starte xtts/ auf dem Gaming-PC)
</div>
</div>
</div> </div>
</div> </div>
@@ -642,10 +712,40 @@
return; return;
} }
if (msg.type === 'xtts_voices_list') {
const select = document.getElementById('diag-xtts-voice');
// Behalte erste Option (Default)
while (select.options.length > 1) select.remove(1);
for (const v of (msg.payload?.voices || [])) {
const opt = document.createElement('option');
opt.value = v.name;
opt.textContent = `${v.name} (${(v.size / 1024).toFixed(0)}KB)`;
select.appendChild(opt);
}
document.getElementById('xtts-status').textContent = `XTTS: ${msg.payload?.voices?.length || 0} Stimme(n) verfuegbar`;
document.getElementById('xtts-status').style.color = '#34C759';
return;
}
if (msg.type === 'xtts_voice_saved') {
document.getElementById('xtts-clone-status').textContent = `Stimme "${msg.payload?.name}" gespeichert!`;
document.getElementById('xtts-clone-status').style.color = '#34C759';
loadXTTSVoices(); // Liste neu laden
return;
}
if (msg.type === 'voice_config') { if (msg.type === 'voice_config') {
document.getElementById('diag-default-voice').value = msg.defaultVoice || 'ramona'; document.getElementById('diag-default-voice').value = msg.defaultVoice || 'ramona';
document.getElementById('diag-highlight-voice').value = msg.highlightVoice || 'thorsten'; document.getElementById('diag-highlight-voice').value = msg.highlightVoice || 'thorsten';
document.getElementById('diag-tts-enabled').checked = msg.ttsEnabled !== false; document.getElementById('diag-tts-enabled').checked = msg.ttsEnabled !== false;
const sr = msg.speedRamona || 1.0;
const st = msg.speedThorsten || 1.0;
document.getElementById('diag-speed-ramona').value = sr;
document.getElementById('speed-ramona-label').textContent = sr + 'x';
document.getElementById('diag-speed-thorsten').value = st;
document.getElementById('speed-thorsten-label').textContent = st + 'x';
document.getElementById('diag-tts-engine').value = msg.ttsEngine || 'piper';
document.getElementById('diag-xtts-voice').value = msg.xttsVoice || '';
toggleXTTSPanel();
return; return;
} }
@@ -1138,12 +1238,61 @@
}, 120000); }, 120000);
} }
// ── XTTS Panel ─────────────────────────────
function toggleXTTSPanel() {
const engine = document.getElementById('diag-tts-engine').value;
document.getElementById('piper-panel').style.display = engine === 'piper' ? 'block' : 'none';
document.getElementById('xtts-panel').style.display = engine === 'xtts' ? 'block' : 'none';
if (engine === 'xtts') loadXTTSVoices();
}
function loadXTTSVoices() {
sendToRVS_raw({ type: 'xtts_list_voices', payload: {}, timestamp: Date.now() });
}
async function uploadVoiceSamples() {
const name = document.getElementById('xtts-clone-name').value.trim();
const files = document.getElementById('xtts-clone-files').files;
if (!name) { alert('Bitte einen Namen eingeben'); return; }
if (!files || files.length === 0) { alert('Bitte Audio-Dateien auswaehlen'); return; }
document.getElementById('xtts-clone-status').textContent = `Lade ${files.length} Datei(en) hoch...`;
const samples = [];
for (const file of files) {
const buffer = await file.arrayBuffer();
const base64 = btoa(String.fromCharCode(...new Uint8Array(buffer)));
samples.push({ base64, name: file.name, size: file.size });
}
const totalSize = samples.reduce((s, f) => s + f.size, 0);
document.getElementById('xtts-clone-status').textContent =
`Sende ${samples.length} Sample(s) (${(totalSize / 1024).toFixed(0)}KB) an XTTS-Server...`;
sendToRVS_raw({
type: 'voice_upload',
payload: { name, samples },
timestamp: Date.now(),
});
}
// ── Abbrechen ──────────────────────────────
function cancelRequest() {
send({ action: 'cancel_request' });
updateThinkingIndicator({ activity: 'idle' });
addChat('error', 'Anfrage abgebrochen', 'system');
}
// ── Stimmen-Config ────────────────────────── // ── Stimmen-Config ──────────────────────────
function sendVoiceConfig() { function sendVoiceConfig() {
const defaultVoice = document.getElementById('diag-default-voice').value; const defaultVoice = document.getElementById('diag-default-voice').value;
const highlightVoice = document.getElementById('diag-highlight-voice').value; const highlightVoice = document.getElementById('diag-highlight-voice').value;
const ttsEnabled = document.getElementById('diag-tts-enabled').checked; const ttsEnabled = document.getElementById('diag-tts-enabled').checked;
send({ action: 'send_voice_config', defaultVoice, highlightVoice, ttsEnabled }); const speedRamona = parseFloat(document.getElementById('diag-speed-ramona').value);
const speedThorsten = parseFloat(document.getElementById('diag-speed-thorsten').value);
const ttsEngine = document.getElementById('diag-tts-engine').value;
const xttsVoice = document.getElementById('diag-xtts-voice').value;
send({ action: 'send_voice_config', defaultVoice, highlightVoice, ttsEnabled, speedRamona, speedThorsten, ttsEngine, xttsVoice });
} }
// ── Highlight-Trigger ──────────────────────── // ── Highlight-Trigger ────────────────────────
+47 -5
View File
@@ -355,6 +355,11 @@ function handleGatewayMessage(msg) {
broadcast({ type: "agent_activity", activity: "idle" }); broadcast({ type: "agent_activity", activity: "idle" });
pendingMessageTime = 0; // Watchdog: Antwort erhalten pendingMessageTime = 0; // Watchdog: Antwort erhalten
updateAgentActivity(); updateAgentActivity();
// Antwort in Backup-Log schreiben
try {
const entry = JSON.stringify({ ts: Date.now(), role: "assistant", text: text.slice(0, 2000), session: activeSessionKey }) + "\n";
fs.appendFileSync("/shared/config/chat_backup.jsonl", entry);
} catch {}
return; return;
} }
@@ -428,6 +433,12 @@ function sendToGateway(text, isPipeline) {
log("debug", "gateway", `RAW >>> ${payload}`); log("debug", "gateway", `RAW >>> ${payload}`);
gatewayWs.send(payload); gatewayWs.send(payload);
pendingMessageTime = Date.now(); // Watchdog: Nachricht gesendet pendingMessageTime = Date.now(); // Watchdog: Nachricht gesendet
// Nachricht sofort in Backup-Log schreiben (OpenClaw speichert erst nach Run-Ende)
try {
fs.mkdirSync("/shared/config", { recursive: true });
const entry = JSON.stringify({ ts: Date.now(), role: "user", text, session: activeSessionKey }) + "\n";
fs.appendFileSync("/shared/config/chat_backup.jsonl", entry);
} catch {}
log("info", "gateway", `chat.send [${reqId}]: "${text}"`); log("info", "gateway", `chat.send [${reqId}]: "${text}"`);
if (isPipeline) plog(`chat.send [${reqId}] an Gateway gesendet — warte auf ACK...`); if (isPipeline) plog(`chat.send [${reqId}] an Gateway gesendet — warte auf ACK...`);
@@ -1005,6 +1016,7 @@ function waitForMessage(ws, timeoutMs) {
let lastAgentActivity = Date.now(); let lastAgentActivity = Date.now();
let watchdogWarned = false; let watchdogWarned = false;
let watchdogFixAttempted = false;
let pendingMessageTime = 0; // Wann wurde die letzte Nachricht gesendet let pendingMessageTime = 0; // Wann wurde die letzte Nachricht gesendet
function updateAgentActivity() { function updateAgentActivity() {
@@ -1024,20 +1036,37 @@ setInterval(async () => {
broadcast({ type: "watchdog", status: "warning", waitingMs, message: "ARIA reagiert nicht — moeglicherweise stuck Run" }); broadcast({ type: "watchdog", status: "warning", waitingMs, message: "ARIA reagiert nicht — moeglicherweise stuck Run" });
} }
// Nach 5min: Auto-Fix anbieten // Nach 5min: doctor --fix
if (waitingMs > 300000 && watchdogWarned) { if (waitingMs > 300000 && watchdogWarned && !watchdogFixAttempted) {
watchdogFixAttempted = true;
log("error", "server", "Watchdog: 5min ohne Antwort — fuehre openclaw doctor --fix aus"); log("error", "server", "Watchdog: 5min ohne Antwort — fuehre openclaw doctor --fix aus");
broadcast({ type: "watchdog", status: "fixing", message: "Auto-Fix: openclaw doctor --fix" }); broadcast({ type: "watchdog", status: "fixing", message: "Auto-Fix: openclaw doctor --fix" });
try { try {
await dockerExec("aria-core", "openclaw doctor --fix 2>/dev/null || true"); await dockerExec("aria-core", "openclaw doctor --fix 2>/dev/null || true");
log("info", "server", "Watchdog: doctor --fix ausgefuehrt"); log("info", "server", "Watchdog: doctor --fix ausgefuehrt");
broadcast({ type: "watchdog", status: "fixed", message: "doctor --fix ausgefuehrt — sende Nachricht erneut" }); broadcast({ type: "watchdog", status: "fixed", message: "doctor --fix ausgefuehrt — warte auf Antwort..." });
} catch (err) { } catch (err) {
log("error", "server", `Watchdog: doctor --fix fehlgeschlagen: ${err.message}`); log("error", "server", `Watchdog: doctor --fix fehlgeschlagen: ${err.message}`);
broadcast({ type: "watchdog", status: "error", message: `Auto-Fix fehlgeschlagen: ${err.message}` });
} }
pendingMessageTime = 0; // Reset }
// Nach 8min: Container neustarten
if (waitingMs > 480000 && watchdogFixAttempted) {
log("error", "server", "Watchdog: 8min ohne Antwort — starte aria-core + aria-proxy neu");
broadcast({ type: "watchdog", status: "restarting", message: "Container-Restart: aria-core + aria-proxy" });
try {
const { execSync } = require("child_process");
execSync("docker restart aria-core aria-proxy", { timeout: 60000 });
log("info", "server", "Watchdog: Container neugestartet");
broadcast({ type: "watchdog", status: "restarted", message: "Container neugestartet — warte auf Gateway-Reconnect..." });
// Gateway wird sich automatisch neu verbinden
} catch (err) {
log("error", "server", `Watchdog: Container-Restart fehlgeschlagen: ${err.message}`);
broadcast({ type: "watchdog", status: "error", message: `Restart fehlgeschlagen: ${err.message}` });
}
pendingMessageTime = 0;
watchdogWarned = false; watchdogWarned = false;
watchdogFixAttempted = false;
} }
}, 30000); }, 30000);
@@ -1127,6 +1156,15 @@ wss.on("connection", (ws) => {
if (ws._sshSock) ws._sshSock.write(msg.data); if (ws._sshSock) ws._sshSock.write(msg.data);
} else if (msg.action === "live_ssh_close") { } else if (msg.action === "live_ssh_close") {
if (ws._sshSock) { ws._sshSock.end(); ws._sshSock = null; } if (ws._sshSock) { ws._sshSock.end(); ws._sshSock = null; }
} else if (msg.action === "cancel_request") {
// Laufende Anfrage abbrechen — doctor --fix beendet stuck runs
log("warn", "server", "Anfrage abgebrochen — fuehre doctor --fix aus");
pendingMessageTime = 0;
watchdogWarned = false;
watchdogFixAttempted = false;
if (pipelineActive) pipelineEnd(false, "Vom Benutzer abgebrochen");
broadcast({ type: "agent_activity", activity: "idle" });
dockerExec("aria-core", "openclaw doctor --fix 2>/dev/null || true").catch(() => {});
} else if (msg.action === "get_voice_config") { } else if (msg.action === "get_voice_config") {
handleGetVoiceConfig(ws); handleGetVoiceConfig(ws);
} else if (msg.action === "send_voice_config") { } else if (msg.action === "send_voice_config") {
@@ -1135,6 +1173,10 @@ wss.on("connection", (ws) => {
defaultVoice: msg.defaultVoice || "ramona", defaultVoice: msg.defaultVoice || "ramona",
highlightVoice: msg.highlightVoice || "thorsten", highlightVoice: msg.highlightVoice || "thorsten",
ttsEnabled: msg.ttsEnabled !== false, ttsEnabled: msg.ttsEnabled !== false,
ttsEngine: msg.ttsEngine || "piper",
xttsVoice: msg.xttsVoice || "",
speedRamona: msg.speedRamona || 1.0,
speedThorsten: msg.speedThorsten || 1.0,
}; };
try { try {
fs.mkdirSync("/shared/config", { recursive: true }); fs.mkdirSync("/shared/config", { recursive: true });
+30 -20
View File
@@ -1,26 +1,36 @@
# erledigt bildupload ghet noch nicht. # ARIA Issues & Features
# ende
# erledigt
sprachnachrichten werden nicht als zweite nachricht dargestellt, damit man weiß was man gesendet hat
# ende
## Erledigt
# erledigt cache leeren, bilder werden nicht neu geladen beim antippen. - [x] Bildupload funktioniert (Shared Volume /shared/uploads/)
autoload geht nicht - [x] Sprachnachrichten werden als Text angezeigt (STT → Chat-Bubble)
# ende - [x] Cache leeren + Auto-Download von Anhaengen
- [x] ARIA liest Nachrichten vor (TTS via Piper)
- [x] Autoscroll zur letzten Nachricht
- [x] Bilder im Chat groesser + Vollbild-Vorschau
- [x] Ohr-Button Absturz gefixt (LiveAudioStream entfernt, Phase 1 Placeholder)
- [x] Play-Button in ARIA-Nachrichten fuer Sprachwiedergabe
- [x] Chat-Suche in der App (Lupe in Statusleiste)
- [x] Watchdog mit Container-Restart (2min Warnung → 5min doctor --fix → 8min Restart)
- [x] Abbrechen-Button im Diagnostic Chat
- [x] Nachrichten Backup on-the-fly (/shared/config/chat_backup.jsonl)
- [x] Grosse Nachrichten satzweise aufteilen fuer TTS
- [x] RVS Nachrichten vom Smartphone gehen durch
- [x] Stimmen-Einstellungen (Ramona/Thorsten, Speed pro Stimme)
- [x] Highlight-Trigger konfigurierbar in Diagnostic
wenn man auf das ohr zum hören klickt stürzt ab ## Offen
# erledigt aria liest die nachrichten nicht vor ### TTS / Stimmen
#ende - [ ] TTS Engine waehlbar: Piper (CPU, schnell) oder Coqui XTTS v2 (GPU, natuerlicher)
- [ ] Piper Voices Download ueber Diagnostic (neue Sprachen/Stimmen)
- [ ] Coqui XTTS v2 Integration (braucht GPU, bessere deutsche Stimme)
# erledigt autoscroll geht doch noch nicht zur letzten nachricht ### App
unserer memory brain - [ ] Wake Word on-device (Porcupine "ARIA" Keyword, Phase 2)
# ende - [ ] Chat-History zuverlaessiger laden (AsyncStorage Race Condition)
# erledigt bilder im chat größer darstellen ### Architektur
# ende - [ ] Bilder: Claude Vision direkt nutzen (aktuell nur Dateipfad an ARIA)
- [ ] Auto-Compacting und Memory/Brain Verwaltung (SQLite?)
- [ ] Diagnostic: System-Info Tab (Container-Status, Disk, RAM, CPU)
die viper voices downloaden über die diagnostic
# ende
+17
View File
@@ -170,6 +170,22 @@ else
exit 1 exit 1
fi fi
# ── Auto-Update: APK auf RVS-Server kopieren ─
RVS_UPDATE_HOST="${RVS_UPDATE_HOST:-}"
if [ -n "$RVS_UPDATE_HOST" ]; then
echo -e "${GREEN}[6/6] APK auf RVS-Server kopieren (Auto-Update)...${NC}"
scp "$APK_PATH" "${RVS_UPDATE_HOST}:~/ARIA-AGENT/rvs/updates/${APK_NAME}" 2>/dev/null
if [ $? -eq 0 ]; then
echo -e " ${GREEN}${NC} APK auf RVS-Server kopiert — Apps werden benachrichtigt"
else
echo -e " ${YELLOW}APK konnte nicht auf RVS kopiert werden (RVS_UPDATE_HOST=$RVS_UPDATE_HOST)${NC}"
echo -e " ${YELLOW}Manuell: scp $APK_PATH $RVS_UPDATE_HOST:~/ARIA-AGENT/rvs/updates/${APK_NAME}${NC}"
fi
else
echo -e "${YELLOW}Auto-Update uebersprungen (RVS_UPDATE_HOST nicht gesetzt)${NC}"
echo -e "${YELLOW}Setze RVS_UPDATE_HOST in .env fuer automatische Verteilung${NC}"
fi
# ── Fertig ──────────────────────────────────── # ── Fertig ────────────────────────────────────
echo "" echo ""
echo -e "${GREEN}╔═══════════════════════════════════════════════════╗${NC}" echo -e "${GREEN}╔═══════════════════════════════════════════════════╗${NC}"
@@ -177,4 +193,5 @@ echo -e "${GREEN}║ Release $TAG ist live!$(printf '%*s' $((27 - ${#TAG})) ''
echo -e "${GREEN}╠═══════════════════════════════════════════════════╣${NC}" echo -e "${GREEN}╠═══════════════════════════════════════════════════╣${NC}"
echo -e "${GREEN}${NC} $GITEA_URL/$GITEA_REPO/releases/tag/$TAG" echo -e "${GREEN}${NC} $GITEA_URL/$GITEA_REPO/releases/tag/$TAG"
echo -e "${GREEN}${NC} APK: $APK_NAME ($APK_SIZE)" echo -e "${GREEN}${NC} APK: $APK_NAME ($APK_SIZE)"
echo -e "${GREEN}${NC} Auto-Update: ${RVS_UPDATE_HOST:-nicht konfiguriert}"
echo -e "${GREEN}╚═══════════════════════════════════════════════════╝${NC}" echo -e "${GREEN}╚═══════════════════════════════════════════════════╝${NC}"
+2
View File
@@ -4,5 +4,7 @@ services:
ports: ports:
- "${RVS_PORT:-443}:3000" - "${RVS_PORT:-443}:3000"
restart: always restart: always
volumes:
- ./updates:/updates # APK-Dateien fuer Auto-Update
environment: environment:
- MAX_SESSIONS=10 - MAX_SESSIONS=10
+113 -1
View File
@@ -1,15 +1,21 @@
"use strict"; "use strict";
const { WebSocketServer } = require("ws"); const { WebSocketServer } = require("ws");
const fs = require("fs");
const path = require("path");
// ── Konfiguration aus Umgebungsvariablen ──────────────────────────── // ── Konfiguration aus Umgebungsvariablen ────────────────────────────
const PORT = parseInt(process.env.PORT || "3000", 10); const PORT = parseInt(process.env.PORT || "3000", 10);
const MAX_SESSIONS = parseInt(process.env.MAX_SESSIONS || "10", 10); const MAX_SESSIONS = parseInt(process.env.MAX_SESSIONS || "10", 10);
const UPDATES_DIR = process.env.UPDATES_DIR || "/updates";
// Kein Polling — APK wird manuell per git pull bereitgestellt
// Erlaubte Nachrichtentypen — alles andere wird verworfen // Erlaubte Nachrichtentypen — alles andere wird verworfen
const ALLOWED_TYPES = new Set([ const ALLOWED_TYPES = new Set([
"chat", "audio", "file", "location", "mode", "log", "event", "heartbeat", "chat", "audio", "file", "location", "mode", "log", "event", "heartbeat",
"file_request", "file_response", "file_saved", "stt_result", "config", "file_request", "file_response", "file_saved", "stt_result", "config", "tts_request",
"xtts_request", "xtts_response", "xtts_list_voices", "xtts_voices_list", "voice_upload", "xtts_voice_saved",
"update_check", "update_available", "update_download", "update_data",
]); ]);
// Token-Raum: token -> { clients: Set<ws> } // Token-Raum: token -> { clients: Set<ws> }
@@ -46,6 +52,9 @@ const wss = new WebSocketServer({ port: PORT });
wss.on("listening", () => { wss.on("listening", () => {
log(`RVS läuft auf Port ${PORT} | Max Sessions: ${MAX_SESSIONS}`); log(`RVS läuft auf Port ${PORT} | Max Sessions: ${MAX_SESSIONS}`);
// Beim Start pruefen ob eine APK da ist
const apkInfo = getLatestAPK();
if (apkInfo) log(`APK bereit: v${apkInfo.version} (${(fs.statSync(apkInfo.path).size / 1024 / 1024).toFixed(1)}MB)`);
}); });
wss.on("connection", (ws, req) => { wss.on("connection", (ws, req) => {
@@ -107,6 +116,52 @@ function registerClient(ws, token) {
return; return;
} }
// Update-Check: direkt an den anfragenden Client antworten (nicht relay'en)
if (msg.type === "update_check") {
const clientVersion = msg.payload?.version || "0.0.0.0";
const apkInfo = getLatestAPK();
if (apkInfo && compareVersions(apkInfo.version, clientVersion) > 0) {
ws.send(JSON.stringify({
type: "update_available",
payload: {
version: apkInfo.version,
downloadUrl: `/update/latest.apk`,
size: fs.statSync(apkInfo.path).size,
},
timestamp: Date.now(),
}));
}
return;
}
// Update-Download: APK als Base64 ueber WebSocket senden
if (msg.type === "update_download") {
const apkInfo = getLatestAPK();
if (!apkInfo) {
ws.send(JSON.stringify({ type: "update_data", payload: { error: "Keine APK verfuegbar" }, timestamp: Date.now() }));
return;
}
try {
const data = fs.readFileSync(apkInfo.path);
const base64 = data.toString("base64");
const sizeMB = (data.length / 1024 / 1024).toFixed(1);
log(`APK sende: v${apkInfo.version} (${sizeMB}MB) an Client`);
ws.send(JSON.stringify({
type: "update_data",
payload: {
version: apkInfo.version,
base64,
size: data.length,
fileName: `ARIA-v${apkInfo.version}.apk`,
},
timestamp: Date.now(),
}));
} catch (err) {
ws.send(JSON.stringify({ type: "update_data", payload: { error: err.message }, timestamp: Date.now() }));
}
return;
}
// An alle anderen Clients im Raum weiterleiten // An alle anderen Clients im Raum weiterleiten
for (const client of room.clients) { for (const client of room.clients) {
if (client !== ws && client.readyState === 1) { if (client !== ws && client.readyState === 1) {
@@ -167,6 +222,63 @@ wss.on("close", () => {
clearInterval(cleanup); clearInterval(cleanup);
}); });
// ── Auto-Update: APK-Erkennung + Push ──────────────────────────────
let latestVersion = null;
function getLatestAPK() {
try {
if (!fs.existsSync(UPDATES_DIR)) return null;
const files = fs.readdirSync(UPDATES_DIR)
.filter(f => f.endsWith(".apk"))
.map(f => {
// ARIA-v0.0.2.3.apk oder ARIA-Cockpit-release.apk
const match = f.match(/(\d+\.\d+\.\d+[\.\d]*)/);
return { file: f, path: path.join(UPDATES_DIR, f), version: match ? match[1] : null };
})
.filter(f => f.version)
.sort((a, b) => compareVersions(b.version, a.version)); // Neueste zuerst
return files[0] || null;
} catch {
return null;
}
}
function compareVersions(a, b) {
const pa = a.split(".").map(Number);
const pb = b.split(".").map(Number);
for (let i = 0; i < Math.max(pa.length, pb.length); i++) {
const diff = (pa[i] || 0) - (pb[i] || 0);
if (diff !== 0) return diff;
}
return 0;
}
function notifyClientsAboutUpdate(apkInfo) {
const msg = JSON.stringify({
type: "update_available",
payload: {
version: apkInfo.version,
downloadUrl: `/update/latest.apk`,
size: fs.statSync(apkInfo.path).size,
},
timestamp: Date.now(),
});
// An alle Clients in allen Rooms senden
for (const [, room] of rooms) {
for (const client of room.clients) {
if (client.readyState === 1) {
client.send(msg);
}
}
}
log(`Update-Benachrichtigung gesendet: v${apkInfo.version} (${rooms.size} Raum/Raeume)`);
}
// Kein Polling — Update-Check passiert on-demand (update_check Message von App)
// ── Sauberes Herunterfahren ───────────────────────────────────────── // ── Sauberes Herunterfahren ─────────────────────────────────────────
process.on("SIGTERM", () => { process.on("SIGTERM", () => {
View File
+11
View File
@@ -0,0 +1,11 @@
# ════════════════════════════════════════════════
# ARIA XTTS v2 — Konfiguration
# Kopieren nach .env und anpassen
# ════════════════════════════════════════════════
# RVS Verbindung (gleiche Daten wie auf der ARIA-VM)
RVS_HOST=mobil.hacker-net.de
RVS_PORT=444
RVS_TLS=true
RVS_TLS_FALLBACK=true
RVS_TOKEN=dein_token_hier
+5
View File
@@ -0,0 +1,5 @@
FROM node:22-alpine
WORKDIR /app
COPY bridge.js package.json ./
RUN npm install --production
CMD ["node", "bridge.js"]
+268
View File
@@ -0,0 +1,268 @@
/**
* ARIA XTTS Bridge — Verbindet XTTS v2 Server mit dem RVS
*
* Empfaengt tts_request ueber RVS → rendert Audio via XTTS API → sendet zurueck
* Empfaengt voice_upload → speichert Voice-Sample fuer Cloning
* Empfaengt xtts_list_voices → listet verfuegbare Stimmen
*/
const WebSocket = require("ws");
const http = require("http");
const https = require("https");
const fs = require("fs");
const path = require("path");
const XTTS_API_URL = process.env.XTTS_API_URL || "http://xtts:8000";
const RVS_HOST = process.env.RVS_HOST || "";
const RVS_PORT = process.env.RVS_PORT || "443";
const RVS_TLS = process.env.RVS_TLS || "true";
const RVS_TLS_FALLBACK = process.env.RVS_TLS_FALLBACK || "true";
const RVS_TOKEN = process.env.RVS_TOKEN || "";
const VOICES_DIR = "/voices";
function log(msg) {
console.log(`[${new Date().toISOString()}] ${msg}`);
}
// ── RVS Verbindung ──────────────────────────────────
let rvsWs = null;
let retryDelay = 2;
function connectRVS(forcePlain) {
if (!RVS_HOST || !RVS_TOKEN) {
log("RVS nicht konfiguriert — beende");
process.exit(1);
}
const useTls = RVS_TLS === "true" && !forcePlain;
const proto = useTls ? "wss" : "ws";
const url = `${proto}://${RVS_HOST}:${RVS_PORT}?token=${RVS_TOKEN}`;
log(`Verbinde zu RVS: ${proto}://${RVS_HOST}:${RVS_PORT}`);
const ws = new WebSocket(url);
ws.on("open", () => {
log("RVS verbunden — warte auf TTS-Requests");
rvsWs = ws;
retryDelay = 2;
// Keepalive
setInterval(() => {
if (ws.readyState === WebSocket.OPEN) {
ws.ping();
ws.send(JSON.stringify({ type: "heartbeat", timestamp: Date.now() }));
}
}, 25000);
});
ws.on("message", async (raw) => {
try {
const msg = JSON.parse(raw.toString());
if (msg.type === "xtts_request") {
await handleTTSRequest(msg.payload);
} else if (msg.type === "voice_upload") {
await handleVoiceUpload(msg.payload);
} else if (msg.type === "xtts_list_voices") {
await handleListVoices();
}
} catch (err) {
log(`Fehler: ${err.message}`);
}
});
ws.on("close", () => {
log("RVS Verbindung geschlossen");
rvsWs = null;
setTimeout(() => connectRVS(), Math.min(retryDelay * 1000, 30000));
retryDelay = Math.min(retryDelay * 2, 30);
});
ws.on("error", (err) => {
log(`RVS Fehler: ${err.message}`);
if (useTls && RVS_TLS_FALLBACK === "true") {
log("TLS fehlgeschlagen — Fallback auf ws://");
ws.removeAllListeners();
try { ws.close(); } catch (_) {}
connectRVS(true);
}
});
}
// ── TTS Request Handler ─────────────────────────────
async function handleTTSRequest(payload) {
const { text, voice, requestId, language } = payload;
if (!text) return;
log(`TTS-Request: "${text.slice(0, 60)}..." (voice: ${voice || "default"}, lang: ${language || "de"})`);
try {
// Voice-Sample Pfad bestimmen
const voiceSample = voice ? path.join(VOICES_DIR, `${voice}.wav`) : null;
const hasCustomVoice = voiceSample && fs.existsSync(voiceSample);
// XTTS API aufrufen
const audioBuffer = await callXTTSAPI(text, language || "de", hasCustomVoice ? voiceSample : null);
if (audioBuffer && audioBuffer.length > 100) {
const base64 = audioBuffer.toString("base64");
log(`TTS fertig: ${audioBuffer.length} bytes (${(audioBuffer.length / 1024).toFixed(0)}KB)`);
sendToRVS({
type: "xtts_response",
payload: {
requestId: requestId || "",
base64,
mimeType: "audio/wav",
voice: voice || "default",
engine: "xtts",
},
timestamp: Date.now(),
});
} else {
log("TTS: Leeres Audio erhalten");
sendToRVS({
type: "xtts_response",
payload: { requestId, error: "Leeres Audio" },
timestamp: Date.now(),
});
}
} catch (err) {
log(`TTS Fehler: ${err.message}`);
sendToRVS({
type: "xtts_response",
payload: { requestId, error: err.message },
timestamp: Date.now(),
});
}
}
function callXTTSAPI(text, language, speakerWav) {
return new Promise((resolve, reject) => {
const body = JSON.stringify({
text,
language,
speaker_wav: speakerWav || "",
});
const url = new URL(`${XTTS_API_URL}/tts_to_audio/`);
const options = {
hostname: url.hostname,
port: url.port,
path: url.pathname,
method: "POST",
headers: {
"Content-Type": "application/json",
"Content-Length": Buffer.byteLength(body),
},
timeout: 60000,
};
const req = http.request(options, (res) => {
const chunks = [];
res.on("data", (chunk) => chunks.push(chunk));
res.on("end", () => {
if (res.statusCode === 200) {
resolve(Buffer.concat(chunks));
} else {
reject(new Error(`XTTS API HTTP ${res.statusCode}: ${Buffer.concat(chunks).toString().slice(0, 200)}`));
}
});
});
req.on("error", reject);
req.on("timeout", () => { req.destroy(); reject(new Error("XTTS API Timeout (60s)")); });
req.write(body);
req.end();
});
}
// ── Voice Upload Handler ────────────────────────────
async function handleVoiceUpload(payload) {
const { name, samples } = payload;
if (!name || !samples || !Array.isArray(samples) || samples.length === 0) {
log("Voice Upload: Ungueltige Daten");
return;
}
log(`Voice Upload: "${name}" (${samples.length} Samples)`);
try {
// Alle Samples zusammenfuegen
const buffers = samples.map(s => Buffer.from(s.base64, "base64"));
const combined = Buffer.concat(buffers);
// Als WAV speichern
fs.mkdirSync(VOICES_DIR, { recursive: true });
const filePath = path.join(VOICES_DIR, `${name.replace(/[^a-zA-Z0-9_-]/g, "_")}.wav`);
fs.writeFileSync(filePath, combined);
log(`Voice gespeichert: ${filePath} (${(combined.length / 1024).toFixed(0)}KB)`);
sendToRVS({
type: "xtts_voice_saved",
payload: { name, size: combined.length, path: filePath },
timestamp: Date.now(),
});
} catch (err) {
log(`Voice Upload Fehler: ${err.message}`);
}
}
// ── Voice List Handler ──────────────────────────────
async function handleListVoices() {
try {
const files = fs.existsSync(VOICES_DIR)
? fs.readdirSync(VOICES_DIR).filter(f => f.endsWith(".wav"))
: [];
const voices = files.map(f => ({
name: path.basename(f, ".wav"),
file: f,
size: fs.statSync(path.join(VOICES_DIR, f)).size,
}));
log(`Stimmen: ${voices.length} verfuegbar`);
sendToRVS({
type: "xtts_voices_list",
payload: { voices },
timestamp: Date.now(),
});
} catch (err) {
log(`Stimmen-Liste Fehler: ${err.message}`);
}
}
// ── RVS senden ──────────────────────────────────────
function sendToRVS(msg) {
if (rvsWs && rvsWs.readyState === WebSocket.OPEN) {
rvsWs.send(JSON.stringify(msg));
}
}
// ── Start ───────────────────────────────────────────
log("ARIA XTTS Bridge startet...");
log(`XTTS API: ${XTTS_API_URL}`);
log(`RVS: ${RVS_HOST}:${RVS_PORT}`);
// Warten bis XTTS API erreichbar ist
function waitForXTTS(callback, attempts) {
if (attempts <= 0) { log("XTTS API nicht erreichbar — starte trotzdem"); callback(); return; }
http.get(`${XTTS_API_URL}/docs`, (res) => {
log("XTTS API erreichbar");
callback();
}).on("error", () => {
log(`XTTS API noch nicht bereit — warte (${attempts} Versuche uebrig)...`);
setTimeout(() => waitForXTTS(callback, attempts - 1), 5000);
});
}
waitForXTTS(() => connectRVS(), 24); // Max 2min warten
+54
View File
@@ -0,0 +1,54 @@
# ════════════════════════════════════════════════
# ARIA XTTS v2 — GPU TTS Server
# Laeuft auf dem Gaming-PC (RTX 3060)
# Verbindet sich zum RVS fuer TTS-Requests
# ════════════════════════════════════════════════
#
# Voraussetzungen:
# - Docker Desktop mit WSL2
# - NVIDIA Container Toolkit
# - .env mit RVS-Verbindungsdaten
#
# Start: docker compose up -d
# Test: curl http://localhost:8000/docs
# ════════════════════════════════════════════════
services:
# ─── XTTS v2 API Server (GPU) ─────────────────
xtts:
image: ghcr.io/daswer123/xtts-api-server:latest
container_name: aria-xtts
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
ports:
- "8000:8000"
volumes:
- xtts-models:/root/.local/share/tts # Model-Cache (~2GB)
- ./voices:/voices # Custom Voice Samples
environment:
- COQUI_TOS_AGREED=1
restart: unless-stopped
# ─── XTTS Bridge (verbindet zu RVS) ───────────
xtts-bridge:
build: .
container_name: aria-xtts-bridge
depends_on:
- xtts
environment:
- XTTS_API_URL=http://xtts:8000
- RVS_HOST=${RVS_HOST}
- RVS_PORT=${RVS_PORT:-443}
- RVS_TLS=${RVS_TLS:-true}
- RVS_TLS_FALLBACK=${RVS_TLS_FALLBACK:-true}
- RVS_TOKEN=${RVS_TOKEN}
restart: unless-stopped
volumes:
xtts-models:
+8
View File
@@ -0,0 +1,8 @@
{
"name": "aria-xtts-bridge",
"version": "1.0.0",
"private": true,
"dependencies": {
"ws": "^8.16.0"
}
}