Compare commits

..

20 Commits

Author SHA1 Message Date
duffyduck d6a89168ef release: bump version to 0.0.2.4 2026-04-05 19:51:19 +02:00
duffyduck cb33a20694 docs: update README with XTTS, auto-update, watchdog, TTS settings
- Architecture: Added XTTS v2 (Gaming-PC) and auto-update flow
- Diagnostic: Thinking indicator, cancel button, TTS tab, voice cloning
- App: Play button, chat search, auto-update, voice speed settings
- RVS: Auto-update APK distribution over WebSocket
- Watchdog: 2min warning → 5min doctor --fix → 8min container restart
- Roadmap: Phase 1 fully completed, updated Phase 2+3

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 19:46:16 +02:00
duffyduck a242693751 feat: XTTS v2 integration, auto-update system, TTS engine abstraction
- XTTS v2: Docker setup for Gaming-PC (GPU), bridge via RVS relay
- XTTS: Voice cloning UI in Diagnostic (multi-file upload)
- XTTS: Engine selectable (Piper local vs XTTS remote) with fallback
- Auto-Update: RVS serves APK over WebSocket (no HTTP needed)
- Auto-Update: App checks version on start, prompts install
- Auto-Update: release.sh copies APK to RVS via scp
- Bridge: TTS engine abstraction (piper/xtts), config persistent
- Bridge: xtts_response handler, tts_request on-demand
- Diagnostic: TTS engine dropdown, XTTS voice panel, voice cloning
- App: Play button on ARIA messages, chat search, update service
- Wake word: Disabled LiveAudioStream (crash fix), Phase 1 placeholder
- Watchdog: Container restart after 8min stuck
- Chat backup: on-the-fly to /shared/config/chat_backup.jsonl

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 19:42:10 +02:00
duffyduck 81ca3cc7a7 Ohr-Button Absturz gefixt (LiveAudioStream entfernt, Phase 1 , Play-Button in ARIA-Nachrichten fuer Sprachwiedergabe
- [x] Chat-Suche in der App (Lupe in Statusleiste)
- [x] Watchdog mit Container-Restart (2min Warnung → 5min doctor --fix → 8min Restart),Abbrechen-Button im Diagnostic Chat
- [x] Nachrichten Backup on-the-fly (/shared/config/chat_backup.jsonl)
- [x] Grosse Nachrichten satzweise aufteilen fuer TTS
- [x] RVS Nachrichten vom Smartphone gehen durch
2026-04-01 23:45:25 +02:00
duffyduck 1a32098c9e release: bump version to 0.0.2.3 2026-04-01 23:45:15 +02:00
duffyduck fa4c32270b sst immer 2026-03-29 19:18:41 +02:00
duffyduck 9c43b875f4 release: bump version to 0.0.2.2 2026-03-29 19:04:31 +02:00
duffyduck 63560e290b two speed 2026-03-29 19:03:40 +02:00
duffyduck 1ab8a6a2fe addes speed config for voice 2026-03-29 18:50:09 +02:00
duffyduck a2c0196e05 release: bump version to 0.0.2.1 2026-03-29 18:49:37 +02:00
duffyduck 680f7a64e2 slpit setnteces 2026-03-29 18:42:24 +02:00
duffyduck 4893616a5a playback issue 2026-03-29 18:36:00 +02:00
duffyduck 04e8c0245d voiice settings permanent 2026-03-29 18:23:31 +02:00
duffyduck 10cefaf1cd changed connection model 2026-03-29 18:12:26 +02:00
duffyduck adbb1fe80a changed docker file 2026-03-29 17:46:27 +02:00
duffyduck 79c50aedcc release: bump version to 0.0.2.0 2026-03-29 17:42:23 +02:00
duffyduck eb72b35e23 added voice settings in adroid app and diagnostic, higlight trigger in app und diagnostic
change voicec
2026-03-29 17:41:28 +02:00
duffyduck bbd02d46a6 changed issue md 2026-03-29 17:28:40 +02:00
duffyduck 3d3c8ce973 fixed tts format, added trigger words settings 2026-03-29 17:27:43 +02:00
duffyduck 562f929056 added setting for states and voices in setting diagnostic, added states in diagnostic, added watchdog and debug tts do diagnostic 2026-03-29 17:12:25 +02:00
23 changed files with 1772 additions and 208 deletions
+1
View File
@@ -36,6 +36,7 @@ android/local.properties
android/package-lock.json
*.apk
*.aab
rvs/updates/*.apk
# ── Tauri / Desktop Build ───────────────────────
desktop/src-tauri/target/
+114 -21
View File
@@ -29,11 +29,18 @@ ARIA hat zwei Rollen:
┌─────────────────────────────────────────────────────────┐
│ RVS — Rendezvous-Server │
│ Node.js WebSocket Relay (Docker, Rechenzentrum) │
│ Reiner Relay — kennt keine Tokens, leitet durch
│ Relay + Auto-Update (APK-Verteilung)
│ rvs/docker-compose.yml │
└───────────────────────┬─────────────────────────────────┘
│ WebSocket Tunnel
└───────────┬───────────────────────────┬─────────────────┘
│ WebSocket Tunnel │ WebSocket Tunnel
┌───────────────────────────┐
│ Gaming-PC (optional) │
│ RTX 3060, Docker+WSL2 │
│ XTTS v2 (natuerliche │
│ Stimmen, Voice Cloning) │
│ xtts/docker-compose.yml │
└───────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ ARIA-VM (Proxmox, Debian 13) — ARIAs Wohnung │
│ Basissystem + Docker. Rest richtet ARIA selbst ein. │
@@ -66,13 +73,14 @@ ARIA hat zwei Rollen:
└─────────────────────────────────────────────────────────┘
```
**Drei separate Deployments:**
**Vier separate Deployments:**
| Was | Wo | Wie |
|-----|----|-----|
| RVS | Rechenzentrum | `cd rvs && docker compose up -d` |
| ARIA Core | Debian 13 VM | `docker compose up -d && ./aria-setup.sh` |
| Android App | Stefans Handy | APK installieren, QR-Code scannen |
| XTTS v2 (optional) | Gaming-PC (GPU) | `cd xtts && docker compose up -d` |
| Android App | Stefans Handy | APK installieren (Auto-Update via RVS) |
---
@@ -314,13 +322,19 @@ Erreichbar unter `http://<VM-IP>:3001`. Teilt das Netzwerk mit aria-core.
### Features
- **Status-Karten**: Gateway (Handshake), RVS (TLS-Fallback), Proxy (Auth)
- **Chat-Test**: Nachrichten direkt an ARIA senden (Gateway oder via RVS)
- **Chat-Test**: Nachrichten direkt an ARIA senden (Gateway oder via RVS), Vollbild-Modus
- **"ARIA denkt..." Indikator**: Zeigt live was ARIA gerade tut (Denken, Tool, Schreiben)
- **Abbrechen-Button**: Stoppt laufende Anfragen + doctor --fix
- **Session-Verwaltung**: Sessions auflisten, wechseln, erstellen, loeschen
- **Chat-History**: Wird beim Laden und Session-Wechsel angezeigt (read-only aus JSONL)
- **TTS-Diagnose Tab**: Stimmen testen, Status pruefen, Fehler anzeigen
- **Einstellungen**: TTS-Engine (Piper/XTTS), Stimmen, Speed, Highlight-Trigger, Betriebsmodi
- **XTTS Voice Cloning**: Audio-Samples hochladen, eigene Stimme erstellen
- **Claude Login**: Browser-Terminal zum Einloggen in den Proxy
- **Core Terminal**: Shell in aria-core (openclaw CLI)
- **Container-Logs**: Echtzeit-Logs aller Container (gefiltert nach Tab)
- **Container-Logs**: Echtzeit-Logs aller Container (gefiltert nach Tab + Pipeline)
- **SSH Terminal**: Direkter SSH-Zugang zu aria-wohnung
- **Watchdog**: Erkennt stuck Runs (2min Warnung → 5min doctor --fix → 8min Container-Restart)
### Session-Verwaltung
@@ -340,10 +354,13 @@ API-Endpoint fuer andere Services: `GET http://localhost:3001/api/session`
- **Sprachaufnahme**: Push-to-Talk (halten) oder Tap-to-Talk (tippen, Auto-Stop bei Stille)
- **VAD (Voice Activity Detection)**: Erkennt 1.8s Stille und stoppt automatisch
- **STT (Speech-to-Text)**: Audio wird in der Bridge per Whisper transkribiert, transkribierter Text erscheint im Chat
- **Wake Word**: Toggle-Button (Ohr-Symbol) aktiviert kontinuierliches Mikrofon-Monitoring
- **TTS-Wiedergabe**: ARIA antwortet per Lautsprecher (Ramona/Thorsten)
- **Datei- und Bild-Upload**: Bilder inline im Chat, Dateien mit Icon + Name + Groesse
- **Anhaenge**: Bridge speichert Dateien in Shared Volume (`/shared/uploads/`), ARIA kann darauf zugreifen
- **TTS-Wiedergabe**: ARIA antwortet per Lautsprecher (Piper oder XTTS v2)
- **Play-Button**: Jede ARIA-Nachricht kann nochmal vorgelesen werden
- **Chat-Suche**: Lupe in der Statusleiste filtert Nachrichten live
- **Datei- und Bild-Upload**: Bilder inline im Chat (Vollbild-Tap), Dateien mit Icon + Name + Groesse
- **Anhaenge**: Bridge speichert in Shared Volume, ARIA kann darauf zugreifen, Re-Download ueber RVS
- **Einstellungen**: TTS Engine, Stimmen, Speed pro Stimme, Speicherort, Auto-Download, GPS
- **Auto-Update**: Prueft beim Start auf neue Version, Download + Installation ueber RVS
- GPS-Position (optional)
- QR-Code Scanner fuer Token-Pairing
@@ -374,19 +391,31 @@ cd android
```
Das Script macht alles in einem Schritt:
1. Fragt Gitea-Kennwort ab (wird nirgends gespeichert)
2. Baut die Release-APK
3. Erstellt Git Tag + pusht
4. Erstellt Gitea Release
5. Laedt APK als Asset hoch
1. Setzt Versionsnummern (package.json, build.gradle, SettingsScreen)
2. Fragt Gitea-Kennwort ab (wird nirgends gespeichert)
3. Baut die Release-APK
4. Git Commit + Tag + Push
5. Erstellt Gitea Release + laedt APK hoch
6. Kopiert APK auf RVS-Server (Auto-Update, optional)
Voraussetzung in `.env`:
```bash
GITEA_URL=https://gitea.hackersoft.de
GITEA_REPO=stefan/aria-agent
GITEA_USER=stefan
RVS_UPDATE_HOST=root@aria-rvs # Optional: fuer Auto-Update
```
### Auto-Update
Die App prueft beim Start ob eine neuere Version auf dem RVS liegt.
Der Update-Flow:
1. `./release.sh 0.0.3.0` → APK wird auf RVS kopiert (via scp)
2. Alternativ: `git pull` auf dem RVS-Server → APK in `rvs/updates/`
3. App sendet `update_check` mit aktueller Version
4. RVS vergleicht → sendet `update_available`
5. App zeigt Dialog → Download ueber WebSocket → Installation
### Audio-Pipeline (Spracheingabe)
```
@@ -454,6 +483,11 @@ aria-data/
│ ├── aria.env ← Voice Bridge Config
│ └── diag-state/ ← Diagnostic persistenter State
│ (im Shared Volume /shared/config/):
│ ├── voice_config.json ← TTS-Einstellungen (Stimme, Speed, Engine)
│ ├── highlight_triggers.json ← Highlight-Trigger Woerter
│ └── chat_backup.jsonl ← Nachrichten-Backup (on-the-fly)
└── ssh/ ← SSH Keys fuer VM-Zugriff
├── id_ed25519 ← Private Key (generiert von aria-setup.sh)
├── id_ed25519.pub ← Public Key (muss in VM authorized_keys!)
@@ -469,7 +503,7 @@ tar -czf aria-backup-$(date +%Y%m%d).tar.gz aria-data/
## RVS — Rendezvous-Server
Laeuft im Rechenzentrum. Reiner Relay — kennt keine Tokens, speichert nichts.
Laeuft im Rechenzentrum. WebSocket Relay + Auto-Update Server.
Wer sich mit dem gleichen Token verbindet, landet im gleichen Room.
```bash
@@ -477,10 +511,60 @@ cd rvs
docker compose up -d
```
**Features:**
- WebSocket Relay (alle Message-Types: chat, audio, file, config, xtts, update, etc.)
- Auto-Update: APK-Verteilung an Apps ueber WebSocket
- Heartbeat + tote Verbindungen aufraeumen
**Auto-Update APK bereitstellen:**
```bash
# APK in updates/ legen (manuell oder via release.sh)
cp ARIA-v0.0.3.0.apk ~/ARIA-AGENT/rvs/updates/
# RVS erkennt die Version aus dem Dateinamen
```
**Multi-Instanz:** Mehrere ARIA-VMs koennen denselben RVS nutzen — jede mit eigenem Token.
---
## XTTS v2 — GPU TTS Server (optional)
Laeuft auf einem separaten Rechner mit NVIDIA GPU (z.B. Gaming-PC mit RTX 3060).
Verbindet sich ueber RVS mit der ARIA-Infrastruktur — kein VPN noetig.
### Voraussetzungen
- Docker Desktop mit WSL2 (Windows) oder Docker mit NVIDIA Runtime (Linux)
- NVIDIA Container Toolkit
- GPU mit mindestens 4GB VRAM (6GB+ empfohlen)
### Setup
```bash
cd xtts
cp .env.example .env
# .env mit RVS-Verbindungsdaten fuellen (gleiche wie auf der ARIA-VM)
docker compose up -d
# Erster Start laedt ~2GB Model herunter
```
### Features
- **Natuerliche Stimmen**: Deutlich bessere Qualitaet als Piper
- **Voice Cloning**: Eigene Stimme mit 6-10s Audio-Sample
- **16 Sprachen**: Deutsch, Englisch, Franzoesisch, etc.
- **RVS-Integration**: Bridge waehlt automatisch XTTS wenn verfuegbar
### Stimme klonen
In der Diagnostic unter Einstellungen → Sprachausgabe → XTTS:
1. TTS Engine auf "XTTS v2" stellen
2. "Stimme klonen" → Audio-Dateien hochladen (WAV/MP3, min. 6-10s)
3. Name vergeben → "Stimme erstellen"
4. Neue Stimme in der Auswahl verfuegbar
---
## Docker Volumes
| Volume | Pfad im Container | Zweck |
@@ -491,7 +575,7 @@ docker compose up -d
| `./aria-data/ssh` (bind) | `/root/.ssh`, `/home/node/.ssh` | SSH Keys |
| `./aria-data/brain` (bind) | `/home/node/.openclaw/workspace/memory` | Gedaechtnis |
| `./aria-data/skills` (bind) | `/home/node/.openclaw/workspace/skills` | Skills |
| `aria-shared` | `/shared` (Core + Bridge) | Datei-Austausch (Uploads von App) |
| `aria-shared` | `/shared` (Core + Bridge + Proxy + Diag) | Datei-Austausch, Config, Uploads |
| `./aria-data/config/diag-state` (bind) | `/data` (Diagnostic) | Persistenter State (aktive Session) |
---
@@ -569,8 +653,15 @@ docker exec aria-core ssh aria-wohnung hostname
- [x] Android App (Chat + Sprache + Uploads)
- [x] Tool-Permissions (alle Tools freigeschaltet)
- [x] SSH-Zugriff auf VM (aria-wohnung)
- [x] Diagnostic Web-UI
- [x] Diagnostic Web-UI + Einstellungen
- [x] Session-Verwaltung + Chat-History
- [x] Stimmen-Einstellungen (Ramona/Thorsten, Speed, Highlight-Trigger)
- [x] TTS satzweise fuer lange Texte
- [x] Datei-/Bild-Upload mit Shared Volume
- [x] Watchdog (stuck Run Erkennung + Auto-Fix + Container-Restart)
- [x] Auto-Update System (APK via RVS)
- [x] Chat-Suche, Play-Button, Abbrechen-Button
- [x] XTTS v2 Integration (GPU, Voice Cloning, remote ueber RVS)
### Phase 2 — ARIA wird produktiv
@@ -578,7 +669,8 @@ docker exec aria-core ssh aria-wohnung hostname
- [ ] Gitea-Integration
- [ ] VM einrichten (Desktop, Browser, Tools)
- [ ] Heartbeat (periodische Selbst-Checks)
- [ ] Lokales LLM als Wächter (Triage vor Claude-Call)
- [ ] Lokales LLM als Waechter (Triage vor Claude-Call)
- [ ] Auto-Compacting / Memory-Verwaltung
### Phase 3 — Erweiterungen
@@ -586,3 +678,4 @@ docker exec aria-core ssh aria-wohnung hostname
- [ ] Desktop Client (Tauri)
- [ ] bKVM Remote IT-Support
- [ ] Porcupine Wake Word (on-device "ARIA" in der App)
- [ ] Claude Vision direkt (Bildanalyse ohne Dateipfad-Umweg)
+2 -2
View File
@@ -79,8 +79,8 @@ android {
applicationId "com.ariacockpit"
minSdkVersion rootProject.ext.minSdkVersion
targetSdkVersion rootProject.ext.targetSdkVersion
versionCode 109
versionName "0.0.1.9"
versionCode 204
versionName "0.0.2.4"
// Fallback fuer Libraries mit Product Flavors
missingDimensionStrategy 'react-native-camera', 'general'
}
+2 -3
View File
@@ -1,6 +1,6 @@
{
"name": "aria-cockpit",
"version": "0.0.1.9",
"version": "0.0.2.4",
"private": true,
"scripts": {
"android": "react-native run-android",
@@ -24,8 +24,7 @@
"react-native-camera-kit": "^13.0.0",
"@react-native-async-storage/async-storage": "^1.21.0",
"react-native-fs": "^2.20.0",
"react-native-audio-recorder-player": "^3.6.7",
"react-native-live-audio-stream": "^1.1.1"
"react-native-audio-recorder-player": "^3.6.7"
},
"devDependencies": {
"typescript": "^5.3.3",
+70 -1
View File
@@ -23,6 +23,7 @@ import RNFS from 'react-native-fs';
import rvs, { RVSMessage, ConnectionState } from '../services/rvs';
import audioService from '../services/audio';
import wakeWordService from '../services/wakeword';
import updateService from '../services/updater';
import VoiceButton from '../components/VoiceButton';
import FileUpload, { FileData } from '../components/FileUpload';
import CameraUpload, { PhotoData } from '../components/CameraUpload';
@@ -91,6 +92,8 @@ const ChatScreen: React.FC = () => {
const [gpsEnabled, setGpsEnabled] = useState(false);
const [wakeWordActive, setWakeWordActive] = useState(false);
const [fullscreenImage, setFullscreenImage] = useState<string | null>(null);
const [searchQuery, setSearchQuery] = useState('');
const [searchVisible, setSearchVisible] = useState(false);
const flatListRef = useRef<FlatList>(null);
const messageIdCounter = useRef(0);
@@ -260,6 +263,16 @@ const ChatScreen: React.FC = () => {
};
}, []);
// Auto-Update: Bei App-Start pruefen
useEffect(() => {
const unsubUpdate = updateService.onUpdateAvailable((info) => {
updateService.promptUpdate(info);
});
// Nach 5s pruefen (RVS muss erst verbunden sein)
const timer = setTimeout(() => updateService.checkForUpdate(), 5000);
return () => { unsubUpdate(); clearTimeout(timer); };
}, []);
// Wake Word: "ARIA" Erkennung → Auto-Aufnahme starten
useEffect(() => {
const unsubWake = wakeWordService.onWakeWord(async () => {
@@ -581,6 +594,18 @@ const ChatScreen: React.FC = () => {
{item.text}
</Text>
)}
{/* Play-Button fuer ARIA-Nachrichten */}
{!isUser && item.text.length > 0 && (
<TouchableOpacity
style={styles.playButton}
onPress={() => {
// TTS-Request an Bridge senden
rvs.send('tts_request' as any, { text: item.text, voice: '' });
}}
>
<Text style={styles.playButtonText}>{'\uD83D\uDD0A'}</Text>
</TouchableOpacity>
)}
<Text style={styles.timestamp}>{time}</Text>
</View>
);
@@ -603,12 +628,32 @@ const ChatScreen: React.FC = () => {
{connectionState === 'connected' ? 'Verbunden' :
connectionState === 'connecting' ? 'Verbinde...' : 'Getrennt'}
</Text>
<TouchableOpacity onPress={() => setSearchVisible(!searchVisible)} style={{marginLeft: 'auto', paddingHorizontal: 8}}>
<Text style={{fontSize: 16}}>{'\uD83D\uDD0D'}</Text>
</TouchableOpacity>
</View>
{/* Suchleiste */}
{searchVisible && (
<View style={styles.searchBar}>
<TextInput
style={styles.searchInput}
value={searchQuery}
onChangeText={setSearchQuery}
placeholder="Chat durchsuchen..."
placeholderTextColor="#555570"
autoFocus
/>
<TouchableOpacity onPress={() => { setSearchVisible(false); setSearchQuery(''); }}>
<Text style={{color: '#FF3B30', fontSize: 14, paddingHorizontal: 8}}>X</Text>
</TouchableOpacity>
</View>
)}
{/* Nachrichtenliste */}
<FlatList
ref={flatListRef}
data={messages}
data={searchQuery ? messages.filter(m => m.text.toLowerCase().includes(searchQuery.toLowerCase())) : messages}
keyExtractor={item => item.id}
renderItem={renderMessage}
contentContainerStyle={styles.messageList}
@@ -887,6 +932,30 @@ const styles = StyleSheet.create({
wakeWordIcon: {
fontSize: 16,
},
searchBar: {
flexDirection: 'row',
alignItems: 'center',
backgroundColor: '#12122A',
paddingHorizontal: 12,
paddingVertical: 6,
borderBottomWidth: 1,
borderBottomColor: '#1E1E2E',
},
searchInput: {
flex: 1,
color: '#FFFFFF',
fontSize: 14,
paddingVertical: 4,
},
playButton: {
alignSelf: 'flex-end',
paddingHorizontal: 8,
paddingVertical: 2,
marginTop: 4,
},
playButtonText: {
fontSize: 16,
},
fullscreenOverlay: {
flex: 1,
backgroundColor: 'rgba(0,0,0,0.95)',
+63 -5
View File
@@ -74,6 +74,8 @@ const SettingsScreen: React.FC = () => {
const [ttsEnabled, setTtsEnabled] = useState(true);
const [defaultVoice, setDefaultVoice] = useState('ramona');
const [highlightVoice, setHighlightVoice] = useState('thorsten');
const [speedRamona, setSpeedRamona] = useState(1.0);
const [speedThorsten, setSpeedThorsten] = useState(1.0);
const [editingPath, setEditingPath] = useState(false);
const [tempPath, setTempPath] = useState('');
@@ -103,6 +105,12 @@ const SettingsScreen: React.FC = () => {
AsyncStorage.getItem('aria_highlight_voice').then(saved => {
if (saved) setHighlightVoice(saved);
});
AsyncStorage.getItem('aria_speed_ramona').then(saved => {
if (saved) setSpeedRamona(parseFloat(saved));
});
AsyncStorage.getItem('aria_speed_thorsten').then(saved => {
if (saved) setSpeedThorsten(parseFloat(saved));
});
}, []);
// Speichergroesse berechnen
@@ -482,7 +490,7 @@ const SettingsScreen: React.FC = () => {
<View style={{flexDirection: 'row', gap: 8, marginTop: 8}}>
<TouchableOpacity
style={[styles.voiceBtn, defaultVoice === 'ramona' && styles.voiceBtnActive]}
onPress={() => { setDefaultVoice('ramona'); AsyncStorage.setItem('aria_default_voice', 'ramona'); }}
onPress={() => { setDefaultVoice('ramona'); AsyncStorage.setItem('aria_default_voice', 'ramona'); rvs.send('config' as any, { defaultVoice: 'ramona' }); }}
>
<Text style={styles.voiceBtnIcon}>{'\uD83D\uDE4E\u200D\u2640\uFE0F'}</Text>
<Text style={[styles.voiceBtnText, defaultVoice === 'ramona' && styles.voiceBtnTextActive]}>Ramona</Text>
@@ -490,7 +498,7 @@ const SettingsScreen: React.FC = () => {
</TouchableOpacity>
<TouchableOpacity
style={[styles.voiceBtn, defaultVoice === 'thorsten' && styles.voiceBtnActive]}
onPress={() => { setDefaultVoice('thorsten'); AsyncStorage.setItem('aria_default_voice', 'thorsten'); }}
onPress={() => { setDefaultVoice('thorsten'); AsyncStorage.setItem('aria_default_voice', 'thorsten'); rvs.send('config' as any, { defaultVoice: 'thorsten' }); }}
>
<Text style={styles.voiceBtnIcon}>{'\uD83E\uDDD4'}</Text>
<Text style={[styles.voiceBtnText, defaultVoice === 'thorsten' && styles.voiceBtnTextActive]}>Thorsten</Text>
@@ -506,14 +514,14 @@ const SettingsScreen: React.FC = () => {
<View style={{flexDirection: 'row', gap: 8, marginTop: 8}}>
<TouchableOpacity
style={[styles.voiceBtn, highlightVoice === 'thorsten' && styles.voiceBtnActive]}
onPress={() => { setHighlightVoice('thorsten'); AsyncStorage.setItem('aria_highlight_voice', 'thorsten'); }}
onPress={() => { setHighlightVoice('thorsten'); AsyncStorage.setItem('aria_highlight_voice', 'thorsten'); rvs.send('config' as any, { highlightVoice: 'thorsten' }); }}
>
<Text style={styles.voiceBtnIcon}>{'\uD83E\uDDD4'}</Text>
<Text style={[styles.voiceBtnText, highlightVoice === 'thorsten' && styles.voiceBtnTextActive]}>Thorsten</Text>
</TouchableOpacity>
<TouchableOpacity
style={[styles.voiceBtn, highlightVoice === 'ramona' && styles.voiceBtnActive]}
onPress={() => { setHighlightVoice('ramona'); AsyncStorage.setItem('aria_highlight_voice', 'ramona'); }}
onPress={() => { setHighlightVoice('ramona'); AsyncStorage.setItem('aria_highlight_voice', 'ramona'); rvs.send('config' as any, { highlightVoice: 'ramona' }); }}
>
<Text style={styles.voiceBtnIcon}>{'\uD83D\uDE4E\u200D\u2640\uFE0F'}</Text>
<Text style={[styles.voiceBtnText, highlightVoice === 'ramona' && styles.voiceBtnTextActive]}>Ramona</Text>
@@ -521,6 +529,56 @@ const SettingsScreen: React.FC = () => {
</View>
</View>
{/* Sprechgeschwindigkeit Ramona */}
<View style={{marginTop: 16}}>
<Text style={styles.toggleLabel}>Ramona Speed: {speedRamona.toFixed(1)}x</Text>
<View style={{flexDirection: 'row', justifyContent: 'space-around', marginTop: 8}}>
{[0.5, 0.75, 1.0, 1.25, 1.5, 2.0].map(speed => (
<TouchableOpacity
key={speed}
onPress={() => {
setSpeedRamona(speed);
AsyncStorage.setItem('aria_speed_ramona', String(speed));
rvs.send('config' as any, { speedRamona: speed });
}}
style={{
paddingHorizontal: 10, paddingVertical: 6, borderRadius: 6,
backgroundColor: speedRamona === speed ? '#0096FF' : '#1E1E2E',
}}
>
<Text style={{color: speedRamona === speed ? '#fff' : '#8888AA', fontSize: 12, fontWeight: '600'}}>
{speed}x
</Text>
</TouchableOpacity>
))}
</View>
</View>
{/* Sprechgeschwindigkeit Thorsten */}
<View style={{marginTop: 16}}>
<Text style={styles.toggleLabel}>Thorsten Speed: {speedThorsten.toFixed(1)}x</Text>
<View style={{flexDirection: 'row', justifyContent: 'space-around', marginTop: 8}}>
{[0.5, 0.75, 1.0, 1.25, 1.5, 2.0].map(speed => (
<TouchableOpacity
key={speed}
onPress={() => {
setSpeedThorsten(speed);
AsyncStorage.setItem('aria_speed_thorsten', String(speed));
rvs.send('config' as any, { speedThorsten: speed });
}}
style={{
paddingHorizontal: 10, paddingVertical: 6, borderRadius: 6,
backgroundColor: speedThorsten === speed ? '#0096FF' : '#1E1E2E',
}}
>
<Text style={{color: speedThorsten === speed ? '#fff' : '#8888AA', fontSize: 12, fontWeight: '600'}}>
{speed}x
</Text>
</TouchableOpacity>
))}
</View>
</View>
{/* Highlight-Trigger Info */}
<View style={{marginTop: 16, padding: 10, backgroundColor: '#1E1E2E', borderRadius: 8}}>
<Text style={styles.toggleLabel}>{'\u26A1'} Highlight-Trigger</Text>
@@ -690,7 +748,7 @@ const SettingsScreen: React.FC = () => {
<Text style={styles.sectionTitle}>{'\u00DC'}ber</Text>
<View style={styles.card}>
<Text style={styles.aboutTitle}>ARIA Cockpit</Text>
<Text style={styles.aboutVersion}>Version 0.0.1.9 </Text>
<Text style={styles.aboutVersion}>Version 0.0.2.4 </Text>
<Text style={styles.aboutInfo}>
Stefans Kommandozentrale f{'\u00FC'}r ARIA.{'\n'}
Gebaut mit React Native + TypeScript.
+1 -1
View File
@@ -12,7 +12,7 @@ import AsyncStorage from '@react-native-async-storage/async-storage';
export type ConnectionState = 'connecting' | 'connected' | 'disconnected';
export type MessageType = 'chat' | 'audio' | 'file' | 'location' | 'mode' | 'log' | 'event';
export type MessageType = 'chat' | 'audio' | 'file' | 'location' | 'mode' | 'log' | 'event' | 'update_available' | string;
export interface RVSMessage {
type: MessageType;
+149
View File
@@ -0,0 +1,149 @@
/**
* Auto-Update Service — prueft und installiert App-Updates via RVS
*
* Flow:
* 1. App sendet "update_check" mit aktueller Version an RVS
* 2. RVS vergleicht → sendet "update_available" mit Download-URL
* 3. App zeigt Benachrichtigung → User bestaetigt → Download + Install
*/
import { Alert, Linking, Platform } from 'react-native';
import RNFS from 'react-native-fs';
import rvs, { RVSMessage } from './rvs';
// Aktuelle App-Version (aus package.json via Build)
const APP_VERSION = '0.0.2.3'; // TODO: aus nativer Build-Config lesen
type UpdateCallback = (info: UpdateInfo) => void;
export interface UpdateInfo {
version: string;
downloadUrl: string;
size: number;
}
class UpdateService {
private listeners: UpdateCallback[] = [];
private checking = false;
private downloading = false;
constructor() {
// Auf update_available Nachrichten lauschen
rvs.onMessage((msg: RVSMessage) => {
if (msg.type === 'update_available' as any) {
const info: UpdateInfo = {
version: (msg.payload.version as string) || '',
downloadUrl: (msg.payload.downloadUrl as string) || '',
size: (msg.payload.size as number) || 0,
};
if (info.version && this.isNewer(info.version)) {
console.log(`[Update] Neue Version verfuegbar: ${info.version} (aktuell: ${APP_VERSION})`);
this.listeners.forEach(cb => cb(info));
}
}
});
}
/** Bei App-Start Update pruefen */
checkForUpdate(): void {
if (this.checking) return;
this.checking = true;
console.log(`[Update] Pruefe auf Updates (aktuell: ${APP_VERSION})`);
rvs.send('update_check' as any, { version: APP_VERSION });
setTimeout(() => { this.checking = false; }, 10000);
}
/** Callback registrieren */
onUpdateAvailable(callback: UpdateCallback): () => void {
this.listeners.push(callback);
return () => {
this.listeners = this.listeners.filter(cb => cb !== callback);
};
}
/** Update-Dialog anzeigen */
promptUpdate(info: UpdateInfo): void {
const sizeMB = (info.size / 1024 / 1024).toFixed(1);
Alert.alert(
'ARIA Update verfuegbar',
`Version ${info.version} (${sizeMB} MB)\n\nAktuell: ${APP_VERSION}\n\nJetzt herunterladen und installieren?`,
[
{ text: 'Spaeter', style: 'cancel' },
{
text: 'Installieren',
onPress: () => this.downloadAndInstall(info),
},
],
);
}
/** APK ueber WebSocket herunterladen und installieren */
async downloadAndInstall(info: UpdateInfo): Promise<void> {
if (this.downloading) return;
this.downloading = true;
try {
console.log(`[Update] Fordere APK v${info.version} an...`);
Alert.alert('Download gestartet', `Version ${info.version} wird ueber RVS heruntergeladen...`);
// APK ueber WebSocket anfordern
rvs.send('update_download' as any, {});
// Auf update_data warten (einmalig)
const apkData = await new Promise<{base64: string, fileName: string}>((resolve, reject) => {
const timeout = setTimeout(() => reject(new Error('Download-Timeout (60s)')), 60000);
const unsub = rvs.onMessage((msg: RVSMessage) => {
if ((msg.type as string) === 'update_data') {
clearTimeout(timeout);
unsub();
if (msg.payload.error) {
reject(new Error(msg.payload.error as string));
} else {
resolve({
base64: msg.payload.base64 as string,
fileName: msg.payload.fileName as string || `ARIA-${info.version}.apk`,
});
}
}
});
});
// Base64 als APK-Datei speichern
const destPath = `${RNFS.CachesDirectoryPath}/${apkData.fileName}`;
await RNFS.writeFile(destPath, apkData.base64, 'base64');
const fileSize = await RNFS.stat(destPath);
console.log(`[Update] APK gespeichert: ${destPath} (${(parseInt(fileSize.size) / 1024 / 1024).toFixed(1)}MB)`);
// APK installieren (oeffnet Android-Installer)
if (Platform.OS === 'android') {
await Linking.openURL(`file://${destPath}`);
}
} catch (err: any) {
console.error(`[Update] Fehler: ${err.message}`);
Alert.alert('Update fehlgeschlagen', err.message);
} finally {
this.downloading = false;
}
}
/** Versionsvergleich */
private isNewer(remote: string): boolean {
const r = remote.split('.').map(Number);
const l = APP_VERSION.split('.').map(Number);
for (let i = 0; i < Math.max(r.length, l.length); i++) {
const diff = (r[i] || 0) - (l[i] || 0);
if (diff > 0) return true;
if (diff < 0) return false;
}
return false;
}
getCurrentVersion(): string {
return APP_VERSION;
}
}
const updateService = new UpdateService();
export default updateService;
+7 -77
View File
@@ -1,21 +1,12 @@
/**
* Wake Word Service — "ARIA" Erkennung
*
* Nutzt react-native-live-audio-stream fuer kontinuierliches Mikrofon-Monitoring.
* Erkennt Sprache per Energie-Schwellwert und sendet kurze Audio-Clips
* zur serverseitigen Wake-Word-Pruefung (openwakeword in der Bridge).
* Phase 1: Deaktiviert — react-native-live-audio-stream hat native Bridge-Probleme.
* Nutzt stattdessen Tap-to-Talk (VoiceButton) als primaeren Eingabemodus.
*
* Architektur:
* App (Mikrofon) → Energie-Erkennung → Audio-Buffer
* → RVS "wake_check" → Bridge → openwakeword → Bestaetigung
* → App startet Aufnahme
*
* Aktuell (Phase 1): Einfacher Tap-to-Talk + Auto-Stop.
* Spaeter (Phase 2): Porcupine on-device "ARIA" Keyword.
* Phase 2: Porcupine on-device "ARIA" Keyword (geplant).
*/
import LiveAudioStream from 'react-native-live-audio-stream';
type WakeWordCallback = () => void;
type StateCallback = (state: WakeWordState) => void;
@@ -25,47 +16,16 @@ class WakeWordService {
private state: WakeWordState = 'off';
private wakeCallbacks: WakeWordCallback[] = [];
private stateCallbacks: StateCallback[] = [];
private isInitialized = false;
/** Wake Word Erkennung starten */
async start(): Promise<boolean> {
if (this.state === 'listening') return true;
try {
if (!this.isInitialized) {
LiveAudioStream.init({
sampleRate: 16000,
channels: 1,
bitsPerSample: 16,
audioSource: 6, // VOICE_RECOGNITION
bufferSize: 4096,
});
this.isInitialized = true;
}
// Audio-Stream starten und auf Energie pruefen
LiveAudioStream.start();
LiveAudioStream.on('data', (base64Chunk: string) => {
if (this.state !== 'listening') return;
// Base64 → Int16 Array → RMS berechnen
const raw = this._base64ToInt16(base64Chunk);
const rms = this._calculateRMS(raw);
// Schwellwert: wenn laut genug → Wake Word erkannt
// Phase 1: Einfache Energie-Erkennung (jemand spricht)
// Phase 2: Porcupine "ARIA" Keyword
if (rms > 2000) {
this.setState('detected');
this.wakeCallbacks.forEach(cb => cb());
// Nach Detection kurz pausieren, Aufnahme uebernimmt das Mikrofon
this.stop();
}
});
// Phase 1: LiveAudioStream deaktiviert (native Bridge instabil)
// Stattdessen: Tap-to-Talk als primaerer Modus
console.log('[WakeWord] Wake Word ist in Phase 1 noch nicht verfuegbar — nutze Tap-to-Talk');
this.setState('listening');
console.log('[WakeWord] Listening gestartet');
return true;
} catch (err) {
console.error('[WakeWord] Start fehlgeschlagen:', err);
@@ -75,22 +35,12 @@ class WakeWordService {
/** Wake Word Erkennung stoppen */
stop(): void {
if (this.state === 'off') return;
try {
LiveAudioStream.stop();
} catch {}
this.setState('off');
console.log('[WakeWord] Gestoppt');
}
/** Nach Aufnahme erneut starten */
async resume(): Promise<void> {
// Kurze Pause damit Aufnahme das Mikrofon freigeben kann
setTimeout(() => {
if (this.state === 'off') {
this.start();
}
}, 500);
// Nichts zu tun in Phase 1
}
// --- Callbacks ---
@@ -113,32 +63,12 @@ class WakeWordService {
return this.state;
}
// --- Hilfsfunktionen ---
private setState(state: WakeWordState): void {
if (this.state !== state) {
this.state = state;
this.stateCallbacks.forEach(cb => cb(state));
}
}
private _base64ToInt16(base64: string): Int16Array {
const binary = atob(base64);
const bytes = new Uint8Array(binary.length);
for (let i = 0; i < binary.length; i++) {
bytes[i] = binary.charCodeAt(i);
}
return new Int16Array(bytes.buffer);
}
private _calculateRMS(samples: Int16Array): number {
if (samples.length === 0) return 0;
let sum = 0;
for (let i = 0; i < samples.length; i++) {
sum += samples[i] * samples[i];
}
return Math.sqrt(sum / samples.length);
}
}
const wakeWordService = new WakeWordService();
+245 -33
View File
@@ -38,6 +38,7 @@ import websockets
from faster_whisper import WhisperModel
from openwakeword.model import Model as WakeWordModel
from piper import PiperVoice
from piper.config import SynthesisConfig
from modes import Mode, detect_mode_switch, should_speak
@@ -72,7 +73,7 @@ BLOCK_SIZE = 1280 # 80ms bei 16kHz — gut fuer Wake-Word-Erkennung
RECORD_SECONDS = 8 # Max. Aufnahmedauer nach Wake-Word
# Epische Trigger — bei diesen Woertern spricht Thorsten
EPIC_TRIGGERS = [
EPIC_TRIGGERS_DEFAULT = [
"deploy",
"erfolgreich",
"alarm",
@@ -84,6 +85,24 @@ EPIC_TRIGGERS = [
"aufgabe abgeschlossen",
]
# Trigger aus Shared-Config laden (von Diagnostic gespeichert)
TRIGGERS_FILE = "/shared/config/highlight_triggers.json"
def load_epic_triggers():
"""Laedt Highlight-Trigger aus Shared-Config oder nutzt Defaults."""
try:
if os.path.exists(TRIGGERS_FILE):
with open(TRIGGERS_FILE) as f:
triggers = json.load(f)
if isinstance(triggers, list) and len(triggers) > 0:
logger.info("Highlight-Trigger geladen: %d aus %s", len(triggers), TRIGGERS_FILE)
return triggers
except Exception as e:
logger.warning("Highlight-Trigger laden fehlgeschlagen: %s — nutze Defaults", e)
return EPIC_TRIGGERS_DEFAULT
EPIC_TRIGGERS = load_epic_triggers()
def load_config() -> dict[str, str]:
"""Laedt Konfiguration aus /config/aria.env."""
@@ -111,6 +130,9 @@ class VoiceEngine:
def __init__(self, voices_dir: Path) -> None:
self.voices_dir = voices_dir
self.voices: dict[str, PiperVoice] = {}
self.default_voice = "ramona"
self.highlight_voice = "thorsten"
self.speech_speed = {"ramona": 1.0, "thorsten": 1.0}
def initialize(self) -> None:
"""Laedt die Piper-Stimmen aus dem Voices-Verzeichnis."""
@@ -154,14 +176,14 @@ class VoiceEngine:
if requested_voice and requested_voice in self.voices:
return requested_voice
# Epische Trigger pruefen
# Highlight-Trigger pruefen
text_lower = text.lower()
for trigger in EPIC_TRIGGERS:
if trigger in text_lower:
logger.info("Epischer Trigger erkannt: '%s'Thorsten spricht", trigger)
return "thorsten"
logger.info("Highlight-Trigger erkannt: '%s'%s spricht", trigger, self.highlight_voice)
return self.highlight_voice
return "ramona"
return self.default_voice
def synthesize(self, text: str, voice_name: str = "ramona") -> Optional[bytes]:
"""Erzeugt Audio-Daten aus Text mit der gewaehlten Stimme.
@@ -179,23 +201,50 @@ class VoiceEngine:
return None
try:
# Piper gibt PCM-Samples zurueck, wir schreiben sie als WAV
# Langen Text in Saetze aufteilen (Piper hat Limits bei langen Texten)
import re
sentences = re.split(r'(?<=[.!?])\s+', text.strip())
# Markdown-Formatierung entfernen
sentences = [re.sub(r'\*\*([^*]+)\*\*', r'\1', s).strip() for s in sentences if s.strip()]
if not sentences:
return None
# Jeden Satz einzeln synthetisieren und WAVs zusammenfuegen
all_audio = b""
sample_rate = None
for sentence in sentences:
if not sentence:
continue
with tempfile.NamedTemporaryFile(suffix=".wav", delete=False) as tmp:
tmp_path = tmp.name
speed = self.speech_speed.get(voice_name, 1.0)
syn_config = SynthesisConfig(length_scale=1.0 / max(0.3, speed))
with wave.open(tmp_path, "wb") as wav_file:
voice.synthesize_wav(sentence, wav_file, syn_config=syn_config)
with wave.open(tmp_path, "rb") as wav_file:
if sample_rate is None:
sample_rate = wav_file.getframerate()
all_audio += wav_file.readframes(wav_file.getnframes())
Path(tmp_path).unlink(missing_ok=True)
# Zusammengefuegtes WAV erstellen
with tempfile.NamedTemporaryFile(suffix=".wav", delete=False) as tmp:
tmp_path = tmp.name
with wave.open(tmp_path, "wb") as wav_file:
final_path = tmp.name
with wave.open(final_path, "wb") as wav_file:
wav_file.setnchannels(1)
wav_file.setsampwidth(2) # 16-bit
wav_file.setframerate(voice.config.sample_rate)
voice.synthesize(text, wav_file)
wav_file.setsampwidth(2)
wav_file.setframerate(sample_rate or 22050)
wav_file.writeframes(all_audio)
audio_data = Path(tmp_path).read_bytes()
Path(tmp_path).unlink(missing_ok=True)
audio_data = Path(final_path).read_bytes()
Path(final_path).unlink(missing_ok=True)
logger.info(
"TTS: %d bytes erzeugt mit %s'%s'",
"TTS: %d bytes erzeugt mit %s (%d Saetze)'%s'",
len(audio_data),
voice_name,
len(sentences),
text[:60],
)
return audio_data
@@ -440,6 +489,25 @@ class ARIABridge:
# Komponenten
self.voice_engine = VoiceEngine(VOICES_DIR)
self.tts_enabled = True
# Gespeicherte Voice-Config laden
try:
vc_path = "/shared/config/voice_config.json"
if os.path.exists(vc_path):
with open(vc_path) as f:
vc = json.load(f)
self.voice_engine.default_voice = vc.get("defaultVoice", "ramona")
self.voice_engine.highlight_voice = vc.get("highlightVoice", "thorsten")
self.voice_engine.speech_speed = {
"ramona": vc.get("speedRamona", 1.0),
"thorsten": vc.get("speedThorsten", 1.0),
}
self.tts_enabled = vc.get("ttsEnabled", True)
self.tts_engine_type = vc.get("ttsEngine", "piper")
self.xtts_voice = vc.get("xttsVoice", "")
logger.info("Voice-Config geladen: %s", vc)
except Exception as e:
logger.warning("Voice-Config laden fehlgeschlagen: %s", e)
self.stt_engine = STTEngine(
model_size=self.config.get("WHISPER_MODEL", WHISPER_MODEL),
language=self.config.get("WHISPER_LANGUAGE", WHISPER_LANGUAGE),
@@ -464,17 +532,20 @@ class ARIABridge:
# Voice-Engine IMMER laden — rendert Audio fuer die App (auch ohne Soundkarte)
self.voice_engine.initialize()
# STT IMMER laden — verarbeitet Audio von der App (braucht kein Sounddevice)
self.stt_engine.initialize()
# Audio-Hardware pruefen (fuer lokales Mikro/Lautsprecher)
self.audio_available = False
try:
sd.query_devices()
devices = sd.query_devices()
sd.query_devices(kind='output')
self.audio_available = True
logger.info("Audio-Geraet gefunden — Wake-Word und lokale TTS aktiv")
self.stt_engine.initialize()
self.wake_word.initialize()
except (sd.PortAudioError, Exception):
logger.warning("Kein Audio-Geraet — Wake-Word und lokale TTS deaktiviert")
logger.info("Piper TTS rendert Audio fuer die App (via RVS)")
logger.warning("Kein Audio-Geraet — Wake-Word und lokale Wiedergabe deaktiviert")
logger.info("TTS rendert fuer App (via RVS), STT verarbeitet App-Audio")
logger.info("Alle Komponenten initialisiert")
logger.info("aria-core: %s", self.ws_url)
@@ -776,18 +847,48 @@ class ARIABridge:
})
# TTS-Audio rendern und an die App senden (wenn Modus es erlaubt)
if should_speak(self.current_mode, is_critical):
audio_data = self.voice_engine.synthesize(text, voice_name)
if audio_data:
audio_b64 = base64.b64encode(audio_data).decode("ascii")
await self._send_to_rvs({
"type": "audio",
"payload": {
"base64": audio_b64,
"mimeType": "audio/wav",
"voice": voice_name,
},
"timestamp": int(asyncio.get_event_loop().time() * 1000),
if getattr(self, 'tts_enabled', True) and should_speak(self.current_mode, is_critical):
tts_engine = getattr(self, 'tts_engine_type', 'piper')
if tts_engine == "xtts":
# XTTS: Request ueber RVS an Gaming-PC senden
xtts_voice = getattr(self, 'xtts_voice', '')
try:
await self._send_to_rvs({
"type": "xtts_request",
"payload": {
"text": text,
"voice": xtts_voice,
"language": "de",
"requestId": str(uuid.uuid4()),
},
"timestamp": int(asyncio.get_event_loop().time() * 1000),
})
logger.info("[core] XTTS-Request gesendet (%s): '%s'", xtts_voice or "default", text[:60])
except Exception as e:
logger.warning("[core] XTTS-Request fehlgeschlagen: %s — Fallback auf Piper", e)
# Fallback auf Piper
audio_data = self.voice_engine.synthesize(text, voice_name)
if audio_data:
audio_b64 = base64.b64encode(audio_data).decode("ascii")
await self._send_to_rvs({
"type": "audio",
"payload": {"base64": audio_b64, "mimeType": "audio/wav", "voice": voice_name},
"timestamp": int(asyncio.get_event_loop().time() * 1000),
})
else:
# Piper: Lokal rendern
audio_data = self.voice_engine.synthesize(text, voice_name)
if audio_data:
audio_b64 = base64.b64encode(audio_data).decode("ascii")
await self._send_to_rvs({
"type": "audio",
"payload": {
"base64": audio_b64,
"mimeType": "audio/wav",
"voice": voice_name,
},
"timestamp": int(asyncio.get_event_loop().time() * 1000),
})
logger.info("[core] TTS-Audio gesendet: %d bytes (%s)", len(audio_data), voice_name)
@@ -896,10 +997,22 @@ class ARIABridge:
retry_delay = min(retry_delay * 2, 30)
async def _rvs_heartbeat(self) -> None:
"""Sendet Heartbeats an den RVS damit die Verbindung offen bleibt."""
"""Sendet Heartbeats + WebSocket Pings an den RVS damit die Verbindung offen bleibt."""
while True:
await asyncio.sleep(25)
await asyncio.sleep(15)
if self.ws_rvs:
try:
# WebSocket Protocol-Level Ping (haelt TCP-Verbindung am Leben)
pong = await self.ws_rvs.ping()
await asyncio.wait_for(pong, timeout=10)
except Exception:
logger.warning("[rvs] Ping fehlgeschlagen — Verbindung tot, erzwinge Reconnect")
try:
await self.ws_rvs.close()
except Exception:
pass
self.ws_rvs = None
break
try:
await self.ws_rvs.send(json.dumps({
"type": "heartbeat",
@@ -932,6 +1045,105 @@ class ARIABridge:
sender = payload.get("sender", "")
if sender in ("aria", "stt"):
return
elif msg_type == "xtts_response":
# XTTS-Audio vom Gaming-PC empfangen → an App weiterleiten
audio_b64 = payload.get("base64", "")
error = payload.get("error", "")
if error:
logger.warning("[rvs] XTTS Fehler: %s", error)
return
if audio_b64:
logger.info("[rvs] XTTS-Audio empfangen: %dKB", len(audio_b64) // 1365)
await self._send_to_rvs({
"type": "audio",
"payload": {
"base64": audio_b64,
"mimeType": payload.get("mimeType", "audio/wav"),
"voice": payload.get("voice", "xtts"),
},
"timestamp": int(asyncio.get_event_loop().time() * 1000),
})
return
elif msg_type == "tts_request":
# App fordert TTS-Audio fuer einen Text an (Play-Button)
text = payload.get("text", "")
requested_voice = payload.get("voice", "")
if text:
voice_name = requested_voice or self.voice_engine.select_voice(text)
audio_data = self.voice_engine.synthesize(text, voice_name)
if audio_data:
audio_b64 = base64.b64encode(audio_data).decode("ascii")
try:
await self._send_to_rvs({
"type": "audio",
"payload": {
"base64": audio_b64,
"mimeType": "audio/wav",
"voice": voice_name,
},
"timestamp": int(asyncio.get_event_loop().time() * 1000),
})
logger.info("[rvs] TTS on-demand: %d bytes (%s)", len(audio_data), voice_name)
except Exception as e:
logger.warning("[rvs] TTS on-demand senden fehlgeschlagen: %s", e)
return
elif msg_type == "config":
# Konfiguration von App/Diagnostic empfangen + persistent speichern
changed = False
if "defaultVoice" in payload:
new_voice = payload["defaultVoice"]
if new_voice in self.voice_engine.voices:
self.voice_engine.default_voice = new_voice
logger.info("[rvs] Standard-Stimme gewechselt: %s", new_voice)
changed = True
if "highlightVoice" in payload:
new_voice = payload["highlightVoice"]
if new_voice in self.voice_engine.voices:
self.voice_engine.highlight_voice = new_voice
logger.info("[rvs] Highlight-Stimme gewechselt: %s", new_voice)
changed = True
if "ttsEnabled" in payload:
self.tts_enabled = bool(payload["ttsEnabled"])
logger.info("[rvs] TTS %s", "aktiviert" if self.tts_enabled else "deaktiviert")
changed = True
if "ttsEngine" in payload:
self.tts_engine_type = payload["ttsEngine"]
logger.info("[rvs] TTS-Engine: %s", self.tts_engine_type)
changed = True
if "xttsVoice" in payload:
self.xtts_voice = payload["xttsVoice"]
logger.info("[rvs] XTTS-Stimme: %s", self.xtts_voice)
changed = True
if "speedRamona" in payload:
self.voice_engine.speech_speed["ramona"] = max(0.3, min(2.0, float(payload["speedRamona"])))
logger.info("[rvs] Speed Ramona: %.1f", self.voice_engine.speech_speed["ramona"])
changed = True
if "speedThorsten" in payload:
self.voice_engine.speech_speed["thorsten"] = max(0.3, min(2.0, float(payload["speedThorsten"])))
logger.info("[rvs] Speed Thorsten: %.1f", self.voice_engine.speech_speed["thorsten"])
changed = True
# Persistent speichern in Shared Volume
if changed:
try:
os.makedirs("/shared/config", exist_ok=True)
config_data = {
"defaultVoice": self.voice_engine.default_voice,
"highlightVoice": self.voice_engine.highlight_voice,
"ttsEnabled": getattr(self, "tts_enabled", True),
"ttsEngine": getattr(self, "tts_engine_type", "piper"),
"xttsVoice": getattr(self, "xtts_voice", ""),
"speedRamona": self.voice_engine.speech_speed.get("ramona", 1.0),
"speedThorsten": self.voice_engine.speech_speed.get("thorsten", 1.0),
}
with open("/shared/config/voice_config.json", "w") as f:
json.dump(config_data, f, indent=2)
logger.info("[rvs] Voice-Config gespeichert: %s", config_data)
except Exception as e:
logger.warning("[rvs] Config speichern fehlgeschlagen: %s", e)
return
text = payload.get("text", "")
if text:
logger.info("[rvs] App-Chat: '%s'", text[:80])
+361 -2
View File
@@ -201,8 +201,9 @@
<button class="btn secondary" onclick="toggleChatFullscreen()" id="btn-chat-fs" style="padding:4px 10px;font-size:11px;">Vollbild</button>
</div>
<div class="chat-box" id="chat-box"></div>
<div id="thinking-indicator" style="display:none;padding:6px 10px;font-size:12px;color:#FFD60A;background:#1E1E2E;border-radius:0 0 6px 6px;margin-top:-8px;margin-bottom:8px;">
<span style="animation:pulse 1s infinite;">&#x1F4AD;</span> <span id="thinking-text">ARIA denkt...</span>
<div id="thinking-indicator" style="display:none;padding:6px 10px;font-size:12px;color:#FFD60A;background:#1E1E2E;border-radius:0 0 6px 6px;margin-top:-8px;margin-bottom:8px;display:flex;align-items:center;justify-content:space-between;">
<span><span style="animation:pulse 1s infinite;">&#x1F4AD;</span> <span id="thinking-text">ARIA denkt...</span></span>
<button class="btn secondary" onclick="cancelRequest()" style="padding:2px 10px;font-size:11px;color:#FF3B30;border-color:#FF3B30;">Abbrechen</button>
</div>
<div class="input-row">
<input type="text" id="chat-input" placeholder="Nachricht an ARIA...">
@@ -283,6 +284,7 @@
<button class="tab-btn" data-tab="bridge" onclick="switchTab('bridge')">Bridge <span class="tab-count" id="count-bridge">0</span></button>
<button class="tab-btn" data-tab="server" onclick="switchTab('server')">Server <span class="tab-count" id="count-server">0</span></button>
<button class="tab-btn" data-tab="pipeline" onclick="switchTab('pipeline')" style="margin-left:auto;border-color:#0096FF44;color:#0096FF">Pipeline <span class="tab-count" id="count-pipeline">0</span></button>
<button class="tab-btn" data-tab="tts" onclick="switchTab('tts')" style="border-color:#34C75944;color:#34C759">TTS</button>
</div>
</div>
<div class="log-panel">
@@ -302,6 +304,36 @@
<div class="log-box hidden" id="log-bridge"></div>
<div class="log-box hidden" id="log-server"></div>
<div class="log-box hidden" id="log-pipeline"></div>
<div class="log-box hidden" id="log-tts" style="padding:12px;">
<h3 style="color:#34C759;margin:0 0 12px;">TTS Diagnose</h3>
<div style="display:grid;grid-template-columns:1fr 1fr;gap:8px;margin-bottom:12px;">
<div style="background:#1E1E2E;padding:8px;border-radius:6px;">
<div style="color:#8888AA;font-size:10px;text-transform:uppercase;">Standard-Stimme</div>
<div style="color:#fff;font-size:14px;margin-top:4px;" id="tts-default-voice">Ramona</div>
</div>
<div style="background:#1E1E2E;padding:8px;border-radius:6px;">
<div style="color:#8888AA;font-size:10px;text-transform:uppercase;">Highlight-Stimme</div>
<div style="color:#fff;font-size:14px;margin-top:4px;" id="tts-highlight-voice">Thorsten</div>
</div>
<div style="background:#1E1E2E;padding:8px;border-radius:6px;">
<div style="color:#8888AA;font-size:10px;text-transform:uppercase;">Status</div>
<div style="font-size:14px;margin-top:4px;" id="tts-status">Unbekannt</div>
</div>
<div style="background:#1E1E2E;padding:8px;border-radius:6px;">
<div style="color:#8888AA;font-size:10px;text-transform:uppercase;">Letzter Fehler</div>
<div style="color:#FF6B6B;font-size:12px;margin-top:4px;word-break:break-all;" id="tts-last-error">-</div>
</div>
</div>
<div style="margin-bottom:8px;">
<input type="text" id="tts-test-text" value="Hallo Stefan, ich bin ARIA." placeholder="Test-Text..." style="background:#1E1E2E;border:1px solid #2A2A3E;border-radius:6px;padding:8px;color:#fff;font-size:13px;width:100%;box-sizing:border-box;">
</div>
<div style="display:flex;gap:8px;">
<button class="btn" onclick="testTTS('ramona')" style="flex:1;">Ramona testen</button>
<button class="btn" onclick="testTTS('thorsten')" style="flex:1;">Thorsten testen</button>
<button class="btn secondary" onclick="checkTTSStatus()" style="flex:1;">Status pruefen</button>
</div>
<div id="tts-log" style="margin-top:12px;max-height:200px;overflow-y:auto;font-size:11px;font-family:monospace;color:#8888AA;"></div>
</div>
</div>
</div>
@@ -340,6 +372,143 @@
<!-- ══════ TAB: Einstellungen ══════ -->
<div id="tab-settings" class="main-tab">
<!-- Betriebsmodus -->
<div class="settings-section">
<h2>Betriebsmodus</h2>
<div class="card" style="max-width:500px;">
<div id="mode-selector" style="display:grid;grid-template-columns:1fr 1fr;gap:8px;">
<button class="btn mode-btn" data-mode="normal" onclick="setMode('normal')" style="background:#1E1E2E;border:2px solid transparent;">
<span style="font-size:18px;">&#x1F7E2;</span> Normal<br><span style="font-size:10px;color:#8888AA;">Hoert zu, antwortet, spricht</span>
</button>
<button class="btn mode-btn" data-mode="dnd" onclick="setMode('dnd')" style="background:#1E1E2E;border:2px solid transparent;">
<span style="font-size:18px;">&#x1F534;</span> Nicht stoeren<br><span style="font-size:10px;color:#8888AA;">Nur Kritikalarme</span>
</button>
<button class="btn mode-btn" data-mode="whisper" onclick="setMode('whisper')" style="background:#1E1E2E;border:2px solid transparent;">
<span style="font-size:18px;">&#x1F7E1;</span> Fluestern<br><span style="font-size:10px;color:#8888AA;">Nur Text, keine Sprache</span>
</button>
<button class="btn mode-btn" data-mode="hangar" onclick="setMode('hangar')" style="background:#1E1E2E;border:2px solid transparent;">
<span style="font-size:18px;">&#x2708;&#xFE0F;</span> Hangar<br><span style="font-size:10px;color:#8888AA;">Nur wichtige Meldungen</span>
</button>
<button class="btn mode-btn" data-mode="gaming" onclick="setMode('gaming')" style="background:#1E1E2E;border:2px solid transparent;grid-column:1/-1;">
<span style="font-size:18px;">&#x1F3AE;</span> Gaming<br><span style="font-size:10px;color:#8888AA;">Nur direkte Fragen</span>
</button>
</div>
<div style="margin-top:8px;font-size:11px;color:#555570;" id="mode-status">Aktueller Modus: Normal</div>
</div>
</div>
<!-- Stimmen -->
<div class="settings-section">
<h2>Sprachausgabe</h2>
<div class="card" style="max-width:500px;">
<!-- TTS Engine Auswahl -->
<div style="display:flex;align-items:center;gap:12px;margin-bottom:12px;">
<label style="color:#8888AA;font-size:12px;">TTS Engine:</label>
<select id="diag-tts-engine" onchange="sendVoiceConfig();toggleXTTSPanel()" style="background:#1E1E2E;color:#fff;border:1px solid #2A2A3E;border-radius:6px;padding:6px 10px;font-size:13px;">
<option value="piper">Piper (lokal, CPU, schnell)</option>
<option value="xtts">XTTS v2 (remote, GPU, natuerlich)</option>
</select>
</div>
<!-- Piper Stimmen (nur bei Engine=piper) -->
<div id="piper-panel">
<div style="display:flex;align-items:center;gap:12px;margin-bottom:12px;">
<label style="color:#8888AA;font-size:12px;">Standard-Stimme:</label>
<select id="diag-default-voice" onchange="sendVoiceConfig()" style="background:#1E1E2E;color:#fff;border:1px solid #2A2A3E;border-radius:6px;padding:6px 10px;font-size:13px;">
<option value="ramona">Ramona (weiblich)</option>
<option value="thorsten">Thorsten (maennlich)</option>
</select>
</div>
<div style="display:flex;align-items:center;gap:12px;margin-bottom:12px;">
<label style="color:#8888AA;font-size:12px;">Highlight-Stimme:</label>
<select id="diag-highlight-voice" onchange="sendVoiceConfig()" style="background:#1E1E2E;color:#fff;border:1px solid #2A2A3E;border-radius:6px;padding:6px 10px;font-size:13px;">
<option value="thorsten">Thorsten (maennlich)</option>
<option value="ramona">Ramona (weiblich)</option>
</select>
</div>
<div style="display:flex;align-items:center;gap:12px;margin-bottom:12px;">
<label style="color:#8888AA;font-size:12px;">TTS aktiv:</label>
<label class="toggle"><input type="checkbox" id="diag-tts-enabled" checked onchange="sendVoiceConfig()"><span class="slider"></span></label>
</div>
<div style="margin-bottom:4px;">
<label style="color:#8888AA;font-size:12px;">Ramona Speed: <span id="speed-ramona-label">1.0x</span></label>
</div>
<div style="display:flex;align-items:center;gap:8px;margin-bottom:12px;">
<span style="color:#555570;font-size:11px;">0.5x</span>
<input type="range" id="diag-speed-ramona" min="0.5" max="2.0" step="0.1" value="1.0"
oninput="document.getElementById('speed-ramona-label').textContent=this.value+'x'"
onchange="sendVoiceConfig()"
style="flex:1;accent-color:#0096FF;">
<span style="color:#555570;font-size:11px;">2.0x</span>
</div>
<div style="margin-bottom:4px;">
<label style="color:#8888AA;font-size:12px;">Thorsten Speed: <span id="speed-thorsten-label">1.0x</span></label>
</div>
<div style="display:flex;align-items:center;gap:8px;">
<span style="color:#555570;font-size:11px;">0.5x</span>
<input type="range" id="diag-speed-thorsten" min="0.5" max="2.0" step="0.1" value="1.0"
oninput="document.getElementById('speed-thorsten-label').textContent=this.value+'x'"
onchange="sendVoiceConfig()"
style="flex:1;accent-color:#0096FF;">
<span style="color:#555570;font-size:11px;">2.0x</span>
</div>
</div><!-- /piper-panel -->
<!-- XTTS Panel (nur bei Engine=xtts) -->
<div id="xtts-panel" style="display:none;">
<div style="display:flex;align-items:center;gap:12px;margin-bottom:12px;">
<label style="color:#8888AA;font-size:12px;">XTTS Stimme:</label>
<select id="diag-xtts-voice" onchange="sendVoiceConfig()" style="background:#1E1E2E;color:#fff;border:1px solid #2A2A3E;border-radius:6px;padding:6px 10px;font-size:13px;">
<option value="">Standard (XTTS Default)</option>
</select>
<button class="btn secondary" onclick="loadXTTSVoices()" style="padding:4px 10px;font-size:11px;">Laden</button>
</div>
<!-- Voice Cloning -->
<div style="background:#1E1E2E;border-radius:8px;padding:12px;margin-top:8px;">
<div style="color:#0096FF;font-size:13px;font-weight:600;margin-bottom:8px;">Stimme klonen</div>
<div style="color:#8888AA;font-size:11px;margin-bottom:8px;">
Lade ein oder mehrere Audio-Samples hoch (WAV/MP3, min. 6-10 Sekunden).
Mehrere Dateien werden automatisch zusammengefuegt.
</div>
<div style="margin-bottom:8px;">
<input type="text" id="xtts-clone-name" placeholder="Name fuer die Stimme..." style="background:#0D0D1A;border:1px solid #2A2A3E;border-radius:6px;padding:6px 10px;color:#fff;font-size:13px;width:100%;box-sizing:border-box;">
</div>
<div style="margin-bottom:8px;">
<input type="file" id="xtts-clone-files" accept="audio/*" multiple style="color:#8888AA;font-size:12px;">
</div>
<div style="display:flex;gap:8px;">
<button class="btn" onclick="uploadVoiceSamples()" style="flex:1;">Stimme erstellen</button>
</div>
<div id="xtts-clone-status" style="font-size:11px;color:#555570;margin-top:6px;"></div>
</div>
<!-- XTTS Status -->
<div style="margin-top:8px;font-size:11px;color:#555570;" id="xtts-status">
XTTS-Server: Nicht verbunden (starte xtts/ auf dem Gaming-PC)
</div>
</div>
</div>
</div>
<!-- Highlight-Trigger -->
<div class="settings-section">
<h2>Highlight-Trigger</h2>
<div style="font-size:11px;color:#8888AA;margin-bottom:8px;">
Woerter die automatisch die Highlight-Stimme (Thorsten) ausloesen.
Eines pro Zeile. Aenderungen werden in der Bridge gespeichert.
</div>
<div class="card" style="max-width:500px;">
<textarea id="highlight-triggers" rows="8" style="width:100%;box-sizing:border-box;background:#1E1E2E;border:1px solid #2A2A3E;border-radius:6px;padding:8px;color:#fff;font-size:13px;font-family:monospace;resize:vertical;"
placeholder="Lade..."></textarea>
<div style="display:flex;gap:8px;margin-top:8px;">
<button class="btn" onclick="saveHighlightTriggers()" style="flex:1;">Speichern</button>
<button class="btn secondary" onclick="loadHighlightTriggers()" style="flex:1;">Neu laden</button>
</div>
<div id="trigger-status" style="font-size:11px;color:#555570;margin-top:6px;"></div>
</div>
</div>
<!-- Tool-Berechtigungen -->
<div class="settings-section">
<h2>Tool-Berechtigungen</h2>
@@ -420,6 +589,7 @@
bridge: document.getElementById('log-bridge'),
server: document.getElementById('log-server'),
pipeline: document.getElementById('log-pipeline'),
tts: document.getElementById('log-tts'),
};
// Scroll-Pause pro aktivem Tab
@@ -513,11 +683,88 @@
if (msg.type === 'state') { updateState(msg.state); return; }
if (msg.type === 'log') { addLog(msg.entry.level, msg.entry.source, msg.entry.message, msg.entry.ts); return; }
if (msg.type === 'tts_result') {
if (msg.ok) {
ttsLog(`\u2705 ${msg.voice}: ${msg.duration}ms, ${msg.size} bytes`);
document.getElementById('tts-status').textContent = 'OK';
document.getElementById('tts-status').style.color = '#34C759';
} else {
ttsLog(`\u274C Fehler: ${msg.error}`);
document.getElementById('tts-status').textContent = 'Fehler';
document.getElementById('tts-status').style.color = '#FF3B30';
document.getElementById('tts-last-error').textContent = msg.error;
}
return;
}
if (msg.type === 'tts_status') {
document.getElementById('tts-default-voice').textContent = msg.defaultVoice || '?';
document.getElementById('tts-highlight-voice').textContent = msg.highlightVoice || '?';
document.getElementById('tts-status').textContent = msg.ok ? 'OK' : 'Fehler';
document.getElementById('tts-status').style.color = msg.ok ? '#34C759' : '#FF3B30';
if (msg.voices) ttsLog(`Stimmen: ${msg.voices.join(', ')}`);
if (msg.error) { document.getElementById('tts-last-error').textContent = msg.error; ttsLog(`Fehler: ${msg.error}`); }
else { document.getElementById('tts-last-error').textContent = '-'; ttsLog('TTS OK'); }
return;
}
if (msg.type === 'agent_activity') {
updateThinkingIndicator(msg);
return;
}
if (msg.type === 'xtts_voices_list') {
const select = document.getElementById('diag-xtts-voice');
// Behalte erste Option (Default)
while (select.options.length > 1) select.remove(1);
for (const v of (msg.payload?.voices || [])) {
const opt = document.createElement('option');
opt.value = v.name;
opt.textContent = `${v.name} (${(v.size / 1024).toFixed(0)}KB)`;
select.appendChild(opt);
}
document.getElementById('xtts-status').textContent = `XTTS: ${msg.payload?.voices?.length || 0} Stimme(n) verfuegbar`;
document.getElementById('xtts-status').style.color = '#34C759';
return;
}
if (msg.type === 'xtts_voice_saved') {
document.getElementById('xtts-clone-status').textContent = `Stimme "${msg.payload?.name}" gespeichert!`;
document.getElementById('xtts-clone-status').style.color = '#34C759';
loadXTTSVoices(); // Liste neu laden
return;
}
if (msg.type === 'voice_config') {
document.getElementById('diag-default-voice').value = msg.defaultVoice || 'ramona';
document.getElementById('diag-highlight-voice').value = msg.highlightVoice || 'thorsten';
document.getElementById('diag-tts-enabled').checked = msg.ttsEnabled !== false;
const sr = msg.speedRamona || 1.0;
const st = msg.speedThorsten || 1.0;
document.getElementById('diag-speed-ramona').value = sr;
document.getElementById('speed-ramona-label').textContent = sr + 'x';
document.getElementById('diag-speed-thorsten').value = st;
document.getElementById('speed-thorsten-label').textContent = st + 'x';
document.getElementById('diag-tts-engine').value = msg.ttsEngine || 'piper';
document.getElementById('diag-xtts-voice').value = msg.xttsVoice || '';
toggleXTTSPanel();
return;
}
if (msg.type === 'trigger_list') {
const textarea = document.getElementById('highlight-triggers');
textarea.value = (msg.triggers || []).join('\n');
document.getElementById('trigger-status').textContent = msg.triggers.length + ' Trigger geladen';
document.getElementById('trigger-status').style.color = '#8888AA';
return;
}
if (msg.type === 'watchdog') {
const colors = { warning: '#FFD60A', fixing: '#FF9500', fixed: '#34C759', error: '#FF3B30' };
const color = colors[msg.status] || '#FFD60A';
addChat('error', `\u26A0\uFE0F Watchdog: ${msg.message}`, `system — ${msg.status}`);
addLog('warn', 'server', `Watchdog: ${msg.message}`);
return;
}
if (msg.type === 'chat_final') {
addChat('received', msg.text, 'chat:final');
return;
@@ -991,6 +1238,113 @@
}, 120000);
}
// ── XTTS Panel ─────────────────────────────
function toggleXTTSPanel() {
const engine = document.getElementById('diag-tts-engine').value;
document.getElementById('piper-panel').style.display = engine === 'piper' ? 'block' : 'none';
document.getElementById('xtts-panel').style.display = engine === 'xtts' ? 'block' : 'none';
if (engine === 'xtts') loadXTTSVoices();
}
function loadXTTSVoices() {
sendToRVS_raw({ type: 'xtts_list_voices', payload: {}, timestamp: Date.now() });
}
async function uploadVoiceSamples() {
const name = document.getElementById('xtts-clone-name').value.trim();
const files = document.getElementById('xtts-clone-files').files;
if (!name) { alert('Bitte einen Namen eingeben'); return; }
if (!files || files.length === 0) { alert('Bitte Audio-Dateien auswaehlen'); return; }
document.getElementById('xtts-clone-status').textContent = `Lade ${files.length} Datei(en) hoch...`;
const samples = [];
for (const file of files) {
const buffer = await file.arrayBuffer();
const base64 = btoa(String.fromCharCode(...new Uint8Array(buffer)));
samples.push({ base64, name: file.name, size: file.size });
}
const totalSize = samples.reduce((s, f) => s + f.size, 0);
document.getElementById('xtts-clone-status').textContent =
`Sende ${samples.length} Sample(s) (${(totalSize / 1024).toFixed(0)}KB) an XTTS-Server...`;
sendToRVS_raw({
type: 'voice_upload',
payload: { name, samples },
timestamp: Date.now(),
});
}
// ── Abbrechen ──────────────────────────────
function cancelRequest() {
send({ action: 'cancel_request' });
updateThinkingIndicator({ activity: 'idle' });
addChat('error', 'Anfrage abgebrochen', 'system');
}
// ── Stimmen-Config ──────────────────────────
function sendVoiceConfig() {
const defaultVoice = document.getElementById('diag-default-voice').value;
const highlightVoice = document.getElementById('diag-highlight-voice').value;
const ttsEnabled = document.getElementById('diag-tts-enabled').checked;
const speedRamona = parseFloat(document.getElementById('diag-speed-ramona').value);
const speedThorsten = parseFloat(document.getElementById('diag-speed-thorsten').value);
const ttsEngine = document.getElementById('diag-tts-engine').value;
const xttsVoice = document.getElementById('diag-xtts-voice').value;
send({ action: 'send_voice_config', defaultVoice, highlightVoice, ttsEnabled, speedRamona, speedThorsten, ttsEngine, xttsVoice });
}
// ── Highlight-Trigger ────────────────────────
function loadHighlightTriggers() {
send({ action: 'get_triggers' });
}
function saveHighlightTriggers() {
const text = document.getElementById('highlight-triggers').value;
const triggers = text.split('\n').map(t => t.trim()).filter(t => t.length > 0);
send({ action: 'save_triggers', triggers });
document.getElementById('trigger-status').textContent = 'Gespeichert (' + triggers.length + ' Trigger)';
document.getElementById('trigger-status').style.color = '#34C759';
}
// Beim Tab-Wechsel zu Einstellungen: Trigger laden
const origSwitchMainTab = typeof switchMainTab === 'function' ? switchMainTab : null;
// ── Modus-Wechsel ────────────────────────────
let currentMode = 'normal';
const MODE_LABELS = { normal: 'Normal', dnd: 'Nicht stoeren', whisper: 'Fluestern', hangar: 'Hangar', gaming: 'Gaming' };
function setMode(mode) {
currentMode = mode;
// Visuelles Feedback
document.querySelectorAll('.mode-btn').forEach(btn => {
btn.style.borderColor = btn.dataset.mode === mode ? '#0096FF' : 'transparent';
});
document.getElementById('mode-status').textContent = `Aktueller Modus: ${MODE_LABELS[mode] || mode}`;
// An Bridge senden via RVS
sendToRVS(`ARIA, ${MODE_LABELS[mode]}-Modus`, false);
log("info", "server", `Modus gewechselt: ${mode}`);
}
// ── TTS Diagnose ─────────────────────────────
function ttsLog(msg) {
const el = document.getElementById('tts-log');
const time = new Date().toLocaleTimeString('de-DE');
el.innerHTML += `<div>[${time}] ${escapeHtml(msg)}</div>`;
el.scrollTop = el.scrollHeight;
}
function testTTS(voice) {
const text = document.getElementById('tts-test-text').value.trim();
if (!text) return;
ttsLog(`Teste ${voice}: "${text}"...`);
send({ action: 'test_tts', voice, text });
}
function checkTTSStatus() {
ttsLog('Pruefe TTS-Status...');
send({ action: 'check_tts' });
}
function openLightbox(mediaType, url) {
const lb = document.getElementById('lightbox');
if (mediaType === 'video') {
@@ -1389,6 +1743,11 @@
document.querySelectorAll('.main-nav-btn').forEach(b => {
if (b.textContent.trim().toLowerCase().includes(tab === 'main' ? 'main' : 'einstellung')) b.classList.add('active');
});
// Einstellungen: Config + Trigger laden
if (tab === 'settings') {
loadHighlightTriggers();
send({ action: 'get_voice_config' });
}
}
// ── Einstellungen: Tool-Berechtigungen ──────────────────
+248 -42
View File
@@ -336,6 +336,7 @@ function handleGatewayMessage(msg) {
// Genereller Activity-Heartbeat (ARIA denkt)
broadcast({ type: "agent_activity", activity: stream || "thinking" });
updateAgentActivity();
return;
}
@@ -352,6 +353,13 @@ function handleGatewayMessage(msg) {
if (pipelineActive) pipelineEnd(true, `"${text.slice(0, 120)}"`);
broadcast({ type: "chat_final", text, payload });
broadcast({ type: "agent_activity", activity: "idle" });
pendingMessageTime = 0; // Watchdog: Antwort erhalten
updateAgentActivity();
// Antwort in Backup-Log schreiben
try {
const entry = JSON.stringify({ ts: Date.now(), role: "assistant", text: text.slice(0, 2000), session: activeSessionKey }) + "\n";
fs.appendFileSync("/shared/config/chat_backup.jsonl", entry);
} catch {}
return;
}
@@ -424,6 +432,13 @@ function sendToGateway(text, isPipeline) {
const payload = JSON.stringify(msg);
log("debug", "gateway", `RAW >>> ${payload}`);
gatewayWs.send(payload);
pendingMessageTime = Date.now(); // Watchdog: Nachricht gesendet
// Nachricht sofort in Backup-Log schreiben (OpenClaw speichert erst nach Run-Ende)
try {
fs.mkdirSync("/shared/config", { recursive: true });
const entry = JSON.stringify({ ts: Date.now(), role: "user", text, session: activeSessionKey }) + "\n";
fs.appendFileSync("/shared/config/chat_backup.jsonl", entry);
} catch {}
log("info", "gateway", `chat.send [${reqId}]: "${text}"`);
if (isPipeline) plog(`chat.send [${reqId}] an Gateway gesendet — warte auf ACK...`);
@@ -545,55 +560,35 @@ function connectRVS(forcePlain) {
});
}
function sendToRVS(text, isPipeline) {
if (!RVS_HOST || !RVS_TOKEN) {
log("error", "rvs", "Nicht konfiguriert");
if (isPipeline) pipelineEnd(false, "RVS nicht konfiguriert");
return false;
}
// Frische WebSocket-Verbindung fuer jede Nachricht (Zombie-Schutz)
function sendToRVS_raw(msgObj) {
if (!RVS_HOST || !RVS_TOKEN) return;
const proto = RVS_TLS === "true" ? "wss" : "ws";
const url = `${proto}://${RVS_HOST}:${RVS_PORT}?token=${RVS_TOKEN}`;
const msg = JSON.stringify({
const freshWs = new WebSocket(url);
freshWs.on("open", () => {
freshWs.send(JSON.stringify(msgObj));
setTimeout(() => { try { freshWs.close(); } catch (_) {} }, 5000);
});
freshWs.on("error", () => {});
}
function sendToRVS(text, isPipeline) {
// Ueber Gateway senden (zuverlaessig) UND an RVS fuer App-Sichtbarkeit
// Die Bridge empfaengt RVS-Nachrichten von der App zuverlaessig,
// aber die Diagnostic→RVS→Bridge Route hat Zombie-Probleme.
// Deshalb: Gateway fuer ARIA, RVS nur fuer App-Anzeige.
// 1. An Gateway senden (damit ARIA antwortet)
const gatewayOk = sendToGateway(text, isPipeline);
// 2. An RVS senden (damit die App die Nachricht sieht)
sendToRVS_raw({
type: "chat",
payload: { text, sender: "diagnostic" },
timestamp: Date.now(),
});
log("info", "rvs", `Sende via frische Verbindung: ${url.split('?')[0]}`);
const freshWs = new WebSocket(url);
freshWs.on("open", () => {
freshWs.send(msg);
log("info", "rvs", `Gesendet via RVS: "${text}"`);
// Verbindung offen lassen fuer Antwort-Empfang, nach 5min schliessen
setTimeout(() => { try { freshWs.close(); } catch (_) {} }, 300000);
});
freshWs.on("message", (raw) => {
try {
const resp = JSON.parse(raw.toString());
if (resp.type === "chat" && resp.payload) {
const sender = resp.payload.sender || "?";
// Eigene Nachrichten und STT ignorieren (werden von persistenter Verbindung gehandelt)
if (sender === "diagnostic" || sender === "stt") return;
log("info", "rvs", `Chat von ${sender}: "${(resp.payload.text || "").slice(0, 100)}"`);
if (pipelineActive && sender !== "diagnostic") {
pipelineEnd(true, `Antwort via RVS von ${sender}: "${(resp.payload.text || "").slice(0, 120)}"`);
}
broadcast({ type: "rvs_chat", msg: resp });
} else if (resp.type !== "heartbeat") {
log("debug", "rvs", `Nachricht: ${JSON.stringify(resp).slice(0, 150)}`);
}
} catch {}
});
freshWs.on("error", (err) => {
log("error", "rvs", `Sende-Fehler: ${err.message}`);
if (isPipeline) pipelineEnd(false, `RVS Fehler: ${err.message}`);
});
if (isPipeline) plog(`Nachricht an RVS gesendet — warte auf Antwort via RVS...`);
return true;
return gatewayOk;
}
// ── Claude Proxy Test ────────────────────────────────────
@@ -1017,6 +1012,64 @@ function waitForMessage(ws, timeoutMs) {
});
}
// ── Watchdog: Stuck Run Erkennung ────────────────────────
let lastAgentActivity = Date.now();
let watchdogWarned = false;
let watchdogFixAttempted = false;
let pendingMessageTime = 0; // Wann wurde die letzte Nachricht gesendet
function updateAgentActivity() {
lastAgentActivity = Date.now();
watchdogWarned = false;
}
// Watchdog prüft alle 30s ob ARIA nach einer gesendeten Nachricht reagiert
setInterval(async () => {
if (pendingMessageTime === 0) return; // Keine Nachricht gesendet
const waitingMs = Date.now() - pendingMessageTime;
// Nach 2min ohne Agent-Activity: Warnung
if (waitingMs > 120000 && !watchdogWarned) {
watchdogWarned = true;
log("warn", "server", `Watchdog: Keine ARIA-Aktivitaet seit ${Math.round(waitingMs / 1000)}s — moeglicherweise stuck`);
broadcast({ type: "watchdog", status: "warning", waitingMs, message: "ARIA reagiert nicht — moeglicherweise stuck Run" });
}
// Nach 5min: doctor --fix
if (waitingMs > 300000 && watchdogWarned && !watchdogFixAttempted) {
watchdogFixAttempted = true;
log("error", "server", "Watchdog: 5min ohne Antwort — fuehre openclaw doctor --fix aus");
broadcast({ type: "watchdog", status: "fixing", message: "Auto-Fix: openclaw doctor --fix" });
try {
await dockerExec("aria-core", "openclaw doctor --fix 2>/dev/null || true");
log("info", "server", "Watchdog: doctor --fix ausgefuehrt");
broadcast({ type: "watchdog", status: "fixed", message: "doctor --fix ausgefuehrt — warte auf Antwort..." });
} catch (err) {
log("error", "server", `Watchdog: doctor --fix fehlgeschlagen: ${err.message}`);
}
}
// Nach 8min: Container neustarten
if (waitingMs > 480000 && watchdogFixAttempted) {
log("error", "server", "Watchdog: 8min ohne Antwort — starte aria-core + aria-proxy neu");
broadcast({ type: "watchdog", status: "restarting", message: "Container-Restart: aria-core + aria-proxy" });
try {
const { execSync } = require("child_process");
execSync("docker restart aria-core aria-proxy", { timeout: 60000 });
log("info", "server", "Watchdog: Container neugestartet");
broadcast({ type: "watchdog", status: "restarted", message: "Container neugestartet — warte auf Gateway-Reconnect..." });
// Gateway wird sich automatisch neu verbinden
} catch (err) {
log("error", "server", `Watchdog: Container-Restart fehlgeschlagen: ${err.message}`);
broadcast({ type: "watchdog", status: "error", message: `Restart fehlgeschlagen: ${err.message}` });
}
pendingMessageTime = 0;
watchdogWarned = false;
watchdogFixAttempted = false;
}
}, 30000);
// ── HTTP Server + WebSocket fuer Browser ────────────────
const htmlPath = path.join(__dirname, "index.html");
@@ -1103,6 +1156,42 @@ wss.on("connection", (ws) => {
if (ws._sshSock) ws._sshSock.write(msg.data);
} else if (msg.action === "live_ssh_close") {
if (ws._sshSock) { ws._sshSock.end(); ws._sshSock = null; }
} else if (msg.action === "cancel_request") {
// Laufende Anfrage abbrechen — doctor --fix beendet stuck runs
log("warn", "server", "Anfrage abgebrochen — fuehre doctor --fix aus");
pendingMessageTime = 0;
watchdogWarned = false;
watchdogFixAttempted = false;
if (pipelineActive) pipelineEnd(false, "Vom Benutzer abgebrochen");
broadcast({ type: "agent_activity", activity: "idle" });
dockerExec("aria-core", "openclaw doctor --fix 2>/dev/null || true").catch(() => {});
} else if (msg.action === "get_voice_config") {
handleGetVoiceConfig(ws);
} else if (msg.action === "send_voice_config") {
// Stimmen-Config persistent speichern + an Bridge via RVS senden
const voiceConfig = {
defaultVoice: msg.defaultVoice || "ramona",
highlightVoice: msg.highlightVoice || "thorsten",
ttsEnabled: msg.ttsEnabled !== false,
ttsEngine: msg.ttsEngine || "piper",
xttsVoice: msg.xttsVoice || "",
speedRamona: msg.speedRamona || 1.0,
speedThorsten: msg.speedThorsten || 1.0,
};
try {
fs.mkdirSync("/shared/config", { recursive: true });
fs.writeFileSync("/shared/config/voice_config.json", JSON.stringify(voiceConfig, null, 2));
} catch {}
sendToRVS_raw({ type: "config", payload: voiceConfig, timestamp: Date.now() });
log("info", "server", `Voice-Config gespeichert+gesendet: default=${voiceConfig.defaultVoice}, highlight=${voiceConfig.highlightVoice}, tts=${voiceConfig.ttsEnabled}`);
} else if (msg.action === "get_triggers") {
handleGetTriggers(ws);
} else if (msg.action === "save_triggers") {
handleSaveTriggers(ws, msg.triggers || []);
} else if (msg.action === "test_tts") {
handleTestTTS(ws, msg.voice || "ramona", msg.text || "Test");
} else if (msg.action === "check_tts") {
handleCheckTTS(ws);
} else if (msg.action === "check_desktop") {
checkDesktopAvailable(ws);
} else if (msg.action === "load_chat_history") {
@@ -1229,6 +1318,123 @@ function startLiveSSH(clientWs) {
createReq.end(createBody);
}
// ── Voice-Config laden ────────────────────────────────
function handleGetVoiceConfig(clientWs) {
try {
const configPath = "/shared/config/voice_config.json";
if (fs.existsSync(configPath)) {
const config = JSON.parse(fs.readFileSync(configPath, "utf-8"));
clientWs.send(JSON.stringify({ type: "voice_config", ...config }));
} else {
clientWs.send(JSON.stringify({ type: "voice_config", defaultVoice: "ramona", highlightVoice: "thorsten", ttsEnabled: true }));
}
} catch (err) {
clientWs.send(JSON.stringify({ type: "voice_config", defaultVoice: "ramona", highlightVoice: "thorsten", ttsEnabled: true }));
}
}
// ── Highlight-Trigger ─────────────────────────────────
const TRIGGERS_FILE = "/shared/config/highlight_triggers.json";
async function handleGetTriggers(clientWs) {
try {
// Zuerst aus Shared Volume lesen, dann Fallback auf Bridge-Defaults
let triggers;
if (fs.existsSync(TRIGGERS_FILE)) {
triggers = JSON.parse(fs.readFileSync(TRIGGERS_FILE, "utf-8"));
} else {
// Defaults aus der Bridge lesen
const result = await dockerExec("aria-bridge", `python3 -c "
import sys; sys.path.insert(0,'/app')
from aria_bridge import EPIC_TRIGGERS
print('\\n'.join(EPIC_TRIGGERS))
"`);
triggers = result.trim().split("\n").filter(t => t);
}
clientWs.send(JSON.stringify({ type: "trigger_list", triggers }));
} catch (err) {
clientWs.send(JSON.stringify({ type: "trigger_list", triggers: [], error: err.message }));
}
}
async function handleSaveTriggers(clientWs, triggers) {
try {
// In Shared Volume speichern (fuer Bridge lesbar)
fs.mkdirSync("/shared/config", { recursive: true });
fs.writeFileSync(TRIGGERS_FILE, JSON.stringify(triggers, null, 2));
log("info", "server", `${triggers.length} Highlight-Trigger gespeichert`);
// Bridge informieren (wird beim naechsten Start geladen)
clientWs.send(JSON.stringify({ type: "trigger_list", triggers }));
} catch (err) {
log("error", "server", `Trigger speichern fehlgeschlagen: ${err.message}`);
}
}
// ── TTS Diagnose ──────────────────────────────────────
async function handleTestTTS(clientWs, voice, text) {
try {
log("info", "server", `TTS-Test: ${voice} — "${text}"`);
const result = await dockerExec("aria-bridge", `python3 -c "
import time, sys
sys.path.insert(0, '/app')
from piper import PiperVoice
import wave, tempfile, os
voices = {'ramona': '/voices/de_DE-ramona-low.onnx', 'thorsten': '/voices/de_DE-thorsten-high.onnx'}
path = voices.get('${voice}')
if not path or not os.path.exists(path):
print('FEHLER: Stimme nicht gefunden')
sys.exit(1)
v = PiperVoice.load(path)
start = time.time()
tmp = tempfile.NamedTemporaryFile(suffix='.wav', delete=False)
with wave.open(tmp.name, 'wb') as wf:
wf.setnchannels(1)
wf.setsampwidth(2)
wf.setframerate(v.config.sample_rate)
v.synthesize('${text.replace(/'/g, "\\'")}', wf)
size = os.path.getsize(tmp.name)
dur = int((time.time() - start) * 1000)
os.unlink(tmp.name)
print(f'OK:{dur}:{size}')
"`);
const parts = result.trim().split(":");
if (parts[0] === "OK") {
clientWs.send(JSON.stringify({ type: "tts_result", ok: true, voice, duration: parts[1], size: parts[2] }));
} else {
clientWs.send(JSON.stringify({ type: "tts_result", ok: false, voice, error: result.trim() }));
}
} catch (err) {
clientWs.send(JSON.stringify({ type: "tts_result", ok: false, voice, error: err.message }));
}
}
async function handleCheckTTS(clientWs) {
try {
const result = await dockerExec("aria-bridge", `python3 -c "
import os, json
voices = {}
for name, path in [('ramona', '/voices/de_DE-ramona-low.onnx'), ('thorsten', '/voices/de_DE-thorsten-high.onnx')]:
voices[name] = os.path.exists(path)
print(json.dumps(voices))
"`);
const voices = JSON.parse(result.trim());
const available = Object.entries(voices).filter(([,v]) => v).map(([k]) => k);
const missing = Object.entries(voices).filter(([,v]) => !v).map(([k]) => k);
clientWs.send(JSON.stringify({
type: "tts_status",
ok: missing.length === 0,
voices: available,
defaultVoice: "ramona",
highlightVoice: "thorsten",
error: missing.length > 0 ? `Fehlend: ${missing.join(", ")}` : null,
}));
} catch (err) {
clientWs.send(JSON.stringify({ type: "tts_status", ok: false, error: err.message }));
}
}
function checkDesktopAvailable(clientWs) {
// Pruefen ob VNC auf der VM laeuft (Port 5900/5901)
const checkSock = net.connect({ host: "host.docker.internal", port: 5901 }, () => {
+1 -1
View File
@@ -100,7 +100,7 @@ services:
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- ./aria-data/config/diag-state:/data # Persistenter State (aktive Session etc.)
- aria-shared:/shared:ro # Shared Volume (Uploads lesen fuer Vorschau)
- aria-shared:/shared # Shared Volume (Uploads + Config)
environment:
- ARIA_AUTH_TOKEN=${ARIA_AUTH_TOKEN:-}
- PROXY_URL=http://proxy:3456
+30 -19
View File
@@ -1,25 +1,36 @@
bildupload ghet noch nicht.
# ARIA Issues & Features
#erledigt
sprachnachrichten werden nicht als zweite nachricht dargestellt, damit man weiß was man gesendet hat
# ende
## Erledigt
- [x] Bildupload funktioniert (Shared Volume /shared/uploads/)
- [x] Sprachnachrichten werden als Text angezeigt (STT → Chat-Bubble)
- [x] Cache leeren + Auto-Download von Anhaengen
- [x] ARIA liest Nachrichten vor (TTS via Piper)
- [x] Autoscroll zur letzten Nachricht
- [x] Bilder im Chat groesser + Vollbild-Vorschau
- [x] Ohr-Button Absturz gefixt (LiveAudioStream entfernt, Phase 1 Placeholder)
- [x] Play-Button in ARIA-Nachrichten fuer Sprachwiedergabe
- [x] Chat-Suche in der App (Lupe in Statusleiste)
- [x] Watchdog mit Container-Restart (2min Warnung → 5min doctor --fix → 8min Restart)
- [x] Abbrechen-Button im Diagnostic Chat
- [x] Nachrichten Backup on-the-fly (/shared/config/chat_backup.jsonl)
- [x] Grosse Nachrichten satzweise aufteilen fuer TTS
- [x] RVS Nachrichten vom Smartphone gehen durch
- [x] Stimmen-Einstellungen (Ramona/Thorsten, Speed pro Stimme)
- [x] Highlight-Trigger konfigurierbar in Diagnostic
cache leeren, bilder werden nicht neu geladen beim antippen.
autoload geht nicht
## Offen
wenn man auf das ohr zum hören klickt stürzt ab
### TTS / Stimmen
- [ ] TTS Engine waehlbar: Piper (CPU, schnell) oder Coqui XTTS v2 (GPU, natuerlicher)
- [ ] Piper Voices Download ueber Diagnostic (neue Sprachen/Stimmen)
- [ ] Coqui XTTS v2 Integration (braucht GPU, bessere deutsche Stimme)
aria liest die nachrichten nicht vor
### App
- [ ] Wake Word on-device (Porcupine "ARIA" Keyword, Phase 2)
- [ ] Chat-History zuverlaessiger laden (AsyncStorage Race Condition)
# erledigt autoscroll geht doch noch nicht zur letzten nachricht
unserer memory brain
# ende
bilder im chat größer darstellen
# ende
die viper voices downloaden über die diagnostic
# ende
### Architektur
- [ ] Bilder: Claude Vision direkt nutzen (aktuell nur Dateipfad an ARIA)
- [ ] Auto-Compacting und Memory/Brain Verwaltung (SQLite?)
- [ ] Diagnostic: System-Info Tab (Container-Status, Disk, RAM, CPU)
+17
View File
@@ -170,6 +170,22 @@ else
exit 1
fi
# ── Auto-Update: APK auf RVS-Server kopieren ─
RVS_UPDATE_HOST="${RVS_UPDATE_HOST:-}"
if [ -n "$RVS_UPDATE_HOST" ]; then
echo -e "${GREEN}[6/6] APK auf RVS-Server kopieren (Auto-Update)...${NC}"
scp "$APK_PATH" "${RVS_UPDATE_HOST}:~/ARIA-AGENT/rvs/updates/${APK_NAME}" 2>/dev/null
if [ $? -eq 0 ]; then
echo -e " ${GREEN}${NC} APK auf RVS-Server kopiert — Apps werden benachrichtigt"
else
echo -e " ${YELLOW}APK konnte nicht auf RVS kopiert werden (RVS_UPDATE_HOST=$RVS_UPDATE_HOST)${NC}"
echo -e " ${YELLOW}Manuell: scp $APK_PATH $RVS_UPDATE_HOST:~/ARIA-AGENT/rvs/updates/${APK_NAME}${NC}"
fi
else
echo -e "${YELLOW}Auto-Update uebersprungen (RVS_UPDATE_HOST nicht gesetzt)${NC}"
echo -e "${YELLOW}Setze RVS_UPDATE_HOST in .env fuer automatische Verteilung${NC}"
fi
# ── Fertig ────────────────────────────────────
echo ""
echo -e "${GREEN}╔═══════════════════════════════════════════════════╗${NC}"
@@ -177,4 +193,5 @@ echo -e "${GREEN}║ Release $TAG ist live!$(printf '%*s' $((27 - ${#TAG})) ''
echo -e "${GREEN}╠═══════════════════════════════════════════════════╣${NC}"
echo -e "${GREEN}${NC} $GITEA_URL/$GITEA_REPO/releases/tag/$TAG"
echo -e "${GREEN}${NC} APK: $APK_NAME ($APK_SIZE)"
echo -e "${GREEN}${NC} Auto-Update: ${RVS_UPDATE_HOST:-nicht konfiguriert}"
echo -e "${GREEN}╚═══════════════════════════════════════════════════╝${NC}"
+2
View File
@@ -4,5 +4,7 @@ services:
ports:
- "${RVS_PORT:-443}:3000"
restart: always
volumes:
- ./updates:/updates # APK-Dateien fuer Auto-Update
environment:
- MAX_SESSIONS=10
+113 -1
View File
@@ -1,15 +1,21 @@
"use strict";
const { WebSocketServer } = require("ws");
const fs = require("fs");
const path = require("path");
// ── Konfiguration aus Umgebungsvariablen ────────────────────────────
const PORT = parseInt(process.env.PORT || "3000", 10);
const MAX_SESSIONS = parseInt(process.env.MAX_SESSIONS || "10", 10);
const UPDATES_DIR = process.env.UPDATES_DIR || "/updates";
// Kein Polling — APK wird manuell per git pull bereitgestellt
// Erlaubte Nachrichtentypen — alles andere wird verworfen
const ALLOWED_TYPES = new Set([
"chat", "audio", "file", "location", "mode", "log", "event", "heartbeat",
"file_request", "file_response", "file_saved", "stt_result",
"file_request", "file_response", "file_saved", "stt_result", "config", "tts_request",
"xtts_request", "xtts_response", "xtts_list_voices", "xtts_voices_list", "voice_upload", "xtts_voice_saved",
"update_check", "update_available", "update_download", "update_data",
]);
// Token-Raum: token -> { clients: Set<ws> }
@@ -46,6 +52,9 @@ const wss = new WebSocketServer({ port: PORT });
wss.on("listening", () => {
log(`RVS läuft auf Port ${PORT} | Max Sessions: ${MAX_SESSIONS}`);
// Beim Start pruefen ob eine APK da ist
const apkInfo = getLatestAPK();
if (apkInfo) log(`APK bereit: v${apkInfo.version} (${(fs.statSync(apkInfo.path).size / 1024 / 1024).toFixed(1)}MB)`);
});
wss.on("connection", (ws, req) => {
@@ -107,6 +116,52 @@ function registerClient(ws, token) {
return;
}
// Update-Check: direkt an den anfragenden Client antworten (nicht relay'en)
if (msg.type === "update_check") {
const clientVersion = msg.payload?.version || "0.0.0.0";
const apkInfo = getLatestAPK();
if (apkInfo && compareVersions(apkInfo.version, clientVersion) > 0) {
ws.send(JSON.stringify({
type: "update_available",
payload: {
version: apkInfo.version,
downloadUrl: `/update/latest.apk`,
size: fs.statSync(apkInfo.path).size,
},
timestamp: Date.now(),
}));
}
return;
}
// Update-Download: APK als Base64 ueber WebSocket senden
if (msg.type === "update_download") {
const apkInfo = getLatestAPK();
if (!apkInfo) {
ws.send(JSON.stringify({ type: "update_data", payload: { error: "Keine APK verfuegbar" }, timestamp: Date.now() }));
return;
}
try {
const data = fs.readFileSync(apkInfo.path);
const base64 = data.toString("base64");
const sizeMB = (data.length / 1024 / 1024).toFixed(1);
log(`APK sende: v${apkInfo.version} (${sizeMB}MB) an Client`);
ws.send(JSON.stringify({
type: "update_data",
payload: {
version: apkInfo.version,
base64,
size: data.length,
fileName: `ARIA-v${apkInfo.version}.apk`,
},
timestamp: Date.now(),
}));
} catch (err) {
ws.send(JSON.stringify({ type: "update_data", payload: { error: err.message }, timestamp: Date.now() }));
}
return;
}
// An alle anderen Clients im Raum weiterleiten
for (const client of room.clients) {
if (client !== ws && client.readyState === 1) {
@@ -167,6 +222,63 @@ wss.on("close", () => {
clearInterval(cleanup);
});
// ── Auto-Update: APK-Erkennung + Push ──────────────────────────────
let latestVersion = null;
function getLatestAPK() {
try {
if (!fs.existsSync(UPDATES_DIR)) return null;
const files = fs.readdirSync(UPDATES_DIR)
.filter(f => f.endsWith(".apk"))
.map(f => {
// ARIA-v0.0.2.3.apk oder ARIA-Cockpit-release.apk
const match = f.match(/(\d+\.\d+\.\d+[\.\d]*)/);
return { file: f, path: path.join(UPDATES_DIR, f), version: match ? match[1] : null };
})
.filter(f => f.version)
.sort((a, b) => compareVersions(b.version, a.version)); // Neueste zuerst
return files[0] || null;
} catch {
return null;
}
}
function compareVersions(a, b) {
const pa = a.split(".").map(Number);
const pb = b.split(".").map(Number);
for (let i = 0; i < Math.max(pa.length, pb.length); i++) {
const diff = (pa[i] || 0) - (pb[i] || 0);
if (diff !== 0) return diff;
}
return 0;
}
function notifyClientsAboutUpdate(apkInfo) {
const msg = JSON.stringify({
type: "update_available",
payload: {
version: apkInfo.version,
downloadUrl: `/update/latest.apk`,
size: fs.statSync(apkInfo.path).size,
},
timestamp: Date.now(),
});
// An alle Clients in allen Rooms senden
for (const [, room] of rooms) {
for (const client of room.clients) {
if (client.readyState === 1) {
client.send(msg);
}
}
}
log(`Update-Benachrichtigung gesendet: v${apkInfo.version} (${rooms.size} Raum/Raeume)`);
}
// Kein Polling — Update-Check passiert on-demand (update_check Message von App)
// ── Sauberes Herunterfahren ─────────────────────────────────────────
process.on("SIGTERM", () => {
View File
+11
View File
@@ -0,0 +1,11 @@
# ════════════════════════════════════════════════
# ARIA XTTS v2 — Konfiguration
# Kopieren nach .env und anpassen
# ════════════════════════════════════════════════
# RVS Verbindung (gleiche Daten wie auf der ARIA-VM)
RVS_HOST=mobil.hacker-net.de
RVS_PORT=444
RVS_TLS=true
RVS_TLS_FALLBACK=true
RVS_TOKEN=dein_token_hier
+5
View File
@@ -0,0 +1,5 @@
FROM node:22-alpine
WORKDIR /app
COPY bridge.js package.json ./
RUN npm install --production
CMD ["node", "bridge.js"]
+268
View File
@@ -0,0 +1,268 @@
/**
* ARIA XTTS Bridge — Verbindet XTTS v2 Server mit dem RVS
*
* Empfaengt tts_request ueber RVS → rendert Audio via XTTS API → sendet zurueck
* Empfaengt voice_upload → speichert Voice-Sample fuer Cloning
* Empfaengt xtts_list_voices → listet verfuegbare Stimmen
*/
const WebSocket = require("ws");
const http = require("http");
const https = require("https");
const fs = require("fs");
const path = require("path");
const XTTS_API_URL = process.env.XTTS_API_URL || "http://xtts:8000";
const RVS_HOST = process.env.RVS_HOST || "";
const RVS_PORT = process.env.RVS_PORT || "443";
const RVS_TLS = process.env.RVS_TLS || "true";
const RVS_TLS_FALLBACK = process.env.RVS_TLS_FALLBACK || "true";
const RVS_TOKEN = process.env.RVS_TOKEN || "";
const VOICES_DIR = "/voices";
function log(msg) {
console.log(`[${new Date().toISOString()}] ${msg}`);
}
// ── RVS Verbindung ──────────────────────────────────
let rvsWs = null;
let retryDelay = 2;
function connectRVS(forcePlain) {
if (!RVS_HOST || !RVS_TOKEN) {
log("RVS nicht konfiguriert — beende");
process.exit(1);
}
const useTls = RVS_TLS === "true" && !forcePlain;
const proto = useTls ? "wss" : "ws";
const url = `${proto}://${RVS_HOST}:${RVS_PORT}?token=${RVS_TOKEN}`;
log(`Verbinde zu RVS: ${proto}://${RVS_HOST}:${RVS_PORT}`);
const ws = new WebSocket(url);
ws.on("open", () => {
log("RVS verbunden — warte auf TTS-Requests");
rvsWs = ws;
retryDelay = 2;
// Keepalive
setInterval(() => {
if (ws.readyState === WebSocket.OPEN) {
ws.ping();
ws.send(JSON.stringify({ type: "heartbeat", timestamp: Date.now() }));
}
}, 25000);
});
ws.on("message", async (raw) => {
try {
const msg = JSON.parse(raw.toString());
if (msg.type === "xtts_request") {
await handleTTSRequest(msg.payload);
} else if (msg.type === "voice_upload") {
await handleVoiceUpload(msg.payload);
} else if (msg.type === "xtts_list_voices") {
await handleListVoices();
}
} catch (err) {
log(`Fehler: ${err.message}`);
}
});
ws.on("close", () => {
log("RVS Verbindung geschlossen");
rvsWs = null;
setTimeout(() => connectRVS(), Math.min(retryDelay * 1000, 30000));
retryDelay = Math.min(retryDelay * 2, 30);
});
ws.on("error", (err) => {
log(`RVS Fehler: ${err.message}`);
if (useTls && RVS_TLS_FALLBACK === "true") {
log("TLS fehlgeschlagen — Fallback auf ws://");
ws.removeAllListeners();
try { ws.close(); } catch (_) {}
connectRVS(true);
}
});
}
// ── TTS Request Handler ─────────────────────────────
async function handleTTSRequest(payload) {
const { text, voice, requestId, language } = payload;
if (!text) return;
log(`TTS-Request: "${text.slice(0, 60)}..." (voice: ${voice || "default"}, lang: ${language || "de"})`);
try {
// Voice-Sample Pfad bestimmen
const voiceSample = voice ? path.join(VOICES_DIR, `${voice}.wav`) : null;
const hasCustomVoice = voiceSample && fs.existsSync(voiceSample);
// XTTS API aufrufen
const audioBuffer = await callXTTSAPI(text, language || "de", hasCustomVoice ? voiceSample : null);
if (audioBuffer && audioBuffer.length > 100) {
const base64 = audioBuffer.toString("base64");
log(`TTS fertig: ${audioBuffer.length} bytes (${(audioBuffer.length / 1024).toFixed(0)}KB)`);
sendToRVS({
type: "xtts_response",
payload: {
requestId: requestId || "",
base64,
mimeType: "audio/wav",
voice: voice || "default",
engine: "xtts",
},
timestamp: Date.now(),
});
} else {
log("TTS: Leeres Audio erhalten");
sendToRVS({
type: "xtts_response",
payload: { requestId, error: "Leeres Audio" },
timestamp: Date.now(),
});
}
} catch (err) {
log(`TTS Fehler: ${err.message}`);
sendToRVS({
type: "xtts_response",
payload: { requestId, error: err.message },
timestamp: Date.now(),
});
}
}
function callXTTSAPI(text, language, speakerWav) {
return new Promise((resolve, reject) => {
const body = JSON.stringify({
text,
language,
speaker_wav: speakerWav || "",
});
const url = new URL(`${XTTS_API_URL}/tts_to_audio/`);
const options = {
hostname: url.hostname,
port: url.port,
path: url.pathname,
method: "POST",
headers: {
"Content-Type": "application/json",
"Content-Length": Buffer.byteLength(body),
},
timeout: 60000,
};
const req = http.request(options, (res) => {
const chunks = [];
res.on("data", (chunk) => chunks.push(chunk));
res.on("end", () => {
if (res.statusCode === 200) {
resolve(Buffer.concat(chunks));
} else {
reject(new Error(`XTTS API HTTP ${res.statusCode}: ${Buffer.concat(chunks).toString().slice(0, 200)}`));
}
});
});
req.on("error", reject);
req.on("timeout", () => { req.destroy(); reject(new Error("XTTS API Timeout (60s)")); });
req.write(body);
req.end();
});
}
// ── Voice Upload Handler ────────────────────────────
async function handleVoiceUpload(payload) {
const { name, samples } = payload;
if (!name || !samples || !Array.isArray(samples) || samples.length === 0) {
log("Voice Upload: Ungueltige Daten");
return;
}
log(`Voice Upload: "${name}" (${samples.length} Samples)`);
try {
// Alle Samples zusammenfuegen
const buffers = samples.map(s => Buffer.from(s.base64, "base64"));
const combined = Buffer.concat(buffers);
// Als WAV speichern
fs.mkdirSync(VOICES_DIR, { recursive: true });
const filePath = path.join(VOICES_DIR, `${name.replace(/[^a-zA-Z0-9_-]/g, "_")}.wav`);
fs.writeFileSync(filePath, combined);
log(`Voice gespeichert: ${filePath} (${(combined.length / 1024).toFixed(0)}KB)`);
sendToRVS({
type: "xtts_voice_saved",
payload: { name, size: combined.length, path: filePath },
timestamp: Date.now(),
});
} catch (err) {
log(`Voice Upload Fehler: ${err.message}`);
}
}
// ── Voice List Handler ──────────────────────────────
async function handleListVoices() {
try {
const files = fs.existsSync(VOICES_DIR)
? fs.readdirSync(VOICES_DIR).filter(f => f.endsWith(".wav"))
: [];
const voices = files.map(f => ({
name: path.basename(f, ".wav"),
file: f,
size: fs.statSync(path.join(VOICES_DIR, f)).size,
}));
log(`Stimmen: ${voices.length} verfuegbar`);
sendToRVS({
type: "xtts_voices_list",
payload: { voices },
timestamp: Date.now(),
});
} catch (err) {
log(`Stimmen-Liste Fehler: ${err.message}`);
}
}
// ── RVS senden ──────────────────────────────────────
function sendToRVS(msg) {
if (rvsWs && rvsWs.readyState === WebSocket.OPEN) {
rvsWs.send(JSON.stringify(msg));
}
}
// ── Start ───────────────────────────────────────────
log("ARIA XTTS Bridge startet...");
log(`XTTS API: ${XTTS_API_URL}`);
log(`RVS: ${RVS_HOST}:${RVS_PORT}`);
// Warten bis XTTS API erreichbar ist
function waitForXTTS(callback, attempts) {
if (attempts <= 0) { log("XTTS API nicht erreichbar — starte trotzdem"); callback(); return; }
http.get(`${XTTS_API_URL}/docs`, (res) => {
log("XTTS API erreichbar");
callback();
}).on("error", () => {
log(`XTTS API noch nicht bereit — warte (${attempts} Versuche uebrig)...`);
setTimeout(() => waitForXTTS(callback, attempts - 1), 5000);
});
}
waitForXTTS(() => connectRVS(), 24); // Max 2min warten
+54
View File
@@ -0,0 +1,54 @@
# ════════════════════════════════════════════════
# ARIA XTTS v2 — GPU TTS Server
# Laeuft auf dem Gaming-PC (RTX 3060)
# Verbindet sich zum RVS fuer TTS-Requests
# ════════════════════════════════════════════════
#
# Voraussetzungen:
# - Docker Desktop mit WSL2
# - NVIDIA Container Toolkit
# - .env mit RVS-Verbindungsdaten
#
# Start: docker compose up -d
# Test: curl http://localhost:8000/docs
# ════════════════════════════════════════════════
services:
# ─── XTTS v2 API Server (GPU) ─────────────────
xtts:
image: ghcr.io/daswer123/xtts-api-server:latest
container_name: aria-xtts
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
ports:
- "8000:8000"
volumes:
- xtts-models:/root/.local/share/tts # Model-Cache (~2GB)
- ./voices:/voices # Custom Voice Samples
environment:
- COQUI_TOS_AGREED=1
restart: unless-stopped
# ─── XTTS Bridge (verbindet zu RVS) ───────────
xtts-bridge:
build: .
container_name: aria-xtts-bridge
depends_on:
- xtts
environment:
- XTTS_API_URL=http://xtts:8000
- RVS_HOST=${RVS_HOST}
- RVS_PORT=${RVS_PORT:-443}
- RVS_TLS=${RVS_TLS:-true}
- RVS_TLS_FALLBACK=${RVS_TLS_FALLBACK:-true}
- RVS_TOKEN=${RVS_TOKEN}
restart: unless-stopped
volumes:
xtts-models:
+8
View File
@@ -0,0 +1,8 @@
{
"name": "aria-xtts-bridge",
"version": "1.0.0",
"private": true,
"dependencies": {
"ws": "^8.16.0"
}
}