feat(brain): Multi-Threading via per-request project_id + per-project queue

Erster Schritt zum echten Multi-Threading fuer ARIA-Projekte. Kein globaler active_project-State mehr — jeder /chat-Request sagt selbst welche Buehne (project_id im Body). Verschiedene Projekte laufen parallel, gleiches Projekt queued via asyncio.Lock. Backend: - ChatIn.project_id: Client bestimmt pro Request wohin. Bridge routet. - /chat: async, holt per-Projekt asyncio.Lock. Requests fuers gleiche Projekt reihen sich in _project_pending ein, warten am Lock. Requests fuer verschiedene Projekte laufen echt parallel. - Neuer /projects/queue-status endpoint: pro Kontext (inkl. Hauptchat unter __main__): busy True/False + queue_size. Fuers UI-Status-Dots. - Agent.chat() nimmt project_id + pending_queue Params. Kein projects_mod.get_active() mehr im Hot-Path. Queue-Aware Prompting: - Wenn nach dem aktuellen Turn weitere Nachrichten in der Queue liegen, wird der System-Prompt um ein QUEUE-Segment erweitert mit Instruktion: „Bevor Du den aktuellen Task loesst, pruef die Queue — widerspricht/ annuliert eine spaetere Nachricht? Dann Skip-Antwort statt Doppelarbeit." - Beispiel: Task 'titelleiste rot' + Queue-Tail 'doch nicht, blau' → ARIA skipt rot, blau kommt als naechste Anfrage sauber durch. - Kein extra LLM-Call — reine Prompt-Injection. Project-Tools: - project_enter/exit sind jetzt UI-Signale (App wechselt Ansicht via project_changed event), aendern KEINEN Brain-State mehr. Der aktuelle Turn bleibt in seinem Chat-Kontext. - project_list zeigt keinen "AKTIV"-Marker mehr (nicht mehr sinnvoll). - projects_mod.set_active/get_active bleiben als Legacy-Helpers (kein Aufruf mehr aus dem Hot-Path). Bridge: - send_to_core packt project_id in den /chat-Body. - User-Backup-Eintrag tag't project_id sauber, keine Brain-Query mehr. Naechste Schritte (kommende Commits): - App: Focus-One-View mit Drawer + Status-Dots + OS-Push - Diagnostic: Dashboard-Stack mit Karten - Voice-Router: 30s-Sticky + Meta-Command-Interception im wakeword.ts Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-07-02 17:57:30 +02:00
parent 5b2c552a88
commit 7927ad05ae
3 changed files with 191 additions and 64 deletions
@@ -607,6 +607,11 @@ def memory_import_bootstrap(body: BootstrapBundle):
 class ChatIn(BaseModel):
    message: str
    source: str = ""  # "app" / "diagnostic" / "stt" — optional
+    # Multi-Threading: Client bestimmt pro Request welches Projekt (leer = Hauptchat).
+    # Kein globaler active_project-State mehr im Brain — parallele Requests fuer
+    # verschiedene Projekte laufen echt parallel, nur Requests fuers gleiche
+    # Projekt queuen (per-Projekt-Lock).
+    project_id: str = ""


 class ChatOut(BaseModel):
@@ -614,36 +619,124 @@ class ChatOut(BaseModel):
    turns: int
    distilling: bool
    events: list = Field(default_factory=list)
-    # Aktive Projekt-ID NACH dem Turn (kann durch project_enter/exit-Tools
-    # waehrend des Turns gewechselt haben). Bridge gibt das an die Chat-
-    # Bubble-Broadcasts weiter damit App + Diagnostic die Nachricht zum
-    # richtigen Projekt-Block sortieren koennen.
+    # Echo der project_id die dieser Turn hatte. Bridge nutzt sie damit die
+    # ausgehende Chat-Bubble sauber getaggt in der richtigen Thread-Bahn der
+    # UI landet.
    project_id: str = ""


-@app.post("/chat", response_model=ChatOut)
-def chat(body: ChatIn, background: BackgroundTasks):
-    """Hauptpfad. Antwort kommt synchron. Memory-Destillat laeuft
-    im Hintergrund nachdem die Response rausging."""
-    a = agent()
-    try:
-        reply = a.chat(body.message, source=body.source)
-    except ValueError as exc:
-        raise HTTPException(400, str(exc))
-    except RuntimeError as exc:
-        logger.error("chat fehlgeschlagen: %s", exc)
-        raise HTTPException(502, str(exc))
+# Per-Projekt async-Locks fuer Queue-Behavior: Requests fuers gleiche Projekt
+# warten aufeinander (queue), Requests fuer verschiedene Projekte laufen echt
+# parallel. Hauptchat = Lock unter key "" (leerer String).
+_project_locks: dict[str, asyncio.Lock] = {}
+_project_locks_meta_lock = asyncio.Lock()
+# Pro Projekt eine Liste noch-nicht-verarbeiteter Requests. Wird beim Enqueue
+# ergaenzt, beim Fertig-Werden gepoppt. Ermoeglicht Queue-Aware-Prompting:
+# waehrend ARIA an Task N arbeitet, sieht sie N+1..N+k als System-Prompt-Hinweis
+# und kann entscheiden ob eine spaetere Nachricht die aktuelle korrigiert/
+# annuliert → dann Skip-Antwort statt Ausfuehren.
+_project_pending: dict[str, list[dict]] = {}

-    needs_distill = a.conversation.needs_distill()
-    if needs_distill:
-        background.add_task(a.distill_old_turns)
-    return ChatOut(
-        reply=reply,
-        turns=len(a.conversation.turns),
-        distilling=needs_distill,
-        events=a.pop_events(),
-        project_id=projects_mod.get_active(),
-    )
+
+async def _get_project_lock(project_id: str) -> asyncio.Lock:
+    """Holt (oder erzeugt) den asyncio.Lock fuer ein bestimmtes Projekt.
+    Nutzt _project_locks_meta_lock zur Vermeidung von Race Conditions
+    beim ersten-Zugriff pro Projekt."""
+    async with _project_locks_meta_lock:
+        lock = _project_locks.get(project_id)
+        if lock is None:
+            lock = asyncio.Lock()
+            _project_locks[project_id] = lock
+        return lock
+
+
+def _project_queue_snapshot() -> dict:
+    """Snapshot fuer /projects/queue-status: welche Projekte arbeiten gerade,
+    wieviele wait-in-queue haben, welche sind idle."""
+    out = {}
+    # Zeige nur Kontexte mit Aktivitaet — locked oder pending
+    seen: set = set()
+    for pid, lock in _project_locks.items():
+        pending = len(_project_pending.get(pid, []))
+        is_busy = lock.locked()
+        # busy: gerade in Verarbeitung. queue: N weitere warten dahinter.
+        # Der Busy-Request zaehlt NICHT in queue (er ist ja aus pending schon "raus").
+        out[pid or "__main__"] = {
+            "busy": is_busy,
+            "queue_size": max(0, pending - (1 if is_busy else 0)),
+        }
+        seen.add(pid)
+    for pid, pend in _project_pending.items():
+        if pid in seen:
+            continue
+        out[pid or "__main__"] = {"busy": False, "queue_size": len(pend)}
+    return out
+
+
+@app.post("/chat", response_model=ChatOut)
+async def chat(body: ChatIn, background: BackgroundTasks):
+    """Hauptpfad. Antwort kommt synchron. Memory-Destillat laeuft
+    im Hintergrund nachdem die Response rausging.
+
+    Multi-Threading: Requests fuers gleiche Projekt (project_id gleich)
+    laufen serialisiert durch den per-Projekt-Lock — Queue-Behavior.
+    Verschiedene Projekte laufen parallel."""
+    pid = (body.project_id or "").strip()
+    lock = await _get_project_lock(pid)
+    # Vor dem Lock in die Pending-Liste, damit die verlaufende Task sehen kann
+    # was NACH ihr in der Warteschlange steht (Queue-Aware Prompting).
+    import uuid as _uuid
+    req_id = _uuid.uuid4().hex
+    _project_pending.setdefault(pid, []).append({
+        "id": req_id, "message": body.message, "source": body.source,
+    })
+    try:
+        async with lock:
+            # Snapshot: was liegt NACH mir in der Queue?
+            after_me = [
+                e["message"] for e in _project_pending.get(pid, [])
+                if e["id"] != req_id
+            ]
+            a = agent()
+            try:
+                # Sync-Aufruf im Executor damit wir den Event-Loop nicht blocken —
+                # chat() macht HTTP-Calls (Proxy) die 30-60s dauern koennen.
+                loop = asyncio.get_running_loop()
+                reply = await loop.run_in_executor(
+                    None,
+                    lambda: a.chat(
+                        body.message, source=body.source, project_id=pid,
+                        pending_queue=after_me,
+                    ),
+                )
+            except ValueError as exc:
+                raise HTTPException(400, str(exc))
+            except RuntimeError as exc:
+                logger.error("chat fehlgeschlagen: %s", exc)
+                raise HTTPException(502, str(exc))
+
+            needs_distill = a.conversation.needs_distill()
+            if needs_distill:
+                background.add_task(a.distill_old_turns)
+            return ChatOut(
+                reply=reply,
+                turns=len(a.conversation.turns),
+                distilling=needs_distill,
+                events=a.pop_events(),
+                project_id=pid,
+            )
+    finally:
+        _project_pending[pid] = [
+            e for e in _project_pending.get(pid, []) if e["id"] != req_id
+        ]
+
+
+@app.get("/projects/queue-status")
+def projects_queue_status():
+    """Snapshot: fuer jeden Projekt-Kontext (inkl. Hauptchat unter __main__)
+    - busy: True wenn gerade ein Request in Verarbeitung
+    - queue_size: wieviele weitere warten dahinter"""
+    return {"contexts": _project_queue_snapshot()}


 # ── Projekte ────────────────────────────────────────────────────────