# Mikrofon-Monitor Implementation Plan > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** Mikrofon-Geräteänderungen erkennen, bei fehlendem konfiguriertem Mikrofon automatisch auf Standard-Mikrofon wechseln und den Nutzer per Toast und Tray-Tooltip benachrichtigen. **Architecture:** Neues `whisper_local/microphone/` Paket mit `MicrophoneMonitor`-Protocol und `create_monitor()`-Factory (analog zu `whisper_local/media/`). Windows nutzt `IMMNotificationClient` via `comtypes` mit Fallback auf Polling; alle anderen Plattformen nutzen Polling (`asyncio.sleep(2.5)`). Benachrichtigungen laufen über `notify-py` (cross-platform) und `PystrayApp.set_warning()`. **Tech Stack:** Python 3.13+, `sounddevice` (device listing), `comtypes` (Windows COM), `notify-py` (Toast), `pystray` (Tray-Tooltip), `pytest-asyncio` (Tests) --- ## Dateiübersicht | Aktion | Datei | Zweck | |--------|-------|-------| | Erstellen | `whisper_local/microphone/__init__.py` | Protocol + Factory | | Erstellen | `whisper_local/microphone/_poll.py` | Polling-Implementierung | | Erstellen | `whisper_local/microphone/_win32.py` | Windows IMMNotificationClient | | Erstellen | `whisper_local/tray/_notification.py` | notify-py Wrapper | | Ändern | `whisper_local/tray/_tray.py` | `set_warning()` zu `PystrayApp` + `NoOpTray` | | Ändern | `whisper_local/__main__.py` | Monitor-Integration in `App` | | Ändern | `pyproject.toml` | `notify-py` + `comtypes` als Abhängigkeiten | | Erstellen | `tests/test_microphone_monitor.py` | Tests für `PollMonitor` | --- ## Task 1: Abhängigkeiten + `_notification.py` **Files:** - Modify: `pyproject.toml` - Create: `whisper_local/tray/_notification.py` - [ ] **Schritt 1: Abhängigkeiten in `pyproject.toml` eintragen** In der `dependencies`-Liste nach `"darkdetect>=0.8.0",` folgende Zeilen ergänzen: ```toml "notify-py>=0.3.43", "comtypes>=1.4.0; sys_platform == 'win32'", ``` - [ ] **Schritt 2: Lock-File aktualisieren** ``` uv lock ``` Erwartete Ausgabe: `Resolved N packages` ohne Fehler. - [ ] **Schritt 3: `_notification.py` anlegen** ```python # whisper_local/tray/_notification.py """Desktop-Benachrichtigungen via notify-py.""" import logging logger = logging.getLogger(__name__) _APP_NAME = "whisper-local" def notify(title: str, message: str) -> None: """Zeigt eine Desktop-Benachrichtigung. Bei Fehler wird nur geloggt.""" try: from notifypy import Notify n = Notify() n.application_name = _APP_NAME n.title = title n.message = message n.send() except Exception: logger.warning("Benachrichtigung fehlgeschlagen: %s – %s", title, message) ``` - [ ] **Schritt 4: Importtest** ``` uv run python -c "from whisper_local.tray._notification import notify; print('OK')" ``` Erwartete Ausgabe: `OK` - [ ] **Schritt 5: Committen** ``` git add pyproject.toml uv.lock whisper_local/tray/_notification.py git commit -m "feat(notify): notify-py + _notification.py Wrapper" ``` --- ## Task 2: `MicrophoneMonitor` Protocol + Factory-Skeleton **Files:** - Create: `whisper_local/microphone/__init__.py` - [ ] **Schritt 1: Paket anlegen** ```python # whisper_local/microphone/__init__.py """Mikrofon-Geräteüberwachung — plattformspezifische Backends.""" import sys from collections.abc import Awaitable, Callable from typing import Protocol class MicrophoneMonitor(Protocol): on_device_added: Callable[[str], Awaitable[None]] | None on_device_removed: Callable[[str], Awaitable[None]] | None on_configured_missing: Callable[[], Awaitable[None]] | None async def start(self) -> None: ... def stop(self) -> None: ... def create_monitor(configured_device: str | None) -> MicrophoneMonitor: """Erstellt den plattformspezifischen Mikrofon-Monitor.""" if sys.platform == "win32": from whisper_local.microphone._win32 import Win32Monitor return Win32Monitor(configured_device) from whisper_local.microphone._poll import PollMonitor return PollMonitor(configured_device) ``` - [ ] **Schritt 2: Importtest** ``` uv run python -c "from whisper_local.microphone import create_monitor; print('OK')" ``` Erwartete Ausgabe: `OK` (auf Windows schlägt das vorerst fehl, weil `_win32.py` noch nicht existiert — das ist OK, kommt in Task 5) - [ ] **Schritt 3: Committen** ``` git add whisper_local/microphone/__init__.py git commit -m "feat(microphone): Protocol + create_monitor() Factory-Skeleton" ``` --- ## Task 3: `PollMonitor` — Geräteerkennung (TDD) **Files:** - Create: `whisper_local/microphone/_poll.py` - Create: `tests/test_microphone_monitor.py` - [ ] **Schritt 1: Testdatei anlegen (schlägt zunächst fehl)** ```python # tests/test_microphone_monitor.py import asyncio from unittest.mock import AsyncMock, patch import pytest from whisper_local.microphone._poll import PollMonitor def _fake_devices(names: list[str]) -> list[dict]: return [{"name": n, "max_input_channels": 1} for n in names] @pytest.mark.asyncio async def test_on_device_added_fires_when_device_appears(): monitor = PollMonitor(configured_device=None, interval=0.05) event = asyncio.Event() added: list[str] = [] async def on_added(name: str) -> None: added.append(name) event.set() monitor.on_device_added = on_added call_count = 0 def fake_query(): nonlocal call_count call_count += 1 if call_count == 1: return _fake_devices(["Mic A"]) return _fake_devices(["Mic A", "Mic B"]) with patch("sounddevice.query_devices", side_effect=fake_query): await monitor.start() await asyncio.wait_for(event.wait(), timeout=1.0) monitor.stop() assert added == ["Mic B"] @pytest.mark.asyncio async def test_on_device_removed_fires_when_device_disappears(): monitor = PollMonitor(configured_device=None, interval=0.05) event = asyncio.Event() removed: list[str] = [] async def on_removed(name: str) -> None: removed.append(name) event.set() monitor.on_device_removed = on_removed call_count = 0 def fake_query(): nonlocal call_count call_count += 1 if call_count == 1: return _fake_devices(["Mic A", "Mic B"]) return _fake_devices(["Mic A"]) with patch("sounddevice.query_devices", side_effect=fake_query): await monitor.start() await asyncio.wait_for(event.wait(), timeout=1.0) monitor.stop() assert removed == ["Mic B"] ``` - [ ] **Schritt 2: Tests ausführen — müssen FEHLSCHLAGEN** ``` uv run pytest tests/test_microphone_monitor.py -v ``` Erwartete Ausgabe: `ModuleNotFoundError: No module named 'whisper_local.microphone._poll'` - [ ] **Schritt 3: `_poll.py` implementieren** ```python # whisper_local/microphone/_poll.py """Polling-basierter Mikrofon-Monitor (cross-platform).""" import asyncio import logging from collections.abc import Awaitable, Callable import sounddevice as sd logger = logging.getLogger(__name__) class PollMonitor: def __init__(self, configured_device: str | None, interval: float = 2.5): self.configured_device = configured_device self.interval = interval self.on_device_added: Callable[[str], Awaitable[None]] | None = None self.on_device_removed: Callable[[str], Awaitable[None]] | None = None self.on_configured_missing: Callable[[], Awaitable[None]] | None = None self._task: asyncio.Task | None = None self._known_devices: set[str] = set() def _current_devices(self) -> set[str]: try: return { dev["name"] for dev in sd.query_devices() if dev["max_input_channels"] > 0 } except Exception: logger.exception("Fehler beim Abfragen der Audiogeräte") return self._known_devices.copy() async def start(self) -> None: self._known_devices = self._current_devices() self._task = asyncio.create_task(self._loop()) def stop(self) -> None: if self._task is not None: self._task.cancel() self._task = None async def _loop(self) -> None: while True: await asyncio.sleep(self.interval) current = self._current_devices() added = current - self._known_devices removed = self._known_devices - current self._known_devices = current for name in added: if self.on_device_added: await self.on_device_added(name) for name in removed: if self.on_device_removed: await self.on_device_removed(name) ``` - [ ] **Schritt 4: Tests ausführen — müssen BESTEHEN** ``` uv run pytest tests/test_microphone_monitor.py -v ``` Erwartete Ausgabe: ``` PASSED tests/test_microphone_monitor.py::test_on_device_added_fires_when_device_appears PASSED tests/test_microphone_monitor.py::test_on_device_removed_fires_when_device_disappears ``` - [ ] **Schritt 5: Committen** ``` git add whisper_local/microphone/_poll.py tests/test_microphone_monitor.py git commit -m "feat(microphone): PollMonitor mit Geräteerkennung (TDD)" ``` --- ## Task 4: `PollMonitor` — sofortige Startprüfung (TDD) **Files:** - Modify: `whisper_local/microphone/_poll.py` - Modify: `tests/test_microphone_monitor.py` - [ ] **Schritt 1: Zwei neue Tests zur Testdatei hinzufügen** (nach den bestehenden Tests einfügen) ```python @pytest.mark.asyncio async def test_on_configured_missing_fires_immediately_at_start(): monitor = PollMonitor(configured_device="Headset USB", interval=99.0) missing_called = asyncio.Event() async def on_missing() -> None: missing_called.set() monitor.on_configured_missing = on_missing with patch("sounddevice.query_devices", return_value=_fake_devices(["Mic A"])): await monitor.start() assert missing_called.is_set() monitor.stop() @pytest.mark.asyncio async def test_on_configured_missing_does_not_fire_when_device_present(): monitor = PollMonitor(configured_device="Headset USB", interval=99.0) missing_mock = AsyncMock() monitor.on_configured_missing = missing_mock with patch("sounddevice.query_devices", return_value=_fake_devices(["Headset USB", "Mic A"])): await monitor.start() missing_mock.assert_not_called() monitor.stop() ``` - [ ] **Schritt 2: Tests ausführen — die zwei neuen müssen FEHLSCHLAGEN** ``` uv run pytest tests/test_microphone_monitor.py::test_on_configured_missing_fires_immediately_at_start tests/test_microphone_monitor.py::test_on_configured_missing_does_not_fire_when_device_present -v ``` Erwartete Ausgabe: `FAILED` für beide neuen Tests. - [ ] **Schritt 3: `PollMonitor.start()` um sofortige Prüfung erweitern** In `whisper_local/microphone/_poll.py` die Methode `start()` ersetzen: ```python async def start(self) -> None: self._known_devices = self._current_devices() if ( self.configured_device and self.configured_device not in self._known_devices and self.on_configured_missing ): await self.on_configured_missing() self._task = asyncio.create_task(self._loop()) ``` - [ ] **Schritt 4: Alle Tests ausführen — müssen BESTEHEN** ``` uv run pytest tests/test_microphone_monitor.py -v ``` Erwartete Ausgabe: 4× `PASSED` - [ ] **Schritt 5: Committen** ``` git add whisper_local/microphone/_poll.py tests/test_microphone_monitor.py git commit -m "feat(microphone): PollMonitor meldet fehlendes Gerät sofort beim Start" ``` --- ## Task 5: `Win32Monitor` mit IMMNotificationClient **Files:** - Create: `whisper_local/microphone/_win32.py` - [ ] **Schritt 1: `_win32.py` anlegen** ```python # whisper_local/microphone/_win32.py """Windows Mikrofon-Monitor via IMMNotificationClient (Core Audio API).""" import asyncio import ctypes import logging from collections.abc import Awaitable, Callable import sounddevice as sd logger = logging.getLogger(__name__) _CLSID_MMDeviceEnumerator = "{BCDE0395-E52F-467C-8E3D-C4579291692E}" _IID_IMMDeviceEnumerator = "{A95664D2-9614-4F35-A746-DE8DB63617E6}" _IID_IMMNotificationClient = "{7991EEC9-7E89-4D85-8390-6C703CEC60C0}" def _build_com_interfaces(): """Definiert IMMDeviceEnumerator und IMMNotificationClient via comtypes.""" import comtypes from comtypes import COMMETHOD, GUID, HRESULT, IUnknown, POINTER class _IMMNotificationClient(IUnknown): _iid_ = GUID(_IID_IMMNotificationClient) _methods_ = [ COMMETHOD([], HRESULT, "OnDeviceStateChanged", (["in"], ctypes.c_wchar_p, "pwstrDeviceId"), (["in"], ctypes.c_uint, "dwNewState")), COMMETHOD([], HRESULT, "OnDeviceAdded", (["in"], ctypes.c_wchar_p, "pwstrDeviceId")), COMMETHOD([], HRESULT, "OnDeviceRemoved", (["in"], ctypes.c_wchar_p, "pwstrDeviceId")), COMMETHOD([], HRESULT, "OnDefaultDeviceChanged", (["in"], ctypes.c_int, "flow"), (["in"], ctypes.c_int, "role"), (["in"], ctypes.c_wchar_p, "pwstrDefaultDeviceId")), COMMETHOD([], HRESULT, "OnPropertyValueChanged", (["in"], ctypes.c_wchar_p, "pwstrDeviceId"), (["in"], ctypes.c_void_p, "key")), ] class _IMMDeviceEnumerator(IUnknown): _iid_ = GUID(_IID_IMMDeviceEnumerator) _methods_ = [ COMMETHOD([], HRESULT, "EnumAudioEndpoints", (["in"], ctypes.c_int, "dataFlow"), (["in"], ctypes.c_uint, "dwStateMask"), (["out"], POINTER(IUnknown), "ppDevices")), COMMETHOD([], HRESULT, "GetDefaultAudioEndpoint", (["in"], ctypes.c_int, "dataFlow"), (["in"], ctypes.c_int, "role"), (["out"], POINTER(IUnknown), "ppEndpoint")), COMMETHOD([], HRESULT, "GetDevice", (["in"], ctypes.c_wchar_p, "pwstrId"), (["out"], POINTER(IUnknown), "ppDevice")), COMMETHOD([], HRESULT, "RegisterEndpointNotificationCallback", (["in"], POINTER(_IMMNotificationClient), "pClient")), COMMETHOD([], HRESULT, "UnregisterEndpointNotificationCallback", (["in"], POINTER(_IMMNotificationClient), "pClient")), ] return _IMMNotificationClient, _IMMDeviceEnumerator def _build_client_class(IMMNotificationClient, callback): """Erstellt eine comtypes.COMObject-Implementierung von IMMNotificationClient.""" import comtypes class _NotificationClientImpl(comtypes.COMObject): _com_interfaces_ = [IMMNotificationClient] def OnDeviceStateChanged(self, pwstrDeviceId, dwNewState): callback() return 0 def OnDeviceAdded(self, pwstrDeviceId): callback() return 0 def OnDeviceRemoved(self, pwstrDeviceId): callback() return 0 def OnDefaultDeviceChanged(self, flow, role, pwstrDefaultDeviceId): return 0 def OnPropertyValueChanged(self, pwstrDeviceId, key): return 0 return _NotificationClientImpl() class Win32Monitor: def __init__(self, configured_device: str | None): self.configured_device = configured_device self.on_device_added: Callable[[str], Awaitable[None]] | None = None self.on_device_removed: Callable[[str], Awaitable[None]] | None = None self.on_configured_missing: Callable[[], Awaitable[None]] | None = None self._loop: asyncio.AbstractEventLoop | None = None self._known_devices: set[str] = set() self._enumerator = None self._client = None self._fallback = None def _current_devices(self) -> set[str]: try: return { dev["name"] for dev in sd.query_devices() if dev["max_input_channels"] > 0 } except Exception: logger.exception("Fehler beim Abfragen der Audiogeräte") return self._known_devices.copy() async def start(self) -> None: self._loop = asyncio.get_running_loop() self._known_devices = self._current_devices() if ( self.configured_device and self.configured_device not in self._known_devices and self.on_configured_missing ): await self.on_configured_missing() try: self._start_com() except Exception: logger.warning( "IMMNotificationClient nicht verfügbar, Fallback auf Polling", exc_info=True, ) from whisper_local.microphone._poll import PollMonitor fallback = PollMonitor(self.configured_device) fallback.on_device_added = self.on_device_added fallback.on_device_removed = self.on_device_removed fallback._known_devices = self._known_devices self._fallback = fallback self._fallback._task = asyncio.create_task(self._fallback._loop()) def _start_com(self) -> None: import comtypes import comtypes.client from comtypes import GUID comtypes.CoInitialize() IMMNotificationClient, IMMDeviceEnumerator = _build_com_interfaces() self._enumerator = comtypes.client.CreateObject( GUID(_CLSID_MMDeviceEnumerator), interface=IMMDeviceEnumerator, ) self._client = _build_client_class(IMMNotificationClient, self._on_com_event) self._enumerator.RegisterEndpointNotificationCallback(self._client) def _on_com_event(self) -> None: if self._loop is not None: self._loop.call_soon_threadsafe( lambda: asyncio.ensure_future(self._handle_change()) ) async def _handle_change(self) -> None: current = self._current_devices() added = current - self._known_devices removed = self._known_devices - current self._known_devices = current for name in added: if self.on_device_added: await self.on_device_added(name) for name in removed: if self.on_device_removed: await self.on_device_removed(name) def stop(self) -> None: if self._fallback is not None: self._fallback.stop() return if self._enumerator is not None and self._client is not None: try: self._enumerator.UnregisterEndpointNotificationCallback(self._client) except Exception: logger.warning("Fehler beim Deregistrieren des Notification-Clients") try: import comtypes comtypes.CoUninitialize() except Exception: pass ``` - [ ] **Schritt 2: Importtest auf Windows** ``` uv run python -c "from whisper_local.microphone._win32 import Win32Monitor; print('OK')" ``` Erwartete Ausgabe: `OK` - [ ] **Schritt 3: Alle Tests laufen lassen** ``` uv run pytest tests/ -v ``` Erwartete Ausgabe: alle bestehenden Tests `PASSED` - [ ] **Schritt 4: Committen** ``` git add whisper_local/microphone/_win32.py git commit -m "feat(microphone): Win32Monitor via IMMNotificationClient mit Polling-Fallback" ``` --- ## Task 6: `PystrayApp.set_warning()` + `NoOpTray.set_warning()` **Files:** - Modify: `whisper_local/tray/_tray.py` - [ ] **Schritt 1: `set_warning()` zu `PystrayApp` hinzufügen** In `whisper_local/tray/_tray.py` nach der Methode `set_state` die neue Methode einfügen: ```python def set_warning(self, msg: str | None) -> None: """Setzt Tray-Titel auf Warnung oder zurück auf normal (thread-sicher).""" if self._icon is not None: self._icon.title = "whisper-local" if msg is None else f"whisper-local ⚠ {msg}" ``` - [ ] **Schritt 2: `set_warning()` zu `NoOpTray` hinzufügen** In `NoOpTray` nach `set_state` einfügen: ```python def set_warning(self, msg: str | None) -> None: pass ``` - [ ] **Schritt 3: Importtest** ``` uv run python -c "from whisper_local.tray._tray import PystrayApp, NoOpTray; print('OK')" ``` Erwartete Ausgabe: `OK` - [ ] **Schritt 4: Committen** ``` git add whisper_local/tray/_tray.py git commit -m "feat(tray): set_warning() für Tray-Tooltip-Warnung" ``` --- ## Task 7: App-Integration **Files:** - Modify: `whisper_local/__main__.py` - [ ] **Schritt 1: Import und Monitor-Erstellung in `App.__init__` hinzufügen** In `whisper_local/__main__.py` den Import-Block am Anfang der Datei ergänzen: ```python from whisper_local.microphone import create_monitor ``` In `App.__init__` nach `self.hotkey = create_listener(key_name=config.hotkey)` einfügen: ```python self.monitor = create_monitor(config.microphone or None) self.monitor.on_device_added = self._on_microphone_added self.monitor.on_device_removed = self._on_microphone_removed self.monitor.on_configured_missing = self._on_configured_microphone_missing ``` - [ ] **Schritt 2: Callbacks implementieren** (in `App`, nach `_open_settings`) ```python async def _on_configured_microphone_missing(self) -> None: """Konfiguriertes Mikrofon nicht gefunden — auf Standard wechseln.""" from whisper_local.tray._notification import notify device_name = self._config.microphone or "Mikrofon" logger.warning("Konfiguriertes Mikrofon '%s' nicht gefunden, nutze Standard", device_name) self.recorder = Recorder( sample_rate=self._config.sample_rate, channels=self._config.channels, min_duration=self._config.min_duration, device=None, ) notify( "Mikrofon nicht gefunden", f"„{device_name}" ist nicht verfügbar. Standard-Mikrofon wird verwendet.", ) self.tray.set_warning("Mikrofon nicht gefunden") async def _on_microphone_added(self, device_name: str) -> None: """Neues Mikrofon erkannt — konfiguriertes Gerät ggf. wiederherstellen.""" if device_name != self._config.microphone: return from whisper_local.tray._notification import notify logger.info("Konfiguriertes Mikrofon '%s' wieder verfügbar", device_name) self.recorder = Recorder( sample_rate=self._config.sample_rate, channels=self._config.channels, min_duration=self._config.min_duration, device=self._config.microphone or None, ) notify("Mikrofon verbunden", f"„{device_name}" ist wieder verfügbar.") self.tray.set_warning(None) async def _on_microphone_removed(self, device_name: str) -> None: """Mikrofon entfernt — konfiguriertes Gerät → Fallback auslösen.""" logger.info("Mikrofon entfernt: %s", device_name) if device_name == self._config.microphone: await self._on_configured_microphone_missing() ``` - [ ] **Schritt 3: Monitor in `App.run()` starten** In `App.run()` nach `self._hotkey_task = asyncio.create_task(self.hotkey.listen())` einfügen: ```python asyncio.create_task(self.monitor.start()) ``` - [ ] **Schritt 4: Monitor in `_on_config_reload` neu starten** In `_on_config_reload` nach dem Block mit `self.recorder = Recorder(...)` einfügen: ```python self.monitor.stop() self.monitor = create_monitor(new_config.microphone or None) self.monitor.on_device_added = self._on_microphone_added self.monitor.on_device_removed = self._on_microphone_removed self.monitor.on_configured_missing = self._on_configured_microphone_missing if self._loop is not None: asyncio.run_coroutine_threadsafe(self.monitor.start(), self._loop) self.tray.set_warning(None) ``` - [ ] **Schritt 5: Alle Tests ausführen** ``` uv run pytest tests/ -v ``` Erwartete Ausgabe: alle Tests `PASSED` - [ ] **Schritt 6: App manuell testen** ``` uv run whisper-local ``` Prüfen: - App startet ohne Fehler - USB/Bluetooth-Mikrofon anstecken → kein Absturz - Konfiguriertes Mikrofon abziehen (falls gesetzt) → Toast erscheint, Tray-Tooltip zeigt Warnung - Mikrofon wieder anstecken → Toast „verbunden", Warnung verschwindet - [ ] **Schritt 7: Committen** ``` git add whisper_local/__main__.py git commit -m "feat(app): Mikrofon-Monitor in App integriert" ```