Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
24 KiB
Mikrofon-Monitor Implementation Plan
For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Mikrofon-Geräteänderungen erkennen, bei fehlendem konfiguriertem Mikrofon automatisch auf Standard-Mikrofon wechseln und den Nutzer per Toast und Tray-Tooltip benachrichtigen.
Architecture: Neues whisper_local/microphone/ Paket mit MicrophoneMonitor-Protocol und create_monitor()-Factory (analog zu whisper_local/media/). Windows nutzt IMMNotificationClient via comtypes mit Fallback auf Polling; alle anderen Plattformen nutzen Polling (asyncio.sleep(2.5)). Benachrichtigungen laufen über notify-py (cross-platform) und PystrayApp.set_warning().
Tech Stack: Python 3.13+, sounddevice (device listing), comtypes (Windows COM), notify-py (Toast), pystray (Tray-Tooltip), pytest-asyncio (Tests)
Dateiübersicht
| Aktion | Datei | Zweck |
|---|---|---|
| Erstellen | whisper_local/microphone/__init__.py |
Protocol + Factory |
| Erstellen | whisper_local/microphone/_poll.py |
Polling-Implementierung |
| Erstellen | whisper_local/microphone/_win32.py |
Windows IMMNotificationClient |
| Erstellen | whisper_local/tray/_notification.py |
notify-py Wrapper |
| Ändern | whisper_local/tray/_tray.py |
set_warning() zu PystrayApp + NoOpTray |
| Ändern | whisper_local/__main__.py |
Monitor-Integration in App |
| Ändern | pyproject.toml |
notify-py + comtypes als Abhängigkeiten |
| Erstellen | tests/test_microphone_monitor.py |
Tests für PollMonitor |
Task 1: Abhängigkeiten + _notification.py
Files:
-
Modify:
pyproject.toml -
Create:
whisper_local/tray/_notification.py -
Schritt 1: Abhängigkeiten in
pyproject.tomleintragen
In der dependencies-Liste nach "darkdetect>=0.8.0", folgende Zeilen ergänzen:
"notify-py>=0.3.43",
"comtypes>=1.4.0; sys_platform == 'win32'",
- Schritt 2: Lock-File aktualisieren
uv lock
Erwartete Ausgabe: Resolved N packages ohne Fehler.
- Schritt 3:
_notification.pyanlegen
# whisper_local/tray/_notification.py
"""Desktop-Benachrichtigungen via notify-py."""
import logging
logger = logging.getLogger(__name__)
_APP_NAME = "whisper-local"
def notify(title: str, message: str) -> None:
"""Zeigt eine Desktop-Benachrichtigung. Bei Fehler wird nur geloggt."""
try:
from notifypy import Notify
n = Notify()
n.application_name = _APP_NAME
n.title = title
n.message = message
n.send()
except Exception:
logger.warning("Benachrichtigung fehlgeschlagen: %s – %s", title, message)
- Schritt 4: Importtest
uv run python -c "from whisper_local.tray._notification import notify; print('OK')"
Erwartete Ausgabe: OK
- Schritt 5: Committen
git add pyproject.toml uv.lock whisper_local/tray/_notification.py
git commit -m "feat(notify): notify-py + _notification.py Wrapper"
Task 2: MicrophoneMonitor Protocol + Factory-Skeleton
Files:
-
Create:
whisper_local/microphone/__init__.py -
Schritt 1: Paket anlegen
# whisper_local/microphone/__init__.py
"""Mikrofon-Geräteüberwachung — plattformspezifische Backends."""
import sys
from collections.abc import Awaitable, Callable
from typing import Protocol
class MicrophoneMonitor(Protocol):
on_device_added: Callable[[str], Awaitable[None]] | None
on_device_removed: Callable[[str], Awaitable[None]] | None
on_configured_missing: Callable[[], Awaitable[None]] | None
async def start(self) -> None: ...
def stop(self) -> None: ...
def create_monitor(configured_device: str | None) -> MicrophoneMonitor:
"""Erstellt den plattformspezifischen Mikrofon-Monitor."""
if sys.platform == "win32":
from whisper_local.microphone._win32 import Win32Monitor
return Win32Monitor(configured_device)
from whisper_local.microphone._poll import PollMonitor
return PollMonitor(configured_device)
- Schritt 2: Importtest
uv run python -c "from whisper_local.microphone import create_monitor; print('OK')"
Erwartete Ausgabe: OK (auf Windows schlägt das vorerst fehl, weil _win32.py noch nicht existiert — das ist OK, kommt in Task 5)
- Schritt 3: Committen
git add whisper_local/microphone/__init__.py
git commit -m "feat(microphone): Protocol + create_monitor() Factory-Skeleton"
Task 3: PollMonitor — Geräteerkennung (TDD)
Files:
-
Create:
whisper_local/microphone/_poll.py -
Create:
tests/test_microphone_monitor.py -
Schritt 1: Testdatei anlegen (schlägt zunächst fehl)
# tests/test_microphone_monitor.py
import asyncio
from unittest.mock import AsyncMock, patch
import pytest
from whisper_local.microphone._poll import PollMonitor
def _fake_devices(names: list[str]) -> list[dict]:
return [{"name": n, "max_input_channels": 1} for n in names]
@pytest.mark.asyncio
async def test_on_device_added_fires_when_device_appears():
monitor = PollMonitor(configured_device=None, interval=0.05)
event = asyncio.Event()
added: list[str] = []
async def on_added(name: str) -> None:
added.append(name)
event.set()
monitor.on_device_added = on_added
call_count = 0
def fake_query():
nonlocal call_count
call_count += 1
if call_count == 1:
return _fake_devices(["Mic A"])
return _fake_devices(["Mic A", "Mic B"])
with patch("sounddevice.query_devices", side_effect=fake_query):
await monitor.start()
await asyncio.wait_for(event.wait(), timeout=1.0)
monitor.stop()
assert added == ["Mic B"]
@pytest.mark.asyncio
async def test_on_device_removed_fires_when_device_disappears():
monitor = PollMonitor(configured_device=None, interval=0.05)
event = asyncio.Event()
removed: list[str] = []
async def on_removed(name: str) -> None:
removed.append(name)
event.set()
monitor.on_device_removed = on_removed
call_count = 0
def fake_query():
nonlocal call_count
call_count += 1
if call_count == 1:
return _fake_devices(["Mic A", "Mic B"])
return _fake_devices(["Mic A"])
with patch("sounddevice.query_devices", side_effect=fake_query):
await monitor.start()
await asyncio.wait_for(event.wait(), timeout=1.0)
monitor.stop()
assert removed == ["Mic B"]
- Schritt 2: Tests ausführen — müssen FEHLSCHLAGEN
uv run pytest tests/test_microphone_monitor.py -v
Erwartete Ausgabe: ModuleNotFoundError: No module named 'whisper_local.microphone._poll'
- Schritt 3:
_poll.pyimplementieren
# whisper_local/microphone/_poll.py
"""Polling-basierter Mikrofon-Monitor (cross-platform)."""
import asyncio
import logging
from collections.abc import Awaitable, Callable
import sounddevice as sd
logger = logging.getLogger(__name__)
class PollMonitor:
def __init__(self, configured_device: str | None, interval: float = 2.5):
self.configured_device = configured_device
self.interval = interval
self.on_device_added: Callable[[str], Awaitable[None]] | None = None
self.on_device_removed: Callable[[str], Awaitable[None]] | None = None
self.on_configured_missing: Callable[[], Awaitable[None]] | None = None
self._task: asyncio.Task | None = None
self._known_devices: set[str] = set()
def _current_devices(self) -> set[str]:
try:
return {
dev["name"]
for dev in sd.query_devices()
if dev["max_input_channels"] > 0
}
except Exception:
logger.exception("Fehler beim Abfragen der Audiogeräte")
return self._known_devices.copy()
async def start(self) -> None:
self._known_devices = self._current_devices()
self._task = asyncio.create_task(self._loop())
def stop(self) -> None:
if self._task is not None:
self._task.cancel()
self._task = None
async def _loop(self) -> None:
while True:
await asyncio.sleep(self.interval)
current = self._current_devices()
added = current - self._known_devices
removed = self._known_devices - current
self._known_devices = current
for name in added:
if self.on_device_added:
await self.on_device_added(name)
for name in removed:
if self.on_device_removed:
await self.on_device_removed(name)
- Schritt 4: Tests ausführen — müssen BESTEHEN
uv run pytest tests/test_microphone_monitor.py -v
Erwartete Ausgabe:
PASSED tests/test_microphone_monitor.py::test_on_device_added_fires_when_device_appears
PASSED tests/test_microphone_monitor.py::test_on_device_removed_fires_when_device_disappears
- Schritt 5: Committen
git add whisper_local/microphone/_poll.py tests/test_microphone_monitor.py
git commit -m "feat(microphone): PollMonitor mit Geräteerkennung (TDD)"
Task 4: PollMonitor — sofortige Startprüfung (TDD)
Files:
-
Modify:
whisper_local/microphone/_poll.py -
Modify:
tests/test_microphone_monitor.py -
Schritt 1: Zwei neue Tests zur Testdatei hinzufügen (nach den bestehenden Tests einfügen)
@pytest.mark.asyncio
async def test_on_configured_missing_fires_immediately_at_start():
monitor = PollMonitor(configured_device="Headset USB", interval=99.0)
missing_called = asyncio.Event()
async def on_missing() -> None:
missing_called.set()
monitor.on_configured_missing = on_missing
with patch("sounddevice.query_devices", return_value=_fake_devices(["Mic A"])):
await monitor.start()
assert missing_called.is_set()
monitor.stop()
@pytest.mark.asyncio
async def test_on_configured_missing_does_not_fire_when_device_present():
monitor = PollMonitor(configured_device="Headset USB", interval=99.0)
missing_mock = AsyncMock()
monitor.on_configured_missing = missing_mock
with patch("sounddevice.query_devices", return_value=_fake_devices(["Headset USB", "Mic A"])):
await monitor.start()
missing_mock.assert_not_called()
monitor.stop()
- Schritt 2: Tests ausführen — die zwei neuen müssen FEHLSCHLAGEN
uv run pytest tests/test_microphone_monitor.py::test_on_configured_missing_fires_immediately_at_start tests/test_microphone_monitor.py::test_on_configured_missing_does_not_fire_when_device_present -v
Erwartete Ausgabe: FAILED für beide neuen Tests.
- Schritt 3:
PollMonitor.start()um sofortige Prüfung erweitern
In whisper_local/microphone/_poll.py die Methode start() ersetzen:
async def start(self) -> None:
self._known_devices = self._current_devices()
if (
self.configured_device
and self.configured_device not in self._known_devices
and self.on_configured_missing
):
await self.on_configured_missing()
self._task = asyncio.create_task(self._loop())
- Schritt 4: Alle Tests ausführen — müssen BESTEHEN
uv run pytest tests/test_microphone_monitor.py -v
Erwartete Ausgabe: 4× PASSED
- Schritt 5: Committen
git add whisper_local/microphone/_poll.py tests/test_microphone_monitor.py
git commit -m "feat(microphone): PollMonitor meldet fehlendes Gerät sofort beim Start"
Task 5: Win32Monitor mit IMMNotificationClient
Files:
-
Create:
whisper_local/microphone/_win32.py -
Schritt 1:
_win32.pyanlegen
# whisper_local/microphone/_win32.py
"""Windows Mikrofon-Monitor via IMMNotificationClient (Core Audio API)."""
import asyncio
import ctypes
import logging
from collections.abc import Awaitable, Callable
import sounddevice as sd
logger = logging.getLogger(__name__)
_CLSID_MMDeviceEnumerator = "{BCDE0395-E52F-467C-8E3D-C4579291692E}"
_IID_IMMDeviceEnumerator = "{A95664D2-9614-4F35-A746-DE8DB63617E6}"
_IID_IMMNotificationClient = "{7991EEC9-7E89-4D85-8390-6C703CEC60C0}"
def _build_com_interfaces():
"""Definiert IMMDeviceEnumerator und IMMNotificationClient via comtypes."""
import comtypes
from comtypes import COMMETHOD, GUID, HRESULT, IUnknown, POINTER
class _IMMNotificationClient(IUnknown):
_iid_ = GUID(_IID_IMMNotificationClient)
_methods_ = [
COMMETHOD([], HRESULT, "OnDeviceStateChanged",
(["in"], ctypes.c_wchar_p, "pwstrDeviceId"),
(["in"], ctypes.c_uint, "dwNewState")),
COMMETHOD([], HRESULT, "OnDeviceAdded",
(["in"], ctypes.c_wchar_p, "pwstrDeviceId")),
COMMETHOD([], HRESULT, "OnDeviceRemoved",
(["in"], ctypes.c_wchar_p, "pwstrDeviceId")),
COMMETHOD([], HRESULT, "OnDefaultDeviceChanged",
(["in"], ctypes.c_int, "flow"),
(["in"], ctypes.c_int, "role"),
(["in"], ctypes.c_wchar_p, "pwstrDefaultDeviceId")),
COMMETHOD([], HRESULT, "OnPropertyValueChanged",
(["in"], ctypes.c_wchar_p, "pwstrDeviceId"),
(["in"], ctypes.c_void_p, "key")),
]
class _IMMDeviceEnumerator(IUnknown):
_iid_ = GUID(_IID_IMMDeviceEnumerator)
_methods_ = [
COMMETHOD([], HRESULT, "EnumAudioEndpoints",
(["in"], ctypes.c_int, "dataFlow"),
(["in"], ctypes.c_uint, "dwStateMask"),
(["out"], POINTER(IUnknown), "ppDevices")),
COMMETHOD([], HRESULT, "GetDefaultAudioEndpoint",
(["in"], ctypes.c_int, "dataFlow"),
(["in"], ctypes.c_int, "role"),
(["out"], POINTER(IUnknown), "ppEndpoint")),
COMMETHOD([], HRESULT, "GetDevice",
(["in"], ctypes.c_wchar_p, "pwstrId"),
(["out"], POINTER(IUnknown), "ppDevice")),
COMMETHOD([], HRESULT, "RegisterEndpointNotificationCallback",
(["in"], POINTER(_IMMNotificationClient), "pClient")),
COMMETHOD([], HRESULT, "UnregisterEndpointNotificationCallback",
(["in"], POINTER(_IMMNotificationClient), "pClient")),
]
return _IMMNotificationClient, _IMMDeviceEnumerator
def _build_client_class(IMMNotificationClient, callback):
"""Erstellt eine comtypes.COMObject-Implementierung von IMMNotificationClient."""
import comtypes
class _NotificationClientImpl(comtypes.COMObject):
_com_interfaces_ = [IMMNotificationClient]
def OnDeviceStateChanged(self, pwstrDeviceId, dwNewState):
callback()
return 0
def OnDeviceAdded(self, pwstrDeviceId):
callback()
return 0
def OnDeviceRemoved(self, pwstrDeviceId):
callback()
return 0
def OnDefaultDeviceChanged(self, flow, role, pwstrDefaultDeviceId):
return 0
def OnPropertyValueChanged(self, pwstrDeviceId, key):
return 0
return _NotificationClientImpl()
class Win32Monitor:
def __init__(self, configured_device: str | None):
self.configured_device = configured_device
self.on_device_added: Callable[[str], Awaitable[None]] | None = None
self.on_device_removed: Callable[[str], Awaitable[None]] | None = None
self.on_configured_missing: Callable[[], Awaitable[None]] | None = None
self._loop: asyncio.AbstractEventLoop | None = None
self._known_devices: set[str] = set()
self._enumerator = None
self._client = None
self._fallback = None
def _current_devices(self) -> set[str]:
try:
return {
dev["name"]
for dev in sd.query_devices()
if dev["max_input_channels"] > 0
}
except Exception:
logger.exception("Fehler beim Abfragen der Audiogeräte")
return self._known_devices.copy()
async def start(self) -> None:
self._loop = asyncio.get_running_loop()
self._known_devices = self._current_devices()
if (
self.configured_device
and self.configured_device not in self._known_devices
and self.on_configured_missing
):
await self.on_configured_missing()
try:
self._start_com()
except Exception:
logger.warning(
"IMMNotificationClient nicht verfügbar, Fallback auf Polling",
exc_info=True,
)
from whisper_local.microphone._poll import PollMonitor
fallback = PollMonitor(self.configured_device)
fallback.on_device_added = self.on_device_added
fallback.on_device_removed = self.on_device_removed
fallback._known_devices = self._known_devices
self._fallback = fallback
self._fallback._task = asyncio.create_task(self._fallback._loop())
def _start_com(self) -> None:
import comtypes
import comtypes.client
from comtypes import GUID
comtypes.CoInitialize()
IMMNotificationClient, IMMDeviceEnumerator = _build_com_interfaces()
self._enumerator = comtypes.client.CreateObject(
GUID(_CLSID_MMDeviceEnumerator),
interface=IMMDeviceEnumerator,
)
self._client = _build_client_class(IMMNotificationClient, self._on_com_event)
self._enumerator.RegisterEndpointNotificationCallback(self._client)
def _on_com_event(self) -> None:
if self._loop is not None:
self._loop.call_soon_threadsafe(
lambda: asyncio.ensure_future(self._handle_change())
)
async def _handle_change(self) -> None:
current = self._current_devices()
added = current - self._known_devices
removed = self._known_devices - current
self._known_devices = current
for name in added:
if self.on_device_added:
await self.on_device_added(name)
for name in removed:
if self.on_device_removed:
await self.on_device_removed(name)
def stop(self) -> None:
if self._fallback is not None:
self._fallback.stop()
return
if self._enumerator is not None and self._client is not None:
try:
self._enumerator.UnregisterEndpointNotificationCallback(self._client)
except Exception:
logger.warning("Fehler beim Deregistrieren des Notification-Clients")
try:
import comtypes
comtypes.CoUninitialize()
except Exception:
pass
- Schritt 2: Importtest auf Windows
uv run python -c "from whisper_local.microphone._win32 import Win32Monitor; print('OK')"
Erwartete Ausgabe: OK
- Schritt 3: Alle Tests laufen lassen
uv run pytest tests/ -v
Erwartete Ausgabe: alle bestehenden Tests PASSED
- Schritt 4: Committen
git add whisper_local/microphone/_win32.py
git commit -m "feat(microphone): Win32Monitor via IMMNotificationClient mit Polling-Fallback"
Task 6: PystrayApp.set_warning() + NoOpTray.set_warning()
Files:
-
Modify:
whisper_local/tray/_tray.py -
Schritt 1:
set_warning()zuPystrayApphinzufügen
In whisper_local/tray/_tray.py nach der Methode set_state die neue Methode einfügen:
def set_warning(self, msg: str | None) -> None:
"""Setzt Tray-Titel auf Warnung oder zurück auf normal (thread-sicher)."""
if self._icon is not None:
self._icon.title = "whisper-local" if msg is None else f"whisper-local ⚠ {msg}"
- Schritt 2:
set_warning()zuNoOpTrayhinzufügen
In NoOpTray nach set_state einfügen:
def set_warning(self, msg: str | None) -> None:
pass
- Schritt 3: Importtest
uv run python -c "from whisper_local.tray._tray import PystrayApp, NoOpTray; print('OK')"
Erwartete Ausgabe: OK
- Schritt 4: Committen
git add whisper_local/tray/_tray.py
git commit -m "feat(tray): set_warning() für Tray-Tooltip-Warnung"
Task 7: App-Integration
Files:
-
Modify:
whisper_local/__main__.py -
Schritt 1: Import und Monitor-Erstellung in
App.__init__hinzufügen
In whisper_local/__main__.py den Import-Block am Anfang der Datei ergänzen:
from whisper_local.microphone import create_monitor
In App.__init__ nach self.hotkey = create_listener(key_name=config.hotkey) einfügen:
self.monitor = create_monitor(config.microphone or None)
self.monitor.on_device_added = self._on_microphone_added
self.monitor.on_device_removed = self._on_microphone_removed
self.monitor.on_configured_missing = self._on_configured_microphone_missing
- Schritt 2: Callbacks implementieren (in
App, nach_open_settings)
async def _on_configured_microphone_missing(self) -> None:
"""Konfiguriertes Mikrofon nicht gefunden — auf Standard wechseln."""
from whisper_local.tray._notification import notify
device_name = self._config.microphone or "Mikrofon"
logger.warning("Konfiguriertes Mikrofon '%s' nicht gefunden, nutze Standard", device_name)
self.recorder = Recorder(
sample_rate=self._config.sample_rate,
channels=self._config.channels,
min_duration=self._config.min_duration,
device=None,
)
notify(
"Mikrofon nicht gefunden",
f"„{device_name}" ist nicht verfügbar. Standard-Mikrofon wird verwendet.",
)
self.tray.set_warning("Mikrofon nicht gefunden")
async def _on_microphone_added(self, device_name: str) -> None:
"""Neues Mikrofon erkannt — konfiguriertes Gerät ggf. wiederherstellen."""
if device_name != self._config.microphone:
return
from whisper_local.tray._notification import notify
logger.info("Konfiguriertes Mikrofon '%s' wieder verfügbar", device_name)
self.recorder = Recorder(
sample_rate=self._config.sample_rate,
channels=self._config.channels,
min_duration=self._config.min_duration,
device=self._config.microphone or None,
)
notify("Mikrofon verbunden", f"„{device_name}" ist wieder verfügbar.")
self.tray.set_warning(None)
async def _on_microphone_removed(self, device_name: str) -> None:
"""Mikrofon entfernt — konfiguriertes Gerät → Fallback auslösen."""
logger.info("Mikrofon entfernt: %s", device_name)
if device_name == self._config.microphone:
await self._on_configured_microphone_missing()
- Schritt 3: Monitor in
App.run()starten
In App.run() nach self._hotkey_task = asyncio.create_task(self.hotkey.listen()) einfügen:
asyncio.create_task(self.monitor.start())
- Schritt 4: Monitor in
_on_config_reloadneu starten
In _on_config_reload nach dem Block mit self.recorder = Recorder(...) einfügen:
self.monitor.stop()
self.monitor = create_monitor(new_config.microphone or None)
self.monitor.on_device_added = self._on_microphone_added
self.monitor.on_device_removed = self._on_microphone_removed
self.monitor.on_configured_missing = self._on_configured_microphone_missing
if self._loop is not None:
asyncio.run_coroutine_threadsafe(self.monitor.start(), self._loop)
self.tray.set_warning(None)
- Schritt 5: Alle Tests ausführen
uv run pytest tests/ -v
Erwartete Ausgabe: alle Tests PASSED
- Schritt 6: App manuell testen
uv run whisper-local
Prüfen:
-
App startet ohne Fehler
-
USB/Bluetooth-Mikrofon anstecken → kein Absturz
-
Konfiguriertes Mikrofon abziehen (falls gesetzt) → Toast erscheint, Tray-Tooltip zeigt Warnung
-
Mikrofon wieder anstecken → Toast „verbunden", Warnung verschwindet
-
Schritt 7: Committen
git add whisper_local/__main__.py
git commit -m "feat(app): Mikrofon-Monitor in App integriert"