Files
whisper-local/docs/superpowers/plans/2026-04-15-media-pause-windows-smtc.md
2026-04-15 20:07:05 +02:00

21 KiB

Windows SMTC Media-Controller Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Windows-Implementierung der Media-Pause-Funktion via Global System Media Transport Controls (GSMTC / SMTC) — pausiert alle laufenden Mediaplayer bei Aufnahmebeginn und setzt sie danach fort.

Architecture: SmtcController in whisper_local/media/_smtc.py implementiert das bestehende MediaController-Protocol. Die Factory in __init__.py bekommt einen win32-Dispatch-Zweig. Sessions werden über ihre AUMID (source_app_user_model_id) identifiziert; ein Circuit-Breaker verhindert Reconnect-Versuche wenn SMTC nicht erreichbar ist.

Tech Stack: Python 3.13, pywinrt (winrt-Windows.Media.Control, winrt-Windows.Foundation, winrt-Windows.Foundation.Collections), pytest-asyncio, unittest.mock


Dateien

Datei Aktion Zweck
pyproject.toml Modify pywinrt-Dependencies hinzufügen
whisper_local/media/_smtc.py Create SmtcController-Implementierung
whisper_local/media/__init__.py Modify win32-Dispatch-Zweig ergänzen
tests/test_media_smtc.py Create Tests für SmtcController
tests/test_media_factory.py Modify win32-Factory-Test + Noop-Test-Fix

Task 1: pywinrt-Dependencies hinzufügen

Files:

  • Modify: pyproject.toml

  • Schritt 1: Dependencies ergänzen

In pyproject.toml die dependencies-Liste um drei Einträge erweitern (nach der pywin32-Zeile):

dependencies = [
    "faster-whisper>=1.1.0",
    "sounddevice>=0.5.0",
    "numpy>=2.0.0",
    "evdev>=1.7.0; sys_platform == 'linux'",
    "PyGObject>=3.50; sys_platform == 'linux'",
    "dbus-next>=0.2.3; sys_platform == 'linux'",
    "pynput>=1.7.0; sys_platform == 'win32'",
    "pywin32>=306; sys_platform == 'win32'",
    "winrt-Windows.Media.Control>=3.2.1; sys_platform == 'win32'",
    "winrt-Windows.Foundation>=3.2.1; sys_platform == 'win32'",
    "winrt-Windows.Foundation.Collections>=3.2.1; sys_platform == 'win32'",
    "pystray>=0.19.0",
    "Pillow>=10.0.0",
    "sv-ttk>=2.6.0",
    "darkdetect>=0.8.0",
]
  • Schritt 2: Sync ausführen
uv sync

Erwartete Ausgabe: Pakete winrt-runtime, winrt-windows-media-control, winrt-windows-foundation, winrt-windows-foundation-collections werden aufgelöst (sind ggf. bereits installiert).

  • Schritt 3: Import prüfen
uv run python -c "import winrt.windows.media.control; print('OK')"

Erwartete Ausgabe: OK

  • Schritt 4: Committen
git add pyproject.toml uv.lock
git commit -m "build: pywinrt als win32-Dependency hinzufügen"

Task 2: SmtcController — Skeleton + Circuit-Breaker (TDD)

Files:

  • Create: whisper_local/media/_smtc.py

  • Create: tests/test_media_smtc.py

  • Schritt 1: Testdatei mit circuit-breaker-Tests anlegen

Datei tests/test_media_smtc.py erstellen:

"""Tests für SmtcController (Windows/SMTC)."""

import sys
from unittest.mock import AsyncMock, MagicMock

import pytest

pytestmark = pytest.mark.skipif(
    sys.platform != "win32", reason="SMTC is Windows-only"
)

from winrt.windows.media.control import (
    GlobalSystemMediaTransportControlsSessionPlaybackStatus as Status,
)

PLAYING = Status.PLAYING
PAUSED = Status.PAUSED


def _make_session(aumid: str, status) -> MagicMock:
    """Erzeugt eine gemockte SMTC-Session mit gegebenem PlaybackStatus."""
    session = MagicMock()
    session.source_app_user_model_id = aumid
    info = MagicMock()
    info.playback_status = status
    session.get_playback_info = MagicMock(return_value=info)
    session.try_pause_async = AsyncMock()
    session.try_play_async = AsyncMock()
    return session


def _make_manager(sessions: list) -> MagicMock:
    """Erzeugt einen gemockten SMTC-Manager mit gegebenen Sessions."""
    manager = MagicMock()
    manager.get_sessions = MagicMock(return_value=sessions)
    return manager


@pytest.mark.asyncio
async def test_pause_is_noop_when_smtc_unreachable(monkeypatch, caplog):
    from whisper_local.media._smtc import SmtcController

    controller = SmtcController()
    monkeypatch.setattr(
        controller,
        "_ensure_manager",
        AsyncMock(side_effect=RuntimeError("kein SMTC")),
    )

    with caplog.at_level("WARNING"):
        await controller.pause()

    assert controller._paused == []
    assert any("SMTC" in r.message or "smtc" in r.message.lower() for r in caplog.records)


@pytest.mark.asyncio
async def test_pause_skips_reconnect_after_smtc_failure(monkeypatch):
    from whisper_local.media._smtc import SmtcController

    call_count = 0

    async def failing_ensure():
        nonlocal call_count
        call_count += 1
        raise RuntimeError("kein SMTC")

    controller = SmtcController()
    monkeypatch.setattr(controller, "_ensure_manager", failing_ensure)

    await controller.pause()
    await controller.pause()
    await controller.pause()

    assert call_count == 1
  • Schritt 2: Tests ausführen — müssen FAIL sein
uv run pytest tests/test_media_smtc.py -v

Erwartete Ausgabe: ImportError oder ModuleNotFoundError (Datei existiert noch nicht).

  • Schritt 3: SmtcController-Skeleton implementieren

Datei whisper_local/media/_smtc.py erstellen:

"""Windows SMTC-Implementierung via pywinrt."""

import logging
from typing import Any

logger = logging.getLogger(__name__)


class SmtcController:
    def __init__(self) -> None:
        self._paused: list[str] = []
        self._manager: Any = None
        self._broken: bool = False

    async def _ensure_manager(self) -> Any:
        if self._broken:
            raise RuntimeError("SMTC nicht verfügbar")
        if self._manager is None:
            from winrt.windows.media.control import (
                GlobalSystemMediaTransportControlsSessionManager,
            )
            self._manager = (
                await GlobalSystemMediaTransportControlsSessionManager.request_async()
            )
        return self._manager

    async def pause(self) -> None:
        try:
            await self._ensure_manager()
        except Exception as e:
            if not self._broken:
                logger.warning(
                    "SMTC nicht erreichbar, Media-Pause dauerhaft deaktiviert: %s", e
                )
                self._broken = True
            self._paused = []
            return

    async def resume(self) -> None:
        pass
  • Schritt 4: Tests ausführen — müssen PASS sein
uv run pytest tests/test_media_smtc.py::test_pause_is_noop_when_smtc_unreachable tests/test_media_smtc.py::test_pause_skips_reconnect_after_smtc_failure -v

Erwartete Ausgabe: 2 passed

  • Schritt 5: Committen
git add whisper_local/media/_smtc.py tests/test_media_smtc.py
git commit -m "feat(media): SmtcController Skeleton mit circuit-breaker"

Task 3: pause() — Session-Erkennung und Pausieren (TDD)

Files:

  • Modify: tests/test_media_smtc.py

  • Modify: whisper_local/media/_smtc.py

  • Schritt 1: Tests für pause() ergänzen

Am Ende von tests/test_media_smtc.py anfügen:

@pytest.mark.asyncio
async def test_pause_with_no_sessions_is_noop(monkeypatch):
    from whisper_local.media._smtc import SmtcController

    controller = SmtcController()
    monkeypatch.setattr(
        controller, "_ensure_manager", AsyncMock(return_value=_make_manager([]))
    )

    await controller.pause()

    assert controller._paused == []


@pytest.mark.asyncio
async def test_pause_pauses_all_playing_sessions(monkeypatch):
    from whisper_local.media._smtc import SmtcController

    s1 = _make_session("Spotify", PLAYING)
    s2 = _make_session("msedge", PLAYING)
    controller = SmtcController()
    monkeypatch.setattr(
        controller,
        "_ensure_manager",
        AsyncMock(return_value=_make_manager([s1, s2])),
    )

    await controller.pause()

    s1.try_pause_async.assert_awaited_once()
    s2.try_pause_async.assert_awaited_once()
    assert controller._paused == ["Spotify", "msedge"]


@pytest.mark.asyncio
async def test_pause_skips_already_paused_sessions(monkeypatch):
    from whisper_local.media._smtc import SmtcController

    playing = _make_session("Spotify", PLAYING)
    already_paused = _make_session("msedge", PAUSED)
    controller = SmtcController()
    monkeypatch.setattr(
        controller,
        "_ensure_manager",
        AsyncMock(return_value=_make_manager([playing, already_paused])),
    )

    await controller.pause()

    playing.try_pause_async.assert_awaited_once()
    already_paused.try_pause_async.assert_not_awaited()
    assert controller._paused == ["Spotify"]


@pytest.mark.asyncio
async def test_pause_logs_and_continues_when_session_fails(monkeypatch, caplog):
    from whisper_local.media._smtc import SmtcController

    broken = _make_session("broken", PLAYING)
    broken.try_pause_async = AsyncMock(side_effect=RuntimeError("Verbindung verloren"))
    good = _make_session("Spotify", PLAYING)
    controller = SmtcController()
    monkeypatch.setattr(
        controller,
        "_ensure_manager",
        AsyncMock(return_value=_make_manager([broken, good])),
    )

    with caplog.at_level("WARNING"):
        await controller.pause()

    good.try_pause_async.assert_awaited_once()
    assert controller._paused == ["Spotify"]
    assert any("broken" in r.message for r in caplog.records)
  • Schritt 2: Tests ausführen — müssen FAIL sein
uv run pytest tests/test_media_smtc.py::test_pause_with_no_sessions_is_noop tests/test_media_smtc.py::test_pause_pauses_all_playing_sessions tests/test_media_smtc.py::test_pause_skips_already_paused_sessions tests/test_media_smtc.py::test_pause_logs_and_continues_when_session_fails -v

Erwartete Ausgabe: 4 FAIL (pause() tut bislang nichts außer Manager holen).

  • Schritt 3: pause() und _pause_session() implementieren

In whisper_local/media/_smtc.py die pause()-Methode und _pause_session() ersetzen/ergänzen:

"""Windows SMTC-Implementierung via pywinrt."""

import logging
from typing import Any

logger = logging.getLogger(__name__)


class SmtcController:
    def __init__(self) -> None:
        self._paused: list[str] = []
        self._manager: Any = None
        self._broken: bool = False

    async def _ensure_manager(self) -> Any:
        if self._broken:
            raise RuntimeError("SMTC nicht verfügbar")
        if self._manager is None:
            from winrt.windows.media.control import (
                GlobalSystemMediaTransportControlsSessionManager,
            )
            self._manager = (
                await GlobalSystemMediaTransportControlsSessionManager.request_async()
            )
        return self._manager

    async def _pause_session(self, session: Any) -> str | None:
        """Pausiert eine Session wenn sie spielt. Gibt AUMID zurück, sonst None."""
        from winrt.windows.media.control import (
            GlobalSystemMediaTransportControlsSessionPlaybackStatus,
        )
        aumid = session.source_app_user_model_id
        try:
            info = session.get_playback_info()
            if (
                info.playback_status
                != GlobalSystemMediaTransportControlsSessionPlaybackStatus.PLAYING
            ):
                return None
            await session.try_pause_async()
            return aumid
        except Exception as e:
            logger.warning("Konnte Session %s nicht pausieren: %s", aumid, e)
            return None

    async def pause(self) -> None:
        try:
            manager = await self._ensure_manager()
        except Exception as e:
            if not self._broken:
                logger.warning(
                    "SMTC nicht erreichbar, Media-Pause dauerhaft deaktiviert: %s", e
                )
                self._broken = True
            self._paused = []
            return

        sessions = list(manager.get_sessions())
        paused = []
        for session in sessions:
            result = await self._pause_session(session)
            if result is not None:
                paused.append(result)
        self._paused = paused

    async def resume(self) -> None:
        pass
  • Schritt 4: Alle bisherigen Tests ausführen — müssen PASS sein
uv run pytest tests/test_media_smtc.py -v

Erwartete Ausgabe: 6 passed

  • Schritt 5: Committen
git add whisper_local/media/_smtc.py tests/test_media_smtc.py
git commit -m "feat(media): SmtcController.pause() erkennt und pausiert PLAYING-Sessions"

Task 4: resume() implementieren (TDD)

Files:

  • Modify: tests/test_media_smtc.py

  • Modify: whisper_local/media/_smtc.py

  • Schritt 1: Tests für resume() ergänzen

Am Ende von tests/test_media_smtc.py anfügen:

@pytest.mark.asyncio
async def test_resume_with_empty_paused_list_is_noop(monkeypatch):
    from whisper_local.media._smtc import SmtcController

    controller = SmtcController()
    controller._paused = []
    ensure = AsyncMock()
    monkeypatch.setattr(controller, "_ensure_manager", ensure)

    await controller.resume()

    ensure.assert_not_awaited()


@pytest.mark.asyncio
async def test_resume_plays_only_previously_paused(monkeypatch):
    from whisper_local.media._smtc import SmtcController

    spotify = _make_session("Spotify", PAUSED)
    edge = _make_session("msedge", PAUSED)
    controller = SmtcController()
    controller._paused = ["Spotify"]
    monkeypatch.setattr(
        controller,
        "_ensure_manager",
        AsyncMock(return_value=_make_manager([spotify, edge])),
    )

    await controller.resume()

    spotify.try_play_async.assert_awaited_once()
    edge.try_play_async.assert_not_awaited()
    assert controller._paused == []


@pytest.mark.asyncio
async def test_resume_skips_disappeared_session(monkeypatch, caplog):
    from whisper_local.media._smtc import SmtcController

    still_there = _make_session("Spotify", PAUSED)
    controller = SmtcController()
    controller._paused = ["gone_app", "Spotify"]
    monkeypatch.setattr(
        controller,
        "_ensure_manager",
        AsyncMock(return_value=_make_manager([still_there])),
    )

    with caplog.at_level("WARNING"):
        await controller.resume()

    still_there.try_play_async.assert_awaited_once()
    assert controller._paused == []
    assert any("gone_app" in r.message for r in caplog.records)


@pytest.mark.asyncio
async def test_resume_logs_and_continues_when_session_fails(monkeypatch, caplog):
    from whisper_local.media._smtc import SmtcController

    broken = _make_session("broken", PAUSED)
    broken.try_play_async = AsyncMock(side_effect=RuntimeError("Verbindung verloren"))
    good = _make_session("Spotify", PAUSED)
    controller = SmtcController()
    controller._paused = ["broken", "Spotify"]
    monkeypatch.setattr(
        controller,
        "_ensure_manager",
        AsyncMock(return_value=_make_manager([broken, good])),
    )

    with caplog.at_level("WARNING"):
        await controller.resume()

    good.try_play_async.assert_awaited_once()
    assert controller._paused == []
    assert any("broken" in r.message for r in caplog.records)
  • Schritt 2: Tests ausführen — müssen FAIL sein
uv run pytest tests/test_media_smtc.py::test_resume_with_empty_paused_list_is_noop tests/test_media_smtc.py::test_resume_plays_only_previously_paused tests/test_media_smtc.py::test_resume_skips_disappeared_session tests/test_media_smtc.py::test_resume_logs_and_continues_when_session_fails -v

Erwartete Ausgabe: 4 FAIL (resume() ist noch ein No-Op).

  • Schritt 3: resume() implementieren

In whisper_local/media/_smtc.py die resume()-Methode ersetzen:

    async def resume(self) -> None:
        if not self._paused:
            return
        try:
            manager = await self._ensure_manager()
            current = {
                s.source_app_user_model_id: s for s in manager.get_sessions()
            }
        except Exception as e:
            logger.warning("SMTC nicht erreichbar beim Fortsetzen: %s", e)
            self._paused = []
            return
        to_resume = self._paused
        self._paused = []
        for aumid in to_resume:
            session = current.get(aumid)
            if session is None:
                logger.warning(
                    "Session %s nicht mehr vorhanden, wird übersprungen", aumid
                )
                continue
            try:
                await session.try_play_async()
            except Exception as e:
                logger.warning("Konnte Session %s nicht fortsetzen: %s", aumid, e)
  • Schritt 4: Alle Tests ausführen — müssen PASS sein
uv run pytest tests/test_media_smtc.py -v

Erwartete Ausgabe: 10 passed

  • Schritt 5: Committen
git add whisper_local/media/_smtc.py tests/test_media_smtc.py
git commit -m "feat(media): SmtcController.resume() stellt nur eigene Pausen wieder her"

Task 5: Factory-Dispatch für win32 + Protocol-Check (TDD)

Files:

  • Modify: tests/test_media_factory.py

  • Modify: whisper_local/media/__init__.py

  • Schritt 1: Factory-Tests aktualisieren

In tests/test_media_factory.py:

  1. Den bestehenden Test test_factory_returns_noop_on_non_linux umbenennen und auf "darwin" patchen (bisher war "win32" der Testfall — der hat jetzt einen eigenen Test):
def test_factory_returns_noop_on_other_platforms():
    with patch.object(sys, "platform", "darwin"):
        controller = create_media_controller(enabled=True)
    assert isinstance(controller, NoopController)
  1. Neuen Test für win32 ergänzen (am Ende der Datei):
@pytest.mark.skipif(sys.platform != "win32", reason="SmtcController nur auf Windows")
def test_factory_returns_smtc_on_win32_when_enabled():
    from whisper_local.media._smtc import SmtcController

    controller = create_media_controller(enabled=True)
    assert isinstance(controller, SmtcController)
  • Schritt 2: Tests ausführen — win32-Test muss FAIL sein
uv run pytest tests/test_media_factory.py -v

Erwartete Ausgabe: test_factory_returns_smtc_on_win32_when_enabled FAIL (Factory gibt noch NoopController zurück), alle anderen PASS.

  • Schritt 3: Factory um win32-Dispatch erweitern

In whisper_local/media/__init__.py den win32-Zweig vor dem Noop-Fallback einfügen:

"""Media-Steuerung — plattformspezifische Backends hinter gemeinsamem Interface."""

import sys
from typing import Protocol, runtime_checkable


@runtime_checkable
class MediaController(Protocol):
    async def pause(self) -> None: ...
    async def resume(self) -> None: ...


def create_media_controller(enabled: bool) -> MediaController:
    """Erstellt den plattformspezifischen Media-Controller.

    `enabled=False` → immer NoopController. Auf nicht unterstützten Plattformen
    wird ebenfalls der NoopController zurückgegeben.
    """
    if not enabled:
        from whisper_local.media._noop import NoopController
        return NoopController()
    if sys.platform == "linux":
        from whisper_local.media._mpris import MprisController
        return MprisController()
    if sys.platform == "win32":
        from whisper_local.media._smtc import SmtcController
        return SmtcController()
    from whisper_local.media._noop import NoopController
    return NoopController()
  • Schritt 4: Alle Factory-Tests ausführen — müssen PASS sein
uv run pytest tests/test_media_factory.py -v

Erwartete Ausgabe: alle passed

  • Schritt 5: Committen
git add whisper_local/media/__init__.py tests/test_media_factory.py
git commit -m "feat(media): Factory dispatcht auf win32 zum SmtcController"

Task 6: Gesamttest + Protocol-Konformität prüfen

Files: keine Änderungen

  • Schritt 1: Alle Tests ausführen
uv run pytest tests/test_media_smtc.py tests/test_media_factory.py -v

Erwartete Ausgabe: alle Tests grün, kein Skip außer den Linux-spezifischen (auf Windows).

  • Schritt 2: Protocol-Konformität prüfen
uv run python -c "
from whisper_local.media import MediaController, create_media_controller
ctrl = create_media_controller(enabled=True)
assert isinstance(ctrl, MediaController), f'Protocol nicht erfüllt: {type(ctrl)}'
print('OK:', type(ctrl).__name__, 'erfüllt MediaController-Protocol')
"

Erwartete Ausgabe: OK: SmtcController erfüllt MediaController-Protocol

  • Schritt 3: Vollständige Testsuite ausführen
uv run pytest -v

Erwartete Ausgabe: alle Tests grün (Linux-Tests werden auf Windows übersprungen).