feat: MiniMax provider for image/video/podcast skills + new music-generation skill (#3437)

* docs(spec): MiniMax integration for generation skills + new music skill Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs(plan): MiniMax generation providers implementation plan Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(skills): add importlib loader + FakeResp for skill tests * test(skills): register loaded module in sys.modules; raise requests.HTTPError in FakeResp * feat(image-generation): add MiniMax provider with env auto-detect Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(image-generation): guard unknown provider, derive ref MIME, strengthen tests Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(video-generation): add MiniMax provider with async poll/download Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(video-generation): surface base_resp errors while polling; add timeout test * feat(podcast-generation): add MiniMax t2a_v2 provider with env auto-detect Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(podcast-generation): restore TTS credential guard; add volcengine + voice tests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(music-generation): new MiniMax music skill via skill-creator Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor(music-generation): treat empty lyrics as absent; test no-audio-data path * refactor(skills): add request timeouts to MiniMax network calls Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Potential fix for pull request finding 'Explicit returns mixed with implicit (fall through) returns' Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> * fix(models): strip inconsistent user-message names for MiniMax chat DeerFlow middlewares tag user messages with provenance names (user-input, summary, loop_warning); langchain serializes them into the OpenAI-compatible payload and MiniMax rejects mismatched user-message names with "user name must be consistent (2013)". PatchedChatMiniMax now drops the per-message name from user-role messages. Point the config.example MiniMax models at PatchedChatMiniMax so they also get reasoning_content mapping. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(image-generation): MiniMax sends JSON prompt field, guard 1500-char limit MiniMax image-01 takes one text string capped at 1500 chars, but the skill was sending the whole structured JSON. The MiniMax provider now extracts the JSON `prompt` field (relying on prompt_optimizer to expand it) and fails fast with a clear error before calling the API when that field exceeds 1500 chars. Authoring stays provider-agnostic; Gemini still receives the full JSON. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(podcast-generation): per-provider TTS concurrency and retry/backoff Each TTS provider owns its concurrency internally — MiniMax runs single-threaded to reduce rate-limit failures, Volcengine keeps 4 workers — with automatic retry and backoff on transient HTTP and base_resp errors. No caller-facing concurrency knob. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(skills): address Copilot review comments on generation skills - video: add raise_for_status + timeout to the Gemini download/POST/poll calls so non-2xx responses surface as clear HTTP errors instead of JSON/KeyError or hangs - video: check the task Fail status before the generic base_resp check so the failure keeps its task_id context - video/image: create the output file parent directory before writing (matching music-generation) so nested output paths do not raise FileNotFoundError - music: require a non-empty prompt and fail fast with ValueError instead of sending an empty prompt to the API Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(scripts): reclaim dev ports across worktrees in make stop/dev All deer-flow worktrees (main checkout + linked worktrees) hardcode the same dev ports (8001/3000/2026), so a service started from any worktree must be reclaimable from another. stop_all now resolves the set of worktree roots (DEERFLOW_ROOTS) and treats a process as deer-flow-owned when its open files live under any of them. It also force-kills survivors on 2026 alongside 8001/3000, fixing `make dev` aborting on the nginx port preflight when a prior nginx lingered on 2026. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(view-image): hide the injected image-context message from the UI ViewImageMiddleware injects a HumanMessage (text + base64 images) so the vision model can see viewed images, but it was the only internal injector that set neither hide_from_ui nor a hidden name, so it leaked into the chat UI (and IM channels) as a user bubble reading "Here are the images you've viewed:". Mark it with additional_kwargs={"hide_from_ui": True}, matching todo/dynamic_context injections, which the frontend isHiddenFromUIMessage and the channel sender already honor. The model still receives the full content. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(minimax): mark M2.7 models as text-only (no vision) MiniMax M2.7 / M2.7-highspeed do not support vision; only M3 does. The provider config asserted vision support for M2.7 in four places. - config.example.yaml: 4 M2.7 entries -> supports_vision: false - backend/docs/CONFIGURATION.md: M2.7 + highspeed -> supports_vision: false - wizard: add LLMProvider.model_vision_overrides + extra_config_for() so selecting an M2.7 model writes supports_vision: false while M3 (default) keeps vision; wire it through setup_wizard.py - tests: M2.7-highspeed fixture -> supports_vision=False; add test_minimax_vision_is_per_model Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: Willem Jiang <willem.jiang@gmail.com> Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
2026-06-10 09:25:57 +00:00 · 2026-06-08 22:04:38 +08:00
parent 1651d1f1f5
commit cd5bedaa74
27 changed files with 3564 additions and 365 deletions
@@ -0,0 +1,39 @@
+"""Load a skill's scripts/generate.py as an importable module, by file path.
+
+Skills live in skills/public/<name>/scripts/generate.py and are NOT a package,
+so tests load them via importlib. Tests then mock the module's `requests`.
+"""
+import importlib.util
+import sys
+from pathlib import Path
+
+import requests
+
+REPO_ROOT = Path(__file__).resolve().parents[2]
+
+
+def load(skill_name: str):
+    """Return the generate.py module for skills/public/<skill_name>."""
+    path = REPO_ROOT / "skills" / "public" / skill_name / "scripts" / "generate.py"
+    mod_name = skill_name.replace("-", "_") + "_generate"
+    spec = importlib.util.spec_from_file_location(mod_name, path)
+    module = importlib.util.module_from_spec(spec)
+    sys.modules[mod_name] = module  # standard pattern; lets the module resolve itself
+    spec.loader.exec_module(module)
+    return module
+
+
+class FakeResp:
+    """Minimal stand-in for requests.Response."""
+
+    def __init__(self, json_data=None, content=b"", status_code=200):
+        self._json = json_data if json_data is not None else {}
+        self.content = content
+        self.status_code = status_code
+
+    def raise_for_status(self):
+        if self.status_code >= 400:
+            raise requests.HTTPError(f"HTTP {self.status_code}")
+
+    def json(self):
+        return self._json
@@ -0,0 +1,195 @@
+import base64
+import sys
+from pathlib import Path
+
+import pytest
+
+sys.path.insert(0, str(Path(__file__).resolve().parent))
+from skill_loader import FakeResp, load  # noqa: E402
+
+img = load("image-generation")
+
+
+@pytest.fixture(autouse=True)
+def clean_env(monkeypatch):
+    for k in ["GEMINI_API_KEY", "MINIMAX_API_KEY", "IMAGE_GENERATION_PROVIDER",
+              "MINIMAX_API_HOST", "MINIMAX_IMAGE_MODEL"]:
+        monkeypatch.delenv(k, raising=False)
+
+
+def test_resolve_prefers_gemini(monkeypatch):
+    monkeypatch.setenv("GEMINI_API_KEY", "g")
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+    assert img._resolve_provider("IMAGE_GENERATION_PROVIDER", "gemini", True) == "gemini"
+
+
+def test_resolve_falls_back_to_minimax(monkeypatch):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+    assert img._resolve_provider("IMAGE_GENERATION_PROVIDER", "gemini", False) == "minimax"
+
+
+def test_resolve_override_wins(monkeypatch):
+    monkeypatch.setenv("GEMINI_API_KEY", "g")
+    monkeypatch.setenv("IMAGE_GENERATION_PROVIDER", "MiniMax")
+    assert img._resolve_provider("IMAGE_GENERATION_PROVIDER", "gemini", True) == "minimax"
+
+
+def test_resolve_errors_when_none(monkeypatch):
+    with pytest.raises(ValueError):
+        img._resolve_provider("IMAGE_GENERATION_PROVIDER", "gemini", False)
+
+
+def test_minimax_builds_payload_and_writes(monkeypatch, tmp_path):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+    raw = b"PNGBYTES"
+    captured = {}
+
+    def fake_post(url, headers=None, json=None, **kw):
+        captured["url"] = url
+        captured["headers"] = headers
+        captured["json"] = json
+        return FakeResp({"data": {"image_base64": [base64.b64encode(raw).decode()]},
+                         "base_resp": {"status_code": 0, "status_msg": "success"}})
+
+    monkeypatch.setattr(img.requests, "post", fake_post)
+    out = tmp_path / "o.jpg"
+    prompt_file = tmp_path / "p.json"
+    prompt_file.write_text("a red apple", encoding="utf-8")
+    msg = img.generate_image(str(prompt_file), [], str(out), "16:9")
+
+    assert out.read_bytes() == raw
+    assert captured["url"].endswith("/v1/image_generation")
+    assert captured["headers"]["Authorization"] == "Bearer m"
+    assert captured["json"]["model"] == "image-01"
+    assert captured["json"]["response_format"] == "base64"
+    assert captured["json"]["aspect_ratio"] == "16:9"
+    assert captured["json"]["n"] == 1
+    assert captured["json"]["prompt_optimizer"] is True
+    assert "Successfully generated image" in msg
+
+
+def test_minimax_reference_image_as_data_url(monkeypatch, tmp_path):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+    captured = {}
+
+    def fake_post(url, headers=None, json=None, **kw):
+        captured["json"] = json
+        return FakeResp({"data": {"image_base64": [base64.b64encode(b"x").decode()]},
+                         "base_resp": {"status_code": 0}})
+
+    monkeypatch.setattr(img.requests, "post", fake_post)
+    ref = tmp_path / "ref.jpg"
+    ref.write_bytes(b"\xff\xd8refbytes")
+    prompt_file = tmp_path / "p.json"
+    prompt_file.write_text("scene", encoding="utf-8")
+    img.generate_image(str(prompt_file), [str(ref)], str(tmp_path / "o.jpg"), "1:1")
+
+    subj = captured["json"]["subject_reference"]
+    assert subj[0]["type"] == "character"
+    assert subj[0]["image_file"].startswith("data:image/jpeg;base64,")
+    import base64 as _b64
+    encoded = subj[0]["image_file"].split(",", 1)[1]
+    assert _b64.b64decode(encoded) == b"\xff\xd8refbytes"
+
+
+def test_minimax_raises_on_base_resp_error(monkeypatch, tmp_path):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+
+    def fake_post(url, headers=None, json=None, **kw):
+        return FakeResp({"base_resp": {"status_code": 1004, "status_msg": "auth failed"}})
+
+    monkeypatch.setattr(img.requests, "post", fake_post)
+    prompt_file = tmp_path / "p.json"
+    prompt_file.write_text("x", encoding="utf-8")
+    with pytest.raises(Exception) as e:
+        img.generate_image(str(prompt_file), [], str(tmp_path / "o.jpg"), "1:1")
+    assert "1004" in str(e.value)
+
+
+def test_minimax_extracts_json_prompt_field(monkeypatch, tmp_path):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+    captured = {}
+
+    def fake_post(url, headers=None, json=None, **kw):
+        captured["json"] = json
+        return FakeResp({"data": {"image_base64": [base64.b64encode(b"x").decode()]},
+                         "base_resp": {"status_code": 0}})
+
+    monkeypatch.setattr(img.requests, "post", fake_post)
+    prompt_file = tmp_path / "p.json"
+    prompt_file.write_text(
+        '{"prompt": "a red barn at dawn", "style": "watercolor", '
+        '"composition": "rule of thirds", "negative_prompt": "blurry"}',
+        encoding="utf-8",
+    )
+    img.generate_image(str(prompt_file), [], str(tmp_path / "o.jpg"), "16:9")
+
+    # Only the JSON `prompt` field reaches MiniMax — no other fields, no JSON syntax.
+    assert captured["json"]["prompt"] == "a red barn at dawn"
+    assert captured["json"]["prompt_optimizer"] is True
+
+
+def test_minimax_plaintext_prompt_passes_through(monkeypatch, tmp_path):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+    captured = {}
+
+    def fake_post(url, headers=None, json=None, **kw):
+        captured["json"] = json
+        return FakeResp({"data": {"image_base64": [base64.b64encode(b"x").decode()]},
+                         "base_resp": {"status_code": 0}})
+
+    monkeypatch.setattr(img.requests, "post", fake_post)
+    prompt_file = tmp_path / "p.txt"
+    prompt_file.write_text("a red apple on a table", encoding="utf-8")
+    img.generate_image(str(prompt_file), [], str(tmp_path / "o.jpg"), "1:1")
+
+    assert captured["json"]["prompt"] == "a red apple on a table"
+
+
+def test_minimax_rejects_overlong_prompt_without_calling_api(monkeypatch, tmp_path):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+
+    def fake_post(url, headers=None, json=None, **kw):  # pragma: no cover
+        raise AssertionError("must not call the API when the prompt is over the limit")
+
+    monkeypatch.setattr(img.requests, "post", fake_post)
+    prompt_file = tmp_path / "p.json"
+    prompt_file.write_text('{"prompt": "' + "x" * 1600 + '"}', encoding="utf-8")
+    out = tmp_path / "o.jpg"
+    msg = img.generate_image(str(prompt_file), [], str(out), "16:9")
+
+    assert "1500" in msg
+    assert "character" in msg.lower()
+    assert not out.exists()
+
+
+def test_minimax_creates_nested_output_dir(monkeypatch, tmp_path):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+
+    def fake_post(url, headers=None, json=None, **kw):
+        return FakeResp({"data": {"image_base64": [base64.b64encode(b"img").decode()]},
+                         "base_resp": {"status_code": 0}})
+
+    monkeypatch.setattr(img.requests, "post", fake_post)
+    prompt_file = tmp_path / "p.txt"
+    prompt_file.write_text("a cat", encoding="utf-8")
+    out = tmp_path / "nested" / "dir" / "o.jpg"
+    img.generate_image(str(prompt_file), [], str(out), "1:1")
+
+    assert out.read_bytes() == b"img"
+
+
+def test_unknown_provider_raises(monkeypatch, tmp_path):
+    monkeypatch.setenv("IMAGE_GENERATION_PROVIDER", "openai")
+    monkeypatch.setenv("GEMINI_API_KEY", "g")
+    pf = tmp_path / "p.json"
+    pf.write_text("x", encoding="utf-8")
+    with pytest.raises(ValueError):
+        img.generate_image(str(pf), [], str(tmp_path / "o.jpg"), "1:1")
+
+
+def test_guess_mime_by_extension():
+    assert img._guess_mime("/a/b.png") == "image/png"
+    assert img._guess_mime("/a/b.webp") == "image/webp"
+    assert img._guess_mime("/a/b.jpg") == "image/jpeg"
+    assert img._guess_mime("/a/b.unknown") == "image/jpeg"
@@ -0,0 +1,135 @@
+import sys
+from pathlib import Path
+
+import pytest
+
+sys.path.insert(0, str(Path(__file__).resolve().parent))
+from skill_loader import FakeResp, load  # noqa: E402
+
+mus = load("music-generation")
+
+
+@pytest.fixture(autouse=True)
+def clean_env(monkeypatch):
+    for k in ["MINIMAX_API_KEY", "MINIMAX_API_HOST", "MINIMAX_MUSIC_MODEL"]:
+        monkeypatch.delenv(k, raising=False)
+
+
+def _post_ok(captured):
+    def fake_post(url, headers=None, json=None, **kw):
+        captured["url"] = url
+        captured["headers"] = headers
+        captured["json"] = json
+        return FakeResp({"data": {"audio": b"songbytes".hex(), "status": 2},
+                         "base_resp": {"status_code": 0}})
+    return fake_post
+
+
+def test_with_lyrics_payload_and_writes(monkeypatch, tmp_path):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+    captured = {}
+    monkeypatch.setattr(mus.requests, "post", _post_ok(captured))
+    spec = tmp_path / "s.json"
+    spec.write_text('{"title":"X","prompt":"pop, happy","lyrics":"[verse]\\nla la"}',
+                    encoding="utf-8")
+    out = tmp_path / "o.mp3"
+    msg = mus.generate_music(str(spec), str(out))
+    assert out.read_bytes() == b"songbytes"
+    assert captured["url"].endswith("/v1/music_generation")
+    assert captured["headers"]["Authorization"] == "Bearer m"
+    assert captured["json"]["model"] == "music-2.6-free"
+    assert captured["json"]["lyrics"] == "[verse]\nla la"
+    assert captured["json"]["output_format"] == "hex"
+    assert "Successfully generated music" in msg
+
+
+def test_instrumental_sets_flag(monkeypatch, tmp_path):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+    captured = {}
+    monkeypatch.setattr(mus.requests, "post", _post_ok(captured))
+    spec = tmp_path / "s.json"
+    spec.write_text('{"prompt":"lofi beats","is_instrumental":true}', encoding="utf-8")
+    mus.generate_music(str(spec), str(tmp_path / "o.mp3"))
+    assert captured["json"]["is_instrumental"] is True
+    assert "lyrics" not in captured["json"]
+    assert "lyrics_optimizer" not in captured["json"]
+
+
+def test_no_lyrics_uses_optimizer(monkeypatch, tmp_path):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+    captured = {}
+    monkeypatch.setattr(mus.requests, "post", _post_ok(captured))
+    spec = tmp_path / "s.json"
+    spec.write_text('{"prompt":"sad ballad"}', encoding="utf-8")
+    mus.generate_music(str(spec), str(tmp_path / "o.mp3"))
+    assert captured["json"]["lyrics_optimizer"] is True
+    assert "lyrics" not in captured["json"]
+
+
+def test_model_override(monkeypatch, tmp_path):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+    monkeypatch.setenv("MINIMAX_MUSIC_MODEL", "music-2.6")
+    captured = {}
+    monkeypatch.setattr(mus.requests, "post", _post_ok(captured))
+    spec = tmp_path / "s.json"
+    spec.write_text('{"prompt":"jazz","lyrics":"[verse]\\nhi"}', encoding="utf-8")
+    mus.generate_music(str(spec), str(tmp_path / "o.mp3"))
+    assert captured["json"]["model"] == "music-2.6"
+
+
+def test_raises_on_base_resp_error(monkeypatch, tmp_path):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+
+    def fake_post(url, headers=None, json=None, **kw):
+        return FakeResp({"base_resp": {"status_code": 1008, "status_msg": "no balance"}})
+
+    monkeypatch.setattr(mus.requests, "post", fake_post)
+    spec = tmp_path / "s.json"
+    spec.write_text('{"prompt":"x","lyrics":"[verse]\\ny"}', encoding="utf-8")
+    with pytest.raises(Exception) as e:
+        mus.generate_music(str(spec), str(tmp_path / "o.mp3"))
+    assert "1008" in str(e.value)
+
+
+def test_missing_api_key_returns_message(monkeypatch, tmp_path):
+    spec = tmp_path / "s.json"
+    spec.write_text('{"prompt":"x"}', encoding="utf-8")
+    msg = mus.generate_music(str(spec), str(tmp_path / "o.mp3"))
+    assert "MINIMAX_API_KEY" in msg
+
+
+def test_raises_on_missing_audio_data(monkeypatch, tmp_path):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+
+    def fake_post(url, headers=None, json=None, **kw):
+        return FakeResp({"base_resp": {"status_code": 0}})  # no "data" key
+
+    monkeypatch.setattr(mus.requests, "post", fake_post)
+    spec = tmp_path / "s.json"
+    spec.write_text('{"prompt":"x"}', encoding="utf-8")
+    with pytest.raises(Exception, match="no audio data"):
+        mus.generate_music(str(spec), str(tmp_path / "o.mp3"))
+
+
+def test_empty_prompt_raises(monkeypatch, tmp_path):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+
+    def fake_post(url, headers=None, json=None, **kw):  # pragma: no cover
+        raise AssertionError("must not call the API when prompt is missing")
+
+    monkeypatch.setattr(mus.requests, "post", fake_post)
+    spec = tmp_path / "s.json"
+    spec.write_text('{"title":"X","lyrics":"[verse]\\nhi"}', encoding="utf-8")  # no prompt
+    with pytest.raises(ValueError, match="prompt"):
+        mus.generate_music(str(spec), str(tmp_path / "o.mp3"))
+
+
+def test_empty_lyrics_falls_back_to_optimizer(monkeypatch, tmp_path):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+    captured = {}
+    monkeypatch.setattr(mus.requests, "post", _post_ok(captured))
+    spec = tmp_path / "s.json"
+    spec.write_text('{"prompt":"x","lyrics":""}', encoding="utf-8")
+    mus.generate_music(str(spec), str(tmp_path / "o.mp3"))
+    assert captured["json"]["lyrics_optimizer"] is True
+    assert "lyrics" not in captured["json"]
@@ -0,0 +1,253 @@
+import sys
+from pathlib import Path
+
+import pytest
+
+sys.path.insert(0, str(Path(__file__).resolve().parent))
+from skill_loader import FakeResp, load  # noqa: E402
+
+pod = load("podcast-generation")
+
+
+@pytest.fixture(autouse=True)
+def clean_env(monkeypatch):
+    for k in ["VOLCENGINE_TTS_APPID", "VOLCENGINE_TTS_ACCESS_TOKEN", "VOLCENGINE_TTS_CLUSTER",
+              "MINIMAX_API_KEY", "PODCAST_GENERATION_PROVIDER", "MINIMAX_API_HOST",
+              "MINIMAX_TTS_MODEL", "MINIMAX_TTS_VOICE_MALE", "MINIMAX_TTS_VOICE_FEMALE",
+              "MINIMAX_TTS_MAX_RETRIES"]:
+        monkeypatch.delenv(k, raising=False)
+    # never actually sleep during backoff in tests
+    monkeypatch.setattr(pod.time, "sleep", lambda *_: None)
+
+
+def test_resolve_prefers_volcengine(monkeypatch):
+    monkeypatch.setenv("VOLCENGINE_TTS_APPID", "a")
+    monkeypatch.setenv("VOLCENGINE_TTS_ACCESS_TOKEN", "t")
+    assert pod._resolve_tts_provider() == "volcengine"
+
+
+def test_resolve_falls_back_to_minimax(monkeypatch):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+    assert pod._resolve_tts_provider() == "minimax"
+
+
+def test_resolve_override(monkeypatch):
+    monkeypatch.setenv("VOLCENGINE_TTS_APPID", "a")
+    monkeypatch.setenv("VOLCENGINE_TTS_ACCESS_TOKEN", "t")
+    monkeypatch.setenv("PODCAST_GENERATION_PROVIDER", "minimax")
+    assert pod._resolve_tts_provider() == "minimax"
+
+
+def test_resolve_unknown_raises(monkeypatch):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+    monkeypatch.setenv("PODCAST_GENERATION_PROVIDER", "openai")
+    with pytest.raises(ValueError):
+        pod._resolve_tts_provider()
+
+
+def test_minimax_tts_decodes_hex(monkeypatch):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+    captured = {}
+
+    def fake_post(url, headers=None, json=None, **kw):
+        captured["url"] = url
+        captured["json"] = json
+        return FakeResp({"data": {"audio": b"audiobytes".hex(), "status": 2},
+                         "base_resp": {"status_code": 0}})
+
+    monkeypatch.setattr(pod.requests, "post", fake_post)
+    out = pod.text_to_speech_minimax("hello", "male-qn-qingse")
+    assert out == b"audiobytes"
+    assert captured["url"].endswith("/v1/t2a_v2")
+    assert captured["json"]["voice_setting"]["voice_id"] == "male-qn-qingse"
+    assert captured["json"]["output_format"] == "hex"
+
+
+def test_process_line_minimax_voice_mapping(monkeypatch):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+    seen = {}
+
+    def fake_tts(text, voice_id):
+        seen["voice_id"] = voice_id
+        return b"x"
+
+    monkeypatch.setattr(pod, "text_to_speech_minimax", fake_tts)
+    line = pod.ScriptLine(speaker="female", paragraph="hi")
+    idx, audio = pod._process_line((0, line, 1, "minimax"))
+    assert audio == b"x"
+    assert seen["voice_id"] == "female-tianmei"
+
+
+def test_generate_podcast_minimax_end_to_end(monkeypatch, tmp_path):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+
+    def fake_post(url, headers=None, json=None, **kw):
+        return FakeResp({"data": {"audio": b"chunk".hex(), "status": 2},
+                         "base_resp": {"status_code": 0}})
+
+    monkeypatch.setattr(pod.requests, "post", fake_post)
+    script = tmp_path / "s.json"
+    script.write_text(
+        '{"title":"T","locale":"en","lines":[{"speaker":"male","paragraph":"a"},'
+        '{"speaker":"female","paragraph":"b"}]}',
+        encoding="utf-8",
+    )
+    out = tmp_path / "o.mp3"
+    msg = pod.generate_podcast(str(script), str(out), None)
+    assert out.read_bytes() == b"chunkchunk"
+    assert "Successfully generated podcast" in msg
+
+
+def test_volcengine_tts_decodes_base64(monkeypatch):
+    import base64
+    monkeypatch.setenv("VOLCENGINE_TTS_APPID", "a")
+    monkeypatch.setenv("VOLCENGINE_TTS_ACCESS_TOKEN", "t")
+
+    def fake_post(url, headers=None, json=None, **kw):
+        return FakeResp({"code": 3000, "data": base64.b64encode(b"volcbytes").decode()})
+
+    monkeypatch.setattr(pod.requests, "post", fake_post)
+    out = pod.text_to_speech_volcengine("hi", "zh_male_yangguangqingnian_moon_bigtts")
+    assert out == b"volcbytes"
+
+
+def test_volcengine_without_creds_raises(monkeypatch):
+    monkeypatch.setenv("PODCAST_GENERATION_PROVIDER", "volcengine")
+    script = pod.Script(lines=[pod.ScriptLine("male", "a")])
+    with pytest.raises(ValueError):
+        pod.tts_node(script)
+
+
+def test_process_line_minimax_male_and_override(monkeypatch):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+    seen = []
+
+    def fake_tts(text, voice_id):
+        seen.append(voice_id)
+        return b"x"
+
+    monkeypatch.setattr(pod, "text_to_speech_minimax", fake_tts)
+    male = pod.ScriptLine(speaker="male", paragraph="hi")
+    pod._process_line((0, male, 1, "minimax"))
+    assert seen[-1] == "male-qn-qingse"
+    monkeypatch.setenv("MINIMAX_TTS_VOICE_MALE", "custom-male")
+    pod._process_line((0, male, 1, "minimax"))
+    assert seen[-1] == "custom-male"
+
+
+def _seq_post(responses):
+    """Return a fake requests.post that yields the given responses in order."""
+    calls = {"n": 0}
+
+    def fake_post(*a, **k):
+        resp = responses[min(calls["n"], len(responses) - 1)]
+        calls["n"] += 1
+        return resp
+
+    return fake_post, calls
+
+
+def test_minimax_retries_on_rate_limit_code(monkeypatch):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+    fake_post, calls = _seq_post([
+        FakeResp({"base_resp": {"status_code": 1002, "status_msg": "rate limit"}}),
+        FakeResp({"base_resp": {"status_code": 1039, "status_msg": "tpm limit"}}),
+        FakeResp({"data": {"audio": b"ok".hex()}, "base_resp": {"status_code": 0}}),
+    ])
+    monkeypatch.setattr(pod.requests, "post", fake_post)
+    out = pod.text_to_speech_minimax("hi", "male-qn-qingse", max_retries=3)
+    assert out == b"ok"
+    assert calls["n"] == 3  # two retries then success
+
+
+def test_minimax_retries_on_http_429(monkeypatch):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+    fake_post, calls = _seq_post([
+        FakeResp({}, status_code=429),
+        FakeResp({"data": {"audio": b"ok".hex()}, "base_resp": {"status_code": 0}}),
+    ])
+    monkeypatch.setattr(pod.requests, "post", fake_post)
+    out = pod.text_to_speech_minimax("hi", "male-qn-qingse", max_retries=3)
+    assert out == b"ok"
+    assert calls["n"] == 2
+
+
+def test_minimax_no_retry_on_auth_error(monkeypatch):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+    fake_post, calls = _seq_post([
+        FakeResp({"base_resp": {"status_code": 1004, "status_msg": "auth failed"}}),
+        FakeResp({"data": {"audio": b"never".hex()}, "base_resp": {"status_code": 0}}),
+    ])
+    monkeypatch.setattr(pod.requests, "post", fake_post)
+    out = pod.text_to_speech_minimax("hi", "male-qn-qingse", max_retries=3)
+    assert out is None
+    assert calls["n"] == 1  # permanent error: no retry
+
+
+def test_minimax_gives_up_after_max_retries(monkeypatch):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+    fake_post, calls = _seq_post([
+        FakeResp({"base_resp": {"status_code": 1002, "status_msg": "rate limit"}}),
+    ])
+    monkeypatch.setattr(pod.requests, "post", fake_post)
+    out = pod.text_to_speech_minimax("hi", "male-qn-qingse", max_retries=2)
+    assert out is None
+    assert calls["n"] == 3  # initial attempt + 2 retries
+
+
+def test_tts_node_raises_on_partial_failure(monkeypatch):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+    calls = {"n": 0}
+
+    def fake_tts(text, voice_id, **kw):
+        calls["n"] += 1
+        return b"x" if calls["n"] == 1 else None
+
+    monkeypatch.setattr(pod, "text_to_speech_minimax", fake_tts)
+    script = pod.Script(lines=[pod.ScriptLine("male", "a"), pod.ScriptLine("female", "b")])
+    with pytest.raises(ValueError) as e:
+        pod.tts_node(script)
+    assert "2" in str(e.value)  # mentions failed line number 2
+
+
+def test_tts_node_defaults_to_one_worker_for_minimax(monkeypatch):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+    captured = {}
+    real_executor = pod.ThreadPoolExecutor
+
+    class CapturingExecutor(real_executor):
+        def __init__(self, *args, **kwargs):
+            captured["max_workers"] = kwargs.get("max_workers", args[0] if args else None)
+            super().__init__(*args, **kwargs)
+
+    def fake_tts(text, voice_id):
+        return b"x"
+
+    monkeypatch.setattr(pod, "ThreadPoolExecutor", CapturingExecutor)
+    monkeypatch.setattr(pod, "text_to_speech_minimax", fake_tts)
+    script = pod.Script(lines=[pod.ScriptLine("male", "a"), pod.ScriptLine("female", "b")])
+
+    assert pod.tts_node(script) == [b"x", b"x"]
+    assert captured["max_workers"] == 1
+
+
+def test_tts_node_keeps_four_worker_default_for_volcengine(monkeypatch):
+    monkeypatch.setenv("VOLCENGINE_TTS_APPID", "a")
+    monkeypatch.setenv("VOLCENGINE_TTS_ACCESS_TOKEN", "t")
+    captured = {}
+    real_executor = pod.ThreadPoolExecutor
+
+    class CapturingExecutor(real_executor):
+        def __init__(self, *args, **kwargs):
+            captured["max_workers"] = kwargs.get("max_workers", args[0] if args else None)
+            super().__init__(*args, **kwargs)
+
+    def fake_tts(text, voice_type):
+        return b"x"
+
+    monkeypatch.setattr(pod, "ThreadPoolExecutor", CapturingExecutor)
+    monkeypatch.setattr(pod, "text_to_speech_volcengine", fake_tts)
+    script = pod.Script(lines=[pod.ScriptLine("male", "a"), pod.ScriptLine("female", "b")])
+
+    assert pod.tts_node(script) == [b"x", b"x"]
+    assert captured["max_workers"] == 4
@@ -0,0 +1,187 @@
+import sys
+from pathlib import Path
+
+import pytest
+import requests
+
+sys.path.insert(0, str(Path(__file__).resolve().parent))
+from skill_loader import FakeResp, load  # noqa: E402
+
+vid = load("video-generation")
+
+
+@pytest.fixture(autouse=True)
+def clean_env(monkeypatch):
+    for k in ["GEMINI_API_KEY", "MINIMAX_API_KEY", "VIDEO_GENERATION_PROVIDER",
+              "MINIMAX_API_HOST", "MINIMAX_VIDEO_MODEL"]:
+        monkeypatch.delenv(k, raising=False)
+    monkeypatch.setattr(vid.time, "sleep", lambda *_: None)
+
+
+def test_resolve_prefers_gemini():
+    assert vid._resolve_provider("VIDEO_GENERATION_PROVIDER", "gemini", True) == "gemini"
+
+
+def test_resolve_falls_back_to_minimax(monkeypatch):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+    assert vid._resolve_provider("VIDEO_GENERATION_PROVIDER", "gemini", False) == "minimax"
+
+
+def test_resolve_override(monkeypatch):
+    monkeypatch.setenv("VIDEO_GENERATION_PROVIDER", "minimax")
+    assert vid._resolve_provider("VIDEO_GENERATION_PROVIDER", "gemini", True) == "minimax"
+
+
+def test_unknown_provider_raises(monkeypatch, tmp_path):
+    monkeypatch.setenv("VIDEO_GENERATION_PROVIDER", "openai")
+    monkeypatch.setenv("GEMINI_API_KEY", "g")
+    pf = tmp_path / "p.json"
+    pf.write_text("x", encoding="utf-8")
+    with pytest.raises(ValueError):
+        vid.generate_video(str(pf), [], str(tmp_path / "v.mp4"), "16:9")
+
+
+def test_minimax_full_flow(monkeypatch, tmp_path):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+    posts = {}
+
+    def fake_post(url, headers=None, json=None, **kw):
+        posts["url"] = url
+        posts["json"] = json
+        return FakeResp({"task_id": "T1", "base_resp": {"status_code": 0}})
+
+    def fake_get(url, headers=None, params=None, **kw):
+        if url.endswith("/v1/query/video_generation"):
+            assert params["task_id"] == "T1"
+            return FakeResp({"status": "Success", "file_id": "F1",
+                             "base_resp": {"status_code": 0}})
+        if url.endswith("/v1/files/retrieve"):
+            assert params["file_id"] == "F1"
+            return FakeResp({"file": {"download_url": "https://dl/v.mp4"},
+                             "base_resp": {"status_code": 0}})
+        return FakeResp(content=b"MP4DATA")  # the actual download
+
+    monkeypatch.setattr(vid.requests, "post", fake_post)
+    monkeypatch.setattr(vid.requests, "get", fake_get)
+
+    out = tmp_path / "v.mp4"
+    pf = tmp_path / "p.json"
+    pf.write_text("a cat runs", encoding="utf-8")
+    msg = vid.generate_video(str(pf), [], str(out), "16:9")
+
+    assert out.read_bytes() == b"MP4DATA"
+    assert posts["url"].endswith("/v1/video_generation")
+    assert posts["json"]["model"] == "MiniMax-Hailuo-2.3"
+    assert "successfully" in msg.lower()
+
+
+def test_minimax_reference_first_frame(monkeypatch, tmp_path):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+    posts = {}
+
+    def fake_post(url, headers=None, json=None, **kw):
+        posts["json"] = json
+        return FakeResp({"task_id": "T1", "base_resp": {"status_code": 0}})
+
+    def fake_get(url, headers=None, params=None, **kw):
+        if url.endswith("/v1/query/video_generation"):
+            return FakeResp({"status": "Success", "file_id": "F1", "base_resp": {"status_code": 0}})
+        if url.endswith("/v1/files/retrieve"):
+            return FakeResp({"file": {"download_url": "https://dl/v.mp4"}, "base_resp": {"status_code": 0}})
+        return FakeResp(content=b"X")
+
+    monkeypatch.setattr(vid.requests, "post", fake_post)
+    monkeypatch.setattr(vid.requests, "get", fake_get)
+    ref = tmp_path / "f.jpg"
+    ref.write_bytes(b"\xff\xd8img")
+    pf = tmp_path / "p.json"
+    pf.write_text("x", encoding="utf-8")
+    vid.generate_video(str(pf), [str(ref)], str(tmp_path / "v.mp4"), "16:9")
+    assert posts["json"]["first_frame_image"].startswith("data:image/jpeg;base64,")
+
+
+def test_minimax_task_fail(monkeypatch, tmp_path):
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+
+    def fake_post(url, headers=None, json=None, **kw):
+        return FakeResp({"task_id": "T1", "base_resp": {"status_code": 0}})
+
+    def fake_get(url, headers=None, params=None, **kw):
+        return FakeResp({"status": "Fail", "base_resp": {"status_code": 1027, "status_msg": "blocked"}})
+
+    monkeypatch.setattr(vid.requests, "post", fake_post)
+    monkeypatch.setattr(vid.requests, "get", fake_get)
+    pf = tmp_path / "p.json"
+    pf.write_text("x", encoding="utf-8")
+    with pytest.raises(Exception):
+        vid.generate_video(str(pf), [], str(tmp_path / "v.mp4"), "16:9")
+
+
+def test_minimax_poll_timeout(monkeypatch):
+    def fake_get(url, headers=None, params=None, **kw):
+        return FakeResp({"status": "Processing", "base_resp": {"status_code": 0}})
+
+    monkeypatch.setattr(vid.requests, "get", fake_get)
+    with pytest.raises(Exception) as e:
+        vid._poll_video_task("https://h", "Bearer m", "T1", max_attempts=3, interval=0)
+    assert "timed out" in str(e.value)
+
+
+def test_minimax_task_fail_keeps_task_context(monkeypatch, tmp_path):
+    # A Fail status takes priority over the generic base_resp check, so the
+    # error keeps the task_id and the task-level failure message.
+    monkeypatch.setenv("MINIMAX_API_KEY", "m")
+
+    def fake_post(url, headers=None, json=None, **kw):
+        return FakeResp({"task_id": "T1", "base_resp": {"status_code": 0}})
+
+    def fake_get(url, headers=None, params=None, **kw):
+        return FakeResp({"status": "Fail", "base_resp": {"status_code": 1027, "status_msg": "blocked"}})
+
+    monkeypatch.setattr(vid.requests, "post", fake_post)
+    monkeypatch.setattr(vid.requests, "get", fake_get)
+    pf = tmp_path / "p.json"
+    pf.write_text("x", encoding="utf-8")
+    with pytest.raises(Exception, match="task T1 failed"):
+        vid.generate_video(str(pf), [], str(tmp_path / "v.mp4"), "16:9")
+
+
+def test_gemini_download_raises_on_http_error(monkeypatch, tmp_path):
+    monkeypatch.setenv("GEMINI_API_KEY", "g")
+    calls = {}
+
+    def fake_get(url, headers=None, **kw):
+        calls["timeout"] = kw.get("timeout")
+        return FakeResp(content=b"error page", status_code=500)
+
+    monkeypatch.setattr(vid.requests, "get", fake_get)
+    out = tmp_path / "sub" / "v.mp4"
+    with pytest.raises(requests.HTTPError):
+        vid.download("https://dl/v.mp4", str(out))
+    assert calls["timeout"]  # a timeout is now passed
+    assert not out.exists()
+
+
+def test_gemini_download_writes_nested_dir(monkeypatch, tmp_path):
+    monkeypatch.setenv("GEMINI_API_KEY", "g")
+
+    def fake_get(url, headers=None, **kw):
+        return FakeResp(content=b"VIDEO")
+
+    monkeypatch.setattr(vid.requests, "get", fake_get)
+    out = tmp_path / "nested" / "dir" / "v.mp4"
+    vid.download("https://dl/v.mp4", str(out))
+    assert out.read_bytes() == b"VIDEO"
+
+
+def test_gemini_post_raises_on_http_error(monkeypatch, tmp_path):
+    monkeypatch.setenv("GEMINI_API_KEY", "g")
+
+    def fake_post(url, headers=None, json=None, **kw):
+        return FakeResp(status_code=503)
+
+    monkeypatch.setattr(vid.requests, "post", fake_post)
+    pf = tmp_path / "p.json"
+    pf.write_text("a cat", encoding="utf-8")
+    with pytest.raises(requests.HTTPError):
+        vid.generate_video(str(pf), [], str(tmp_path / "v.mp4"), "16:9")