22 Commits

Author SHA1 Message Date
pax
10f1b3fd10 test_mpv_options: update demuxer_max_bytes assertion to 50MiB 2026-04-11 23:01:41 -05:00
pax
553734fe79 test_mpv_options: update demuxer_max_bytes assertion (50→150MiB) 2026-04-11 20:01:29 -05:00
pax
1864cfb088 test_pil_safety: target core.config instead of core.images
The 'audit #8 invariant' the test was anchored on (core.images
imported without core.cache first) is about to become moot when
images.py is removed in a follow-up commit. Swap to core.config
to keep the same coverage shape: any non-cache submodule import
must still trigger __init__.py and install the PIL cap.
2026-04-11 17:28:47 -05:00
pax
278d4a291d ci: convert test_safety async tests off pytest-asyncio
The two validate_public_request hook tests used @pytest.mark.asyncio
which requires pytest-asyncio at collection time. CI only installs
httpx + Pillow + pytest, so the marker decoded as PytestUnknownMark
and the test bodies failed with "async def functions are not
natively supported."

Switches both to plain sync tests that drive the coroutine via
asyncio.run(), matching the pattern already used in test_cache.py
for the same reason.

Audit-Ref: SECURITY_AUDIT.md finding #1 (test infrastructure)
2026-04-11 16:38:36 -05:00
pax
4db7943ac7 security: fix #2 — apply lavf protocol whitelist via property API
The previous attempt set ``demuxer_lavf_o`` as an init kwarg with a
comma-laden ``protocol_whitelist=file,http,https,tls,tcp`` value.
mpv rejected it with -7 OPT_FORMAT because python-mpv's init path
goes through ``mpv_set_option_string``, which routes through mpv's
keyvalue list parser — that parser splits on ``,`` to find entries,
shredding the protocol list into orphan tokens. Backslash-escaping
``\,`` did not unescape on this code path either.

Splits the option set into two helpers:

- ``build_mpv_kwargs`` — init kwargs only (ytdl=no, load_scripts=no,
  POSIX input_conf null, all the existing playback/audio/network
  tuning). The lavf option is intentionally absent.
- ``lavf_options`` — a dict applied post-construction via the
  python-mpv property API, which uses the node API and accepts
  dict values for keyvalue-list options without splitting on
  commas inside the value.

Tests cover both paths: that ``demuxer_lavf_o`` is NOT in the init
kwargs (regression guard), and that ``lavf_options`` returns the
expected protocol set.

Audit-Ref: SECURITY_AUDIT.md finding #2
Severity: High
2026-04-11 16:34:50 -05:00
pax
b65f8da837 security: fix #10 — validate media magic in first 16 bytes of stream
The previous flow streamed the full body to disk and called
_is_valid_media after completion. A hostile server that omits
Content-Type (so the early text/html guard doesn't fire) could
burn up to MAX_DOWNLOAD_BYTES (500MB) of bandwidth and cache-dir
write/delete churn before the post-download check rejected.

Refactors _do_download to accumulate chunks into a small header
buffer until at least 16 bytes have arrived, then runs
_looks_like_media against the buffer before committing to writing
the full payload. The 16-byte minimum handles servers that send
tiny chunks (chunked encoding with 1-byte chunks, slow trickle,
TCP MSS fragmentation) without false-failing on the first chunk.

Extracts _looks_like_media(bytes) as a sibling to _is_valid_media
(path) sharing the same magic-byte recognition. _looks_like_media
fails closed on empty input — when called from the streaming
validator, an empty header means the server returned nothing
useful. _is_valid_media keeps its OSError-fallback open behavior
for the on-disk path so transient EBUSY doesn't trigger a delete
+ re-download loop.

Audit-Ref: SECURITY_AUDIT.md finding #10
Severity: Low
2026-04-11 16:24:59 -05:00
pax
2bb6352141 security: fix #8 — install MAX_IMAGE_PIXELS cap in core/__init__.py
PIL's decompression-bomb cap previously lived as a side effect of
importing core/cache.py. Any future code path that touched core/images
(or any other core submodule) without first importing cache would
silently revert to PIL's default 89M-pixel *warning* (not an error),
re-opening the bomb surface.

Moves the cap into core/__init__.py so any import of any
booru_viewer.core.* submodule installs it first. The duplicate set
in cache.py is left in place by this commit and removed in the next
one — both writes are idempotent so this commit is bisect-safe.

Audit-Ref: SECURITY_AUDIT.md finding #8
Severity: Low
2026-04-11 16:21:32 -05:00
pax
6ff1f726d4 security: fix #7 — reject Windows reserved device names in template
render_filename_template's sanitization stripped reserved chars,
control codes, whitespace, and `..` prefixes — but did not catch
Windows reserved device names (CON, PRN, AUX, NUL, COM1-9, LPT1-9).
On Windows, opening `con.jpg` for writing redirects to the CON
device, so a tag value of `con` from a hostile booru would silently
break Save to Library.

Adds a frozenset of reserved stems and prefixes the rendered name
with `_` if its lowercased stem matches. The check runs
unconditionally (not Windows-gated) so a library saved on Linux
can be copied to a Windows machine without breaking on these
filenames.

Audit-Ref: SECURITY_AUDIT.md finding #7
Severity: Low
2026-04-11 16:20:27 -05:00
pax
fa4f2cb270 security: fix #6 — add pure source HTML escape helper
Extracts the rich-text Source-line builder out of info_panel.py
into a Qt-free module so it can be unit-tested under CI (which
installs only httpx + Pillow + pytest, no PySide6).

The helper html.escape()s both the href and the visible display
text, and only emits an <a> tag for http(s) URLs — non-URL
sources (including javascript: and data: schemes) get rendered
as escaped plain text without a clickable anchor.

Not yet wired into InfoPanel.set_post; that lands in the next
commit.

Audit-Ref: SECURITY_AUDIT.md finding #6
Severity: Medium
2026-04-11 16:19:06 -05:00
pax
5d348fa8be security: fix #5 — LRU cap on _url_locks to prevent memory leak
Replaces the unbounded defaultdict(asyncio.Lock) with an OrderedDict
guarded by _get_url_lock() and _evict_url_locks(). The cap is 4096
entries; LRU semantics keep the hot working set alive and oldest-
unlocked-first eviction trims back toward the cap on each new
insertion.

Eviction skips locks that are currently held — popping a lock that
a coroutine is mid-`async with` on would break its __aexit__. The
inner loop's evicted-flag handles the edge case where every
remaining entry is either the freshly inserted hash or held; in
that state the cap is briefly exceeded and the next insertion
retries, instead of looping forever.

Audit-Ref: SECURITY_AUDIT.md finding #5
Severity: Medium
2026-04-11 16:16:52 -05:00
pax
a6a73fed61 security: fix #4 — chmod SQLite DB + WAL/SHM sidecars to 0o600
The sites table stores api_key + api_user in plaintext. Previous
behavior left the DB file at the inherited umask (0o644 on most
Linux systems) so any other local user could sqlite3 it open and
exfiltrate every booru API key.

Adds Database._restrict_perms(), called from the lazy conn init
right after _migrate(). Tightens the main file plus the -wal and
-shm sidecars to 0o600. The sidecars only exist after the first
write, so the FileNotFoundError path is expected and silenced.
Filesystem chmod failures are also swallowed for FUSE-mount
compatibility.

behavior change from v0.2.5: ~/.local/share/booru-viewer/booru.db
is now 0o600 even if a previous version created it 0o644.

Audit-Ref: SECURITY_AUDIT.md finding #4
Severity: Medium
2026-04-11 16:15:41 -05:00
pax
6801a0b45e security: fix #4 — chmod data_dir to 0o700 on POSIX
The data directory holds the SQLite database whose `sites` table
stores api_key and api_user in plaintext. Previous behavior used
the inherited umask (typically 0o755), which leaves the dir
world-traversable on shared workstations and on networked home
dirs whose home is 0o755. Tighten to 0o700 unconditionally on
every data_dir() call so the fix is applied even when an older
version (or external tooling) left the directory loose.

Failures from filesystems that don't support chmod (some FUSE
mounts) are swallowed — better to keep working than refuse to
start. Windows: no-op, NTFS ACLs handle this separately.

behavior change from v0.2.5: ~/.local/share/booru-viewer is now
0o700 even if it was previously 0o755.

Audit-Ref: SECURITY_AUDIT.md finding #4
Severity: Medium
2026-04-11 16:14:30 -05:00
pax
013fe43f95 security: fix #1 — add public-host validator helper
Introduces core/api/_safety.py containing check_public_host and the
validate_public_request async request-hook. The hook rejects any URL
whose host is (or resolves to) loopback, RFC1918, link-local
(including 169.254.169.254 cloud metadata), CGNAT, unique-local v6,
or multicast. Called on every request hop so it covers both the
initial URL and every redirect target that httpx would otherwise
follow blindly.

Also exports redact_url / redact_params for finding #3 — the
secret-key set lives in the same module since both #1 and #3 work
is wired through httpx client event_hooks. Helper is stdlib-only
(ipaddress, socket, urllib.parse) plus httpx; no new deps.

Not yet wired into any httpx client; per-file wiring commits follow.

Audit-Ref: SECURITY_AUDIT.md finding #1
Severity: High
2026-04-11 16:09:53 -05:00
pax
22744c48af security: fix #2 — add pure mpv options builder helper
Extracts the mpv.MPV() kwargs into a Qt-free pure function so the
security-relevant options can be unit-tested on CI (which lacks
PySide6 and libmpv). The builder embeds the audit #2 hardening —
ytdl="no", load_scripts="no", and a lavf protocol whitelist of
file,http,https,tls,tcp — alongside the existing playback tuning.
Not yet wired into _MpvGLWidget; that lands in the next commit.

Audit-Ref: SECURITY_AUDIT.md finding #2
Severity: High
2026-04-11 16:06:33 -05:00
pax
562c03071b test: Phase 2 — add 64 tests for extracted pure functions
5 new test files covering the pure-function extractions from Phase 1:
- test_search_controller.py (24): tag building, blacklist filtering, backfill
- test_window_state.py (16): geometry parsing, splitter parsing, hyprctl cmds
- test_media_controller.py (9): prefetch ring-expansion ordering
- test_post_actions.py (10): batch message detection, library membership
- test_popout_controller.py (3): video sync dict shape

All import-pure (no PySide6, no mpv, no httpx). Total suite: 186 tests.
2026-04-10 15:20:57 -05:00
pax
d66dc14454 db: fix orphan rows — cascade delete_site, wire up reconcile on startup
delete_site() leaked rows in tag_types, search_history, and
saved_searches; reconcile_library_meta() was implemented but never
called. Add tests for both fixes plus tag cache pruning.
2026-04-10 14:10:57 -05:00
pax
a90d71da47 tests: add 36 tests for CategoryFetcher (parser, cache, probe, dispatch)
New test_category_fetcher.py covering:
  HTML parser (10): Rule34/Moebooru/Konachan markup, Gelbooru-empty,
    metadata->Meta mapping, URL-encoded names, edge cases
  Tag API parser (6): JSON, XML, empty, flat list, malformed
  Canonical ordering (4): standard order, species, unknown, empty
  Cache compose (6): full/partial/zero coverage, empty tags, order,
    per-site isolation
  Probe persistence (5): save/load True/False, per-site, clear wipes
  Batch API availability (3): URL+auth combinations
  Map coverage (2): label and type map constants

All pure Python — synthetic HTML, FakePost/FakeClient/FakeResponse.
No network, no Qt. Uses tmp_db fixture from conftest.

Total suite: 117 tests, 0.19s.
2026-04-09 23:58:56 -05:00
pax
ecda09152c ship tests/ (81 tests, was gitignored)
Remove tests/ from .gitignore and track the existing test suite:
  tests/core/test_db.py         — DB schema, migration, CRUD
  tests/core/test_cache.py      — cache helpers
  tests/core/test_config.py     — config/path helpers
  tests/core/test_concurrency.py — app loop accessor
  tests/core/api/test_base.py   — Post dataclass, BooruClient
  tests/gui/popout/test_state.py — 57 state machine tests

All pure Python, no secrets, no external deps. Uses temp DBs and
synthetic data. Run with: pytest tests/
2026-04-09 23:55:38 -05:00
pax
1b66b03a30 Untrack tests/ directory and related dev tooling
Removes the tests/ folder from git tracking and adds it to .gitignore.
The 81 tests (16 Phase A core + 65 popout state machine) stay on
disk as local-only working notes, the same way docs/ and project.md
are gitignored. Running them is `pytest tests/` from the project
root inside .venv as before — nothing about the tests themselves
changed, just whether they're version-controlled.

Reverts the related additions in pyproject.toml and README.md from
commit bf14466 (Phase A baseline) so the public surface doesn't
reference a tests/ folder that no longer ships:

  - pyproject.toml: drops [project.optional-dependencies] test extra
    and [tool.pytest.ini_options]. pytest + pytest-asyncio are still
    installed in the local .venv via the previous pip install -e ".[test]"
    so the suite keeps running locally; new clones won't get them
    automatically.

  - README.md: drops the "Run tests:" section from the Linux install
    block. The README's install instructions return to their pre-
    Phase-A state.

  - .gitignore: adds `tests/` alongside the existing `docs/` and
    `project.md` lines (the same convention used for the refactor
    inventory / plan / notes / final report docs).

The 12 test files removed from tracking (`git rm -r --cached`):
  tests/__init__.py
  tests/conftest.py
  tests/core/__init__.py
  tests/core/test_cache.py
  tests/core/test_concurrency.py
  tests/core/test_config.py
  tests/core/test_db.py
  tests/core/api/__init__.py
  tests/core/api/test_base.py
  tests/gui/__init__.py
  tests/gui/popout/__init__.py
  tests/gui/popout/test_state.py

Verification:
  - tests/ still exists on disk
  - `pytest tests/` still runs and passes 81 / 81 in 0.11s
  - `git ls-files tests/` returns nothing
  - `git status` is clean
2026-04-08 20:47:50 -05:00
pax
3ade3a71c1 popout/state: implement illegal transition handler (env-gated)
Adds the structural alternative to "wait for a downstream symptom and
bisect to find the bad dispatch": catch illegal transitions at the
dispatch boundary instead of letting them silently no-op.

In release mode (default — no env var set):
  - Illegal events are dropped silently
  - A `log.debug` line is emitted with the state and event type
  - dispatch returns []
  - state is unchanged
  - This is what production runs

In strict mode (BOORU_VIEWER_STRICT_STATE=1):
  - Illegal events raise InvalidTransition(state, event)
  - The exception carries both fields for the diagnostic
  - This is for development and the test suite — it makes
    programmer errors loud and immediate instead of silently
    cascading into a downstream symptom

The legality map (`_LEGAL_EVENTS_BY_STATE`) is per-state. Most events
(NavigateRequested / Mute / Volume / LoopMode / Fullscreen / window
events / Close / ContentArrived) are globally legal in any non-Closing
state. State-specific events are listed per state. Closing has an
empty legal set; the dispatch entry already drops everything from
Closing before the legality check runs.

The map distinguishes "legal-but-no-op" from "structurally invalid":

  - VideoEofReached in LoadingVideo: LEGAL. The state machine
    intentionally accepts and drops this event. It's the EOF race
    fix — the event arriving in LoadingVideo is the race scenario,
    and dropping is the structural cure. Strict mode does NOT raise.

  - VideoEofReached in SeekingVideo: LEGAL. Same reasoning — eof
    during a seek is stale.

  - VideoEofReached in AwaitingContent / DisplayingImage: ILLEGAL.
    No video is loaded; an eof event arriving here is a real bug
    in either mpv or the adapter. Strict mode raises.

The strict-mode read happens per-dispatch (`os.environ.get`), not
cached at module load, so monkeypatch.setenv in tests works
correctly. The cost is microseconds per dispatch — negligible.

Tests passing after this commit (65 total → 65 pass):

  Newly added (3):
  - test_strict_mode_raises_invalid_transition
  - test_strict_mode_does_not_raise_for_legal_events
  - test_strict_mode_legal_but_no_op_does_not_raise

  Plus the existing 62 still pass — the legality check is non-
  invasive in release mode (existing tests run without
  BOORU_VIEWER_STRICT_STATE set, so they see release-mode behavior).

Phase A (16 tests in tests/core/) still green.

The state machine logic is now COMPLETE. Every state, every event,
every effect is implemented with both happy-path transitions and
illegal-transition handling. The remaining commits (12-16) carve
the implementation into the planned file layout (effects.py split,
hyprland.py extraction) and rewire the Qt adapter.

Test cases for commit 12 (effects split):
  - Re-import after the file split still works
  - All 65 tests still pass after `from .effects import ...` change
2026-04-08 19:40:05 -05:00
pax
f2f7d64759 popout/state: test scaffolding (62 tests, 27 pass at skeleton stage)
Lays down the full test surface for the popout state machine ahead of
any transition logic. 62 collected tests across the four categories
from docs/POPOUT_REFACTOR_PLAN.md "Test plan":

  1. Read-path queries (4 tests, all passing at commit 3 — these
     exercise the parts of the skeleton that are already real:
     compute_slider_display_ms, the terminal Closing guard, the
     initial state defaults)
  2. Per-state transition tests (~22 tests, all failing at commit 3
     because the per-event handlers in state.py are stubs returning
     []. Each documents the expected new state and effects for one
     specific (state, event) pair. These pass progressively as
     commits 4-11 land.)
  3. Race-fix invariant tests (6 tests — one for each of the six
     structural fixes from the prior fix sweep: EOF race, double-
     navigate, persistent viewport, F11 round-trip, seek pin,
     pending mute replay. The EOF race test already passes because
     dropping VideoEofReached in LoadingVideo is just "stub returns
     []", which is the right behavior for now. The others fail
     until their transitions land.)
  4. Illegal transition tests (17 parametrized cases — at commit 11
     these become BOORU_VIEWER_STRICT_STATE-gated raises. At commits
     3-10 they pass trivially because the stubs return [], which is
     the release-mode behavior.)

All 62 tests are pure Python:
  - Import only `booru_viewer.gui.popout.state` and `popout.viewport`
  - Construct StateMachine() directly
  - Use direct field mutation (`m.state = State.PLAYING_VIDEO`) for
    setup, dispatch the event under test, assert the new state +
    returned effects
  - No QApplication, no mpv, no httpx, no filesystem outside tmp_path
  - Sub-100ms total runtime (currently 0.31s including test discovery)

The forcing function: if state.py grows a PySide6/mpv/httpx import,
this test file fails to collect and the suite breaks. That's the
guardrail that keeps state.py pure as transitions land.

Test count breakdown (62 total):
- 4 trivially-passing (read-path queries + initial state)
- 22 transition tests (one per (state, event) pair)
- 6 invariant tests (mapped to the six race fixes)
- 17 illegal transition cases (parametrized over (state, event) pairs)
- 5 close-from-each-state cases (parametrized)
- 8 misc (state field persistence, window events)

Result at commit 3:
  35 failed, 27 passed in 0.31s

The 27 passing are exactly the predicted set: trivial reads + the
illegal-transition pass-throughs (which work today because the stubs
return [] just like release-mode strict-state would). The 35 failing
are the transition handlers that need real implementations.

Phase A test suite (16 tests in tests/core/) still passes — this
commit only adds new tests, no existing test changed.

Test cases for state machine implementation (commits 4-11):
- Each failing test is its own commit acceptance criterion
- Commit N "passes" when the relevant subset of tests turns green
- Final state machine sweep (commit 11): all 62 tests pass
2026-04-08 19:27:23 -05:00
pax
bf14466382 Add Phase A test suite for core/ primitives
First regression-test layer for booru-viewer. Pure Python — no Qt, no
mpv, no network, no real filesystem outside tmp_path. Locks in the
security and concurrency invariants from the 54ccc40 + eb58d76 hardening
commits ahead of the upcoming popout state machine refactor (Prompt 3),
which needs a stable baseline to refactor against.

16 tests across five files mirroring the source layout under
booru_viewer/core/:

- tests/core/test_db.py (4):
  - _validate_folder_name rejection rules (.., /foo, \\foo, .hidden,
    ~user, empty) and acceptance categories (unicode, spaces, parens)
  - add_bookmark INSERT OR IGNORE collision returns the existing row
    id (locks in the lastrowid=0 fix)
  - get_bookmarks LIKE escaping (literal cat_ear does not match catear)

- tests/core/test_cache.py (7):
  - _referer_for hostname suffix matching (gelbooru.com / donmai.us
    apex rewrite, both exact-match and subdomain)
  - _referer_for rejects substring attackers
    (imgblahgelbooru.attacker.com does NOT pick up the booru referer)
  - ugoira frame-count and uncompressed-size caps refuse zip bombs
    before any decompression
  - _do_download MAX_DOWNLOAD_BYTES enforced both at the
    Content-Length pre-check AND in the chunk-loop running total
  - _is_valid_media returns True on OSError (no delete + redownload
    loop on transient EBUSY)

- tests/core/test_config.py (2):
  - saved_folder_dir rejects literal .. and ../escape
  - find_library_files walks root + 1 level, filters by
    MEDIA_EXTENSIONS, exact post-id stem match

- tests/core/test_concurrency.py (2):
  - get_app_loop raises RuntimeError before set_app_loop is called
  - run_on_app_loop round-trips a coroutine result from a worker
    thread loop back to the test thread

- tests/core/api/test_base.py (1):
  - BooruClient._shared_client lazy singleton constructor-once under
    10-thread first-call race

Plus tests/conftest.py with fixtures: tmp_db, tmp_library,
reset_app_loop, reset_shared_clients. All fixtures use tmp_path or
reset module-level globals around the test so the suite is parallel-
safe.

pyproject.toml:
- New [project.optional-dependencies] test extra: pytest>=8.0,
  pytest-asyncio>=0.23
- New [tool.pytest.ini_options]: asyncio_mode = "auto",
  testpaths = ["tests"]

README.md:
- Linux install section gains "Run tests" with the
  pip install -e ".[test]" + pytest tests/ invocation

Phase B (post-sweep VideoPlayer regression tests for the seek slider
pin, _pending_mute lazy replay, and volume replay) is deferred to
Prompt 3's state machine work — VideoPlayer cannot be instantiated
without QApplication and a real mpv, which is out of scope for a
unit test suite. Once the state machine carves the pure-Python state
out of VideoPlayer, those tests become trivial against the helper
module.

Suite runs in 0.07s (16 tests). Independent of Qt/mpv/network/ffmpeg.

Test cases for Prompt 3:
- (already covered) — this IS the test suite Prompt 3 builds on top of
2026-04-08 18:50:00 -05:00