54 Commits

Author SHA1 Message Date
pax
f0fe52c886 fix: HTML parser two-pass rewrite + fire-and-forget prefetch
Three fixes:

1. HTML parser completely rewritten with two-pass approach:
   - Pass 1: regex finds each tag-type element and its full inner
     content (up to closing </li|span|td|div>)
   - Pass 2: within the content, extracts the tag name from the
     tags=NAME URL parameter in the search link
   The old single-pass regex captured the ? wiki-link (first <a>)
   instead of the tag name (second <a>). The URL-param extraction
   works on Rule34 (40 tags), Safebooru.org (47 tags), and
   yande.re (3 tags). Gelbooru proper returns 0 (post page only
   has ? links with no tags= param) which is correct — Gelbooru
   uses the batch tag API instead.

2. prefetch_batch is now truly fire-and-forget:
   gelbooru.py and moebooru.py use asyncio.create_task instead of
   await for prefetch_batch. search() returns immediately. The
   probe + batch/HTML fetch runs in the background. Previously
   search() blocked on the probe, which made Rule34 searches take
   5+ seconds (slow/broken Rule34 API response time).

3. Partial cache compose already fixed in the previous commit
   complements this: posts with 49/50 cached tags now show all
   available categories instead of nothing.
2026-04-09 19:31:43 -05:00
pax
165733c6e0 category_fetcher: compose from partial cache coverage
try_compose_from_cache previously required 100% cache coverage —
every tag in the post had to have a cached label or it returned
False and populated nothing. One rare uncached tag out of 50
blocked the entire composition, leaving the post with zero
categories even though 49/50 labels were available.

Fix: compose whatever IS cached, return True when at least one
tag got categorized. Tags not in the cache are simply absent from
the categories dict (they stay in the flat tags string). The
return value now means "the post has usable categories" rather
than "the post has complete categories." This distinction matters
because the dispatch logic uses the return value to decide
whether to skip the fetch path — partial coverage is better than
no coverage, and the missing tags get cached eventually when
other posts that contain them get fetched.

Verified against Gelbooru: post with 50 tags where 49 were cached
now gets 49/50 categorized (Artist, Character, Copyright, General,
Meta) instead of 0/50.
2026-04-09 19:23:57 -05:00
pax
8f8db62a5a library_save: ensure categories before template render
save_post_file is now async and gains an optional
category_fetcher parameter. When the template uses any category
token (%artist%, %character%, %copyright%, %general%, %meta%,
%species%) AND the post's tag_categories is empty AND a fetcher
is available, it awaits ensure_categories(post) before calling
render_filename_template. This guarantees the filename is
correct even when saving a post the user hasn't clicked
(bypassing the info panel's on-display trigger).

When the template uses only non-category tokens (%id%, %md5%,
%score%, %rating%, %ext%) or is empty, the ensure check is
skipped entirely — no HTTP overhead for the common case.

Every existing caller already runs from _run_async closures,
so the sync→async signature change is mechanical. The callers
are updated in the next two commits to pass category_fetcher.
2026-04-09 19:18:13 -05:00
pax
f5954d1387 api: factory constructs CategoryFetcher for Gelbooru + Moebooru sites
client_for_type gains optional db + site_id kwargs. When both are
passed and api_type is gelbooru or moebooru, a CategoryFetcher is
constructed and assigned to client.category_fetcher. The fetcher
owns the per-tag cache, the batch tag API fast path, and the
per-post HTML scrape fallback.

Danbooru and e621 never get a fetcher — their inline JSON
categorization is already optimal.

Test Connection dialog and scripts don't pass db/site_id, so they
get fetcher-less clients with the existing search behavior.
2026-04-09 19:15:57 -05:00
pax
834deecf57 moebooru: implement _post_view_url + prefetch wiring
Override _post_view_url to return /post/show/{id} for the per-post
HTML scrape path. No _tag_api_url override — Moebooru has no batch
tag DAPI; the CategoryFetcher dispatch goes straight to per-post
HTML for these sites.

search() and get_post() now call prefetch_batch when a fetcher is
attached, same fire-and-forget pattern as gelbooru.py.
2026-04-09 19:15:34 -05:00
pax
7f897df4b2 gelbooru: implement _post_view_url + _tag_api_url + prefetch wiring
Overrides both URL methods from the base class:
  _post_view_url(post) -> /index.php?page=post&s=view&id={id}
    Universal HTML scrape path — works on Gelbooru proper, Rule34,
    Safebooru.org without auth.
  _tag_api_url() -> {base_url}/index.php
    Batch tag DAPI fast path. The CategoryFetcher's probe-and-cache
    determines at runtime whether the endpoint actually honors
    names=. Gelbooru proper: probe succeeds. Rule34: probe fails
    (garbage response), falls back to HTML. Safebooru.org: no auth,
    dispatch skips batch entirely.

search() and get_post() now call
    await self.category_fetcher.prefetch_batch(posts)
after building the post list, when a fetcher is attached. The
prefetch is fire-and-forget — search returns immediately and the
background tasks fill categories as the user reads. When no
fetcher is attached (Test Connection dialog, scripts), this is a
no-op and behavior is unchanged.
2026-04-09 19:15:02 -05:00
pax
5ba0441be7 e621: populate categories in get_post (latent bug fix) 2026-04-09 19:14:19 -05:00
pax
9001808951 danbooru: populate categories in get_post (latent bug fix) 2026-04-09 19:13:52 -05:00
pax
8f298e51fc api: BooruClient virtual _post_view_url + _tag_api_url + category_fetcher attr
Three additions to the base class, all default-inactive:

  _post_view_url(post) -> str | None
    Override to provide the post-view HTML URL for the per-post
    category scrape path. Default None (Danbooru/e621 skip it).

  _tag_api_url() -> str | None
    Override to provide the batch tag DAPI base URL for the fast
    path in CategoryFetcher. Default None. Only Gelbooru proper
    benefits — the fetcher's probe-and-cache determines at runtime
    whether the endpoint actually honors the names= parameter.

  self.category_fetcher = None
    Set externally by the factory (client_for_type) when db and
    site_id are available. Gelbooru-shape and Moebooru clients use
    it; Danbooru/e621 leave it None.

No behavior change at this commit. Existing clients inherit the
defaults and continue working identically.
2026-04-09 19:13:21 -05:00
pax
e00d88e1ec api: CategoryFetcher module with HTML scrape + batch tag API + cache
New module core/api/category_fetcher.py — the unified tag-category
fetcher for boorus that don't return categories inline.

Public surface:
  try_compose_from_cache(post) — instant, no HTTP. Builds
    post.tag_categories from cached (site_id, name) -> label
    entries. Returns True if every tag in the post is cached.
  fetch_via_tag_api(posts) — batch fast path. Collects uncached
    tags across posts, chunks into 500-name batches, GETs the
    tag DAPI. Only available when the client declares _tag_api_url
    AND has credentials (Gelbooru proper). Includes JSON/XML
    sniffing parser ported from the reverted code.
  fetch_post(post) — universal fallback. HTTP GETs the post-view
    HTML page, regex-extracts class="tag-type-X">name</a>
    markup. Works on every Gelbooru fork and every Moebooru
    deployment. Does NOT require auth.
  ensure_categories(post) — idempotent dispatch: cache compose ->
    batch API (if available) -> HTML scrape. Coalesces concurrent
    calls for the same post.id via an in-flight task dict.
  prefetch_batch(posts) — fire-and-forget background prefetch.
    ONE fetch path per invocation (no mixing batch + HTML).

Probe-and-cache for the batch tag API:
  _batch_api_works = None -> not yet probed OR transient error
                              (retry next call)
  _batch_api_works = True -> batch works (Gelbooru proper)
  _batch_api_works = False -> clean 200 + zero matching names
                               (Rule34's broken names= filter)
  Transition to True/False is permanent per instance. Transient
  errors (HTTP error, timeout, parse exception) leave None so the
  next search retries the probe.

HTML regex handles both standard tag-type-artist and combined-
class forms like tag-link tag-type-artist (Konachan). Tag names
normalized to underscore-separated lowercase.

Canonical category order: Artist > Character > Copyright >
Species > General > Meta > Lore (matches danbooru/e621 inline).

Dead code at this commit — no integration yet.
2026-04-09 19:12:43 -05:00
pax
5395569213 db: re-add tag_types cache table with string labels + auto-prune
Per-site tag-type cache for boorus that don't return categories
inline. Uses string labels ("Artist", "Character", "Copyright",
"General", "Meta") instead of the integer codes the reverted
version used — the labels come directly from HTML class names,
no mapping step needed.

Schema: tag_types(site_id, name, label TEXT, fetched_at)
        PRIMARY KEY (site_id, name)

Methods:
  get_tag_labels(site_id, names) — chunked 500-name SELECT
  set_tag_labels(site_id, mapping) — bulk INSERT OR REPLACE,
    auto-prunes oldest entries when the table exceeds 50k rows
  clear_tag_cache(site_id=None) — manual wipe, for future
    Settings UI "Clear tag cache" button

The 50k row cap prevents unbounded growth over months of
browsing multiple boorus. Normal usage (a few thousand unique
tags per site) never reaches it. When exceeded, the oldest
entries by fetched_at are pruned first — these are the tags the
user hasn't encountered recently and would be re-fetched cheaply
if needed.

Migration: CREATE TABLE IF NOT EXISTS in _migrate(), non-breaking
for existing databases.
2026-04-09 19:10:37 -05:00
pax
150970b56f cache: delete_from_library cleans up library_meta + matches templated names
Two related fixes that the old delete flow was missing:

1. delete_from_library now accepts an optional `db` parameter which
   it forwards to find_library_files. Without `db`, only digit-stem
   files match (the old behavior — preserved as a fallback). With
   `db`, templated filenames stored in library_meta also match,
   so post-refactor saves like 12345_hatsune_miku.jpg get unlinked
   too. Without this fix, "Unsave from Library" on a templated
   save was a silent no-op.

2. Always cleans up the library_meta row when called with `db`, not
   just when files were unlinked. Two cases this matters for:
     a. Files were on disk and unlinked → meta is now stale.
     b. Files were already gone but the meta lingered (orphan from
        a previous broken delete) → user asked to "unsave," meta
        should reflect that.
   This is the missing half of the cleanup that left some libraries
   with pathologically more meta rows than actual files.
2026-04-09 18:25:21 -05:00
pax
5976a81bb6 db: add reconcile_library_meta to clean up orphan meta rows
The old delete_from_library deleted files from disk but never
cleaned up the matching library_meta row. Result: pathologically
the meta table can have many more rows than there are files on
disk. This was harmless when the only consumer was tag-search (the
meta would just match nothing useful), but it becomes a real
problem the moment is_post_in_library / get_saved_post_ids start
driving UI state — the saved-dot indicator would light up for
posts whose files have been gone for ages.

reconcile_library_meta() walks saved_dir() shallowly (root + one
level of subdirs), collects every present post_id (digit-stem
files plus templated filenames looked up via library_meta.filename),
and DELETEs every meta row whose post_id isn't in that set.
Returns the count of removed rows.

Defensive: if saved_dir() exists but has zero files (e.g. removable
drive temporarily unmounted), the method refuses to reconcile and
returns 0. The cost of a false positive — wiping every meta row
for a perfectly intact library — is higher than the cost of
leaving stale rows around for one more session.

The cache.py fix in the next commit makes future delete_from_library
calls clean up after themselves. This method is the one-time
catch-up for libraries that were already polluted before that fix.
2026-04-09 18:25:21 -05:00
pax
6f59de0c64 config: find_library_files now matches templated filenames
When given an optional db handle, find_library_files queries
library_meta for templated filenames belonging to the post and
matches them alongside the legacy digit-stem stem == str(post_id)
heuristic. Without db it degrades to the legacy-only behavior, so
existing callers don't break — but every caller in the gui layer
has a Database instance and will be updated to pass it.

This is the foundation for the bookmark/browse saved-dot indicator
fix and the delete_from_library fix in the next three commits.
2026-04-09 18:25:21 -05:00
pax
28348fa9ab db: add is_post_in_library / get_saved_post_ids helpers
The pre-template world used find_library_files(post_id) — a
filesystem walk matching files whose stem equals str(post_id) — for
"is this post saved?" checks across the bookmark dot indicator,
browse dot indicator, Unsave menu visibility, etc. With templated
filenames (e.g. 12345_hatsune_miku.jpg) the stem no longer equals
the post id and the dots silently stop lighting up.

Two new helpers, both indexed:
- is_post_in_library(post_id) -> bool   single check, SELECT 1
- get_saved_post_ids() -> set[int]      batch fetch for grid scans

Both go through library_meta which is keyed by post_id, so they're
format-agnostic — they don't care whether the on-disk filename is
12345.jpg, mon3tr_(arknights).jpg, or anything else, as long as the
save flow wrote a meta row. Every save site does this since the
unified save_post_file refactor landed.
2026-04-09 18:25:21 -05:00
pax
f0b1fc9052 config: render_filename_template now matches the API client key casing
The danbooru and e621 API clients store tag_categories with
Capitalized keys ("Artist", "Character", "Copyright", "General",
"Meta", "Species") — that's the convention info_panel and
preview_pane already iterate against. render_filename_template was
looking up lowercase keys, so every category token rendered empty
even on Danbooru posts where the data was right there. Templates
like "%id%_%character%" silently collapsed back to "{id}.{ext}".

Fix: look up the Capitalized form, with a fallback chain (exact ->
.lower() -> .capitalize()) so future drift between API clients in
either direction won't silently break templates again.

Verified against a real Danbooru save in the user's library: post
11122211 with tag_categories containing Artist=["yun_ze"],
Character=["mon3tr_(arknights)"], etc. now renders
"%id%_%character%" -> "11122211_mon3tr_(arknights).jpg" instead of
"11122211.jpg".
2026-04-09 18:25:21 -05:00
pax
9248dd77aa library: add unified save_post_file for the upcoming refactor
New module core/library_save.py with one public function and two
private helpers. Dead code at this commit — Phase 2 commits route the
eight save sites through it one at a time.

save_post_file(src, post, dest_dir, db, in_flight=None, explicit_name=None)
- Renders the basename from library_filename_template, or uses
  explicit_name when set (Save As path).
- Resolves collisions: same-post-on-disk hits return the basename
  unchanged so re-saves are idempotent; different-post collisions get
  sequential _1, _2, _3 suffixes. in_flight is consulted alongside
  on-disk state for batch members claimed earlier in the same call.
- Conditionally writes library_meta when the resolved destination is
  inside saved_dir(), regardless of which save path called us.
- Returns the resolved Path so callers can build status messages.

_same_post_on_disk uses get_library_post_id_by_filename, falling back
to the legacy v0.2.3 digit-stem heuristic for rows whose filename
column is empty. Mirrors the digit-stem checks already in gui/library.py.

Boundary rule: imports core.cache, core.config, core.db only. No gui/
imports — that's how main_window.py and bookmarks.py will both call in
without circular imports.
2026-04-09 18:25:21 -05:00
pax
6075f31917 library: scaffold filename templates + DB column
Adds the foundation that the unified save flow refactor builds on. No
behavior change at this commit — empty default template means every save
site still produces {id}{ext} like v0.2.3.

- core/db.py: library_meta.filename column with non-breaking migration
  for legacy databases. Index on filename. New
  get_library_post_id_by_filename() lookup. filename kwarg on
  save_library_meta (defaults to "" for legacy callers).
  library_filename_template added to _DEFAULTS.
- core/config.py: render_filename_template() with %id% %md5% %ext%
  %rating% %score% %artist% %character% %copyright% %general% %meta%
  %species% tokens. Sanitizes filesystem-reserved chars, collapses
  whitespace, strips leading dots/.., caps the rendered stem at 200
  characters, falls back to post id when sanitization yields empty.
- gui/settings.py: Library filename template input field next to the
  Library directory row, with a help label listing tokens and noting
  that Gelbooru/Moebooru can only resolve the basic ones.
2026-04-09 18:25:21 -05:00
pax
250b144806 Decouple bookmark folders from library folders, add move-aware save + submenu pickers everywhere
Bookmark folders and library folders used to share identity through
_db.get_folders() — the same string was both a row in favorite_folders
and a directory under saved_dir. They look like one concept but they're
two stores, and the cross-bleed produced a duplicate-on-move bug and
made "Save to Library" silently re-file the bookmark too.

Now they're independent name spaces:
  - library_folders() in core.config reads filesystem subdirs of
    saved_dir; the source of truth for every Save-to-Library menu
  - find_library_files(post_id) walks the library shallowly and is the
    new "is this saved?" / delete primitive
  - bookmark folders stay DB-backed and are only used for bookmark
    organization (filter combo, Move to Folder)
  - delete_from_library no longer takes a folder hint — walks every
    library folder by post id and deletes every match (also cleans up
    duplicates left by the old save-to-folder copy bug)
  - _save_to_library is move-aware: if the post is already in another
    library folder, atomic Path.rename() into the destination instead
    of re-copying from cache (the duplicate bug fix)
  - bookmark "Move to Folder" no longer also calls _copy_to_library;
    Save to Library no longer also calls move_bookmark_to_folder
  - settings export/import unchanged; favorite_folders table preserved
    so no migration

UI additions:
  - Library tab right-click: Move to Folder submenu (single + multi),
    uses Path.rename for atomic moves
  - Bookmarks tab: − Folder button next to + Folder for deleting the
    selected bookmark folder (DB-only, library filesystem untouched)
  - Browse tab right-click: "Bookmark" replaced with "Bookmark as"
    submenu when not yet bookmarked (Unfiled / folders / + New); flat
    "Remove Bookmark" when already bookmarked
  - Embedded preview Bookmark button: same submenu shape via new
    bookmark_to_folder signal + set_bookmark_folders_callback
  - Popout Bookmark button: same shape — works in both browse and
    bookmarks tab modes
  - Popout Save button: Save-to-Library submenu via new save_to_folder
    + unsave_requested signals (drops save_toggle_requested + the
    _save_toggle_from_popout indirection)
  - Popout in library mode: Save button stays visible as Unsave; the
    rest of the toolbar (Bookmark / BL Tag / BL Post) is hidden

State plumbing:
  - _update_fullscreen_state mirrors the embedded preview's
    _is_bookmarked / _is_saved instead of re-querying DB+filesystem,
    eliminating the popout state drift during async bookmark adds
  - Library tab Save button reads "Unsave" the entire time; Save
    button width bumped 60→75 so the label doesn't clip on tight themes
  - Embedded preview tracks _is_bookmarked alongside _is_saved so the
    new Bookmark-as submenu can flip to a flat unbookmark when active

Naming:
  - "Unsorted" renamed to "Unfiled" everywhere user-facing — library
    Unfiled and bookmarks Unfiled now share one label. Internal
    comparison in library.py:_scan_files updated to match the combo.
2026-04-07 19:50:39 -05:00
pax
eb58d76bc0 Route async work through one persistent loop, lock shared httpx + DB writes
Mixing `threading.Thread + asyncio.run` workers with the long-lived
asyncio loop in gui/app.py is a real loop-affinity bug: the first worker
thread to call `asyncio.run` constructs a throwaway loop, which the
shared httpx clients then attach to, and the next call from the
persistent loop fails with "Event loop is closed" / "attached to a
different loop". This commit eliminates the pattern across the GUI and
adds the locking + cleanup that should have been there from the start.

Persistent loop accessor (core/concurrency.py — new)
- set_app_loop / get_app_loop / run_on_app_loop. BooruApp registers the
  one persistent loop at startup; everything that wants to schedule
  async work calls run_on_app_loop instead of spawning a thread that
  builds its own loop. Three functions, ~30 lines, single source of
  truth for "the loop".

Lazy-init lock + cleanup on shared httpx clients (core/api/base.py,
core/api/e621.py, core/cache.py)
- Each shared singleton (BooruClient._shared_client, E621Client._e621_client,
  cache._shared_client) now uses fast-path / locked-slow-path lazy init.
  Concurrent first-callers from the same loop can no longer both build
  a client and leak one (verified: 10 racing callers => 1 httpx instance).
- Each module exposes an aclose helper that BooruApp.closeEvent runs via
  run_coroutine_threadsafe(...).result(timeout=5) BEFORE stopping the
  loop. The connection pool, keepalive sockets, and TLS state finally
  release cleanly instead of being abandoned at process exit.
- E621Client tracks UA-change leftovers in _e621_to_close so the old
  client doesn't leak when api_user changes — drained in aclose_shared.

GUI workers routed through the persistent loop (gui/sites.py,
gui/bookmarks.py)
- SiteDialog._on_detect / _on_test: replaced
  `threading.Thread(target=lambda: asyncio.run(...))` with
  run_on_app_loop. Results marshaled back through Qt Signals connected
  with QueuedConnection. Added _closed flag + _inflight futures list:
  closeEvent cancels pending coroutines and shorts out the result emit
  if the user closes the dialog mid-detect (no use-after-free on
  destroyed QObject).
- BookmarksView._load_thumb_async: same swap. The existing thumb_ready
  signal already used QueuedConnection so the marshaling side was
  already correct.

DB write serialization (core/db.py)
- Database._write_lock = threading.RLock() — RLock not Lock so a
  writing method can call another writing method on the same thread
  without self-deadlocking.
- New _write() context manager composes the lock + sqlite3's connection
  context manager (the latter handles BEGIN / COMMIT / ROLLBACK
  atomically). Every write method converted: add_site, update_site,
  delete_site, add_bookmark, add_bookmarks_batch, remove_bookmark,
  update_bookmark_cache_path, add_folder, remove_folder, rename_folder,
  move_bookmark_to_folder, add/remove_blacklisted_tag,
  add/remove_blacklisted_post, save_library_meta, remove_library_meta,
  set_setting, add_search_history, clear_search_history,
  remove_search_history, add_saved_search, remove_saved_search.
- _migrate keeps using the lock + raw _conn context manager because
  it runs from inside the conn property's lazy init (where _write()
  would re-enter conn).
- Reads stay lock-free and rely on WAL for reader concurrency. Verified
  under contention: 5 threads × 50 add_bookmark calls => 250 rows,
  zero corruption, zero "database is locked" errors.

Smoke-tested with seven scenarios: get_app_loop raises before set,
run_on_app_loop round-trips, lazy init creates exactly one client,
10 concurrent first-callers => 1 httpx, aclose_shared cleans up,
RLock allows nested re-acquire, multi-threaded write contention.
2026-04-07 17:24:23 -05:00
pax
54ccc40477 Defensive hardening across core/* and popout overlay fix
Sweep of defensive hardening across the core layers plus a related popout
overlay regression that surfaced during verification.

Database integrity (core/db.py)
- Wrap delete_site, add_search_history, remove_folder, rename_folder,
  and _migrate in `with self.conn:` so partial commits can't leave
  orphan rows on a crash mid-method.
- add_bookmark re-SELECTs the existing id when INSERT OR IGNORE
  collides on (site_id, post_id). Was returning Bookmark(id=0)
  silently, which then no-op'd update_bookmark_cache_path the next
  time the post was bookmarked.
- get_bookmarks LIKE clauses now ESCAPE '%', '_', '\\' so user search
  literals stop acting as SQL wildcards (cat_ear no longer matches
  catear).

Path traversal (core/db.py + core/config.py)
- Validate folder names at write time via _validate_folder_name —
  rejects '..', os.sep, leading '.' / '~'. Permits Unicode/spaces/
  parens so existing folders keep working.
- saved_folder_dir() resolves the candidate path and refuses anything
  that doesn't relative_to the saved-images base. Defense in depth
  against folder strings that bypass the write-time validator.
- gui/bookmarks.py and gui/app.py wrap add_folder calls in try/except
  ValueError and surface a QMessageBox.warning instead of crashing.

Download safety (core/cache.py)
- New _do_download(): payloads >=50MB stream to a tempfile in the
  destination dir and atomically os.replace into place; smaller
  payloads keep the existing buffer-then-write fast path. Both
  enforce a 500MB hard cap against the advertised Content-Length AND
  the running total inside the chunk loop (servers can lie).
- Per-URL asyncio.Lock coalesces concurrent downloads of the same
  URL so two callers don't race write_bytes on the same path.
- Image.MAX_IMAGE_PIXELS = 256M with DecompressionBombError handling
  in both converters.
- _convert_ugoira_to_gif checks frame count + cumulative uncompressed
  size against UGOIRA_MAX_FRAMES / UGOIRA_MAX_UNCOMPRESSED_BYTES from
  ZipInfo headers BEFORE decompressing — defends against zip bombs.
- _convert_animated_to_gif writes a .convfailed sentinel sibling on
  failure to break the re-decode-on-every-paint loop for malformed
  animated PNGs/WebPs.
- _is_valid_media returns True (don't delete) on OSError so a
  transient EBUSY/permissions hiccup no longer triggers a delete +
  re-download loop on every access.
- _referer_for() uses proper hostname suffix matching, not substring
  `in` (imgblahgelbooru.attacker.com no longer maps to gelbooru.com).
- PIL handles wrapped in `with` blocks for deterministic cleanup.

API client retry + visibility (core/api/*)
- base.py: _request retries on httpx.NetworkError + ConnectError in
  addition to TimeoutException. test_connection no longer echoes the
  HTTP response body in the error string (it was an SSRF body-leak
  gadget when used via detect_site_type's redirect-following client).
- detect.py + danbooru.py + e621.py + gelbooru.py + moebooru.py:
  every previously-swallowed exception in search/autocomplete/probe
  paths now logs at WARNING with type, message, and (where relevant)
  the response body prefix. Debugging "the site isn't working" used
  to be a total blackout.

main_gui.py
- file_dialog_platform DB probe failure prints to stderr instead of
  vanishing.

Popout overlay (gui/preview.py + gui/app.py)
- preview.py:79,141 — setAttribute(WA_StyledBackground, True) on
  _slideshow_toolbar and _slideshow_controls. Plain QWidget parents
  silently ignore QSS `background:` declarations without this
  attribute, which is why the popout overlay strip was rendering
  fully transparent (buttons styled, bar behind them showing the
  letterbox color).
- app.py: bake _BASE_POPOUT_OVERLAY_QSS as a fallback prepended
  before the user's custom.qss in the loader. Custom themes that
  don't define overlay rules now still get a translucent black
  bar with white text + hairline borders. Bundled themes win on
  tie because their identical-specificity rules come last in the
  prepended string.
2026-04-07 17:24:19 -05:00
pax
463f77d8bb Make info panel tag colors QSS-targetable, delete dead theme.py + green palette constants 2026-04-07 13:15:31 -05:00
pax
72150fc98b Add BOORU_VIEWER_NO_HYPR_RULES + BOORU_VIEWER_NO_POPOUT_ASPECT_LOCK env vars for ricers with their own windowrules 2026-04-07 12:27:22 -05:00
pax
74f948a3e8 Speed up page loads — pre-fetch bookmarks/cache as sets, off-load PIL conversion to a worker 2026-04-07 11:36:23 -05:00
pax
2fbf2f6472 0.2.0: mpv backend, popout viewer, preview toolbar, API retry, SearchState refactor
Video:
- Replace Qt Multimedia with mpv via python-mpv + OpenGL render API
- Hardware-accelerated decoding, frame-accurate seeking, proper EOF detection
- Translucent overlay controls in both preview and popout
- LC_NUMERIC=C for mpv locale compatibility

Popout viewer (renamed from slideshow):
- Floating toolbar + controls overlay with auto-hide (2s)
- Window auto-resizes to content aspect ratio on navigation
- Hyprland: hyprctl resizewindowpixel + keep_aspect_ratio prop
- Window geometry persisted to DB across sessions
- Smart F11 exit sizing (60% monitor, centered)

Preview toolbar:
- Bookmark, Save, BL Tag, BL Post, Popout buttons above preview
- Save opens folder picker menu, shows Save/Unsave state
- Blacklist actions have confirmation dialogs
- Per-tab button visibility (Library: Save + Popout only)
- Cross-tab state management with grid selection clearing

Search & pagination:
- SearchState dataclass replaces 8 scattered attrs + defensive getattr
- Media type filter dropdown (All/Animated/Video/GIF/Audio)
- API retry with backoff on 429/503/timeout
- Infinite scroll dedup fix (local seen set per backfill round)
- Prev/Next buttons hide at boundaries, "(end)" status indicator

Grid:
- Rubber band drag selection
- Saved/bookmarked dots update instantly across all tabs
- Library/bookmarks emit signals on file deletion for cross-tab sync

Settings & misc:
- Default site option
- Max thumbnail cache setting (500MB default)
- Source URLs clickable in info panel
- Long URLs truncated to prevent splitter blowout
- Bulk save no longer auto-bookmarks
2026-04-06 13:43:46 -05:00
pax
1a5dbff1bb Clean up dead code and unused imports 2026-04-05 21:30:47 -05:00
pax
8467c0696b Add post date to info line 2026-04-05 21:15:22 -05:00
pax
39733e4865 Convert animated PNG and WebP to GIF for Qt playback
PIL extracts frames with durations, saves as animated GIF.
Non-animated PNG/WebP kept as-is. Converted on download and
on cache access (for already-cached files). Same pattern as
ugoira zip conversion.
2026-04-05 19:41:34 -05:00
pax
ee329519de Handle non-JSON API responses gracefully
Some boorus return empty/HTML responses for tag-limited queries.
All API clients now catch JSON parse errors and return empty
results instead of crashing.
2026-04-05 19:31:43 -05:00
pax
1807f77dd4 Fix Gelbooru CDN — pass Referer header per-request
Shared client doesn't set Referer globally since it varies per
domain. Now passed as per-request header so Gelbooru CDN doesn't
return HTML captcha pages.
2026-04-05 17:45:57 -05:00
pax
96c57d16a9 Share HTTP client across all API calls for Windows performance
Single shared httpx.AsyncClient for all BooruClient instances
(Danbooru, Gelbooru, Moebooru) with connection pooling.
E621 gets its own shared client (custom User-Agent required).
Site detection also reuses the shared client.
Eliminates per-request TLS handshakes on Windows.
2026-04-05 17:22:30 -05:00
pax
4987765520 Code audit fixes: crash guards, memory caps, unused imports, bounds checks
- Fix pop(0) crash on empty append queue
- Cap page cache to 10 pages (pagination mode only)
- Bounds check before data[0] in gelbooru/moebooru get_post
- Move json import to top level in db.py
- Remove unused imports (Slot, contextmanager)
- Safe dict access in _row_to_bookmark
- Remove redundant datetime import in save_library_meta
- Add threading import for future DB locking
2026-04-05 17:18:27 -05:00
pax
1e87ca4216 Fix missing field import in db.py 2026-04-05 17:12:02 -05:00
pax
d2aae5cd82 Store tag categories in bookmarks, tag click switches to Browse
- Bookmarks DB now stores tag_categories as JSON
- Migration adds column to existing favorites table
- Bookmark info panel uses stored categories directly
- Falls back to library_meta if bookmark has no categories
- Tag click in info panel: clears preview, switches to Browse, searches
2026-04-05 17:09:01 -05:00
pax
87c42f806e Fix library/bookmark info panel, save indicator, DB migration
- Migrate library_meta to add tag_categories column
- Info panel always updates (not gated on isVisible)
- Library info shows status bar with score/rating
- Save indicator: changed elif to if so Saved always triggers
- Bookmark info panel always populated
2026-04-05 16:58:22 -05:00
pax
29ffe0be7a Store tag categories in library metadata, unsave from bookmarks
- tag_categories stored as JSON in library_meta table
- Library and Bookmarks info panels show categorized tags
- Bookmarks falls back to library_meta for categories
- Added Unsave from Library to bookmarks right-click menu
2026-04-05 16:46:48 -05:00
pax
337d5d8087 Library metadata: store tags on save, search by tag in Library
- library_meta table stores tags, score, rating, source per post
- Metadata saved automatically when saving from Browse
- Search box in Library tab filters by tags via DB lookup
- Works with all file types
2026-04-05 16:37:12 -05:00
pax
ee9d06755f Connection pooling for thumbnails, wider score spinbox
Shared httpx.AsyncClient reuses TLS connections across thumbnail
downloads — avoids 40 separate TLS handshakes per page on Windows.
Score spinbox widened to 90px with explicit arrow buttons.
2026-04-05 16:19:10 -05:00
pax
7115d34504 Infinite scroll mode — toggle in Settings > General
When enabled, hides prev/next buttons and loads more posts
automatically when scrolling to the bottom. Posts appended
to the grid, deduped against already-shown posts. Restart
required to toggle.
2026-04-05 13:37:38 -05:00
pax
a2302e2caa Add missing defaults, log all caught exceptions
- Add prefetch_adjacent, clear_cache_on_exit, slideshow_monitor,
  library_dir to _DEFAULTS for fresh installs
- Replace all silent except-pass with logged warnings
2026-04-05 04:01:35 -05:00
pax
f8582e83fa Configurable library directory in Settings > Paths
Browse button to pick a custom library save directory.
Applied on startup via set_library_dir(). Restart required.
2026-04-05 02:01:44 -05:00
pax
72e4d5c5a2 v0.1.4 — Library rewrite: Browse | Bookmarks | Library
Major restructure of the favorites/library system:

- Rename "Favorites" to "Bookmarks" throughout (DB API, GUI, signals)
- Add Library tab for browsing saved files on disk with sorting
- Decouple bookmark from save — independent operations now
- Two indicators on thumbnails: star (bookmarked), green dot (saved)
- Both indicators QSS-controllable (qproperty-bookmarkedColor/savedColor)
- Unbookmarking no longer deletes saved files
- Saving no longer auto-bookmarks
- Library tab: folder sidebar, sort by date/name/size, async thumbnails
- DB table kept as "favorites" internally for migration safety
2026-04-05 01:38:41 -05:00
pax
f311326e73 Optional prefetch with progress bar, post blacklist, theme template link
- Prefetch adjacent posts is now a toggle in Settings > General (off by default)
- Prefetch progress bar on thumbnails shows download state
- Blacklist Post: right-click to hide a specific post by URL
- "Create from Template" opens themes reference on git.pax.moe
  and spawns the default text editor with custom.qss
2026-04-05 00:45:53 -05:00
pax
50d22932fa Decode HTML entities in Gelbooru tags
Fixes &#039; showing instead of apostrophes in tag names.
2026-04-04 21:52:20 -05:00
pax
b6c6a6222a Categorized tags in info panel with color coding
- Artist (gold), Character (green), Copyright (purple), Species
  (red), General, Meta/Lore (gray)
- Danbooru and e621 provide categories from API
- Gelbooru/Moebooru fall back to flat tag list
2026-04-04 21:34:01 -05:00
pax
9f636532c0 Fix blacklist: enable by default, re-search after blacklisting tag
- blacklist_enabled defaults to "1" so it works out of the box
- Right-click blacklist auto-enables and re-searches immediately
2026-04-04 21:26:43 -05:00
pax
e8f72c6fe6 Add Network tab to settings — shows all connected hosts
Logs every outgoing connection (API requests and image downloads)
with timestamps. Network tab in Settings shows all hosts contacted
this session with request counts. No telemetry, just transparency.
2026-04-04 21:11:01 -05:00
pax
afa08ff007 Performance: persistent event loop, batch DB, directory pre-scan
- Replace per-operation thread spawning with a single persistent
  asyncio event loop (saves ~10-50ms per async operation)
- Pre-scan saved directories into sets instead of per-post
  exists() calls (~80+ syscalls reduced to a few iterdir())
- Add add_favorites_batch() for single-transaction bulk inserts
- Add missing indexes on favorites.folder and favorites.favorited_at
2026-04-04 20:19:22 -05:00
pax
8c64f20171 Add per-item delete button to search history dropdown
Each recent search now has an x button to remove it individually.
Clicking the search text still loads it as before.
2026-04-04 20:07:03 -05:00
pax
148c1c3a26 Session cache option, zip conversion fix, page boundary nav
- Add "Clear cache on exit" checkbox in Settings > Cache
- Fix ugoira zip conversion: don't crash if frames fail, verify gif
  exists before deleting zip
- Arrow key past last/first post loads next/prev page
2026-04-04 19:49:09 -05:00