pax 44a20ac057 Search: instrument _do_search and _on_reached_bottom with per-filter drop counts
Note #3 in REFACTOR_NOTES.md (search result count + end-of-results
flag mismatch) reproduced once during the refactor verification sweep
and not again at later commits, so it's intermittent — likely
scenario-dependent (specific tag, blacklist hit rate, page-size /
limit interaction). The bug is real but not reliably repro-able, so
the right move is to add logging now and capture real data on the
next reproduction instead of guessing at a fix.

Both _do_search (paginated) and _on_reached_bottom (infinite scroll
backfill) now log a `do_search:` / `on_reached_bottom:` line with the
following fields:

  - limit              the configured page_size
  - api_returned_total raw count of posts the API returned across all
                       fetched pages (sum of every batch the loop saw)
  - kept               post-filter, post-clamp count actually emitted
  - drops_bl_tags      posts dropped by the blacklist-tags filter
  - drops_bl_posts     posts dropped by the blacklist-posts filter
  - drops_dedup        posts dropped by the dedup-against-seen filter
  - api_short_signal   (do_search only) whether the LAST batch came
                       back smaller than limit — the implicit "API ran
                       out" hint
  - api_exhausted      (on_reached_bottom only) the explicit
                       api_exhausted flag the loop sets when len(batch)
                       falls short
  - last_page          (on_reached_bottom only) the highest page index
                       the backfill loop touched

_on_search_done also gets a one-liner with displayed_count, limit,
and the at_end decision so the user-visible "(end)" flag can be
correlated with the upstream numbers.

Implementation note: the per-filter drop counters live in a closure-
captured `drops` dict that the `_filter` closure mutates as it walks
its three passes (bl_tags → bl_posts → dedup). Same dict shape in
both `_do_search` and `_on_reached_bottom` so the two log lines are
directly comparable. Both async closures also accumulate `raw_total`
across the loop iterations to capture the API's true return count,
since the existing locals only kept the last batch's length.

All logging is `log.debug` so it's off at default INFO level. To
capture: bump booru_viewer logger level (or run with debug logging
enabled in main_window.py:440 — already DEBUG by default per the
existing setLevel call).

This commit DOES NOT fix #3 — the symptom is still intermittent and
the root cause is unknown. It just makes the next reproduction
diagnosable in one shot instead of requiring a second instrumented
run.
2026-04-08 16:32:32 -05:00
..