booru-viewer/booru_viewer/gui/_source_html.py
pax fa4f2cb270 security: fix #6 — add pure source HTML escape helper
Extracts the rich-text Source-line builder out of info_panel.py
into a Qt-free module so it can be unit-tested under CI (which
installs only httpx + Pillow + pytest, no PySide6).

The helper html.escape()s both the href and the visible display
text, and only emits an <a> tag for http(s) URLs — non-URL
sources (including javascript: and data: schemes) get rendered
as escaped plain text without a clickable anchor.

Not yet wired into InfoPanel.set_post; that lands in the next
commit.

Audit-Ref: SECURITY_AUDIT.md finding #6
Severity: Medium
2026-04-11 16:19:06 -05:00

35 lines
1.3 KiB
Python

"""Pure helper for the info-panel Source line.
Lives in its own module so the helper can be unit-tested from CI
without pulling in PySide6. ``info_panel.py`` imports it.
"""
from __future__ import annotations
from html import escape
def build_source_html(source: str | None) -> str:
"""Build the rich-text fragment for the Source line in the info panel.
The fragment is inserted into a QLabel set to RichText format with
setOpenExternalLinks(True) — that means QTextBrowser parses any HTML
in *source* as markup. Without escaping, a hostile booru can break
out of the href attribute, inject ``<img>`` tracking pixels, or make
the visible text disagree with the click target.
The href is only emitted for an http(s) URL; everything else is
rendered as escaped plain text. Both the href value and the visible
display text are HTML-escaped (audit finding #6).
"""
if not source:
return "none"
# Truncate display text but keep the full URL for the link target.
display = source if len(source) <= 60 else source[:57] + "..."
if source.startswith(("http://", "https://")):
return (
f'<a href="{escape(source, quote=True)}" '
f'style="color: #4fc3f7;">{escape(display)}</a>'
)
return escape(display)