duck.contrib.sitemap¶
Sitemap builder for Duck.
Class-based sitemap builder that walks the application’s RouteRegistry and builds an XML sitemap using Duck’s component system (duck.html.components.to_component).
… rubric:: Example
builder = SitemapBuilder( server_url=None, # Parsing None will automatically resolve server URL save_to_file=True, filepath=“/etc/sitemap.xml”, extra_urls=[“/about”, “https://example.com/contact”], exclude_patterns=[“^/admin”, “https://example.com/secret”, “^/api/.*”], default_priority=0.5, default_changefreq=“weekly”, ) xml = builder.build(return_content=True)
Module Contents¶
Classes¶
Build an XML sitemap for a Duck application. |
Data¶
API¶
- duck.contrib.sitemap.DEFAULT_EXCLUDES¶
None
- class duck.contrib.sitemap.SitemapBuilder(server_url: str = None, filepath: Optional[str | pathlib.Path] = None, save_to_file: bool = True, extra_urls: Optional[Iterable[str]] = None, exclude_patterns: Optional[Iterable[str]] = None, default_priority: Optional[float] = 0.5, default_changefreq: Optional[str] = 'monthly', apply_default_excludes: bool = True, excludes_ignorecase: bool = True)¶
Build an XML sitemap for a Duck application.
The builder walks RouteRegistry.url_map, filters out dynamic or regex-like routes, supports explicit extra URLs, supports exclude patterns (absolute or relative, plain or regex), and emits a sitemap using Duck components.
Initialization
Initialize the builder.
- Parameters:
filepath – Optional path to save sitemap XML.
save_to_file – Whether to persist the sitemap to disk. Filepath must be provided.
extra_urls – Extra URL strings (absolute or path) to include in addition to the registered routes.
exclude_patterns – URL strings or regex patterns to exclude. Absolute excludes match against the final URL; non-absolute excludes match against the registered route path and the final URL.
default_priority – Default
value for URLs (0.0 - 1.0). If None the element is omitted. default_changefreq – Default
value for URLs (e.g., “daily”, “weekly”). If None the element is omitted. apply_default_excludes – Whether to apply default exclude patterns to your list of exclude_patterns. Defaults to True.
excludes_ignorecase – Whether to use
re.IGNORECASEwhen compiling exclude patterns. Defaults to True.
- _REGEX_META_CHARS¶
‘[\^\$\*\+\?\[\]\(\)\]’
- __slots__¶
(‘server_url’, ‘filepath’, ‘save_to_file’, ‘extra_urls’, ‘exclude_patterns’, ‘default_priority’, ‘de…
- _build_url_component(url_obj: duck.utils.urlcrack.URL, lastmod_iso: str, changefreq: Optional[str], priority: Optional[float])¶
Construct a
component for a given URL. - Parameters:
url_obj – The URL object to include.
lastmod_iso – ISO formatted last modified date string.
changefreq – Optional changefreq value.
priority – Optional priority between 0.0 and 1.0.
- Returns:
Component instance for the
element.
- _collect_extra_urls(existing_set: Set[str]) List[duck.utils.urlcrack.URL]¶
Normalize and filter explicitly provided extra URLs.
- Parameters:
existing_set – Set of absolute URL strings already collected.
- Returns:
A list of extra absolute URL objects to include.
- _collect_registered_urls() List[duck.utils.urlcrack.URL]¶
Collect absolute URLs from RouteRegistry that are valid sitemap candidates.
- Returns:
A list of absolute URL objects derived from registered routes.
- _is_excluded(full_url_str: str, registered_route_pattern: str) bool¶
Decide whether a candidate URL should be excluded.
Excludes in self.exclude_patterns can be:
absolute URL strings (or regexes) which match the full URL,
relative paths or patterns matched against registered route pattern or full URL,
plain strings (exact match) or regex patterns.
- Parameters:
full_url_str – The absolute URL string to evaluate.
registered_route_pattern – The registered route string or compiled pattern string to use for relative-match comparisons.
- Returns:
True if the URL should be excluded.
- static _looks_like_regex(path: str) bool¶
Return True if
pathcontains characters that look like a regex.- Parameters:
path – Registered route string.
- Returns:
True if the string contains regex meta characters.
- _to_absolute_url(raw: str) duck.utils.urlcrack.URL¶
Convert a raw URL or path into an absolute URL object.
- Parameters:
raw – Absolute URL string or path.
- Returns:
An absolute URL object.
- Return type:
- build(return_content: bool = True) Optional[str]¶
Build the sitemap XML.
- Parameters:
return_content – If True, return the sitemap XML as a string. If False, return None (but still save to file if configured).
- Returns:
The sitemap XML string when
return_contentis True, otherwise None.
- duck.contrib.sitemap.to_component¶
None