duck.utils.pathยถ
Module for Path Operations, .e.g path sanitization, manipulations, joining etc.
Module Contentsยถ
Functionsยถ
This builds an absolute url from provided root_url and path. |
|
Check whether a URL is s complete url including scheme (e.g. โhttpsโ) |
|
Validates if the URL path conforms to RFC 3986 standards. Only allows specific special characters. Also checks for disallowed characters like space, tilde (~), etc. |
|
Returns joined paths but makes sure all paths are included in the final path rather than |
|
Normalizes a URL path by removing consecutive slashes, adding a leading slash, removing trailing slashes, removing disallowed characters, e.g โ<โ, string quotes (etc), replacing back slashes and lowercasing the scheme. |
|
Checks if two paths point to the same location, handling case-insensitivity and different separators. |
|
Replaces the hostname in a URL. |
|
Sanitize a path segment to prevent directory traversal attacks. (same as |
|
Normalizes a URL by removing consecutive slashes, adding a leading slash, removing trailing slashes, removing disallowed characters, e.g โ<โ, string quotes (etc), replacing back slashes and lowercasing the scheme. |
Dataยถ
APIยถ
- duck.utils.path.URL_PATH_REGEXยถ
โ^[a-zA-Z0-9-._~:/?#\ue001\ue001@!\(&\\()*+,;=%]*\)โ
- duck.utils.path.build_absolute_uri(root_url: str, path: str, normalization_ignore_chars: Optional[List[str]] = None) str[source]ยถ
This builds an absolute url from provided root_url and path.
- Parameters:
path โ The path to join with the root url.
normalization_ignore_chars โ List of characters to ignore when normalizing the url path. By default, all unsafe characters are stripped.
- duck.utils.path.is_absolute_url(url: str)[source]ยถ
Check whether a URL is s complete url including scheme (e.g. โhttpsโ)
- duck.utils.path.is_good_url_path(url_path: str) bool[source]ยถ
Validates if the URL path conforms to RFC 3986 standards. Only allows specific special characters. Also checks for disallowed characters like space, tilde (~), etc.
- Parameters:
url_path โ The URL path string to validate.
- Returns:
True if the URL is in the specified format and has no disallowed characters, False otherwise.
- Return type:
bool
- duck.utils.path.joinpaths(path1: Union[str, pathlib.Path], path2: Union[str, pathlib.Path], *more)[source]ยถ
Returns joined paths but makes sure all paths are included in the final path rather than
os.path.join.
- duck.utils.path.normalize_url_path(url_path: str, ignore_chars: Optional[List[str]] = None) str[source]ยถ
Normalizes a URL path by removing consecutive slashes, adding a leading slash, removing trailing slashes, removing disallowed characters, e.g โ<โ, string quotes (etc), replacing back slashes and lowercasing the scheme.
- duck.utils.path.paths_are_same(path1, path2)[source]ยถ
Checks if two paths point to the same location, handling case-insensitivity and different separators.
- duck.utils.path.replace_hostname(url: str, hostname: str) str[source]ยถ
Replaces the hostname in a URL.
If URL doesnโt have scheme (e.g https) or is a urlpath, no modifications will be done.
- Parameters:
url โ The target URL.
new_hostname โ The new hostname or domain.
- duck.utils.path.sanitize_path_segment(segment)[source]ยถ
Sanitize a path segment to prevent directory traversal attacks. (same as
normalize_url_path)- Parameters:
segment โ The path segment to sanitize.
- Returns:
The sanitized path segment.
- Return type:
str
- duck.utils.path.url_normalize(url: str, ignore_chars: Optional[List[str]] = None) str[source]ยถ
Normalizes a URL by removing consecutive slashes, adding a leading slash, removing trailing slashes, removing disallowed characters, e.g โ<โ, string quotes (etc), replacing back slashes and lowercasing the scheme.