zimscraperlib.rewriting.rx_replacer
Classes:
-
RxRewriter–RxRewriter is a generic rewriter base on regex.
Functions:
-
add_around–Create a rewrite_function which add a
prefixand asuffixaround the match. -
add_prefix–Create a rewrite_function which add the
prefixto the matching str. -
add_suffix–Create a rewrite_function which add the
suffixto the matching str. -
m2str–Call a rewrite_function with a string instead of a match object.
-
replace–Create a rewrite_function replacing
srcbytargetin the matching str. -
replace_all–Create a rewrite_function which replace the whole match with text.
-
replace_prefix_from–Returns a function which replaces everything before
matchwithprefix.
Attributes:
TransformationAction
module-attribute
TransformationRule
module-attribute
TransformationRule = tuple[
Pattern[str], TransformationAction
]
RxRewriter
RxRewriter(
rules: Iterable[TransformationRule] | None = None,
)
RxRewriter is a generic rewriter base on regex.
The main "input" is a list of rules, each rule being a tuple (regex,
rewriting_function). We want to apply each rule to the content. But doing it blindly
is counter-productive. It would means that we have to do N replacements (N == number
of rules).
To avoid that, we create one unique regex (compiled_rule) equivalent to
(regex0|regex1|regex2|...) and we do only one replacement with this regex.
When we have a match, we do N regex search to know which rules is corresponding
and we apply the associated rewriting_function.
Methods:
-
rewrite–Apply the unique
compiled_rulespattern and replace the content.
Attributes:
-
compiled_rule(Pattern[str] | None) – -
rules–
Source code in src/zimscraperlib/rewriting/rx_replacer.py
103 104 105 106 107 108 109 110 | |
rules
instance-attribute
rules = rules or []
rewrite
Apply the unique compiled_rules pattern and replace the content.
Source code in src/zimscraperlib/rewriting/rx_replacer.py
120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 | |
add_around
add_around(
prefix: str, suffix: str
) -> TransformationAction
Create a rewrite_function which add a prefix and a suffix around the match.
Source code in src/zimscraperlib/rewriting/rx_replacer.py
22 23 24 25 26 27 28 29 30 31 | |
add_prefix
add_prefix(prefix: str) -> TransformationAction
Create a rewrite_function which add the prefix to the matching str.
Source code in src/zimscraperlib/rewriting/rx_replacer.py
34 35 36 37 38 39 | |
add_suffix
add_suffix(suffix: str) -> TransformationAction
Create a rewrite_function which add the suffix to the matching str.
Source code in src/zimscraperlib/rewriting/rx_replacer.py
42 43 44 45 46 47 | |
m2str
m2str(
function: Callable[[str], str],
) -> TransformationAction
Call a rewrite_function with a string instead of a match object. A lot of rewrite function don't need the match object as they are working directly on text. This decorator can be used on rewrite_function taking a str.
Source code in src/zimscraperlib/rewriting/rx_replacer.py
9 10 11 12 13 14 15 16 17 18 19 | |
replace
replace(src: str, target: str) -> TransformationAction
Create a rewrite_function replacing src by target in the matching str.
Source code in src/zimscraperlib/rewriting/rx_replacer.py
65 66 67 68 69 70 71 72 73 74 | |
replace_all
replace_all(text: str) -> TransformationAction
Create a rewrite_function which replace the whole match with text.
Source code in src/zimscraperlib/rewriting/rx_replacer.py
77 78 79 80 81 82 83 84 85 86 | |
replace_prefix_from
replace_prefix_from(
prefix: str, match: str
) -> TransformationAction
Returns a function which replaces everything before match with prefix.
Source code in src/zimscraperlib/rewriting/rx_replacer.py
50 51 52 53 54 55 56 57 58 59 60 61 62 | |