zimscraperlib.rewriting.js
JS Rewriting
This modules contains tools to rewrite JS retrieved from an online source so that it can safely operate within a ZIM. It is based on the assumption that wombat.js will be used for proper JS operation, intercepting all HTTP requests and rewriting them as needed. The main purpose of the rewriting is hence simply to properly include and configure wombat.js.
This modules assumes that:
- every HTML page in the ZIM have been properly rewriten to include wombat.js and setup
it appropriately
- a specific JS file (provided in statics folder) for JS modules is included in the
ZIM at _zim_static/__wb_module_decl.js
This code is based on https://github.com/webrecorder/wabac.js/blob/main/src/rewrite/jsrewriter.ts Last backport of upstream changes is from wabac.js commit: Feb 20, 2026 - 25061cb53ff113d5cff28f2f1354819f6c41034b
Classes:
-
JsRewriter–JsRewriter is in charge of rewriting the js code stored in our zim file.
Functions:
-
add_suffix_non_prop–Create a rewrite_function which add a
suffixto the match str. -
create_js_rules–This function create all the transformation rules.
-
remove_args_if_strict–Replace 'arguments' with '[]' if the code is in strict mode.
-
replace_import–Create a rewrite_function replacing
srcbytargetin the matching str. -
replace_this–Create a rewrite_function replacing "this" by
this_rwin the matching str. -
replace_this_non_prop–Create a rewrite_function replacing "this" by
this_rw.
Attributes:
-
GLOBALS_RX– -
GLOBAL_OVERRIDES– -
IMPORT_EXPORT_HTTP_RX– -
IMPORT_EXPORT_MATCH_RX– -
REWRITE_JS_RULES– -
this_rw–
GLOBALS_RX
module-attribute
GLOBALS_RX = compile(
"("
+ join(
[
("(?:^|[^$.])\\b" + x + "\\b(?:$|[^$])")
for x in GLOBAL_OVERRIDES
]
)
+ ")"
)
GLOBAL_OVERRIDES
module-attribute
GLOBAL_OVERRIDES = [
"window",
"globalThis",
"self",
"document",
"location",
"top",
"parent",
"frames",
"opener",
]
IMPORT_EXPORT_HTTP_RX
module-attribute
IMPORT_EXPORT_HTTP_RX = compile(
"((?:im|ex)port(?:['\"\\s]*(?:[\\w*${}\\s,]+from\\s*)?['\"\\s]?['\"\\s]))((?:https?|[./]).*?)(['\"\\s])"
)
IMPORT_EXPORT_MATCH_RX
module-attribute
IMPORT_EXPORT_MATCH_RX = compile(
"(^|;)\\s*?(?:im|ex)port(?:['\"\\s]*(?:[\\w*${}\\s,]+from\\s*)?['\"\\s]?['\"\\s])(?:.*?)['\"\\s]"
)
this_rw
module-attribute
this_rw = '_____WB$wombat$check$this$function_____(this)'
JsRewriter
JsRewriter(
url_rewriter: ArticleUrlRewriter,
base_href: str | None,
notify_js_module: Callable[[ZimPath], None] | None,
)
Bases: RxRewriter
JsRewriter is in charge of rewriting the js code stored in our zim file.
Methods:
-
rewrite–Rewrite the js code in
text.
Attributes:
-
base_href– -
compiled_rule(Pattern[str] | None) – -
first_buff– -
last_buff– -
notify_js_module– -
rules– -
url_rewriter–
Source code in src/zimscraperlib/rewriting/js.py
234 235 236 237 238 239 240 241 242 243 244 245 | |
base_href
instance-attribute
base_href = base_href
last_buff
instance-attribute
last_buff = '\n}'
notify_js_module
instance-attribute
notify_js_module = notify_js_module
rules
instance-attribute
rules = rules or []
url_rewriter
instance-attribute
url_rewriter = url_rewriter
rewrite
Rewrite the js code in text.
Source code in src/zimscraperlib/rewriting/js.py
304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 | |
add_suffix_non_prop
add_suffix_non_prop(suffix: str) -> TransformationAction
Create a rewrite_function which add a suffix to the match str.
The suffix is added only if the match is not preceded by . or $.
Applies strict mode transformation to handle 'arguments' keyword.
Source code in src/zimscraperlib/rewriting/js.py
87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 | |
create_js_rules
create_js_rules() -> list[TransformationRule]
This function create all the transformation rules.
A transformation rule is a tuple (Regex, rewrite_function).
If the regex match in the rewritten script, the corresponding match object will be
passed to rewrite_function.
The rewrite_function must all take a opts dictionnary which will be the opts
passed to the JsRewriter.rewrite function.
This is mostly as if we were calling re.sub(regex, rewrite_function, script_text).
The regex will be combined and will match any non overlaping text. So rule to match will be applyed, potentially preventing futher rules to match.
Source code in src/zimscraperlib/rewriting/js.py
148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 | |
remove_args_if_strict
remove_args_if_strict(
target: str,
opts: dict[str, Any] | None,
offset: int,
full_string: str,
) -> str
Replace 'arguments' with '[]' if the code is in strict mode. In strict mode, the arguments keyword is not allowed.
Source code in src/zimscraperlib/rewriting/js.py
70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 | |
replace_import
replace_import(
src: str, target: str
) -> TransformationAction
Create a rewrite_function replacing src by target in the matching str.
This "replace" function is intended to be use to replace in import ... as it
adds a import.meta.url if we are in a module.
Source code in src/zimscraperlib/rewriting/js.py
132 133 134 135 136 137 138 139 140 141 142 143 144 145 | |
replace_this
replace_this() -> TransformationAction
Create a rewrite_function replacing "this" by this_rw in the matching str.
Source code in src/zimscraperlib/rewriting/js.py
104 105 106 107 108 | |
replace_this_non_prop
replace_this_non_prop() -> TransformationAction
Create a rewrite_function replacing "this" by this_rw.
Replacement happen only if "this" is not a property of an object.
Source code in src/zimscraperlib/rewriting/js.py
111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 | |