zimscraperlib.zim.indexing
Special item with customized index data and helper classes
Classes:
-
IndexData–IndexData to properly pass indexing title and content to the libzim
Functions:
-
get_pdf_index_data–Returns the IndexData information for a given PDF
Attributes:
IGNORED_MUPDF_MESSAGES
module-attribute
IGNORED_MUPDF_MESSAGES = [
"lcms: not an ICC profile, invalid signature.",
"format error: cmsOpenProfileFromMem failed",
"ignoring broken ICC profile",
]
IndexData
Bases: IndexData
IndexData to properly pass indexing title and content to the libzim
Both title and content have to be customized (title can be identical to item title or not). keywords is optional since it can be empty wordcount is optional ; if not passed, it is automaticaly computed from content
Methods:
Attributes:
Source code in src/zimscraperlib/zim/indexing.py
26 27 28 29 30 31 32 33 | |
content
property
writable
content
keywords
instance-attribute
keywords = keywords
title
instance-attribute
title = title
wordcount
instance-attribute
wordcount = wordcount
get_content
get_content() -> str
Source code in src/zimscraperlib/zim/indexing.py
41 42 | |
get_keywords
get_keywords() -> str
Source code in src/zimscraperlib/zim/indexing.py
44 45 | |
get_title
get_title() -> str
Source code in src/zimscraperlib/zim/indexing.py
38 39 | |
get_wordcount
get_wordcount() -> int
Source code in src/zimscraperlib/zim/indexing.py
47 48 | |
has_indexdata
has_indexdata() -> bool
Source code in src/zimscraperlib/zim/indexing.py
35 36 | |
get_pdf_index_data
get_pdf_index_data(
*,
content: str | bytes | None = None,
fileobj: BytesIO | None = None,
filepath: Path | None = None,
) -> IndexData
Returns the IndexData information for a given PDF
PDF can be passed either as content or fileobject or filepath
Source code in src/zimscraperlib/zim/indexing.py
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 | |