Skip to content

zimscraperlib.filesystem

Files manipulation tools

Shortcuts to retrieve mime type using magic

Functions:

Attributes:

MIME_OVERRIDES module-attribute

MIME_OVERRIDES = {'image/svg': 'image/svg+xml'}

delete_callback

delete_callback(fpath: Path)

helper deleting passed filepath

Source code in src/zimscraperlib/filesystem.py
42
43
44
def delete_callback(fpath: pathlib.Path):
    """helper deleting passed filepath"""
    fpath.unlink(missing_ok=True)

get_content_mimetype

get_content_mimetype(content: bytes | str) -> str

MIME Type of content retrieved from magic headers

Source code in src/zimscraperlib/filesystem.py
28
29
30
31
32
33
34
35
36
37
38
39
def get_content_mimetype(content: bytes | str) -> str:
    """MIME Type of content retrieved from magic headers"""

    try:
        detected_mime = magic.from_buffer(content, mime=True)
        if isinstance(
            detected_mime, bytes
        ):  # pragma: no cover (old python-magic versions where returning bytes)
            detected_mime = detected_mime.decode()
    except UnicodeDecodeError:
        return "application/octet-stream"
    return MIME_OVERRIDES.get(detected_mime, detected_mime)

get_file_mimetype

get_file_mimetype(fpath: Path) -> str

MIME Type of file retrieved from magic headers

Source code in src/zimscraperlib/filesystem.py
18
19
20
21
22
23
24
25
def get_file_mimetype(fpath: pathlib.Path) -> str:
    """MIME Type of file retrieved from magic headers"""

    # detected_mime = magic.detect_from_filename(fpath).mime_type
    # return MIME_OVERRIDES.get(detected_mime, detected_mime)

    with open(fpath, "rb") as fh:
        return get_content_mimetype(fh.read(2048))

path_from

path_from(path: Path | TemporaryDirectory[Any] | str)

Context manager to get a Path from a path as string, Path or TemporaryDirectory

Since scraperlib wants to manipulate only Path, scrapers might often needs this to create a path from what they have, especially since TemporaryDirectory context manager returns a string which is not really handy.

Source code in src/zimscraperlib/filesystem.py
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
@contextmanager
def path_from(path: pathlib.Path | TemporaryDirectory[Any] | str):
    """Context manager to get a Path from a path as string, Path or TemporaryDirectory

    Since scraperlib wants to manipulate only Path, scrapers might often needs this
    to create a path from what they have, especially since TemporaryDirectory context
    manager returns a string which is not really handy.
    """
    if isinstance(path, pathlib.Path):
        yield path
    elif isinstance(path, TemporaryDirectory):
        with path as pathname:
            yield pathlib.Path(pathname)
    else:
        yield pathlib.Path(path)