HAR-to-Book is a Python script that converts a HAR (HTTP Archive) file into a book by extracting images from the HAR file and generating a PDF document.
- Converts a HAR file into a book
- Extracts images from the HAR file
- Generates a PDF document from the extracted images
The main.py script takes a .har file and creates a book from the images contained within it. The HAR file is a JSON file that stores all the requests made, including the images represented as Base64 strings.
Before running the script, follow these steps:
- Go to the website where you want to create a book.
- Open the browser's Developer Tools by right-clicking anywhere on the page and selecting the "Inspect" option.
- Switch to the "Network" tab in the Developer Tools and minimize it.
- Manually browse through the book, clicking the next button as quickly as possible (you don't have to wait for each image to load).
- Tip: Zoom in for high-quality images.
- Once you reach the last page, look for a button to download the
.HARfile.- This file contains all the requests you made, including the images stored as Base64 strings.
- Save the file in the "raw" folder and rename it to "book.json".
- Download the original
.har,book.json, anddeep.pdffiles from the Google Drive. - Make sure you have the following dependencies installed:
- Python 3
- Pillow library: Install it by running the command
python3 -m pip install --upgrade Pillow.
- Run the script by executing the following command in your terminal or command prompt:
python3 main.py. - The script will perform the following actions:
- Convert the
book.jsonfile into a more manageable format and save it assample.json. - Remove unnecessary data from
sample.jsonand save the modified version assample2.json. - Convert all Base64-encoded images in
sample2.jsoninto PNG files and save them in the "pageImages" folder. - Combine all PNG images into a single PDF document named
deep.pdf.
- Convert the
Contributions are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.