Welcome to PDF2Presentation, a Python-based utility designed to transform your PDF files into fully-fledged PowerPoint presentations.
⚠️ Alpha Version: The project is currently in its alpha stage of development. As such, please note that certain features may not be fully stable or optimized. Your patience, feedback, and contributions are appreciated!
PDF2Presentation uses a combination of sophisticated libraries such as nltk, PyPDF2, fitz, openai, torch, re, pptx, diffusers, io, PIL, and os to deliver a seamless experience of converting your PDF files into presentations. The tool not only extracts the text and images from your PDFs, but also leverages the power of OpenAI's GPT-3 model to generate section titles and summaries, further enriching your presentations.
From a user's perspective, PDF2Presentation offers several advantages:
- Saves Time: Automatically converting a PDF into a presentation saves you hours of manual work.
- Enhances Understanding: Auto-generated summaries help highlight key points and improve the overall comprehension of the content.
- Increases Aesthetics: The tool intelligently generates a cover image and inserts images from the PDF, enhancing the visual appeal of your presentations.
- Prepares Presenter Notes: Auto-generated presenter notes can guide your speech and help maintain a smooth flow during your presentation.
Ensure you have Python 3.6 or later installed. The necessary dependencies can be installed via pip:
pip install nltk PyPDF2 pymupdf openai torch python-pptx diffuser PillowTo use this script, simply run the main Python file:
python main.pyPlease note that "document.pdf" in the main() function is a sample document. You'll need to replace it with the path of your desired PDF document. Also, remember to set your own OpenAI key as shown below:
openai.api_key = "your_openai_key"Since the code is in the alpha stage, there may be some bugs or issues that need to be resolved. Please feel free to report any problems you encounter, or better yet, contribute to improving the code!
We appreciate your interest in our project and welcome contributions. Feel free to open issues or pull requests to help improve PDF2Presentation.
Enjoy transforming your PDFs into impressive presentations!
The changelog contains a record of all notable changes made to PDF2Presentation. These include new features, bug fixes, and other improvements.
- Improved Text Corrections: The project has now migrated to use OpenAI's ChatGPT-4 completions API. This results in more accurate and contextually aware corrections.
- Double Validation Run: To minimize errors, we've implemented a second validation run that helps correct potential typos.
- Summary Storage: To facilitate manual review and editing, the code now saves each generated summary in a .TXT file.
- Improved Text Summarization: Integrated the T5 model for better text summarization through tokenization.
- NLP Enhancements: Utilized the SpaCy (nlp) library for text pre-processing and key phrase extraction for title generation.
- Time Elapsed Counter: Introduced a counter at the end of the script to display the total execution time.
- Initial release of PDF2Presentation, a Python-based utility designed to transform your PDF files into fully-fledged PowerPoint presentations.
- Extraction of text and images from PDF files.
- Section titles and summaries generated by OpenAI's text-davinci-003 model.
- Intelligent generation of a cover image and insertion of images from the PDF.
- Auto-generated presenter notes for a smoother presentation.