-
Notifications
You must be signed in to change notification settings - Fork 11
Description
This is one of the very few pdf to md solutions I've found that is able to preserve italics and bold from text pdfs, which is great. However, when I attempt to convert an image pdf the only output is an md file with a link to the extracted image. All dependencies are installed, including opencv and pytesseract.
This is the terminal output:
2025-10-25 20:36:04,152 - main - INFO - Image captioning model set up successfully.
2025-10-25 20:36:04,185 - main - INFO - Extracted 0 tables from the PDF.
2025-10-25 20:36:04,185 - main - INFO - Processing page 1
2025-10-25 20:36:04,497 - main - INFO - Extracted 0 links from the page.
2025-10-25 20:36:05,672 - main - INFO - Markdown content saved successfully.
2025-10-25 20:36:05,672 - main - INFO - Markdown content has been saved to pdf_to_MD_output/Test-italics.md