Add optional PDF and Office document readers #36
Labels
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: archeious/luminos#36
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The agent currently sees only filename and size for binary document formats. Add optional content extraction as lazy deps (gated behind
--install-extras):pdfminerorpypdffor PDF text extractionopenpyxlfor Excel schema and sheet enumerationpython-docxfor Word document textParticularly valuable for the documents and data domains.