Provide a Document model including a FileField and a custom storage. Uploaded documents live outside of MEDIA_ROOT and must be downloaded through a view that does security checks. Using django-autocomplete-light and django-generic-m2m, allow to attach any object to it.
Document Image Classification with Intra-Domain Transfer Learning and Stacked Generalization of Deep Convolutional Neural Networks
A python module that automatically summarizes text documents and web pages
Extract embedded files and macros from office documents.
Python library for parsing .docx (Office Open XML) files
PyTorch deep learning models for document classification
Using Scikit-learn, machine learning library for the Python programming language.
Static analysis tools for Microsoft Office Open XML files and documents
olefile is a Python package to parse, read and write Microsoft OLE2 files (also called Structured Storage, Compound File Binary Format or Compound Document File Format), such as Microsoft Office 97-2003 documents, vbaProject.bin in MS Office 2007+ files, Image Composer and FlashPix files, Outlook messages, StickyNotes, several Microscopy file formats, McAfee antivirus quarantine files, etc.