Stars
10
Forks
2
Language
Java
Last Updated
Mar 12, 2022
Similar Repos
Repo | Language | Stars | Description | Updated At |
---|---|---|---|---|
Ruby | 478 | Read text and metadata from files and documents (.doc, .docx, .pages, .odt, .rtf, .pdf) | Oct 05, 2022 | |
Go | 987 | Converts PDF, DOC, DOCX, XML, HTML, RTF, etc to plain text | Aug 24, 2022 | |
Go | 2 | Converts PDF, DOC, DOCX, XML, HTML, RTF, etc to plain text | Sep 18, 2021 | |
Python | 3 | Extract text from .pdf, .docx, .hwp, .txt format | Sep 19, 2022 | |
Go | 72 | Extract text from plaintext, .docx, .odt and .rtf files. Pure go. | Oct 04, 2022 | |
Python | 3 | Read texts from different formats (including .doc, .docx, .txt, .pdf) | Mar 25, 2022 | |
HTML | 3 | node.js module for extracting text from html, pdf, doc, docx, xls, xlsx, csv, pptx, png, … | Nov 17, 2022 | |
HTML | 3 | node.js module for extracting text from html, pdf, doc, docx, xls, xlsx, csv, pptx, png, … | Jan 12, 2022 | |
HTML | 1487 | node.js module for extracting text from html, pdf, doc, docx, xls, xlsx, csv, pptx, png, … | Oct 07, 2022 | |
Python | 8 | get emails and phone numbers from text files like .pdf|.txt|.docx | Nov 06, 2021 | |
Python | 4 | extract text from pdf to txt | Nov 03, 2020 | |
Java | 104 | Produce doc/docx/pdf format from doc/docx template | Mar 07, 2022 | |
Python | 2 | Narrates your .docx, .pdf and .txt files | Feb 03, 2022 | |
Python | 31 | Extract payload URLs from Follina (CVE-2022-30190) docx and rtf files | Apr 08, 2023 | |
JavaScript | 2 | This is a simple NodeJS library to parse PDF, DOCX, DOC and TXT files to … | Sep 13, 2022 | |
Java | 2 | Documents reader (TXT, DOC, HTML, PDF, etc) | Oct 17, 2021 | |
Python | 2 | Use python to search text in doc,xls,pdf,txt files | May 07, 2020 | |
JavaScript | 5 | Extract text from PDF files | Dec 22, 2019 | |
Kotlin | 78 | This library reads word documents (.doc and .docx), txt and PDF files, and gives the … | Apr 13, 2023 | |
Python | 3 | GPT-3 based Question Answering System that reads text from PDF, DOCX, or TXT files and … | Apr 30, 2023 | |
PHP | 67 | Old PHP scripts to read text content from different binary formats: PDF, DOC, PPT, RTF … | Apr 29, 2022 | |
Protocol Buffer | 2 | A Go library about to be Microservice to convert PDF, DOC, DOCX, XML, HTML, RTF, … | Nov 27, 2019 | |
JavaScript | 8 | A simple Nodejs (Docker and S3 ready) server for extracting text from pdf, doc, docx, … | May 08, 2022 | |
C | 6 | Extract plain text from pdf files. | Jun 29, 2022 | |
Go | 2 | Extract raw text from PDF files | Jul 31, 2022 | |
JavaScript | 2 | Covert doc/docx to pdf. | Nov 22, 2020 | |
JavaScript | 10 | pdf, rtf, docx, hwp를 html로 변환하는 모듈입니다 | Jul 09, 2022 | |
Perl | 10 | LF Aligner helps translators create translation memories from texts and their translations. It relies on … | Mar 22, 2022 | |
PHP | 4 | a php class to read document lik,e doc,docx,pdf,txt,zip... | May 26, 2020 | |
Java | 14 | The command line tool to generate DOCX, HTML, ODT, PDF, PNG, PPTX, RTF, XLS, XLSX … | Jun 01, 2022 | |
JavaScript | 33 | Convert doc, docx to pdf file | Jul 19, 2022 | |
C | 7 | A simple file reader can extract content from .pdf,.doc,.ppt,xls... files without other tools | Jun 01, 2020 | |
C++ | 9 | Convert PDF to RTF, ASCII or text files. | Mar 01, 2020 | |
Ruby | 9 | Extract html, text and attachments from *.eml files. | Apr 27, 2022 | |
Java | 77 | The EbookReader Android App. Support file format like epub, pdf, txt, html, mobi, azw, azw3, … | Apr 26, 2023 | |
PHP | 4 | Extract text from a Word Doc | Jun 01, 2021 | |
C | 8 | C/Python library to extract text from MS doc files | Aug 15, 2022 | |
C# | 61 | GemBox.Document is a .NET component that enables you to read, write, convert, and print document … | Apr 11, 2023 | |
Go | 2 | Extract raw text from PDF files (PDF2.0/PDF1.7) | Mar 07, 2023 | |
C# | 2 | An application to extract text from pdf files | Feb 02, 2022 | |
Python | 2 | Extract content from MS doc, ppt, etc. | Jun 08, 2020 | |
TypeScript | 3 | Normalize dirty HTML and DOCX/RTF documents into clean, understandable HTML | Apr 21, 2023 | |
PHP | 7 | Imports OpenOffice-compatible files (doc, docx, etc) into SilverStripe pages and content | Jun 20, 2022 | |
C# | 56 | A simple office file reader can extract content and summary information from .doc,.docx,.ppt,.pptx files without … | Aug 29, 2022 | |
Go | 13 | Library to extract text from HTML files | May 14, 2020 | |
Python | 372 | A pure python based utility to extract text and images from docx files. | Oct 09, 2022 | |
Python | 2 | Script to extract highlighted text from a pdf file to a txt file. | Oct 07, 2022 | |
Python | 2 | Compiles multiple *.txt or *.rtf files to FDA specified format and save them to PDF | Jun 22, 2021 | |
Python | 5 | extract meaningful text content from html of web page | Nov 30, 2020 | |
Go | 10 | Extract content from HTML by removing unwanted boilerplate text. | Sep 27, 2022 |