C# .NET document parsing API to extract text, images, metadata & encoding from databases, PDF, Word, Excel, presentations, web, email, EPUB & zip file formats....Text : TXT, RTF Markup : HTML, XHTML, MHTML, MD, XML Portable Formats...PDF JP2 EMLX TAR ONE MHTML XHTML EML DOCX XLTX XLS XLSX CHM JPEG...