C# .NET API to Extract and Parse Document Content or Fetch Metadata


Active member
Mar 8, 2016
Aspose Pty Ltd, Suite 163, 79 Longueville Road, NS
Programming Experience
Proficiently extract raw or formatted text and metadata from your Microsoft Word, PDF, Text and OneNote documents, Excel spreadsheets, PowerPoint presentations or Email and ZIP files by using the .NET text extractor API. Application developers can build a variety of apps in DotNet platform incorporating content extraction functionality with simple integration and seamless processing. Implement document parsing in your password-protected files, search text or regular expressions, obtain images, extract text areas and choose between standard or fast data extraction modes, GroupDocs.Parser for .NET offers feature rich solutions for all your document text extraction and parsing needs.

Below code snippet shows how to extract metadata from PDF documents:
// For complete examples and data files, please go to https://github.com/groupdocs-parser/GroupDocs.Parser-for-.NET
//get file's path
String filePath = Common.getFilePath(fileName);
PdfMetadataExtractor extractor = new PdfMetadataExtractor();
MetadataCollection metadata = extractor.ExtractMetadata(filePath);
foreach (string key in metadata.Keys)
    Console.WriteLine(string.Format("{0} = {1}", key, metadata[key]));

Download your free trial today – https://bit.ly/2QPM8Id

Watch YouTube video tutorials of the GroupDocs API – https://bit.ly/2Dv0Os6
If someone needs PDF generation functionality for the .net application, what do you think will be the best solution for a .net SDK for adding PDF render and print support in .net applications. I have gone through ZetPDF and they are saying that ZetPDF is designed to solve most developer’s needs with regards to PDF rendering. What're your thoughts about it?
Top Bottom