public final class PdfTextExtractor extends Object
Modifier and Type | Method and Description |
---|---|
static String |
getTextFromPage(PdfReader reader,
int pageNumber)
Extract text from a specified page using the default strategy.
|
static String |
getTextFromPage(PdfReader reader,
int pageNumber,
TextExtractionStrategy strategy)
Extract text from a specified page using an extraction strategy.
|
static String |
getTextFromPage(PdfReader reader,
int pageNumber,
TextExtractionStrategy strategy,
Map<String,ContentOperator> additionalContentOperators)
Extract text from a specified page using an extraction strategy.
|
public static String getTextFromPage(PdfReader reader, int pageNumber, TextExtractionStrategy strategy, Map<String,ContentOperator> additionalContentOperators) throws IOException
reader
- the reader to extract text frompageNumber
- the page to extract text fromstrategy
- the strategy to use for extracting textadditionalContentOperators
- an optional map of custom ContentOperators for rendering instructionsIOException
- if any operation fails while reading from the provided PdfReaderpublic static String getTextFromPage(PdfReader reader, int pageNumber, TextExtractionStrategy strategy) throws IOException
reader
- the reader to extract text frompageNumber
- the page to extract text fromstrategy
- the strategy to use for extracting textIOException
- if any operation fails while reading from the provided PdfReaderpublic static String getTextFromPage(PdfReader reader, int pageNumber) throws IOException
Note: the default strategy is subject to change. If using a specific strategy
is important, use getTextFromPage(PdfReader, int, TextExtractionStrategy)
reader
- the reader to extract text frompageNumber
- the page to extract text fromIOException
- if any operation fails while reading from the provided PdfReaderCopyright © 2017. All rights reserved.