info.aduna.infosource.crawl.extract
Class ExtractionUtil
java.lang.Object
info.aduna.infosource.crawl.extract.ExtractionUtil
public class ExtractionUtil
- extends Object
- Author:
- Herko ter Horst
|
Method Summary |
static void |
extract(ThreadedExtractorContainer tec,
InputStream stream,
org.semanticdesktop.aperture.rdf.RDFContainer metadata)
Extract full text and metadata from the specified stream and add
statements representing the extracted data to the specified RDFContainer. |
static InputStream |
getMarkSupportedStream(InputStream in,
int minBufferSize)
Get an InputStream that supports mark and reset based on the specified
InputStream. |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ExtractionUtil
public ExtractionUtil()
extract
public static void extract(ThreadedExtractorContainer tec,
InputStream stream,
org.semanticdesktop.aperture.rdf.RDFContainer metadata)
throws org.semanticdesktop.aperture.extractor.ExtractorException,
IOException
- Extract full text and metadata from the specified stream and add
statements representing the extracted data to the specified RDFContainer.
- Parameters:
tec - the ThreadedExtractorContainer to notify of the extraction processstream - the stream to extract frommetadata - the RDFContainer to add data to
- Throws:
org.semanticdesktop.aperture.extractor.ExtractorException - if something goes wrong during the extraction process
IOException - if something goes wrong reading from the stream
getMarkSupportedStream
public static InputStream getMarkSupportedStream(InputStream in,
int minBufferSize)
throws IOException
- Get an InputStream that supports mark and reset based on the specified
InputStream.
- Parameters:
in - the InputStream to useminBufferSize - the minimum buffer size the result should use
- Returns:
- the input, if it already supports mark and reset, or a
BufferedInputStream with at least the specified minimum buffer
size
- Throws:
IOException - if the wrapping stream could not be created
Copyright © 1997-2008 Aduna. All Rights Reserved.