|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectinfo.aduna.infosource.crawl.CrawlingRepository
public class CrawlingRepository
CrawlingRepository provides a Repository capable of populating itself using Aperture Crawlers and Extractors.
| Field Summary | |
|---|---|
static org.semanticdesktop.aperture.mime.identifier.magic.MagicMimeTypeIdentifierFactory |
IDENTIFIER_FACTORY
|
| Constructor Summary | |
|---|---|
CrawlingRepository()
|
|
| Method Summary | |
|---|---|
void |
crawl(CrawlingListener listener,
boolean fullRecrawl)
Instructs this CrawlingRepository to crawl its DataSource for new, changed or deleted information and update its repository accordingly. |
static org.semanticdesktop.aperture.accessor.DataAccessorRegistry |
getAccessorRegistry()
|
org.openrdf.repository.RepositoryConnection |
getConnection()
|
static org.semanticdesktop.aperture.crawler.CrawlerRegistry |
getCrawlerRegistry()
|
File |
getDataDir()
|
org.semanticdesktop.aperture.datasource.DataSource |
getDataSource()
|
static org.semanticdesktop.aperture.extractor.ExtractorRegistry |
getExtractorRegistry()
|
boolean |
getIncludeInListCrawl()
Returns whether this CrawlingRepository wants to be included in a refresh of all CrawlingRepositories. |
LuceneIndex |
getIndex()
|
static LanguageIdentifier |
getLanguageIdentifier()
|
static org.semanticdesktop.aperture.hypertext.linkextractor.LinkExtractorRegistry |
getLinkExtractorRegistry()
|
ProcessorHook |
getProcessorHook()
|
org.openrdf.model.ValueFactory |
getValueFactory()
|
void |
initialize()
|
boolean |
isWritable()
|
org.openrdf.model.URI |
prepareAccess(org.openrdf.model.URI uri)
Returns a URI (typically representing a file of web page) that can be used to view the contents of the specified URI. |
void |
setDataDir(File dataDir)
|
void |
setDataSource(org.semanticdesktop.aperture.datasource.DataSource dataSource)
|
void |
setIncludeInListCrawl(boolean includeInListCrawl)
Sets whether this CrawlingRepository should be included in a refresh of all CrawlingRepositories. |
void |
setProcessorHook(ProcessorHook processorHook)
|
void |
shutDown()
|
void |
stopCrawling()
Instructs this CrawlingRepository to stop any ongoing crawling processes. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final org.semanticdesktop.aperture.mime.identifier.magic.MagicMimeTypeIdentifierFactory IDENTIFIER_FACTORY
| Constructor Detail |
|---|
public CrawlingRepository()
| Method Detail |
|---|
public org.semanticdesktop.aperture.datasource.DataSource getDataSource()
public void setDataSource(org.semanticdesktop.aperture.datasource.DataSource dataSource)
public File getDataDir()
getDataDir in interface org.openrdf.repository.Repositorypublic void setDataDir(File dataDir)
setDataDir in interface org.openrdf.repository.Repositorypublic ProcessorHook getProcessorHook()
public void setProcessorHook(ProcessorHook processorHook)
public void initialize()
throws org.openrdf.repository.RepositoryException
initialize in interface org.openrdf.repository.Repositoryorg.openrdf.repository.RepositoryExceptionpublic org.openrdf.model.ValueFactory getValueFactory()
getValueFactory in interface org.openrdf.repository.Repository
public org.openrdf.repository.RepositoryConnection getConnection()
throws org.openrdf.repository.RepositoryException
getConnection in interface org.openrdf.repository.Repositoryorg.openrdf.repository.RepositoryException
public boolean isWritable()
throws org.openrdf.repository.RepositoryException
isWritable in interface org.openrdf.repository.Repositoryorg.openrdf.repository.RepositoryException
public void shutDown()
throws org.openrdf.repository.RepositoryException
shutDown in interface org.openrdf.repository.Repositoryorg.openrdf.repository.RepositoryExceptionpublic LuceneIndex getIndex()
public org.openrdf.model.URI prepareAccess(org.openrdf.model.URI uri)
InfoSource
prepareAccess in interface InfoSourcepublic static org.semanticdesktop.aperture.crawler.CrawlerRegistry getCrawlerRegistry()
public static org.semanticdesktop.aperture.accessor.DataAccessorRegistry getAccessorRegistry()
public static org.semanticdesktop.aperture.extractor.ExtractorRegistry getExtractorRegistry()
public static LanguageIdentifier getLanguageIdentifier()
public static org.semanticdesktop.aperture.hypertext.linkextractor.LinkExtractorRegistry getLinkExtractorRegistry()
public void crawl(CrawlingListener listener,
boolean fullRecrawl)
throws IOException
listener - A CrawlingListener to send events about the progress to.fullRecrawl - Flag that indicates whether we are in full recrawl mode or not.
This is purely meant for logging/reporting, the CrawlingRepository
is not responsible for actually clearing any info when this flag is
on.
IOExceptionpublic void stopCrawling()
public boolean getIncludeInListCrawl()
public void setIncludeInListCrawl(boolean includeInListCrawl)
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||