C G H I O P R S U V W Z

C

CONTENT_COLUMN_FAMILY - Static variable in class org.archive.io.hbase.HBaseParameters
DEFAULT OPTIONS
CONTENT_COLUMN_NAME - Static variable in class org.archive.io.hbase.HBaseParameters
 
CURI_COLUMN_FAMILY - Static variable in class org.archive.io.hbase.HBaseParameters
 

G

getByteArrayFromInputStream(ReplayInputStream, int) - Method in class org.archive.io.hbase.HBaseWriter
Read the ReplayInputStream and write it to the given BatchUpdate with the given column.
getClient() - Method in class org.archive.io.hbase.HBaseWriter
Gets the HTable client.
getContentColumnFamily() - Method in class org.archive.io.hbase.HBaseParameters
 
getContentColumnName() - Method in class org.archive.io.hbase.HBaseParameters
 
getCuriColumnFamily() - Method in class org.archive.io.hbase.HBaseParameters
 
getHbaseOptions() - Method in class org.archive.io.hbase.HBaseWriter
 
getHbaseParameters() - Method in class org.archive.modules.writer.HBaseWriterProcessor
 
getHbaseTable() - Method in class org.archive.modules.writer.HBaseWriterProcessor
 
getIpColumnName() - Method in class org.archive.io.hbase.HBaseParameters
 
getIsSeedColumnName() - Method in class org.archive.io.hbase.HBaseParameters
 
getMetadata() - Method in class org.archive.modules.writer.HBaseWriterProcessor
 
getPathFromSeedColumnName() - Method in class org.archive.io.hbase.HBaseParameters
 
getRecordIDGenerator() - Method in class org.archive.modules.writer.HBaseWriterProcessor
 
getRequestColumnName() - Method in class org.archive.io.hbase.HBaseParameters
 
getUrlColumnName() - Method in class org.archive.io.hbase.HBaseParameters
 
getViaColumnName() - Method in class org.archive.io.hbase.HBaseParameters
 
getZkClientPort() - Method in class org.archive.modules.writer.HBaseWriterProcessor
 
getZkQuorum() - Method in class org.archive.modules.writer.HBaseWriterProcessor
Getters and setters
getZookeeperClientPort() - Method in class org.archive.io.hbase.HBaseParameters
 

H

HBaseParameters - Class in org.archive.io.hbase
Configures the values of the column family/qualifier used for the crawl.
HBaseParameters() - Constructor for class org.archive.io.hbase.HBaseParameters
 
HBaseWriter - Class in org.archive.io.hbase
HBase implementation.
HBaseWriter(String, int, String, HBaseParameters) - Constructor for class org.archive.io.hbase.HBaseWriter
Instantiates a new HBaseWriter for the WriterPool to use in heritrix.
HBaseWriterProcessor - Class in org.archive.modules.writer
A Heritrix 3 processor that writes to Hadoop HBase.
HBaseWriterProcessor() - Constructor for class org.archive.modules.writer.HBaseWriterProcessor
 

I

initializeCrawlTable(Configuration, String) - Method in class org.archive.io.hbase.HBaseWriter
Creates the crawl table in HBase.
innerProcessResult(CrawlURI) - Method in class org.archive.modules.writer.HBaseWriterProcessor
 
IP_COLUMN_NAME - Static variable in class org.archive.io.hbase.HBaseParameters
 
IS_SEED_COLUMN_NAME - Static variable in class org.archive.io.hbase.HBaseParameters
 

O

onlyProcessNewRecords() - Method in class org.archive.modules.writer.HBaseWriterProcessor
 
onlyWriteNewRecords() - Method in class org.archive.modules.writer.HBaseWriterProcessor
 
org.archive.io.hbase - package org.archive.io.hbase
Provides HBase writer for heritrix.
org.archive.modules.writer - package org.archive.modules.writer
 

P

PATH_FROM_SEED_COLUMN_NAME - Static variable in class org.archive.io.hbase.HBaseParameters
 
processContent(Put, ReplayInputStream, int) - Method in class org.archive.io.hbase.HBaseWriter
This is a stub method and is here to allow extension/overriding for custom content parsing, data manipulation and to populate new columns.

R

REQUEST_COLUMN_NAME - Static variable in class org.archive.io.hbase.HBaseParameters
 

S

setContentColumnFamily(String) - Method in class org.archive.io.hbase.HBaseParameters
 
setContentColumnName(String) - Method in class org.archive.io.hbase.HBaseParameters
 
setCuriColumnFamily(String) - Method in class org.archive.io.hbase.HBaseParameters
 
setHbaseParameters(HBaseParameters) - Method in class org.archive.modules.writer.HBaseWriterProcessor
 
setHbaseTable(String) - Method in class org.archive.modules.writer.HBaseWriterProcessor
 
setIpColumnName(String) - Method in class org.archive.io.hbase.HBaseParameters
 
setIsSeedColumnName(String) - Method in class org.archive.io.hbase.HBaseParameters
 
setOnlyProcessNewRecords(boolean) - Method in class org.archive.modules.writer.HBaseWriterProcessor
 
setOnlyWriteNewRecords(boolean) - Method in class org.archive.modules.writer.HBaseWriterProcessor
 
setPathFromSeedColumnName(String) - Method in class org.archive.io.hbase.HBaseParameters
 
setRequestColumnName(String) - Method in class org.archive.io.hbase.HBaseParameters
 
setupPool(AtomicInteger) - Method in class org.archive.modules.writer.HBaseWriterProcessor
 
setUrlColumnName(String) - Method in class org.archive.io.hbase.HBaseParameters
 
setViaColumnName(String) - Method in class org.archive.io.hbase.HBaseParameters
 
setZkClientPort(int) - Method in class org.archive.modules.writer.HBaseWriterProcessor
 
setZkQuorum(String) - Method in class org.archive.modules.writer.HBaseWriterProcessor
 
shouldProcess(CrawlURI) - Method in class org.archive.modules.writer.HBaseWriterProcessor
 
shouldWrite(CrawlURI) - Method in class org.archive.modules.writer.HBaseWriterProcessor
Whether the given CrawlURI should be written to archive files.

U

URL_COLUMN_NAME - Static variable in class org.archive.io.hbase.HBaseParameters
 

V

VIA_COLUMN_NAME - Static variable in class org.archive.io.hbase.HBaseParameters
 

W

write(CrawlURI, String, RecordingOutputStream, RecordingInputStream) - Method in class org.archive.io.hbase.HBaseWriter
Write the crawled output to the configured HBase table.
write(CrawlURI, long, InputStream) - Method in class org.archive.modules.writer.HBaseWriterProcessor
Write to HBase.

Z

ZOOKEEPER_CLIENT_PORT - Static variable in class org.archive.io.hbase.HBaseParameters
 

C G H I O P R S U V W Z

Copyright © 2007-2011. All Rights Reserved.