C G H I M P S T W

C

checkBytesWritten(StateProvider) - Method in class com.powerset.heritrix.writer.HBaseWriterProcessor
Check bytes written.
close() - Method in class com.powerset.heritrix.writer.HBaseWriterProcessor
 
com.powerset.heritrix.writer - package com.powerset.heritrix.writer
Provides HBase writer for heritrix.
CONTENT_COLUMN - Static variable in class com.powerset.heritrix.writer.HBaseWriter
The Constant CONTENT_COLUMN.
CONTENT_COLUMN_FAMILY - Static variable in class com.powerset.heritrix.writer.HBaseWriter
The Constant CONTENT_COLUMN_FAMILY.
CONTENT_MAX_SIZE - Static variable in class com.powerset.heritrix.writer.HBaseWriterProcessor
Maximum allowable content size.
createCrawlTable(HBaseConfiguration, String) - Method in class com.powerset.heritrix.writer.HBaseWriter
Creates the crawl table.
CURI_COLUMN_FAMILY - Static variable in class com.powerset.heritrix.writer.HBaseWriter
The Constant CURI_COLUMN_FAMILY.

G

getClient() - Method in class com.powerset.heritrix.writer.HBaseWriter
Gets the client.
getHostAddress(ProcessorURI) - Method in class com.powerset.heritrix.writer.HBaseWriterProcessor
Return IP address of given URI suitable for recording (as in a classic ARC 5-field header line).
getMaster() - Method in class com.powerset.heritrix.writer.HBaseWriterProcessor
Gets the master.
getMaxActive() - Method in class com.powerset.heritrix.writer.HBaseWriterProcessor
Gets the max active.
getMaxWait() - Method in class com.powerset.heritrix.writer.HBaseWriterProcessor
Gets the max wait.
getPool() - Method in class com.powerset.heritrix.writer.HBaseWriterProcessor
Gets the pool.
getTable() - Method in class com.powerset.heritrix.writer.HBaseWriterProcessor
Gets the table.
getTotalBytesWritten() - Method in class com.powerset.heritrix.writer.HBaseWriterProcessor
Gets the total bytes written.

H

HBaseWriter - Class in com.powerset.heritrix.writer
Write to HBase.
HBaseWriter(String, String) - Constructor for class com.powerset.heritrix.writer.HBaseWriter
Instantiates a new h base writer.
HBaseWriterPool - Class in com.powerset.heritrix.writer
A pool of HBaseWriters.
HBaseWriterPool(String, String, int, int) - Constructor for class com.powerset.heritrix.writer.HBaseWriterPool
Constructor.
HBaseWriterProcessor - Class in com.powerset.heritrix.writer
An heritrix2 processor that writes to Hadoop HBase.
HBaseWriterProcessor() - Constructor for class com.powerset.heritrix.writer.HBaseWriterProcessor
Instantiates a new h base writer processor.

I

initialTasks(StateProvider) - Method in class com.powerset.heritrix.writer.HBaseWriterProcessor
 
innerProcess(ProcessorURI) - Method in class com.powerset.heritrix.writer.HBaseWriterProcessor
 
innerProcessResult(ProcessorURI) - Method in class com.powerset.heritrix.writer.HBaseWriterProcessor
 

M

MASTER - Static variable in class com.powerset.heritrix.writer.HBaseWriterProcessor
Location of hbase master.

P

POOL_MAX_ACTIVE - Static variable in class com.powerset.heritrix.writer.HBaseWriterProcessor
Maximum active files in pool.
POOL_MAX_WAIT - Static variable in class com.powerset.heritrix.writer.HBaseWriterProcessor
Maximum time to wait on pool element (milliseconds).
PROCESS_ONLY_NEW_RECORDS - Static variable in class com.powerset.heritrix.writer.HBaseWriterProcessor
If set to true, then only fetch & process urls that are new rowkey records.
processContent(BatchUpdate) - Method in class com.powerset.heritrix.writer.HBaseWriter
This is a stub method and is here to allow extension/overriding for custom content parsing, data manipulation and to populate new columns.

S

SERVER_CACHE - Static variable in class com.powerset.heritrix.writer.HBaseWriterProcessor
The Constant SERVER_CACHE.
setPool(WriterPool) - Method in class com.powerset.heritrix.writer.HBaseWriterProcessor
Sets the pool.
setTotalBytesWritten(long) - Method in class com.powerset.heritrix.writer.HBaseWriterProcessor
Sets the total bytes written.
setupPool() - Method in class com.powerset.heritrix.writer.HBaseWriterProcessor
Setup pool.
shouldProcess(ProcessorURI) - Method in class com.powerset.heritrix.writer.HBaseWriterProcessor
 
shouldWrite(ProcessorURI) - Method in class com.powerset.heritrix.writer.HBaseWriterProcessor
Whether the given ProcessorURI should be written to archive files.

T

TABLE - Static variable in class com.powerset.heritrix.writer.HBaseWriterProcessor
HBase tableName to crawl into.
TOTAL_BYTES_TO_WRITE - Static variable in class com.powerset.heritrix.writer.HBaseWriterProcessor
Total file bytes to write to disk.

W

write(ProcessorURI, String, RecordingOutputStream, RecordingInputStream) - Method in class com.powerset.heritrix.writer.HBaseWriter
Write.
write(ProcessorURI, long, InputStream, String) - Method in class com.powerset.heritrix.writer.HBaseWriterProcessor
Write.
WRITE_ONLY_NEW_RECORDS - Static variable in class com.powerset.heritrix.writer.HBaseWriterProcessor
If set to true, then only write urls that are new rowkey records.

C G H I M P S T W

Copyright © 2007-2009. All Rights Reserved.