Package
Class
Use
Tree
Deprecated
Index
Help
PREV NEXT
FRAMES
NO FRAMES
All Classes
C
G
H
I
M
P
S
T
W
C
checkBytesWritten(StateProvider)
- Method in class com.powerset.heritrix.writer.
HBaseWriterProcessor
Check bytes written.
close()
- Method in class com.powerset.heritrix.writer.
HBaseWriterProcessor
com.powerset.heritrix.writer
- package com.powerset.heritrix.writer
Provides
HBase
writer for
heritrix
.
CONTENT_COLUMN
- Static variable in class com.powerset.heritrix.writer.
HBaseWriter
The Constant CONTENT_COLUMN.
CONTENT_COLUMN_FAMILY
- Static variable in class com.powerset.heritrix.writer.
HBaseWriter
The Constant CONTENT_COLUMN_FAMILY.
CONTENT_MAX_SIZE
- Static variable in class com.powerset.heritrix.writer.
HBaseWriterProcessor
Maximum allowable content size.
createCrawlTable(HBaseConfiguration, String)
- Method in class com.powerset.heritrix.writer.
HBaseWriter
Creates the crawl table.
CURI_COLUMN_FAMILY
- Static variable in class com.powerset.heritrix.writer.
HBaseWriter
The Constant CURI_COLUMN_FAMILY.
G
getClient()
- Method in class com.powerset.heritrix.writer.
HBaseWriter
Gets the client.
getHostAddress(ProcessorURI)
- Method in class com.powerset.heritrix.writer.
HBaseWriterProcessor
Return IP address of given URI suitable for recording (as in a classic ARC 5-field header line).
getMaster()
- Method in class com.powerset.heritrix.writer.
HBaseWriterProcessor
Gets the master.
getMaxActive()
- Method in class com.powerset.heritrix.writer.
HBaseWriterProcessor
Gets the max active.
getMaxWait()
- Method in class com.powerset.heritrix.writer.
HBaseWriterProcessor
Gets the max wait.
getPool()
- Method in class com.powerset.heritrix.writer.
HBaseWriterProcessor
Gets the pool.
getTable()
- Method in class com.powerset.heritrix.writer.
HBaseWriterProcessor
Gets the table.
getTotalBytesWritten()
- Method in class com.powerset.heritrix.writer.
HBaseWriterProcessor
Gets the total bytes written.
H
HBaseWriter
- Class in
com.powerset.heritrix.writer
Write to HBase.
HBaseWriter(String, String)
- Constructor for class com.powerset.heritrix.writer.
HBaseWriter
Instantiates a new h base writer.
HBaseWriterPool
- Class in
com.powerset.heritrix.writer
A pool of HBaseWriters.
HBaseWriterPool(String, String, int, int)
- Constructor for class com.powerset.heritrix.writer.
HBaseWriterPool
Constructor.
HBaseWriterProcessor
- Class in
com.powerset.heritrix.writer
An
heritrix2
processor that writes to
Hadoop HBase
.
HBaseWriterProcessor()
- Constructor for class com.powerset.heritrix.writer.
HBaseWriterProcessor
Instantiates a new h base writer processor.
I
initialTasks(StateProvider)
- Method in class com.powerset.heritrix.writer.
HBaseWriterProcessor
innerProcess(ProcessorURI)
- Method in class com.powerset.heritrix.writer.
HBaseWriterProcessor
innerProcessResult(ProcessorURI)
- Method in class com.powerset.heritrix.writer.
HBaseWriterProcessor
M
MASTER
- Static variable in class com.powerset.heritrix.writer.
HBaseWriterProcessor
Location of hbase master.
P
POOL_MAX_ACTIVE
- Static variable in class com.powerset.heritrix.writer.
HBaseWriterProcessor
Maximum active files in pool.
POOL_MAX_WAIT
- Static variable in class com.powerset.heritrix.writer.
HBaseWriterProcessor
Maximum time to wait on pool element (milliseconds).
PROCESS_ONLY_NEW_RECORDS
- Static variable in class com.powerset.heritrix.writer.
HBaseWriterProcessor
If set to true, then only fetch & process urls that are new rowkey records.
processContent(BatchUpdate)
- Method in class com.powerset.heritrix.writer.
HBaseWriter
This is a stub method and is here to allow extension/overriding for custom content parsing, data manipulation and to populate new columns.
S
SERVER_CACHE
- Static variable in class com.powerset.heritrix.writer.
HBaseWriterProcessor
The Constant SERVER_CACHE.
setPool(WriterPool)
- Method in class com.powerset.heritrix.writer.
HBaseWriterProcessor
Sets the pool.
setTotalBytesWritten(long)
- Method in class com.powerset.heritrix.writer.
HBaseWriterProcessor
Sets the total bytes written.
setupPool()
- Method in class com.powerset.heritrix.writer.
HBaseWriterProcessor
Setup pool.
shouldProcess(ProcessorURI)
- Method in class com.powerset.heritrix.writer.
HBaseWriterProcessor
shouldWrite(ProcessorURI)
- Method in class com.powerset.heritrix.writer.
HBaseWriterProcessor
Whether the given ProcessorURI should be written to archive files.
T
TABLE
- Static variable in class com.powerset.heritrix.writer.
HBaseWriterProcessor
HBase tableName to crawl into.
TOTAL_BYTES_TO_WRITE
- Static variable in class com.powerset.heritrix.writer.
HBaseWriterProcessor
Total file bytes to write to disk.
W
write(ProcessorURI, String, RecordingOutputStream, RecordingInputStream)
- Method in class com.powerset.heritrix.writer.
HBaseWriter
Write.
write(ProcessorURI, long, InputStream, String)
- Method in class com.powerset.heritrix.writer.
HBaseWriterProcessor
Write.
WRITE_ONLY_NEW_RECORDS
- Static variable in class com.powerset.heritrix.writer.
HBaseWriterProcessor
If set to true, then only write urls that are new rowkey records.
C
G
H
I
M
P
S
T
W
Package
Class
Use
Tree
Deprecated
Index
Help
PREV NEXT
FRAMES
NO FRAMES
All Classes
Copyright © 2007-2009. All Rights Reserved.