View Javadoc
1   /**
2    * 
3   		  GNU LESSER GENERAL PUBLIC LICENSE
4   		       Version 2.1, February 1999
5   
6    Copyright (C) 1991, 1999 Free Software Foundation, Inc.
7        59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
8    Everyone is permitted to copy and distribute verbatim copies
9    of this license document, but changing it is not allowed.
10  
11  [This is the first released version of the Lesser GPL.  It also counts
12   as the successor of the GNU Library Public License, version 2, hence
13   the version number 2.1.]
14  
15  			    Preamble
16  
17    The licenses for most software are designed to take away your
18  freedom to share and change it.  By contrast, the GNU General Public
19  Licenses are intended to guarantee your freedom to share and change
20  free software--to make sure the software is free for all its users.
21  
22    This license, the Lesser General Public License, applies to some
23  specially designated software packages--typically libraries--of the
24  Free Software Foundation and other authors who decide to use it.  You
25  can use it too, but we suggest you first think carefully about whether
26  this license or the ordinary General Public License is the better
27  strategy to use in any particular case, based on the explanations below.
28  
29    When we speak of free software, we are referring to freedom of use,
30  not price.  Our General Public Licenses are designed to make sure that
31  you have the freedom to distribute copies of free software (and charge
32  for this service if you wish); that you receive source code or can get
33  it if you want it; that you can change the software and use pieces of
34  it in new free programs; and that you are informed that you can do
35  these things.
36  
37    To protect your rights, we need to make restrictions that forbid
38  distributors to deny you these rights or to ask you to surrender these
39  rights.  These restrictions translate to certain responsibilities for
40  you if you distribute copies of the library or if you modify it.
41  
42    For example, if you distribute copies of the library, whether gratis
43  or for a fee, you must give the recipients all the rights that we gave
44  you.  You must make sure that they, too, receive or can get the source
45  code.  If you link other code with the library, you must provide
46  complete object files to the recipients, so that they can relink them
47  with the library after making changes to the library and recompiling
48  it.  And you must show them these terms so they know their rights.
49  
50    We protect your rights with a two-step method: (1) we copyright the
51  library, and (2) we offer you this license, which gives you legal
52  permission to copy, distribute and/or modify the library.
53  
54    To protect each distributor, we want to make it very clear that
55  there is no warranty for the free library.  Also, if the library is
56  modified by someone else and passed on, the recipients should know
57  that what they have is not the original version, so that the original
58  author's reputation will not be affected by problems that might be
59  introduced by others.
60  
61    Finally, software patents pose a constant threat to the existence of
62  any free program.  We wish to make sure that a company cannot
63  effectively restrict the users of a free program by obtaining a
64  restrictive license from a patent holder.  Therefore, we insist that
65  any patent license obtained for a version of the library must be
66  consistent with the full freedom of use specified in this license.
67  
68    Most GNU software, including some libraries, is covered by the
69  ordinary GNU General Public License.  This license, the GNU Lesser
70  General Public License, applies to certain designated libraries, and
71  is quite different from the ordinary General Public License.  We use
72  this license for certain libraries in order to permit linking those
73  libraries into non-free programs.
74  
75    When a program is linked with a library, whether statically or using
76  a shared library, the combination of the two is legally speaking a
77  combined work, a derivative of the original library.  The ordinary
78  General Public License therefore permits such linking only if the
79  entire combination fits its criteria of freedom.  The Lesser General
80  Public License permits more lax criteria for linking other code with
81  the library.
82  
83    We call this license the "Lesser" General Public License because it
84  does Less to protect the user's freedom than the ordinary General
85  Public License.  It also provides other free software developers Less
86  of an advantage over competing non-free programs.  These disadvantages
87  are the reason we use the ordinary General Public License for many
88  libraries.  However, the Lesser license provides advantages in certain
89  special circumstances.
90  
91    For example, on rare occasions, there may be a special need to
92  encourage the widest possible use of a certain library, so that it becomes
93  a de-facto standard.  To achieve this, non-free programs must be
94  allowed to use the library.  A more frequent case is that a free
95  library does the same job as widely used non-free libraries.  In this
96  case, there is little to gain by limiting the free library to free
97  software only, so we use the Lesser General Public License.
98  
99    In other cases, permission to use a particular library in non-free
100 programs enables a greater number of people to use a large body of
101 free software.  For example, permission to use the GNU C Library in
102 non-free programs enables many more people to use the whole GNU
103 operating system, as well as its variant, the GNU/Linux operating
104 system.
105 
106   Although the Lesser General Public License is Less protective of the
107 users' freedom, it does ensure that the user of a program that is
108 linked with the Library has the freedom and the wherewithal to run
109 that program using a modified version of the Library.
110 
111   The precise terms and conditions for copying, distribution and
112 modification follow.  Pay close attention to the difference between a
113 "work based on the library" and a "work that uses the library".  The
114 former contains code derived from the library, whereas the latter must
115 be combined with the library in order to run.
116 
117 		  GNU LESSER GENERAL PUBLIC LICENSE
118    TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
119 
120   0. This License Agreement applies to any software library or other
121 program which contains a notice placed by the copyright holder or
122 other authorized party saying it may be distributed under the terms of
123 this Lesser General Public License (also called "this License").
124 Each licensee is addressed as "you".
125 
126   A "library" means a collection of software functions and/or data
127 prepared so as to be conveniently linked with application programs
128 (which use some of those functions and data) to form executables.
129 
130   The "Library", below, refers to any such software library or work
131 which has been distributed under these terms.  A "work based on the
132 Library" means either the Library or any derivative work under
133 copyright law: that is to say, a work containing the Library or a
134 portion of it, either verbatim or with modifications and/or translated
135 straightforwardly into another language.  (Hereinafter, translation is
136 included without limitation in the term "modification".)
137 
138   "Source code" for a work means the preferred form of the work for
139 making modifications to it.  For a library, complete source code means
140 all the source code for all modules it contains, plus any associated
141 interface definition files, plus the scripts used to control compilation
142 and installation of the library.
143 
144   Activities other than copying, distribution and modification are not
145 covered by this License; they are outside its scope.  The act of
146 running a program using the Library is not restricted, and output from
147 such a program is covered only if its contents constitute a work based
148 on the Library (independent of the use of the Library in a tool for
149 writing it).  Whether that is true depends on what the Library does
150 and what the program that uses the Library does.
151   
152   1. You may copy and distribute verbatim copies of the Library's
153 complete source code as you receive it, in any medium, provided that
154 you conspicuously and appropriately publish on each copy an
155 appropriate copyright notice and disclaimer of warranty; keep intact
156 all the notices that refer to this License and to the absence of any
157 warranty; and distribute a copy of this License along with the
158 Library.
159 
160   You may charge a fee for the physical act of transferring a copy,
161 and you may at your option offer warranty protection in exchange for a
162 fee.
163 
164   2. You may modify your copy or copies of the Library or any portion
165 of it, thus forming a work based on the Library, and copy and
166 distribute such modifications or work under the terms of Section 1
167 above, provided that you also meet all of these conditions:
168 
169     a) The modified work must itself be a software library.
170 
171     b) You must cause the files modified to carry prominent notices
172     stating that you changed the files and the date of any change.
173 
174     c) You must cause the whole of the work to be licensed at no
175     charge to all third parties under the terms of this License.
176 
177     d) If a facility in the modified Library refers to a function or a
178     table of data to be supplied by an application program that uses
179     the facility, other than as an argument passed when the facility
180     is invoked, then you must make a good faith effort to ensure that,
181     in the event an application does not supply such function or
182     table, the facility still operates, and performs whatever part of
183     its purpose remains meaningful.
184 
185     (For example, a function in a library to compute square roots has
186     a purpose that is entirely well-defined independent of the
187     application.  Therefore, Subsection 2d requires that any
188     application-supplied function or table used by this function must
189     be optional: if the application does not supply it, the square
190     root function must still compute square roots.)
191 
192 These requirements apply to the modified work as a whole.  If
193 identifiable sections of that work are not derived from the Library,
194 and can be reasonably considered independent and separate works in
195 themselves, then this License, and its terms, do not apply to those
196 sections when you distribute them as separate works.  But when you
197 distribute the same sections as part of a whole which is a work based
198 on the Library, the distribution of the whole must be on the terms of
199 this License, whose permissions for other licensees extend to the
200 entire whole, and thus to each and every part regardless of who wrote
201 it.
202 
203 Thus, it is not the intent of this section to claim rights or contest
204 your rights to work written entirely by you; rather, the intent is to
205 exercise the right to control the distribution of derivative or
206 collective works based on the Library.
207 
208 In addition, mere aggregation of another work not based on the Library
209 with the Library (or with a work based on the Library) on a volume of
210 a storage or distribution medium does not bring the other work under
211 the scope of this License.
212 
213   3. You may opt to apply the terms of the ordinary GNU General Public
214 License instead of this License to a given copy of the Library.  To do
215 this, you must alter all the notices that refer to this License, so
216 that they refer to the ordinary GNU General Public License, version 2,
217 instead of to this License.  (If a newer version than version 2 of the
218 ordinary GNU General Public License has appeared, then you can specify
219 that version instead if you wish.)  Do not make any other change in
220 these notices.
221 
222   Once this change is made in a given copy, it is irreversible for
223 that copy, so the ordinary GNU General Public License applies to all
224 subsequent copies and derivative works made from that copy.
225 
226   This option is useful when you wish to copy part of the code of
227 the Library into a program that is not a library.
228 
229   4. You may copy and distribute the Library (or a portion or
230 derivative of it, under Section 2) in object code or executable form
231 under the terms of Sections 1 and 2 above provided that you accompany
232 it with the complete corresponding machine-readable source code, which
233 must be distributed under the terms of Sections 1 and 2 above on a
234 medium customarily used for software interchange.
235 
236   If distribution of object code is made by offering access to copy
237 from a designated place, then offering equivalent access to copy the
238 source code from the same place satisfies the requirement to
239 distribute the source code, even though third parties are not
240 compelled to copy the source along with the object code.
241 
242   5. A program that contains no derivative of any portion of the
243 Library, but is designed to work with the Library by being compiled or
244 linked with it, is called a "work that uses the Library".  Such a
245 work, in isolation, is not a derivative work of the Library, and
246 therefore falls outside the scope of this License.
247 
248   However, linking a "work that uses the Library" with the Library
249 creates an executable that is a derivative of the Library (because it
250 contains portions of the Library), rather than a "work that uses the
251 library".  The executable is therefore covered by this License.
252 Section 6 states terms for distribution of such executables.
253 
254   When a "work that uses the Library" uses material from a header file
255 that is part of the Library, the object code for the work may be a
256 derivative work of the Library even though the source code is not.
257 Whether this is true is especially significant if the work can be
258 linked without the Library, or if the work is itself a library.  The
259 threshold for this to be true is not precisely defined by law.
260 
261   If such an object file uses only numerical parameters, data
262 structure layouts and accessors, and small macros and small inline
263 functions (ten lines or less in length), then the use of the object
264 file is unrestricted, regardless of whether it is legally a derivative
265 work.  (Executables containing this object code plus portions of the
266 Library will still fall under Section 6.)
267 
268   Otherwise, if the work is a derivative of the Library, you may
269 distribute the object code for the work under the terms of Section 6.
270 Any executables containing that work also fall under Section 6,
271 whether or not they are linked directly with the Library itself.
272 
273   6. As an exception to the Sections above, you may also combine or
274 link a "work that uses the Library" with the Library to produce a
275 work containing portions of the Library, and distribute that work
276 under terms of your choice, provided that the terms permit
277 modification of the work for the customer's own use and reverse
278 engineering for debugging such modifications.
279 
280   You must give prominent notice with each copy of the work that the
281 Library is used in it and that the Library and its use are covered by
282 this License.  You must supply a copy of this License.  If the work
283 during execution displays copyright notices, you must include the
284 copyright notice for the Library among them, as well as a reference
285 directing the user to the copy of this License.  Also, you must do one
286 of these things:
287 
288     a) Accompany the work with the complete corresponding
289     machine-readable source code for the Library including whatever
290     changes were used in the work (which must be distributed under
291     Sections 1 and 2 above); and, if the work is an executable linked
292     with the Library, with the complete machine-readable "work that
293     uses the Library", as object code and/or source code, so that the
294     user can modify the Library and then relink to produce a modified
295     executable containing the modified Library.  (It is understood
296     that the user who changes the contents of definitions files in the
297     Library will not necessarily be able to recompile the application
298     to use the modified definitions.)
299 
300     b) Use a suitable shared library mechanism for linking with the
301     Library.  A suitable mechanism is one that (1) uses at run time a
302     copy of the library already present on the user's computer system,
303     rather than copying library functions into the executable, and (2)
304     will operate properly with a modified version of the library, if
305     the user installs one, as long as the modified version is
306     interface-compatible with the version that the work was made with.
307 
308     c) Accompany the work with a written offer, valid for at
309     least three years, to give the same user the materials
310     specified in Subsection 6a, above, for a charge no more
311     than the cost of performing this distribution.
312 
313     d) If distribution of the work is made by offering access to copy
314     from a designated place, offer equivalent access to copy the above
315     specified materials from the same place.
316 
317     e) Verify that the user has already received a copy of these
318     materials or that you have already sent this user a copy.
319 
320   For an executable, the required form of the "work that uses the
321 Library" must include any data and utility programs needed for
322 reproducing the executable from it.  However, as a special exception,
323 the materials to be distributed need not include anything that is
324 normally distributed (in either source or binary form) with the major
325 components (compiler, kernel, and so on) of the operating system on
326 which the executable runs, unless that component itself accompanies
327 the executable.
328 
329   It may happen that this requirement contradicts the license
330 restrictions of other proprietary libraries that do not normally
331 accompany the operating system.  Such a contradiction means you cannot
332 use both them and the Library together in an executable that you
333 distribute.
334 
335   7. You may place library facilities that are a work based on the
336 Library side-by-side in a single library together with other library
337 facilities not covered by this License, and distribute such a combined
338 library, provided that the separate distribution of the work based on
339 the Library and of the other library facilities is otherwise
340 permitted, and provided that you do these two things:
341 
342     a) Accompany the combined library with a copy of the same work
343     based on the Library, uncombined with any other library
344     facilities.  This must be distributed under the terms of the
345     Sections above.
346 
347     b) Give prominent notice with the combined library of the fact
348     that part of it is a work based on the Library, and explaining
349     where to find the accompanying uncombined form of the same work.
350 
351   8. You may not copy, modify, sublicense, link with, or distribute
352 the Library except as expressly provided under this License.  Any
353 attempt otherwise to copy, modify, sublicense, link with, or
354 distribute the Library is void, and will automatically terminate your
355 rights under this License.  However, parties who have received copies,
356 or rights, from you under this License will not have their licenses
357 terminated so long as such parties remain in full compliance.
358 
359   9. You are not required to accept this License, since you have not
360 signed it.  However, nothing else grants you permission to modify or
361 distribute the Library or its derivative works.  These actions are
362 prohibited by law if you do not accept this License.  Therefore, by
363 modifying or distributing the Library (or any work based on the
364 Library), you indicate your acceptance of this License to do so, and
365 all its terms and conditions for copying, distributing or modifying
366 the Library or works based on it.
367 
368   10. Each time you redistribute the Library (or any work based on the
369 Library), the recipient automatically receives a license from the
370 original licensor to copy, distribute, link with or modify the Library
371 subject to these terms and conditions.  You may not impose any further
372 restrictions on the recipients' exercise of the rights granted herein.
373 You are not responsible for enforcing compliance by third parties with
374 this License.
375 
376   11. If, as a consequence of a court judgment or allegation of patent
377 infringement or for any other reason (not limited to patent issues),
378 conditions are imposed on you (whether by court order, agreement or
379 otherwise) that contradict the conditions of this License, they do not
380 excuse you from the conditions of this License.  If you cannot
381 distribute so as to satisfy simultaneously your obligations under this
382 License and any other pertinent obligations, then as a consequence you
383 may not distribute the Library at all.  For example, if a patent
384 license would not permit royalty-free redistribution of the Library by
385 all those who receive copies directly or indirectly through you, then
386 the only way you could satisfy both it and this License would be to
387 refrain entirely from distribution of the Library.
388 
389 If any portion of this section is held invalid or unenforceable under any
390 particular circumstance, the balance of the section is intended to apply,
391 and the section as a whole is intended to apply in other circumstances.
392 
393 It is not the purpose of this section to induce you to infringe any
394 patents or other property right claims or to contest validity of any
395 such claims; this section has the sole purpose of protecting the
396 integrity of the free software distribution system which is
397 implemented by public license practices.  Many people have made
398 generous contributions to the wide range of software distributed
399 through that system in reliance on consistent application of that
400 system; it is up to the author/donor to decide if he or she is willing
401 to distribute software through any other system and a licensee cannot
402 impose that choice.
403 
404 This section is intended to make thoroughly clear what is believed to
405 be a consequence of the rest of this License.
406 
407   12. If the distribution and/or use of the Library is restricted in
408 certain countries either by patents or by copyrighted interfaces, the
409 original copyright holder who places the Library under this License may add
410 an explicit geographical distribution limitation excluding those countries,
411 so that distribution is permitted only in or among countries not thus
412 excluded.  In such case, this License incorporates the limitation as if
413 written in the body of this License.
414 
415   13. The Free Software Foundation may publish revised and/or new
416 versions of the Lesser General Public License from time to time.
417 Such new versions will be similar in spirit to the present version,
418 but may differ in detail to address new problems or concerns.
419 
420 Each version is given a distinguishing version number.  If the Library
421 specifies a version number of this License which applies to it and
422 "any later version", you have the option of following the terms and
423 conditions either of that version or of any later version published by
424 the Free Software Foundation.  If the Library does not specify a
425 license version number, you may choose any version ever published by
426 the Free Software Foundation.
427 
428   14. If you wish to incorporate parts of the Library into other free
429 programs whose distribution conditions are incompatible with these,
430 write to the author to ask for permission.  For software which is
431 copyrighted by the Free Software Foundation, write to the Free
432 Software Foundation; we sometimes make exceptions for this.  Our
433 decision will be guided by the two goals of preserving the free status
434 of all derivatives of our free software and of promoting the sharing
435 and reuse of software generally.
436 
437 			    NO WARRANTY
438 
439   15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO
440 WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW.
441 EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR
442 OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY
443 KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE
444 IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
445 PURPOSE.  THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE
446 LIBRARY IS WITH YOU.  SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME
447 THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
448 
449   16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
450 WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY
451 AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU
452 FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR
453 CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE
454 LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING
455 RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A
456 FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF
457 SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH
458 DAMAGES.
459 
460 		     END OF TERMS AND CONDITIONS
461 
462            How to Apply These Terms to Your New Libraries
463 
464   If you develop a new library, and you want it to be of the greatest
465 possible use to the public, we recommend making it free software that
466 everyone can redistribute and change.  You can do so by permitting
467 redistribution under these terms (or, alternatively, under the terms of the
468 ordinary General Public License).
469 
470   To apply these terms, attach the following notices to the library.  It is
471 safest to attach them to the start of each source file to most effectively
472 convey the exclusion of warranty; and each file should have at least the
473 "copyright" line and a pointer to where the full notice is found.
474 
475     <one line to give the library's name and a brief idea of what it does.>
476     Copyright (C) <year>  <name of author>
477 
478     This library is free software; you can redistribute it and/or
479     modify it under the terms of the GNU Lesser General Public
480     License as published by the Free Software Foundation; either
481     version 2.1 of the License, or (at your option) any later version.
482 
483     This library is distributed in the hope that it will be useful,
484     but WITHOUT ANY WARRANTY; without even the implied warranty of
485     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
486     Lesser General Public License for more details.
487 
488     You should have received a copy of the GNU Lesser General Public
489     License along with this library; if not, write to the Free Software
490     Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
491 
492 Also add information on how to contact you by electronic and paper mail.
493 
494 You should also get your employer (if you work as a programmer) or your
495 school, if any, to sign a "copyright disclaimer" for the library, if
496 necessary.  Here is a sample; alter the names:
497 
498   Yoyodyne, Inc., hereby disclaims all copyright interest in the
499   library `Frob' (a library for tweaking knobs) written by James Random Hacker.
500 
501   <signature of Ty Coon>, 1 April 1990
502   Ty Coon, President of Vice
503 
504 That's all there is to it!
505  */
506 
507 package org.archive.io.hbase;
508 
509 import com.google.common.base.Preconditions;
510 
511 // TODO: Auto-generated Javadoc
512 /**
513  * Configures the values of the column family/qualifier used for the crawl. Also
514  * contains a full set of default values that are the same as the previous
515  * Heritrix2 implementation.
516  * 
517  * Meant to be configured within the Spring framework either inline of
518  * HBaseWriterProcessor or as a named bean and references later on.
519  * 
520  * Typically this config code below is placed in a heritrix3.x job config file called: crawler-beans.cxml
521  * This cxml file can be edited directly from the Heritrix3 web-ui.
522  * 
523  * <pre>
524  * {@code
525  * <bean id="hbaseParameterSettings" class="org.archive.io.hbase.HBaseParameters">
526  * 	<!-- These settings are required -->
527  * 	<property name="zkQuorum" value="localhost" />
528  * 	<property name="hbaseTableName" value="crawl" />
529  * 
530  * 	<!-- This should reflect your installation, but 2181 is the default -->
531  * 	<property name="zkPort" value="2181" />
532  * 
533  * 	<!-- All other settings are optional -->
534  * 	<property name="onlyProcessNewRecords" value="false" />
535  * 	<property name="onlyWriteNewRecords" value="false" />
536  * 	<property name="contentColumnFamily" value="newcontent" />
537  * 	<!-- Overwrite more options here -->
538  * </bean>
539  * 
540  * <bean id="hbaseWriterProcessor" class="org.archive.modules.writer.HBaseWriterProcessor">
541  * 	<property name="hbaseParameters">
542  * 		 <ref bean="hbaseParameterSettings"/> 
543  * 	</property>
544  * </bean>
545  * 
546  * <bean id="dispositionProcessors" class="org.archive.modules.DispositionChain">
547  * 	<property name="processors">
548  * 		 <list>
549  * 			<ref bean="hbaseWriterProcessor"/>
550  * 			<!-- other references -->
551  * 		</list>
552  * 	 </property>
553  * </bean>
554  * }
555  * </pre>
556  * 
557  * @see org.archive.modules.writer.HBaseWriterProcessor
558  *      {@link org.archive.modules.writer.HBaseWriterProcessor} for a full
559  *      example
560  * 
561  */
562 public class HBaseParameters {
563 
564 	/** DEFAULT OPTIONS *. */
565 	public static final int ZK_PORT = 2181;
566 
567 	public static final String defaultHbaseTableNameSpace = "";
568 
569 	// "n" column family and qualifiers
570 	/** The Constant CONTENT_COLUMN_FAMILY. */
571 	public static final String CONTENT_COLUMN_FAMILY = "n";
572 	
573 	/** The Constant CONTENT_COLUMN_NAME. */
574 	public static final String CONTENT_COLUMN_NAME = "raw";
575 
576 	// "curi" column family and qualifiers
577 	/** The Constant CURI_COLUMN_FAMILY. */
578 	public static final String CURI_COLUMN_FAMILY = "c";
579 	
580 	/** The Constant IP_COLUMN_NAME. */
581 	public static final String IP_COLUMN_NAME = "ip";
582 	
583 	public static final String CONTENT_TYPE_COLUMN_NAME = "ct";
584 
585 	public static final String CONTENT_SIZE_COLUMN_NAME = "cz";
586 
587 	public static final String CONTENT_LENGTH_COLUMN_NAME = "cl";
588 
589 	public static final String FETCH_ATTEMPTS_COLUMN_NAME = "fa";
590 
591 	public static final String FETCH_DURATION_COLUMN_NAME = "fd";
592 
593 	public static final String FETCH_ANNOTATIONS_COLUMN_NAME = "an";
594 
595 	public static final String FETCH_ANNOTATIONS_VALUE_DELIMITER = ", ";
596 
597 	/** The Constant PATH_FROM_SEED_COLUMN_NAME. */
598 	public static final String PATH_FROM_SEED_COLUMN_NAME = "pfs";
599 	
600 	/** The Constant IS_SEED_COLUMN_NAME. */
601 	public static final String IS_SEED_COLUMN_NAME = "is";
602 	
603 	/** The Constant VIA_COLUMN_NAME. */
604 	public static final String VIA_COLUMN_NAME = "via";
605 	
606 	/** The Constant URL_COLUMN_NAME. */
607 	public static final String URL_COLUMN_NAME = "url";
608 	
609 	/** The Constant REQUEST_COLUMN_NAME. */
610 	public static final String REQUEST_COLUMN_NAME = "req";
611 	
612 	/** The Constant DEFAULT_MAX_FILE_SIZE_IN_BYTES. (20 MB) */
613 	public static final long DEFAULT_MAX_FILE_SIZE_IN_BYTES = Long.valueOf(20 * 1024 * 1024).longValue();
614 
615 	// the zk client port name, this has to match what is in hbase-site.xml for
616 	// the clientPort config attribute.
617 	/** The ZOOKEEPER client port. */
618 	public static final String ZOOKEEPER_CLIENT_PORT = "hbase.zookeeper.property.clientPort";
619 
620 	/** ACTUAL OPTIONS INITIALIZED TO DEFAULTS *. */
621 	private String zkQuorum = null;
622 	
623 	/** The zk port. */
624 	private int zkPort = ZK_PORT;
625 	
626 	/** The hbase table name. */
627 	private String hbaseTableName = null;
628 
629 	/** The content column family. */
630 	private String contentColumnFamily = CONTENT_COLUMN_FAMILY;
631 	
632 	/** The content column name. */
633 	private String contentColumnName = CONTENT_COLUMN_NAME;
634 
635 	/** The curi column family. */
636 	private String curiColumnFamily = CURI_COLUMN_FAMILY;
637 	
638 	/** The ip column name. */
639 	private String ipColumnName = IP_COLUMN_NAME;
640 	
641 	private String contentTypeColumnName = CONTENT_TYPE_COLUMN_NAME;
642 
643 	private String contentSizeColumnName = CONTENT_SIZE_COLUMN_NAME;
644 
645 	private String contentLengthColumnName = CONTENT_LENGTH_COLUMN_NAME;
646 
647 	private String fetchAttmptsColumnName = FETCH_ATTEMPTS_COLUMN_NAME;
648 
649 	private String fetchDurationColumnName = FETCH_DURATION_COLUMN_NAME;
650 
651 	private String fetchAnnotationsColumnName = FETCH_ANNOTATIONS_COLUMN_NAME;
652 
653 	private String fetchAnnotationsValueDelimiter = FETCH_ANNOTATIONS_VALUE_DELIMITER;
654 
655 	/** The path from seed column name. */
656 	private String pathFromSeedColumnName = PATH_FROM_SEED_COLUMN_NAME;
657 	
658 	/** The is seed column name. */
659 	private String isSeedColumnName = IS_SEED_COLUMN_NAME;
660 	
661 	/** The via column name. */
662 	private String viaColumnName = VIA_COLUMN_NAME;
663 	
664 	/** The url column name. */
665 	private String urlColumnName = URL_COLUMN_NAME;
666 	
667 	/** The request column name. */
668 	private String requestColumnName = REQUEST_COLUMN_NAME;
669 	
670 	/** The default max file size in bytes. */
671 	private long defaultMaxFileSizeInBytes = DEFAULT_MAX_FILE_SIZE_IN_BYTES;
672 
673 	/** The md5 key. */
674 	private boolean md5Key = false;
675 	
676 	/** The serializer. */
677 	private Serializer serializer = null;
678 
679 	/**
680 	 * Default is false, which will write all urls to the HBase table. If set to
681 	 * true, then only write urls that are new rowkey records. Heritrix is good
682 	 * about not hitting the same url twice, so this feature is to ensure that
683 	 * you can run multiple sessions of the same crawl configuration and not
684 	 * write the same url more than once to the same hbase table. You may just
685 	 * want to crawl a site to see what new urls have been added over time, or
686 	 * continue where you left off on a terminated crawl. Heritrix itself does
687 	 * support this functionality by supporting Heritrix checkpoints during a
688 	 * crawl session, so this options may not be a necessary option if
689 	 * checkpoints work for you.
690 	 */
691 	private boolean onlyWriteNewRecords = false;
692 
693 	/**
694 	 * Default is false, which will process all urls in the HBase table. If set
695 	 * to true, then HBase-Writer will only process urls that are new rowkey
696 	 * records in the table. In this mode, Heritrix wont even fetch and parse
697 	 * the content served at the url if it already exists as a rowkey in the
698 	 * HBase table.
699 	 */
700 	private boolean onlyProcessNewRecords = false;
701 
702 	/**
703 	 * Gets the zk quorum.
704 	 *
705 	 * @return the zk quorum
706 	 */
707 	public String getZkQuorum() {
708 		Preconditions.checkState(zkQuorum != null && !zkQuorum.isEmpty(), getClass().getName() + " instances need zkQuorum parameter set before accessing");
709 		return zkQuorum;
710 	}
711 
712 	/**
713 	 * Sets the zk quorum.
714 	 *
715 	 * @param quorum the new zk quorum
716 	 */
717 	public void setZkQuorum(String quorum) {
718 		zkQuorum = quorum;
719 	}
720 
721 	/**
722 	 * Gets the zk port.
723 	 *
724 	 * @return the zk port
725 	 */
726 	public int getZkPort() {
727 		return zkPort;
728 	}
729 
730 	/**
731 	 * Sets the zk port.
732 	 *
733 	 * @param port the new zk port
734 	 */
735 	public void setZkPort(int port) {
736 		zkPort = port;
737 	}
738 
739 	/**
740 	 * Gets the hbase table name.
741 	 *
742 	 * @return the hbase table name
743 	 */
744 	public String getHbaseTableName() {
745 		Preconditions.checkState(hbaseTableName != null && !hbaseTableName.isEmpty(), getClass().getName()
746 				+ " instances need hbaseTableName parameter set before accessing");
747 		return hbaseTableName;
748 	}
749 
750 	/**
751 	 * Sets the hbase table name.
752 	 *
753 	 * @param tableName the new hbase table name
754 	 */
755 	public void setHbaseTableName(String tableName) {
756 		hbaseTableName = tableName;
757 	}
758 
759 	/**
760 	 * Gets the content column family.
761 	 *
762 	 * @return the content column family
763 	 */
764 	public String getContentColumnFamily() {
765 		return contentColumnFamily;
766 	}
767 
768 	/**
769 	 * Sets the content column family.
770 	 *
771 	 * @param contentColumnFamily the new content column family
772 	 */
773 	public void setContentColumnFamily(String contentColumnFamily) {
774 		this.contentColumnFamily = contentColumnFamily;
775 	}
776 
777 	/**
778 	 * Gets the content column name.
779 	 *
780 	 * @return the content column name
781 	 */
782 	public String getContentColumnName() {
783 		return contentColumnName;
784 	}
785 
786 	/**
787 	 * Sets the content column name.
788 	 *
789 	 * @param contentColumnName the new content column name
790 	 */
791 	public void setContentColumnName(String contentColumnName) {
792 		this.contentColumnName = contentColumnName;
793 	}
794 
795 	/**
796 	 * Gets the curi column family.
797 	 *
798 	 * @return the curi column family
799 	 */
800 	public String getCuriColumnFamily() {
801 		return curiColumnFamily;
802 	}
803 
804 	/**
805 	 * Sets the curi column family.
806 	 *
807 	 * @param curiColumnFamily the new curi column family
808 	 */
809 	public void setCuriColumnFamily(String curiColumnFamily) {
810 		this.curiColumnFamily = curiColumnFamily;
811 	}
812 
813 	/**
814 	 * Gets the ip column name.
815 	 *
816 	 * @return the ip column name
817 	 */
818 	public String getIpColumnName() {
819 		return ipColumnName;
820 	}
821 
822 	/**
823 	 * Sets the ip column name.
824 	 *
825 	 * @param ipColumnName the new ip column name
826 	 */
827 	public void setIpColumnName(String ipColumnName) {
828 		this.ipColumnName = ipColumnName;
829 	}
830 
831 	/**
832 	 * Gets the path from seed column name.
833 	 *
834 	 * @return the path from seed column name
835 	 */
836 	public String getPathFromSeedColumnName() {
837 		return pathFromSeedColumnName;
838 	}
839 
840 	/**
841 	 * Sets the path from seed column name.
842 	 *
843 	 * @param pathFromSeedColumnName the new path from seed column name
844 	 */
845 	public void setPathFromSeedColumnName(String pathFromSeedColumnName) {
846 		this.pathFromSeedColumnName = pathFromSeedColumnName;
847 	}
848 
849 	/**
850 	 * Gets the checks if is seed column name.
851 	 *
852 	 * @return the checks if is seed column name
853 	 */
854 	public String getIsSeedColumnName() {
855 		return isSeedColumnName;
856 	}
857 
858 	/**
859 	 * Sets the checks if is seed column name.
860 	 *
861 	 * @param isSeedColumnName the new checks if is seed column name
862 	 */
863 	public void setIsSeedColumnName(String isSeedColumnName) {
864 		this.isSeedColumnName = isSeedColumnName;
865 	}
866 
867 	/**
868 	 * Gets the via column name.
869 	 *
870 	 * @return the via column name
871 	 */
872 	public String getViaColumnName() {
873 		return viaColumnName;
874 	}
875 
876 	/**
877 	 * Sets the via column name.
878 	 *
879 	 * @param viaColumnName the new via column name
880 	 */
881 	public void setViaColumnName(String viaColumnName) {
882 		this.viaColumnName = viaColumnName;
883 	}
884 
885 	/**
886 	 * Gets the url column name.
887 	 *
888 	 * @return the url column name
889 	 */
890 	public String getUrlColumnName() {
891 		return urlColumnName;
892 	}
893 
894 	/**
895 	 * Sets the url column name.
896 	 *
897 	 * @param urlColumnName the new url column name
898 	 */
899 	public void setUrlColumnName(String urlColumnName) {
900 		this.urlColumnName = urlColumnName;
901 	}
902 
903 	/**
904 	 * Gets the request column name.
905 	 *
906 	 * @return the request column name
907 	 */
908 	public String getRequestColumnName() {
909 		return requestColumnName;
910 	}
911 
912 	/**
913 	 * Sets the request column name.
914 	 *
915 	 * @param requestColumnName the new request column name
916 	 */
917 	public void setRequestColumnName(String requestColumnName) {
918 		this.requestColumnName = requestColumnName;
919 	}
920 
921 	/**
922 	 * Gets the zookeeper client port key.
923 	 *
924 	 * @return the zookeeper client port key
925 	 */
926 	public String getZookeeperClientPortKey() {
927 		return ZOOKEEPER_CLIENT_PORT;
928 	}
929 
930 	/**
931 	 * Gets the serializer.
932 	 *
933 	 * @return the serializer
934 	 */
935 	public Serializer getSerializer() {
936 		return serializer;
937 	}
938 
939 	/**
940 	 * Sets the serializer.
941 	 *
942 	 * @param serializer the new serializer
943 	 */
944 	public void setSerializer(Serializer serializer) {
945 		this.serializer = serializer;
946 	}
947 
948 	/**
949 	 * Checks if is md5 key.
950 	 *
951 	 * @return true, if is md5 key
952 	 */
953 	public boolean isMd5Key() {
954 		return this.md5Key;
955 	}
956 
957 	/**
958 	 * Sets the md5 key.
959 	 *
960 	 * @param md5Key the new md5 key
961 	 */
962 	public void setMd5Key(boolean md5Key) {
963 		this.md5Key = md5Key;
964 	}
965 
966 	/**
967 	 * Checks if is only write new records.
968 	 *
969 	 * @return true, if is only write new records
970 	 */
971 	public boolean isOnlyWriteNewRecords() {
972 		return onlyWriteNewRecords;
973 	}
974 
975 	/**
976 	 * Sets the only write new records.
977 	 *
978 	 * @param onlyWriteNewRecords the new only write new records
979 	 */
980 	public void setOnlyWriteNewRecords(boolean onlyWriteNewRecords) {
981 		this.onlyWriteNewRecords = onlyWriteNewRecords;
982 	}
983 
984 	/**
985 	 * Checks if is only process new records.
986 	 *
987 	 * @return true, if is only process new records
988 	 */
989 	public boolean isOnlyProcessNewRecords() {
990 		return onlyProcessNewRecords;
991 	}
992 
993 	/**
994 	 * Sets the only process new records.
995 	 *
996 	 * @param onlyProcessNewRecords the new only process new records
997 	 */
998 	public void setOnlyProcessNewRecords(boolean onlyProcessNewRecords) {
999 		this.onlyProcessNewRecords = onlyProcessNewRecords;
1000 	}
1001 
1002 	/**
1003 	 * Gets the default max file size in bytes.
1004 	 *
1005 	 * @return the default max file size in bytes
1006 	 */
1007 	public long getDefaultMaxFileSizeInBytes() {
1008 		return defaultMaxFileSizeInBytes;
1009 	}
1010 
1011 	/**
1012 	 * Sets the default max file size in bytes.
1013 	 *
1014 	 * @param defaultMaxFileSizeInBytes the new default max file size in bytes
1015 	 */
1016 	public void setDefaultMaxFileSizeInBytes(long defaultMaxFileSizeInBytes) {
1017 		this.defaultMaxFileSizeInBytes = defaultMaxFileSizeInBytes;
1018 	}
1019 
1020 	public String getContentTypeColumnName() {
1021 		return contentTypeColumnName;
1022 	}
1023 
1024 	public void setContentTypeColumnName(String contentTypeColumnName) {
1025 		this.contentTypeColumnName = contentTypeColumnName;
1026 	}
1027 
1028 	public String getContentSizeColumnName() {
1029 		return contentSizeColumnName;
1030 	}
1031 
1032 	public void setContentSizeColumnName(String contentSizeColumnName) {
1033 		this.contentSizeColumnName = contentSizeColumnName;
1034 	}
1035 
1036 	public String getFetchAttmptsColumnName() {
1037 		return fetchAttmptsColumnName;
1038 	}
1039 
1040 	public void setFetchAttmptsColumnName(String fetchAttmptsColumnName) {
1041 		this.fetchAttmptsColumnName = fetchAttmptsColumnName;
1042 	}
1043 
1044 	public String getFetchDurationColumnName() {
1045 		return fetchDurationColumnName;
1046 	}
1047 
1048 	public void setFetchDurationColumnName(String fetchDurationColumnName) {
1049 		this.fetchDurationColumnName = fetchDurationColumnName;
1050 	}
1051 
1052 	public String getFetchAnnotationsColumnName() {
1053 		return fetchAnnotationsColumnName;
1054 	}
1055 
1056 	public void setFetchAnnotationsColumnName(String fetchAnnotationsColumnName) {
1057 		this.fetchAnnotationsColumnName = fetchAnnotationsColumnName;
1058 	}
1059 
1060 	public String getContentLengthColumnName() {
1061 		return contentLengthColumnName;
1062 	}
1063 
1064 	public void setContentLengthColumnName(String contentLengthColumnName) {
1065 		this.contentLengthColumnName = contentLengthColumnName;
1066 	}
1067 
1068 	public String getFetchAnnotationsValueDelimiter() {
1069 		return fetchAnnotationsValueDelimiter;
1070 	}
1071 
1072 	public void setFetchAnnotationsValueDelimiter(String fetchAnnotationsValueDelimiter) {
1073 		this.fetchAnnotationsValueDelimiter = fetchAnnotationsValueDelimiter;
1074 	}
1075 
1076 }