Interface IDocumentPartitionerExtension2
- All Known Implementing Classes:
DefaultPartitioner
,FastPartitioner
,RuleBasedPartitioner
IDocumentPartitioner
.
Extends the original concept of a document partitioner to answer the position categories that are used to manage the partitioning information.
This extension also introduces the concept of open and delimited partitions. A delimited partition has a predefined textual token delimiting its start and end, while an open partition can fill any space between two delimited partitions.
An open partition of length zero can occur between two delimited partitions, thus having the same offset as the following delimited partition. The document start and end are considered to be delimiters of open partitions, i.e. there may be a zero-length partition between the document start and a delimited partition starting at offset 0.
- Since:
- 3.0
-
Method Summary
Modifier and TypeMethodDescriptioncomputePartitioning
(int offset, int length, boolean includeZeroLengthPartitions) Returns the partitioning of the given range of the connected document.getContentType
(int offset, boolean preferOpenPartitions) Returns the content type of the partition containing the given offset in the connected document.String[]
Returns the position categories that this partitioners uses in order to manage the partitioning information of the documents.getPartition
(int offset, boolean preferOpenPartitions) Returns the partition containing the given offset of the connected document.
-
Method Details
-
getManagingPositionCategories
String[] getManagingPositionCategories()Returns the position categories that this partitioners uses in order to manage the partitioning information of the documents. Returnsnull
if no position category is used.- Returns:
- the position categories used to manage partitioning information or
null
-
getContentType
Returns the content type of the partition containing the given offset in the connected document. There must be a document connected to this partitioner.If
preferOpenPartitions
istrue
, precedence is given to an open partition ending atoffset
over a delimited partition starting atoffset
.This method replaces
IDocumentPartitioner.getContentType(int)
and behaves like it whenprepreferOpenPartitions
isfalse
, i.e. precedence is always given to the partition that does not end atoffset
.- Parameters:
offset
- the offset in the connected documentpreferOpenPartitions
-true
if precedence should be given to a open partition ending atoffset
over a delimited partition starting atoffset
- Returns:
- the content type of the offset's partition
-
getPartition
Returns the partition containing the given offset of the connected document. There must be a document connected to this partitioner.If
preferOpenPartitions
istrue
, precedence is given to an open partition ending atoffset
over a delimited partition starting atoffset
.This method replaces
IDocumentPartitioner.getPartition(int)
and behaves like it whenpreferOpenPartitions
isfalse
, i.e. precedence is always given to the partition that does not end atoffset
.- Parameters:
offset
- the offset for which to determine the partitionpreferOpenPartitions
-true
if precedence should be given to a open partition ending atoffset
over a delimited partition starting atoffset
- Returns:
- the partition containing the offset
-
computePartitioning
Returns the partitioning of the given range of the connected document. There must be a document connected to this partitioner.If
includeZeroLengthPartitions
istrue
, a zero-length partition of an open partition type (usually the default partition) is included between two delimited partitions. If it isfalse
, no zero-length partitions are included.This method replaces
IDocumentPartitioner.computePartitioning(int, int)
and behaves like it whenincludeZeroLengthPartitions
isfalse
.- Parameters:
offset
- the offset of the range of interestlength
- the length of the range of interestincludeZeroLengthPartitions
-true
if zero-length partitions should be returned as part of the computed partitioning- Returns:
- the partitioning of the range
-