ClusterPartitioner

We assume that you have already learned what is described in:

Short Summary

ClusterPartitioner distributes individual input data records among different Cluster nodes.

Component Same input metadata Sorted inputs Inputs Outputs Java CTL
ClusterPartitioner-no11-n (virtual)yes/no1)yes/no1)

Legend

1) ClusterPartitioner can use either transformation or other two attributes (Ranges and/or Partition key). A transformation must be defined unless at least one of these is specified.

Abstract

ClusterPartitioner distributes individual input data records among different Cluster nodes.

To distribute data records, user-defined transformation, ranges of Partition key or RoundRobin algorithm may be used. Ranges of Partition key are either those specified in the Ranges attribute or calculated hash values. It uses a CTL template for Partition or implements a PartitionFunction interface. Its methods are listed below. In this component no mapping is defined since it does not change input data records. It only distributes them.

Icon

Ports

Port typeNumberRequiredDescriptionMetadata
Input0yesFor input data recordsAny
Output0yesFor output data recordsInput 01)

Legend:

1): Metadata can be propagated through this component.

Partition Attributes

AttributeReqDescriptionPossible values
Basic
Partition1)Definition of the way how records should be distributed among Cluster nodes written in the graph in CTL or Java. 
Partition URL1)Name of external file, including path, containing the definition of the way how records should be distributed among Cluster nodes written in CTL or Java. 
Partition class1)Name of external class defining the way how records should be distributed among Cluster nodes. 
Ranges1),2)Ranges expressed as a sequence of individual ranges separated from each other by semicolon. Each individual range is a sequence of intervals for some set of fields that are adjacent to each other without any delimiter. It is expressed also whether the minimum and maximum margin is included to the interval or not by bracket and parenthesis, respectively. Example of Ranges: <1,9)(,31.12.2008);<1,9)<31.12.2008,);<9,)(,31.12.2008); <9,)<31.12.2008). 
Partition key1),2)Key according to which input records are distributed among different Cluster nodes. Expressed as the sequence of individual input field names separated from each other by semicolon. Example of Partition key: first_name;last_name. 
Load balancer queue size Size of the queue for load balancer. 
Advanced
Partition source charset Encoding of external file defining the transformation.ISO-8859-1 (default) | other encoding

Legend:

1): If one of these transformation attributes is specified, both Ranges and Partition key will be ignored since they have less priority. Any of these transformation attributes must use a CTL template for Partition or implement a PartitionFunction interface.

See CTL Scripting Specifics or Java Interfaces for ClusterPartitioner for more information.

See also Defining Transformations for detailed information about transformations.

2): If no transformation attribute is defined, Ranges and Partition key are used in one of the following three ways:

CTL Scripting Specifics

When you define any of the three transformation attributes, which is optional, you must specify a transformation that assigns a number of Cluster node to each input record.

For detailed information about Clover Transformation Language see Part VIII, CTL - CloverETL Transformation Language. (CTL is a full-fledged, yet simple language that allows you to perform almost any imaginable transformation.)

CTL scripting allows you to specify custom transformation using the simple CTL scripting language.

CTL Templates for ClusterPartitioner

ClusterPartitioner uses the same transformation teplate as Partition. See CTL Templates for Partition (or ClusterPartitioner).

Once you have written your transformation in CTL, you can also convert it to Java language code by clicking corresponding button at the upper right corner of the tab.

Java Interfaces for ClusterPartitioner

ClusterPartitioner uses the same Java interface as Partition. See Java Interfaces for Partition (and ClusterPartitioner).