Apache Flume Channel Selectors
The channel selector is a part of Flume that figures out which channel a Flume event should go into when there are multiple channels available. It can send the event to one channel or several.
This selection process happens internally. As mentioned before, there are different ways to manage multiple channels using various types of channel selectors.
Replicating channel selectors -
This is the default channel selector. If nothing is set up for the channel selector, the replicating channel selector takes over and decides where to send the event.
Basically, it makes a copy of the event for each channel that’s available, as long as there’s more than one.
# Replicating channel selectors
<Agent_name>.sources = <Source-name>
<Agent_name>.channels = <Channel1> <Channel2>……<Channeln>
<Agent_name>.sources.<source-name>.selector.type = replicating
<Agent_name>.sources.<source-name>.channels
= <Channel1> <Channel2>……<Channeln>
<Agent_name>.sources.<source-name>.selector.optional
= <Optional channel-number>
Example for agent named agt1 and it’s source called src1 -
agt1.sources = src1
agt1.channels = chnl1 chnl2 chnl3
agt1.sources.src1.selector.type = replicating
agt1.sources.src1.channels = chnl1 chnl2 chnl3
agt1.sources.src1.selector.optional = chnl1
In this setup, chnl1 is an optional channel, so if it has trouble writing—it just gets ignored. But if chnl2 or chnl3 have issues writing, the whole transaction fails because those channels aren’t optional.
Below are the two properties for replicating channel selector.
Property Name | Default | Description |
---|---|---|
selector.type | replicating | The component type name, needs to be replicating |
selector.optional | – | Set of channels to be marked as optional |
Multiplexing channel selector -
This channel selector can direct Flume events to various channels based on header information.
# Multiplexing channel selectors
<Agent_name>.sources = <Source-name>
<Agent_name>.channels = <Channel1> <Channel2>……<Channeln>
<Agent_name>.sources.<source-name>.selector.type = multiplexing
<Agent_name>.sources.<source-name>.selector.header = <header-name>
<Agent_name>.sources.<source-name>.selector.mapping.
<header-category> = <Channel1> <Channel2>……<Channeln>
<Agent_name>.sources.<source-name>.selector.mapping.<header-category>
= <Channel1> <Channel2>……<Channeln>
<Agent_name>.sources.<source-name>.selector.default
= <Channel1> <Channel2>……<Channeln>
Example for agent named agt1 and source called src1 -
agt1.sources = src1
agt1.channels = chnl1 chnl2 chnl3 chnl4
agt1.sources.src1.selector.type = multiplexing
agt1.sources.src1.selector.header = grade
agt1.sources.src1.selector.mapping.grade1 = chnl1
agt1.sources.src1.selector.mapping.grade2 = chnl2
agt1.sources.src1.selector.mapping.grade3 = chnl3
agt1.sources.src1.selector.default = chnl4
In this setup, channels are sorted based on the header grade. So, grade 1 goes to channel 1, grade 2 goes to channel 2, and grade 3 goes to channel 3. Channel 4 is the default one.
Below are the two properties for multiplexing channel selector.
Property Name | Default | Description |
---|---|---|
selector.type | replicating | The component type name, needs to be multiplexing |
selector.header | flume.selector.header | |
selector.default | – | |
selector.mapping.* | – |
Custom channel selector -
This is users own implementation of the ChannelSelector interface. If you’re creating a custom channel selector, make sure to add its class and any other necessary bits to the agent’s classpath. Start Flume after adding the custom channel selector.
# custom channel selectors
<Agent_name>.sources = <Source-name>
<Agent_name>.channels = <Channel1>
<Agent_name>.sources.<source-name>.selector.type
= custom selector type
Example for agent named agt1 and its source called src1 -
agt1.sources = src1
agt1.channels = chnl
agt1.sources.src1.selector.type = example.ChannelSelector
Below are the two properties for custom channel selector.
Property Name | Default | Description |
---|---|---|
selector.type | – | The component type name, needs to be your FQCN |
We can define the selector for any channel using the 'selector.type' property. Channel selectors work between the Source and the Channel. They decide which channel goes to which Sink and also determine the right HDFS cluster or HBase system to use. This way, everything stays organized and runs smoothly.