Summary -

In this topic, we described about the below sections -

The components discussed in architecture plays key role in Flume architecture. Those components are required always in Flume process. Flume has some additional/optional components which plays a key role in transferring the events from data generators to centralized store.

Interceptors -

Interceptors are used to inspect flume event. Interceptors can also alter flume event. The interceptors used to modify/drop events in-flight. Flume has the capability as it uses the interceptors.

The interceptor also decides what sort of data should pass through to the Channel. An interceptor can modify/drop events based on any criteria chosen by the developer of the interceptor.

Flume supports binding of interceptors. Refer Interceptors to know the more details.

Flume Channel Selectors -

These channel selectors are used to determine which channel should use to transfer the data when multiple channels are available.

Replicating Channel Selector -

It replicates all the events in each channel. This is defect channel selector. By default, all the channels are replicating channels. If any channel is not replicating channel, then channels to be marked as optional.

Property NameDefaultDescription
selector.typereplicatingThe component type name, needs to be replicating
selector.optionalSet of channels to be marked as optional

Example:

a1.sources = r1
a1.channels = c1 c2 
a1.source.r1.selector.type = replicating
a1.source.r1.channels = c1 c2
a1.source.r1.selector.optional = c2

In the above example configuration, c2 is an optional channel.

Multiplexing Channel Selector -

It decides the channel to use to transmit an event when multiple channels are available. The decision should take place based address on the header of the event.

Property NameDefaultDescription
selector.typereplicatingThe component type name, needs to be multiplexing
selector.headerflume.selector.header
selector.default
selector.mapping.*

Example:

The below example describes the event flow based on the country.

a1.sources = r1
a1.channels = c1 c2 c3
a1.sources.r1.selector.type = multiplexing
a1.sources.r1.selector.header = country
a1.sources.r1.selector.mapping.IN = c1
a1.sources.r1.selector.mapping.US = c2
a1.sources.r1.selector.default = c3

Custom Channel Selector -

A custom channel selector is developer own implementation of the Channel Selector interface. A custom channel selector’s class and its dependencies must be included in the agent’s classpath before Flume agent starts.

Property NameDefaultDescription
selector.typeThe component type name, needs to be your FQCN

Example:

a1.sources = r1
a1.channels = c1
a1.sources.r1.selector.type = org.example.MyChannelSelector

Sink Processors -

Sink groups normally allows users to group multiple sinks into one entity. Sink processor are used to invoke particular sink from group of sinks. Sink processor can be used to provide load balancing capabilities.

The below are the type of load processors -

  1. Default Sink Processor
  2. Failover Sink Processor
  3. Load balancing Sink Processor
  4. Custom Sink Processor