Install Apache Flink on Multi-node Cluster: RHE8


  • Set up a password less SSH connection between the nodes for easy communication

Setting up the cluster nodes:

Install the latest version of Java on all nodes in the cluster. 

sudo yum install java-1.8.0-openjdk-devel

Install Apache ZooKeeper on all nodes in the cluster.

ZooKeeper is used for coordination between the nodes. 

Install ZooKeeper 3.6.2:


tar -xvf apache-zookeeper-3.6.2-bin.tar.gz 

sudo mv apache-zookeeper-3.6.2-bin /usr/local/zookeeper

Installing Apache Flink:

Download the latest version of Apache Flink (1.16.1) from the official website:


Unpack the archive to a directory on all nodes in the cluster:

tar -xvf flink-1.16.1-bin-scala_2.12.tgz sudo mv flink-1.16.1 /usr/local/flink

Configuring the Apache Flink cluster:

  • Create a copy of the flink-conf.yaml configuration file and customize it:
cd /usr/local/flink/conf 
cp flink-conf.yaml flink-conf.yaml.orig
  • Configure the jobmanager.rpc.address setting to the hostname or IP address of the master node.
  • Configure the taskmanager.numberOfTaskSlots setting to the number of parallel tasks that each task manager should run.
  • Configure the taskmanager.memory.process.size setting to the amount of memory that each task manager should use.

for example:

taskmanager.memory.process.size: 4GB
taskmanager.numberOfTaskSlots: 30

Configure the high-availability section to set up a high-availability setup using ZooKeeper:

The zoo.cfg file is the configuration file for ZooKeeper, which is used to set up a high-availability setup for Apache Flink.

The following details need to be added to this file:

  1. Data Directory: Specify the directory where ZooKeeper will store its data.
  2. Client Port: Specify the port that ZooKeeper will listen on for client connections.
  3. Server List: Specify a list of servers in the ZooKeeper ensemble, including the hostname and client port for each server.
  4. Tick Time: Specify the length of a single tick, which is the basic time unit used by ZooKeeper.
  5. Init Limit: Specify the number of ticks that the initial synchronization phase between a ZooKeeper server and its followers can take.
  6. Sync Limit: Specify the number of ticks that a follower can be behind a leader.
  7. Snapshot Counter: Specify the number of transactions that can be processed before a snapshot of the ZooKeeper state is taken.

Here’s an example of a basic zoo.cfg file:




Add the following lines to the flink-conf.yaml file

high-availability: zookeeper 
high-availability.zookeeper.quorum: host1:port,host2:port,host3:port 
high-availability.zookeeper.path.root: /flink:

Note: Replace host1:port, host2:port, and host3:port with the hostnames and ports of the ZooKeeper nodes in your cluster.

Starting the cluster:

Start ZooKeeper on all nodes in the cluster:

cd /usr/local/zookeeper/bin ./ start

Start the JobManager on the master node by running the following command:

cd /usr/local/flink/bin ./ start

Start the TaskManagers on all other nodes by running the following command on each node:

cd /usr/local/flink/bin ./ start 

 Here are the additional steps for setting up TLS/SSL/HTTPS :

  1. Obtain a certificate
  2. Install the certificate: Copy the certificate and private key files to a location on each node in the cluster. The location should be accessible to the user that runs the Flink process.
  3. Install OpenSSL: If it’s not already installed, install the OpenSSL package on each node in the cluster. You can do this by running the following command:
sudo yum install openssl

Configure Flink: Modify the flink-conf.yaml file on each node to enable SSL/TLS and specify the location of the certificate and private key files. Here is an example configuration:

security.ssl.enabled: true 
security.ssl.certificate: /path/to/cert.pem 
security.ssl.private-key: /path/to/key.pem

Restart the nodes: After making the changes to the configuration file, restart the Job Manager and Task Manager nodes.

Verify the configuration: You can verify that the configuration is working by accessing the Flink web UI using an HTTPS URL (e.g. https://<jobmanager_host&gt;:8081).The browser should show that the connection is secure and that the certificate was issued by a trusted CA.


