Instalacija Kafka klastera

Pisao sam o problematici konfiguracije Kafka servera i topica te kako efikasno kreirati Kafka topic a da nisam napisao osnove: instalacija Kafka klastera. Ovim postom ispraviti ću taj propust.

Preduvjeti

Operativni sustav koji se koristi kroz ovaj post je CentOS8. Kafka i Zookeeper pokreću se u Docker kontejnerima. Time se omogućava da se na istom poslužitelju izolirano izvršavaju Kafka i Zookeeper (nisu potrebni odvojeni dedicirani poslužitelji za Zookeeper i Kafku) a ujedno znatno ubrzava proces nadogradnje i Kafka i Zookeepera (zamjena Docker imagea i izmjena start skripte).

Docker image koje ću koristiti kroz ovaj post su Bitnami image. Bitnami koristim zato što se, prvenstveno, kontejneri ne izvršavaju pod root privilegijama te Bitnami izrazito revno održava Docker image. Bitnami image se može startati i sa root privilegijama, što nikako ne preporučam a za više informacija o non-root i root ograničenjima i prednostima može se pronaći u službenoj Bitnami dokumentaciji.

Uputa je pisana za single-node instancu Kafke. Male su promjene potrebne da se instancira klaster sa više nodeova, tek manje promjene u skriptama za pokretanje (uz naravno instalaciju dodatnih servera).

Mrežna konfiguracija

Nakon instalacije CentOS8 operativnog sustava potrebno je konfigurirati mrežni adapter te host name. Neću ulaziti u detalje navedeno kako sam u okviru drugih članaka pisao o tim koracima. Važno je setupirati statičnu IP adresu te ispravno setupirati host name.

Update i instalacija paketa

Prije nadogradnje paketa potrebno je dodati Docker repozitorij kako se Kafka i Zookeeper izvršavaju u Docker kontejenerima te će biti potrebno instalirati Docker.

$ dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo

Napraviti update paketa

$ dnf -y update

Instalirati set paketa koji će biti potrebni za osnovni rad i održavanje

$ dnf -y install nano net-tools wget git bind-utils bash-completion device-mapper-persistent-data lvm2 java-11-openjdk-devel

Instalirati Docker

$ dnf -y install docker-ce docker-ce-cli containerd.io --nobest

Omogućiti, pokrenuti i provjeriti status Docker servisa

$ systemctl enable docker
$ systemctl start docker
$ systemctl status docker

Konfiguracija vatrozida

Kafka i Zookeeper zahtijevaju ukupno pet portova za komunikaciju između sebe, Kafke i Zookeppera te producera i consumera:

  • TCP/9092 – Kafka Broker
  • TCP/9696 – Kafka JMX (može biti i drugi port, pa treba uskladiti sa skriptom za pokretanje Kafka Docker kontejnera)
  • TCP/2888 – Zookeeper 
  • TCP/3888 – Zookeeper
  • TCP/2182 – Kafka Zookeeper komunikacija

Potrebno je kreirati dva xml dokumenta sa navedenim portovima za Kafka i Zookeeper servise koji će se dodati u firewall pravila. Moguće je sve staviti u jedan xml dokument ali meni je draže držati zasebno odvojene servise.

Prvo je potrebno kreirati xml dokument za Kafka set portova

$ nano /etc/firewalld/services/kafka.xml

U datoteku dodati slijedeći sadržaj

<?xml version="1.0" encoding="utf-8"?>
<service>
    <short>kafka</short>
    <description>Kafka and Kafka JMX ports</description>
    <port protocol="tcp" port="9092"/>
    <port protocol="tcp" port="9696"/>
</service>

Kreirati xml dokument za Zookeeper set portova

$ nano /etc/firewalld/services/zookeeper.xml

U datoteku dodati slijedeći sadržaj

<?xml version="1.0" encoding="utf-8"?>
<service>
    <short>zookeeper</short>
    <description>Zokeeper ports</description>
    <port protocol="tcp" port="2888"/>
    <port protocol="tcp" port="3888"/>
    <port protocol="tcp" port="2181"/>
</service>

Učitati oba servisa u firewall

$ firewall-cmd --permanent --add-service=kafka
$ firewall-cmd --permanent --add-service=zookeeper
$ firewall-cmd --reload

Kreiranje korisničkog računa i grupe

Bitnami Docker image se izvršavaju pod non-root korisničkim privilegijama i sa hardkodiranim ID-evima: 1001. Shodno potrebno je kreirati korisnički račun i grupu sa navedenim ID-em kako bi mogli ispravno raditi te čitati i zapisivati podatke na diskovni podsustav.

Kreiranje grupe naziva “kafka” i ID-a “1001”

$ groupadd kafka -g 1001

Kreirati korisnički račun naziva “svckafkalab” i sa ID-em “1001”

$ useradd svckafkalab -u 1001 -g 1001
$ passwd svckafkalab
$ usermod -aG wheel svckafkalab
$ chage -I -1 -m 0 -M 99999 -E -1 svckafkalab

Kreiranje direktorija

Za pohranu podataka Kafka brokera i Zookeepera potrebno je kreirati odgovarajuće direktorije. Također, potrebno je kreirati i direktorij u koji će se pohraniti truststore i keystore.

$ mkdir -p /opt/data/kafka/
$ mkdir -p /opt/data/zookeeper/
$ mkdir -p /opt/scripts/
$ mkdir -p /opt/certificates/

Dodijeliti prava na foldere

$ chmod 775 -R /opt/data/
$ chmod 775 -R /opt/scripts/
$ chmod 775 -R /opt/certificates/

SSL konfiguracija

Za osiguravanje komunikacije između Kafka brokera te producera i consumera potrebno je kreirati odgovarajuće certifikate te certifikate pohraniti u keystore i truststore. Za pomoć kod generiranja koristiti će se Confluent shell helper skripta.

Preporučam prilikom generiranja certifikata generirati wildcard certifikat te je u tom slučaju ovaj korak potrebno napravit samo na prvom brokeru te kopirati truststore i keystore na ostale brokere. U protivnom, na svakom brokeru potrebno je generirati certifikate, keystore i truststore te sve upakirati u jedan truststore i keystore koji će se potom koristiti na svim brokerima i svim producerima i consumerima.

U primjeru niže obratite pozornost na linije 35 i 112. Navedeno se mora podudarati jer u protivnom broker neće vjerovati certifikatu te se Kafka neće uspješno pokrenuti.

Potrebno je preuzeti shell helper skriptu, te omogućiti izvršavanje skripte.

$ cd /opt/certificates/
$ wget https://raw.githubusercontent.com/confluentinc/confluent-platform-security-tools/master/kafka-generate-ssl.sh
$ chmod +X kafka-generate-ssl.sh

Potrebno je pokrenuti helper skriptu i pratiti upute.

$ ./kafka-generate-ssl.sh

Primjer izvršavanja skripte je niže. Ponovo obraćam pozornost na korištenje wildcard CN-a te da linije 35 i 112 budu iste (označene). Svakako zapišite korištene lozinke koje će biti kasnije potrebne za pristup keystoreu. 

Welcome to the Kafka SSL keystore and truststore generator script.
 
First, do you need to generate a trust store and associated private key,
or do you already have a trust store file and private key?
 
Do you need to generate a trust store and associated private key? [yn] y
 
OK, we'll generate a trust store and associated private key.
 
First, the private key.
 
You will be prompted for:
 - A password for the private key. Remember this.
 - Information about you and your company.
 - NOTE that the Common Name (CN) is currently not important.
Generating a RSA private key
........................................................+++++
......+++++
writing new private key to 'truststore/ca-key'
Enter PEM pass phrase:
Verifying - Enter PEM pass phrase:
-----
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [XX]:HR
State or Province Name (full name) []:Grad Zagreb
Locality Name (eg, city) [Default City]:Zagreb
Organization Name (eg, company) [Default Company Ltd]:Lab
Organizational Unit Name (eg, section) []:OJ Lab
Common Name (eg, your name or your server's hostname) []:*.lab.local
Email Address []:[email protected]
 
Two files were created:
 - truststore/ca-key -- the private key used later to
   sign certificates
 - truststore/ca-cert -- the certificate that will be
   stored in the trust store in a moment and serve as the certificate
   authority (CA). Once this certificate has been stored in the trust
   store, it will be deleted. It can be retrieved from the trust store via:
   $ keytool -keystore <trust-store-file> -export -alias CARoot -rfc
 
Now the trust store will be generated from the certificate.
 
You will be prompted for:
 - the trust store's password (labeled 'keystore'). Remember this
 - a confirmation that you want to import the certificate
Enter keystore password:
Re-enter new password:
Owner: [email protected], CN=*.lab.local, OU=OJ Lab, O=Lab, L=Zagreb, ST=Grad Zagreb, C=HR
Issuer: [email protected], CN=*.lab.local, OU=OJ Lab, O=Lab, L=Zagreb, ST=Grad Zagreb, C=HR
Serial number: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Valid from: Mon Dec 16 15:39:07 CET 2019 until: Thu Dec 13 15:39:07 CET 2029
Certificate fingerprints:
         SHA1: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
         SHA256: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Signature algorithm name: SHA256withRSA
Subject Public Key Algorithm: 2048-bit RSA key
Version: 3
 
Extensions:
 
#1: ObjectId: 2.5.29.35 Criticality=false
AuthorityKeyIdentifier [
KeyIdentifier [
XXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXX
]
]
 
#2: ObjectId: 2.5.29.19 Criticality=true
BasicConstraints:[
  CA:true
  PathLen:2147483647
]
 
#3: ObjectId: 2.5.29.14 Criticality=false
SubjectKeyIdentifier [
KeyIdentifier [
XXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXX
]
]
 
Trust this certificate? [no]:  yes
Certificate was added to keystore
 
truststore/kafka.truststore.jks was created.
 
Continuing with:
 - trust store file:        truststore/kafka.truststore.jks
 - trust store private key: truststore/ca-key
 
Now, a keystore will be generated. Each broker and logical client needs its own
keystore. This script will create only one keystore. Run this script multiple
times for multiple keystores.
 
You will be prompted for the following:
 - A keystore password. Remember it.
 - Personal information, such as your name.
     NOTE: currently in Kafka, the Common Name (CN) does not need to be the FQDN of
           this host. However, at some point, this may change. As such, make the CN
           the FQDN. Some operating systems call the CN prompt 'first / last name'
 - A key password, for the key being generated within the keystore. Remember this.
Enter keystore password:
Re-enter new password:
What is your first and last name?
  [Unknown]:  *.lab.local
What is the name of your organizational unit?
  [Unknown]:  OJ Lab
What is the name of your organization?
  [Unknown]:  Lab
What is the name of your City or Locality?
  [Unknown]:  Zagreb
What is the name of your State or Province?
  [Unknown]:  Grad Zagreb
What is the two-letter country code for this unit?
  [Unknown]:  HR
Is CN=*.lab.local, OU=OJ Lab, O=Lab, L=Zagreb, ST=Grad Zagreb, C=HR correct?
  [no]:  yes
 
 
'keystore/kafka.keystore.jks' now contains a key pair and a
self-signed certificate. Again, this keystore can only be used for one broker or
one logical client. Other brokers or clients need to generate their own keystores.
 
Fetching the certificate from the trust store and storing in ca-cert.
 
You will be prompted for the trust store's password (labeled 'keystore')
Enter keystore password:
Certificate stored in file <ca-cert>
 
Now a certificate signing request will be made to the keystore.
 
You will be prompted for the keystore's password.
Enter keystore password:
 
Now the trust store's private key (CA) will sign the keystore's certificate.
 
You will be prompted for the trust store's private key password.
Signature ok
subject=C = HR, ST = Grad Zagreb, L = Zagreb, O = Lab, OU = OJ Lab, CN = *.lab.local
Getting CA Private Key
Enter pass phrase for truststore/ca-key:
 
Now the CA will be imported into the keystore.
 
You will be prompted for the keystore's password and a confirmation that you want to
import the certificate.
Enter keystore password:
Owner: [email protected], CN=*.lab.local, OU=OJ Lab, O=Lab, L=Zagreb, ST=Grad Zagreb, C=HR
Issuer: [email protected], CN=*.lab.local, OU=OJ Lab, O=Lab, L=Zagreb, ST=Grad Zagreb, C=HR
Serial number: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Valid from: Mon Dec 16 15:39:07 CET 2019 until: Thu Dec 13 15:39:07 CET 2029
Certificate fingerprints:
         SHA1: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
         SHA256: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Signature algorithm name: SHA256withRSA
Subject Public Key Algorithm: 2048-bit RSA key
Version: 3
 
Extensions:
 
#1: ObjectId: 2.5.29.35 Criticality=false
AuthorityKeyIdentifier [
KeyIdentifier [
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
]
]
 
#2: ObjectId: 2.5.29.19 Criticality=true
BasicConstraints:[
  CA:true
  PathLen:2147483647
]
 
#3: ObjectId: 2.5.29.14 Criticality=false
SubjectKeyIdentifier [
KeyIdentifier [
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
]
]
 
Trust this certificate? [no]:  yes
Certificate was added to keystore
 
Now the keystore's signed certificate will be imported back into the keystore.
 
You will be prompted for the keystore's password.
Enter keystore password:
Certificate reply was installed in keystore
 
All done!
 
Delete intermediate files? They are:
 - 'ca-cert.srl': CA serial number
 - 'cert-file': the keystore's certificate signing request
   (that was fulfilled)
 - 'cert-signed': the keystore's certificate, signed by the CA, and stored back
    into the keystore
Delete? [yn] y

Nakon generiranja certifikata postaviti prava za Kafka korisnički račun (svckafkalab) i grupu (kafka) na direktorije potrebne za rad Kafke i Zookeepera.

$ chown -R svckafkalab:kafka /opt/data/
$ chown -R svckafkalab:kafka /opt/scripts/
$ chown -R svckafkalab:kafka /opt/certificates/

Pokretanje Kafke i Zookepera

Prije pokretanja Kafke i Zookeepera optimalno je preuzeti Docker image sa Docker Hub. Mudro je koristiti eksplicitne tagove umjesto “latest” taga. Ukoliko se koristi “latest” tag moguće je, da se zbog ponovnog pokretanja Kafke ili Zookeepera, preuzme nova verzija Kafke ili Zookeepera koja će onda biti drugačija u odnosu na ostale nodeove klastera. Prilikom korištenja eksplicitnih tagova kontrolira se verzije Kafke i Zookeepera na svim nodeovima.

$ docker pull bitnami/kafka:2.4.0-debian-9-r16
$ docker pull bitnami/zookeeper:3.5.6-debian-9-r59

Kako bi se olakšalo pokretanje Docker kontejnera Kafke i Zookeepera, potrebno je kreirati dvije shell skripte: kafka.sh i zookeeper.sh

Kreirati zookeeper.sh skriptu

$ nano /opt/scripts/zookeeper.sh

Sadržaj skripte bi trebao biti sličan slijedećem

#!/bin/bash
DATA_DIR=/opt/data/zookeeper
docker run -d --restart unless-stopped --network host \
         -e ZOO_SERVER_ID=1 \
         -e ZOO_SERVERS=0.0.0.0:2888:3888 \
         -e ZOO_ENABLE_AUTH=yes \
         -e ZOO_SERVER_USERS=user \
         -e ZOO_SERVER_PASSWORDS=password\
         -p 2181:2181 \
         -p 2888:2888 \
         -p 3888:3888 \
         -v $DATA_DIR:/bitnami/zookeeper \
         --name zookeeper_`date '+%d.%m.%Y'` \
         bitnami/zookeeper:3.5.6-debian-9-r59

Ukoliko se klaster sastoji od više nodeova potrebno je sve nodeove navesti u varijabli ZOO_SERVERS s time da obratiti pozornost da IP adresa 0.0.0.0 se navodi u skripti na poslužitelju na kojem se skripta pokreće za taj poslužitelj. Također, potrebno je mijenjati ZOO_SERVER_ID obzirom na node u klasteru. ZOO_SERVER_USERS i ZOO_SERVER_PASSWORDS će se koristiti za Kafka -> Zookeeper komunikaciju.

Potrebno je postaviti adekvatna prava na skriptu

$ chmod +X /opt/scripts/zookeeper.sh
$ chmod 755 /opt/scripts/zookeeper.sh

U konačnici potrebno je pokrenuti Zookeeper Docker kontejner

$ /opt/scripts/zookeeper.sh

Provjeriti da li se Zookeeper uspješno pokrenuo (docker logs)

Ukoliko se Zookeeper uspješno pokrenuo može se nastaviti sa pokretanjem Kafke. Potrebno je kreirati skriptu za pokretanje Kafka Docker kontejnera.

$ nano /opt/scripts/kafka.sh

Sadržaj skripte bi trebao biti sličan slijedećem

#!/bin/bash
DATA_DIR=/opt/data/kafka
docker run -d --restart unless-stopped --network host \
         -e ALLOW_PLAINTEXT_LISTENER=no \
         -e KAFKA_CERTIFICATE_PASSWORD=storepassword \
         -e KAFKA_INTER_BROKER_USER=admin \
         -e KAFKA_ZOOKEEPER_USER=user \
         -e KAFKA_ZOOKEEPER_PASSWORD=password \
         -e KAFKA_CFG_AUTO_CREATE_TOPICS_ENABLE=false \
         -e KAFKA_CFG_ADVERTISED_LISTENERS=SASL_SSL://:9092 \
         -e KAFKA_CFG_DELETE_TOPIC_ENABLE=true \
         -e KAFKA_CFG_DEFAULT_REPLICATION_FACTOR=1 \
         -e KAFKA_CFG_INTER_BROKER_PROTOCOL_VERSION=2.2.2 \
         -e KAFKA_CFG_LISTENERS=SASL_SSL://:9092 \
         -e KAFKA_CFG_ZOOKEEPER_CONNECT=kafka.lab.local:2181 \
         -e JMX_PORT=9696 \
         -p 9092:9092 \
         -p 9696:9696 \
         -v $DATA_DIR:/bitnami/kafka \
         -v '/opt/certificates/keystore/kafka.keystore.jks:/opt/bitnami/kafka/conf/certs/kafka.keystore.jks:ro' \
         -v '/opt/certificates/truststore/kafka.truststore.jks:/opt/bitnami/kafka/conf/certs/kafka.truststore.jks:ro' \
         --name kafka_`date +%d.%m.%Y'` \
         bitnami/kafka:2.4.0-debian-9-r16

Potrebno je  obratiti pozornost na KAFKA_ZOOKEEPER_USER i KAFKA_ZOOKEEPER_PASSWORD varijable da budu usklađene sa Zookeeper konfiguracijom. Ako se pokreće više od jednog nodea u klasteru potrebno je promijeniti KAFKA_CFG_DEFAULT_REPLICATION_FACTOR na adekvatnu vrijednost u ovisnosti o broju nodeova te u KAFKA_CFG_ZOOKEEPER_CONNECT navesti sve Zookeeper instance.

Postaviti adekvatna prava na skriptu

$ chmod +X /opt/scripts/kafka.sh
$ chmod 755 /opt/scripts/kafka.sh
$ /opt/scripts/kafka.sh

Provjeriti da li se Kafka uredno pokrenula (docker logs).

Nakon pokretanja svih nodeova u klasteru preporučam provjeriti postavke __consumer_offset topica te ostalih parametara topica a prema ovom članku.

Da li postoji nešto što bi drugačije izveli ili efikasniji način pokretanja Kafka klastera? Da li mislite da Kafka nije dobar kandidat za Dockerizaciju obzirom da je stateful? Volio bi čuti vaša mišljenja u komentarima.

One thought on “Instalacija Kafka klastera

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.