Strimzi Operator and Kafka Cluster Provisioning
In this note we propose to describe the provisioning of a Kafka Cluster using Strimzi operators and how to provision Mirror Maker 2 on Kubernetes or on VM.
Strimzi uses the Cluster Operator to deploy and manage Kafka (including Zookeeper) and Kafka Connect clusters. When the Strimzi Cluster Operator is up, it starts to watch for certain OpenShift or Kubernetes resources containing the desired Kafka and/or Kafka Connect cluster configuration. The base of strimzi is to define a set of kubernetes operators and custom resource definitions for the different elements of Kafka.
We recommend to go over the product overview page.
The service account and role binding do not need to be re-installed if you did it previously.
Concept summary
The Cluster Operator is a pod used to deploys and manages Apache Kafka clusters, Kafka Connect, Kafka MirrorMaker (1 and 2), Kafka Bridge, Kafka Exporter, and the Entity Operator. When deployed the following commands goes to the Cluster operator:
# Get the current cluster list oc get kafka # get the list of topic oc get kafkatopics
We address how to create topic using the operator in the section below.
CRDs act as configuration instructions to describe the custom resources in a Kubernetes cluster, and are provided with Strimzi for each Kafka component used in a deployment.
Strimzi Operators Deployment
The Strimzi operator deployment is done in two phases:
- Deploy the Custom Resource Definitions (CRDs), which act as specifications of the custom resources to deploy.
- Deploy one to many instances of those CRDs
In CRD yaml file the kind
attribute specifies the CRD to conform to.
Each CRD has a common configuration like bootstrap servers, CPU resources, logging, healthchecks...
If you want to deploy a Kafka Cluster using Strimzi on an OpenShift cluster use the steps in next sections.
Create a namespace or openshift project
kubectl create namespace eda-strimzi-kafka24
# Or using Openshift CLI
oc new-project eda-strimzi-kafka24
Download the strimzi artefacts
We have already created the configuration from the source strimzi github in the following folder openshift-strimzi/eda-strimzi-kafka24/cluster-operator. So you do not need to do the following steps if you use the same project name: eda-strimzi-kafka24
.
In case you want to do on your own, get the last Strimzi release from this github page. Then modify the Role binding yaml files with the namespace set in previous step.
sed -i '' 's/namespace: .*/namespace: eda-strimzi-kafka24 /' $strimzi-home/install/cluster-operator/*RoleBinding*.yaml
Deploy the Custom Resource Definitions for kafka
Custom resource definitions are defined within the kubernetes cluster. The following commands
oc apply -f openshift-strimzi/eda-strimzi-kafka24/cluster-operator/ oc get crd
In case of Strimzi cluster operator fails with error like: " kafkas.kafka.strimzi.io is forbidden: User "system:serviceaccount:eda-strimzi-kafka24 :strimzi-cluster-operator" cannot watch resource "kafkas" in API group "kafka.strimzi.io" in the namespace "eda-strimzi-kafka24 ", you need to add cluster role to the strimzi operator user by doing the following commands:
oc adm policy add-cluster-role-to-user strimzi-cluster-operator-namespaced --serviceaccount strimzi-cluster-operator -n eda-strimzi-kafka24 oc adm policy add-cluster-role-to-user strimzi-entity-operator --serviceaccount strimzi-cluster-operator -n eda-strimzi-kafka24
The commands above, should create the following service account, resource definitions, roles, and role bindings:
Names | Resource | Command |
---|---|---|
strimzi-cluster-operator | A service account provides an identity for processes that run in a Pod. | oc get sa -l app=strimzi |
strimzi-cluster-operator-global, strimzi-cluster-operator-namespaced, strimzi-entity-operator, strimzi-kafka-broker, strimzi-topic-operator | Cluster Roles | oc get clusterrole |
strimzi-cluster-operator-entity-operator-delegation, strimzi-cluster-operator, strimzi-cluster-operator-topic-operator-delegation | Role binding | oc get rolebinding |
strimzi-cluster-operator, strimzi-cluster-operator-kafka-broker-delegation | Cluster Role Binding | oc get clusterrolebinding -l app=strimzi |
kafkabridges, kafkaconnectors, kafkaconnects, kafkamirrormaker2s kafka, kafkatopics, kafkausers | Custom Resource Definitions | oc get customresourcedefinition |
Note
All those resources are labelled with strimzi
name.
Add Strimzi Admin Role
If you want to allow non-kubernetes cluster administators to manage Strimzi resources, you must assign them to the Strimzi Administrator
role.
First deploy the role definition using the following command: oc apply -f openshift-strimzi/eda-strimzi-kafka24/010-ClusterRole-strimzi-admin.yaml
Then assign the strimzi-admin ClusterRole to one or more existing users in the Kubernetes cluster.
kubectl create clusterrolebinding strimzi-admin --clusterrole=strimzi-admin --user=<user-your-username-here>
Deploy instances
Deploy Kafka cluster
The CRD for kafka cluster resource is here and we recommend to study it before defining your own cluster.
Change the name of the cluster in one the yaml in the examples/kafka
folder or use our openshift-strimzi/kafka-cluster.yml
file in this project as a starting point. This file defines the default replication factor of 3 and in-synch replicas of 2. For development purpose we have set a plain (unencrypted) listener on port 9092 without TLS authentication.
For external access to the kubernetes cluster, we need to have external listeners. As Openshift uses routes for external access, we need to add the external.type = route
stanza in the yaml file. When exposing Kafka using OpenShift Routes and the HAProxy router, a dedicated Route is created for every Kafka broker pod. An additional Route is created to serve as a Kafka bootstrap address. Kafka clients can use the bootstrap route to connect to Kafka on port 443.
Even for development we added the metrics rules in the metrics
stamza within kafka-cluster.yml
file to expose kafka and zookeeper metrics for Prometheus.
For production we need to use persistence for the kafka log, ingress or load balancer external listener and rack awareness policies. It has to use Mutual TLS authentication, and with Strimzi we can use the User Operator to manage cluster users. Mutual authentication or two-way authentication is when both the server and the client present certificates.
Using non presistence:
oc apply -f openshift-strimzi/kafka-cluster.yaml oc get kafka # NAME DESIRED KAFKA REPLICAS DESIRED ZK REPLICAS # my-cluster 3 3
When looking at the pods running we can see the three kafka and zookeeper nodes as pods, the strimzi entity operator pod and the strimzi cluster operator.
$ oc get pods my-cluster-entity-operator-645fdbc4cb-m29nk 3/3 Running 0 18d my-cluster-kafka-0 2/2 Running 0 3d my-cluster-kafka-1 2/2 Running 0 3d my-cluster-kafka-2 2/2 Running 0 3d my-cluster-zookeeper-0 2/2 Running 0 3d my-cluster-zookeeper-1 2/2 Running 0 3d my-cluster-zookeeper-2 2/2 Running 0 3d strimzi-cluster-operator-58cbbcb7d-bcqhm 1/1 Running 2 18d strimzi-topic-operator-564654cb86-nbt58 1/1 Running 1 18d
To use persistence add persistence volume and declare the PVC in the yaml file and then reapply:
oc apply -f strimzi/kafka-cluster.yaml
Add Topic CRDs and operator
This step is optional. Topic operator helps to manage Kafka topics via yaml configuration and map the topics as kubernetes resources so a command like oc get kafkatopics
returns the list of topics. The operator keeps the resources and the kafka topics in synch. This allows you to declare a KafkaTopic as part of your application’s deployment using yaml file.
To manage Kafka topics with operators, first modify the file 05-Deployment-strimzi-topic-operator.yaml to reflect your cluster name
env: - name: STRIMZI_RESOURCE_LABELS value: "strimzi.io/cluster=eda-demo-24-cluster" - name: STRIMZI_KAFKA_BOOTSTRAP_SERVERS value: eda-demo-24-cluster-kafka-bootstrap:9092 - name: STRIMZI_ZOOKEEPER_CONNECT value: eda-demo-24-cluster-zookeeper-client:2181
and then deploy the topic-operator
. This operation will fail if there is no Kafka Broker and Zookeeper available:
oc apply -f openshift-strimzi/install/topic-operator oc adm policy add-cluster-role-to-user strimzi-topic-operator --serviceaccount strimzi-cluster-operator -n eda-strimzi-kafka24
This will add the following resources:
Names | Resource | Command |
---|---|---|
strimzi-topic-operator | Service account | oc get sa |
strimzi-topic-operator | Role binding | oc get rolebinding |
kafkatopics | Custom Resource Definition | oc get customresourcedefinition |
Create a topic
Edit a yaml file like the following:
apiVersion: kafka.strimzi.io/v1beta1 kind: KafkaTopic metadata: name: accounts labels: strimzi.io/cluster: eda-demo-24-cluster spec: partitions: 1 replicas: 3 config: retention.ms: 7200000 segment.bytes: 1073741824 message.timestamp.type: LogAppendTime
oc apply -f test.yaml oc get kafkatopics
This creates a topic test
in your kafka cluster.
Add User CRDs and operator
This step is optional. To manage Kafka user with operators modify the file 05-Deployment-strimzi-user-operator.yaml
to reflect your cluster name and then deploy the user-operator:
oc apply -f openshift-strimzi/install/user-operator
Test with producer and consumer pods
Use kafka-consumer and producer tools from Kafka distribution. Verify within Dockerhub under the Strimzi account to get the lastest image tag (below we use -2.4.0 tag).
# Start a consumer on test topic oc run kafka-consumer -ti --image=strimzi/kafka:latest-kafka-2.4.0 --rm=true --restart=Never -- bin/kafka-console-consumer.sh --bootstrap-server eda-demo-24-cluster-kafka-bootstrap:9092 --topic test --from-beginning # Start a text producer oc run kafka-producer -ti --image=strimzi/kafka:latest-kafka-2.4.0 --rm=true --restart=Never -- bin/kafka-console-producer.sh --broker-list eda-demo-24-cluster-kafka-bootstrap:9092 --topic test # enter text
If you want to use the strimzi kafka docker image to run the above scripts on your local computer, remotely connect to a kafka cluster you need multiple things to happen:
- Be sure the kafka custer definition yaml file includes the
external
route stamza:
spec: kafka: version: 2.4.0 replicas: 3 listeners: plain: {} tls: {} external: type: route
- Get the host ip address from the Route resource
oc get routes eda-demo-24-cluster-kafka-bootstrap -o=jsonpath='{.status.ingress[0].host}{"\n"}'
- Get the TLS certificate from the broker
oc get secrets oc extract secret/eda-demo-24-cluster-cluster-ca-cert --keys=ca.crt --to=- > ca.crt oc extract secret/eda-demo-24-cluster-clients-ca-cert --keys=ca.crt --to=- >> ca.crt # transform it fo java truststore keytool -import -trustcacerts -alias root -file ca.crt -keystore truststore.jks -storepass password -noprompt
The alias is used to access keystore entries (key and trusted certificate entries).
- Start the docker container by mounting the local folder with the truststore.jks to the
/home
docker run -ti -v $(pwd):/home strimzi/kafka:latest-kafka-2.4.0 bash # inside the container uses the consumer tool bash-4.2$ cd /opt/kafka/bin bash-4.2$ ./kafka-console-consumer.sh --bootstrap-server eda-demo-24-cluster-kafka-bootstrap-eda-strimzi-kafka24.gse-eda-demo-43-fa9ee67c9ab6a7791435450358e564cc-0000.us-east.containers.appdomain.cloud:443 --consumer-property security.protocol=SSL --consumer-property ssl.truststore.password=password --consumer-property ssl.truststore.location=/home/truststore.jks --topic test --from-beginning
- For a producer the approach is the same but using the producer properties:
./kafka-console-producer.sh --broker-list eda-demo-24-cluster-kafka-bootstrap-eda-strimzi-kafka24.gse-eda-demo-43-fa9ee67c9ab6a7791435450358e564cc-0000.us-east.containers.appdomain.cloud:443 --producer-property security.protocol=SSL --producer-property ssl.truststore.password=password --producer-property ssl.truststore.location=/home/truststore.jks --topic test
Those properties can be in file
bootstrap.servers=eda-demo-24-cluster-kafka-bootstrap-eda-strimzi-kafka24 .gse-eda-demos-fa9ee67c9ab6a7791435450358e564cc-0001.us-east.containers.appdomain.cloud security.protocol=SSL ssl.truststore.password=password ssl.truststore.location=/home/truststore.jks
and then use the following parameters in the command line:
./kafka-console-producer.sh --broker-list eda-demo-24-cluster-kafka-bootstrap-eda-strimzi-kafka24 .gse-eda-demos-fa9ee67c9ab6a7791435450358e564cc-0001.us-east.containers.appdomain.cloud:443 --producer.config /home/strimzi.properties --topic test ./kafka-console-consumer.sh --bootstrap-server eda-demo-24-cluster-kafka-bootstrap-eda-strimzi-kafka24 .gse-eda-demos-fa9ee67c9ab6a7791435450358e564cc-0001.us-east.containers.appdomain.cloud:443 --topic test --consumer.config /home/strimzi.properties --from-beginning