# azkaban-on-kubernetes **Repository Path**: wunderfly/azkaban-on-kubernetes ## Basic Information - **Project Name**: azkaban-on-kubernetes - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2024-11-07 - **Last Updated**: 2024-11-07 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Azkaban on Kubernetes Azkaban is a popular workflow engine which is used to run jobs especially in data lake(from my experience). This will show you how to run Azkaban(https://azkaban.readthedocs.io/en/latest/) on kubernetes. ## Build Azkaban with source(Optional) You can build azkaban with source codes and package it as tar files. This step is optional, you can skip this section and move to the next section. ``` cd ~; git clone https://github.com/azkaban/azkaban.git cd azkaban; git checkout tags/3.90.0; # Build and install distributions ./gradlew installDist # package azkaban as tar files. ## db. cd ~/azkaban/azkaban-db/build/install; tar -zcf azkaban-db-3.90.0.tar.gz azkaban-db; ## executor. cd ~/azkaban/azkaban-exec-server/build/install; tar -zcf azkaban-exec-server-3.90.0.tar.gz azkaban-exec-server; ## web. d ~/azkaban/azkaban-web-server/build/install; tar -zcf azkaban-web-server-3.90.0.tar.gz azkaban-web-server; ``` You can upload these packages for instance, to google drive. The azkaban packages used in the section are already uploaded onto the google drive, but you can change it to suit your needs. ## Create Azkaban Images Azkaban packages are downloaded from google drive, and azkaban docker images will be built based on these azkaban packages. There are three docker images built by the following steps, namely, azkaban db, azkaban executor server and azkaban web server. ``` # remove azkaban docker images. docker rmi -f $(docker images -a | grep azkaban | awk '{print $3}') # azkaban db docker image. cd /docker/db; docker build . -t yourrepo/azkaban-db:3.90.0; ## push. docker push yourrepo/azkaban-db:3.90.0; # azkaban executor image. cd /docker/executor; docker build . -t yourrepo/azkaban-exec-server:3.90.0; ## push. docker push yourrepo/azkaban-exec-server:3.90.0; # azkaban web image. cd /docker/web; docker build . -t yourrepo/azkaban-web-server:3.90.0; ## push. docker push yourrepo/azkaban-web-server:3.90.0; ``` Please, note that you should replace `yourrepo` with your docker repo name in the above. ## Run Azkaban on kubernetes Storage class of PVC for mysql looks like this, let's see `mysql.yaml`: ``` storageClassName: direct.csi.min.io ``` You can change it to suit to your environment. Note that you have to change the `yourrepo` in the docker image repo name in the manifest yaml files. Now, run azkaban executors and web server on kubernetes. ``` ## ---- init. # create mysql server. kubectl apply -f mysql.yaml; # wait for mysql pod being ready. while [[ $(kubectl get pods -n azkaban -l app=mysql -o 'jsonpath={..status.conditions[?(@.type=="Ready")].status}') != "True" ]]; do echo "waiting for mysql pod being ready" && sleep 1; done # configmaps kubectl create configmap azkaban-cfg --dry-run --from-file=azkaban-executor.properties --from-file=azkaban-web.properties -o yaml -n azkaban | kubectl apply -f - # create db and tables. kubectl apply -f init-schema.yaml; # wait for job being completed. while [[ $(kubectl get pods -n azkaban -l job-name=azakban-initschema -o jsonpath={..status.phase}) != *"Succeeded"* ]]; do echo "waiting for finishing init schema job" && sleep 2; done ## ---- azkaban. # create azkaban executor. kubectl apply -f azkaban-executor.yaml; # wait for azkaban executor being run while [[ $(kubectl get pods -n azkaban -l app=azkaban-executor -o jsonpath={..status.phase}) != *"Running"* ]]; do echo "waiting for executor being run" && sleep 2; done # create azkaban web. kubectl apply -f azkaban-web.yaml; ``` Let's see the pods in azkaban namespace. It looks like this: ``` kubectl get po -n azkaban NAME READY STATUS RESTARTS AGE azakban-initschema-hr4bn 0/1 Init:Error 0 4h3m azakban-initschema-kg75t 0/1 Completed 0 4h3m azakban-initschema-ppngd 0/1 Init:Error 0 4h3m azkaban-executor-0 1/1 Running 0 3h19m azkaban-executor-1 1/1 Running 0 3h18m azkaban-executor-2 1/1 Running 0 3h18m azkaban-web-664967cb99-xhmrf 1/1 Running 0 3h9m mysql-statefulset-0 1/1 Running 0 4h3m ``` As seen here, a mysql server, three executor servers, one web server are running on kubernetes. ## Access UI To access UI, let's see the services in azkaban namespaces. ``` kubectl get svc -n azkaban NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE azkaban-executor ClusterIP None 3h20m azkaban-web LoadBalancer 10.233.49.152 52.231.165.73 8081:31538/TCP 3h9m mysql-service ClusterIP 10.233.53.51 3306/TCP 4h4m ``` With the external ip of `azkaban-web` Service, you can access UI in browser: ``` http://52.231.165.73:8081/ ``` ## Azkaban Smoke Test You can test azkaban with running example projects. ``` # install azkaban cli. sudo pip install --upgrade "urllib3==1.22" azkaban; # download sample projects and create project with azkaban cli. wget https://github.com/azkaban/azkaban/raw/master/az-examples/flow20-projects/basicFlow20Project.zip; wget https://github.com/azkaban/azkaban/raw/master/az-examples/flow20-projects/embeddedFlow20Project.zip; azkaban upload -c -p basicFlow20Project -u azkaban@http://52.231.165.73:8081 ./basicFlow20Project.zip; azkaban upload -c -p embeddedFlow20Project -u azkaban@http://52.231.165.73:8081 ./embeddedFlow20Project.zip; ``` ## A little bit real example: Run shell in remote machine from azkaban executor It is another example in which azkaban executor will call the remote shell to run spark job. Let's say, because spark and kubectl are installed on the remote machine, it is ready to submit spark job to kubernetes there. To do so, ssh access to the remote machine from azkaban executor must be enabled. ``` # list pods. kubectl get po -n azkaban NAME READY STATUS RESTARTS AGE azakban-initschema-9bgbh 0/1 Completed 0 16h azakban-initschema-dtgg7 0/1 Init:Error 0 16h azakban-initschema-fw7gt 0/1 Init:Error 0 16h azkaban-executor-0 1/1 Running 0 16h azkaban-executor-1 1/1 Running 0 16h azkaban-executor-2 1/1 Running 0 16h azkaban-web-664967cb99-z8dzn 1/1 Running 0 16h mysql-statefulset-0 1/1 Running 0 16h # access executor pod to get public key. kubectl exec -it azkaban-executor-0 -n azkaban -- cat .ssh/id_rsa.pub; ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC0vuKKMz4dD0aBrJKtlVU8fDmYgqkwpkDXTzoUTqm57CqEmzHa5EDS90xGch1rAN4HucOR6dzUGvb2VlATBGIi5VZ6w0OuRR+r50KHqiC0TLdEXzX1/TRO/uHftI/xdUMFDHOWTuZnsYS5V7DCrw1yJnPzHTHktgXDyycM/iEspdfslzgZuIV4zT3HNVAYIplQPyy8TKRy7gojm7OYw5W2S14hqiY5/HL/CZ9CQpKV37qJvd3E4u/pOZCHH7r1Tm5E3bnUX9U8z7Nj0Fb+TZSkxiEbwoKB/Ib07Urc0il2f4mug2bKazZRsU+/bb1+VjoMW0ek+9Rvk1JTkaXIu8k/ executor@33842653d6db # copy this executor public key and paste it to authorized_keys file in remote machine. ## in remote machine. vi ~/.ssh/authorized_keys; ... paste public key. # chmod 600. chmod 600 ~/.ssh/authorized_keys; ``` and then, login to remote machine via ssh in the individual azkaban executor: ``` kubectl exec -it azkaban-executor-0 -n azkaban -- sh; ssh pcp@x.x.x.x; ... exit; ``` Let's create shell to run spark job in the remote machine: ``` # spark job run shell. vi run-spark-example.sh; ... ############## spark job: create delta table # submit spark job onto kubernetes. export MASTER=k8s://https://xxxx:6443; export NAMESPACE=ai-developer; export ENDPOINT=http://$(kubectl get svc s3g-service -n ai-developer -o jsonpath={.status.loadBalancer.ingress[0].ip}):9898; export HIVE_METASTORE=metastore.ai-developer:9083; spark-submit \ --master ${MASTER} \ --deploy-mode cluster \ --name spark-delta-example \ --class io.spongebob.spark.examples.DeltaLakeExample \ --packages com.amazonaws:aws-java-sdk-s3:1.11.375,org.apache.hadoop:hadoop-aws:3.2.0 \ --conf spark.kubernetes.driver.volumes.persistentVolumeClaim.checkpointpvc.mount.path=/checkpoint \ --conf spark.kubernetes.driver.volumes.persistentVolumeClaim.checkpointpvc.mount.subPath=checkpoint \ --conf spark.kubernetes.driver.volumes.persistentVolumeClaim.checkpointpvc.mount.readOnly=false \ --conf spark.kubernetes.driver.volumes.persistentVolumeClaim.checkpointpvc.options.claimName=spark-driver-pvc \ --conf spark.kubernetes.executor.volumes.persistentVolumeClaim.checkpointpvc.mount.path=/checkpoint \ --conf spark.kubernetes.executor.volumes.persistentVolumeClaim.checkpointpvc.mount.subPath=checkpoint \ --conf spark.kubernetes.executor.volumes.persistentVolumeClaim.checkpointpvc.mount.readOnly=false \ --conf spark.kubernetes.executor.volumes.persistentVolumeClaim.checkpointpvc.options.claimName=spark-exec-pvc \ --conf spark.kubernetes.driver.volumes.persistentVolumeClaim.spark-local-dir-localdirpvc.mount.path=/localdir \ --conf spark.kubernetes.driver.volumes.persistentVolumeClaim.spark-local-dir-localdirpvc.mount.readOnly=false \ --conf spark.kubernetes.driver.volumes.persistentVolumeClaim.spark-local-dir-localdirpvc.options.claimName=spark-driver-localdir-pvc \ --conf spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-localdirpvc.mount.path=/localdir \ --conf spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-localdirpvc.mount.readOnly=false \ --conf spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-localdirpvc.options.claimName=spark-exec-localdir-pvc \ --conf spark.kubernetes.file.upload.path=s3a://mykidong/spark-examples \ --conf spark.kubernetes.container.image.pullPolicy=Always \ --conf spark.kubernetes.namespace=$NAMESPACE \ --conf spark.kubernetes.container.image=xxx/spark:v3.0.0 \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ --conf spark.hadoop.hive.metastore.client.connect.retry.delay=5 \ --conf spark.hadoop.hive.metastore.client.socket.timeout=1800 \ --conf spark.hadoop.hive.metastore.uris=thrift://$HIVE_METASTORE \ --conf spark.hadoop.hive.server2.enable.doAs=false \ --conf spark.hadoop.hive.server2.thrift.http.port=10002 \ --conf spark.hadoop.hive.server2.thrift.port=10016 \ --conf spark.hadoop.hive.server2.transport.mode=binary \ --conf spark.hadoop.metastore.catalog.default=spark \ --conf spark.hadoop.hive.execution.engine=spark \ --conf spark.hadoop.hive.input.format=io.delta.hive.HiveInputFormat \ --conf spark.hadoop.hive.tez.input.format=io.delta.hive.HiveInputFormat \ --conf spark.sql.warehouse.dir=s3a:/mykidong/apps/spark/warehouse \ --conf spark.hadoop.fs.defaultFS=s3a://mykidong \ --conf spark.hadoop.fs.s3a.access.key=any-access-key \ --conf spark.hadoop.fs.s3a.secret.key=any-secret-key \ --conf spark.hadoop.fs.s3a.connection.ssl.enabled=true \ --conf spark.hadoop.fs.s3a.endpoint=$ENDPOINT \ --conf spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem \ --conf spark.hadoop.fs.s3a.fast.upload=true \ --conf spark.hadoop.fs.s3a.path.style.access=true \ --conf spark.driver.extraJavaOptions="-Divy.cache.dir=/tmp -Divy.home=/tmp" \ --conf spark.executor.instances=3 \ --conf spark.executor.memory=2G \ --conf spark.executor.cores=1 \ --conf spark.driver.memory=1G \ file:///home/pcp/xxx/examples/spark/target/spark-example-1.0.0-SNAPSHOT-spark-job.jar \ --master ${MASTER}; ... # chmod set to be executable. chmod a+x run-spark-example.sh; ``` Now, create an azkaban flow like this: ``` --- config: failure.emails: mykidong@gmail.com nodes: - name: Start type: noop - name: RunSparkJob type: command config: command: ssh pcp@x.x.x.x "/home/pcp/run-spark-example.sh" dependsOn: - Start - name: End type: noop dependsOn: - RunSparkJob ``` After zipping and deploying your new project with this flow to azkaban web, you can execute the job in the azkaban ui. ## Embeded Pod Runner in Azaban Executor to create a job pod There is another scenario to run jobs. You can write a Pod Runner embeded in azkaban executor to run a job pod. See this for more details: https://mykidong.medium.com/azkaban-on-kubernetes-149fde21aa50