Guide: Schema Registry

This guide will cover how to run Schema Registry on your server in AWS using the hosted Kafka Cluster at Cloudkarafka.
You need a server running Ubuntu in your AWS account that you can access with ssh. To run Schema Registry without memory issues the server needs to have at least 1Gb of memory.

Create a Kafka cluster

Create the Kafka cluster at cloudkarafka.com, make sure to select a subnet that doesn’t conflict with the subnet that your machines (in you account) is using.

Setup VPC peering

See this Guide on how to set up VPC Peering connections Guide: VPC Peering

Download

Schema Registry is part of the Confluent Platform and not available as standalone. So we'll go ahead and download the latest version of the Confluent Platform which is version 5.5.1. Download the tarball and extract into /opt

wget https://packages.confluent.io/archive/5.5/confluent-5.5.1-2.12.tar.gz
tar -xzvf confluent-5.5.1-2.12.tar.gz -C /opt

Configure

cd /opt/confluent-5.5.1
vim etc/schema-registry/schema-registry.properties
# /opt/confluent-5.5.1/etc/schema-registry/schema-registry.properties

listeners=http://0.0.0.0:8081
kafkastore.bootstrap.servers=PLAINTEXT://10.56.72.161:9092,PLAINTEXT://10.56.72.51:9092,PLAINTEXT://10.56.72.225:9092
kafkastore.topic=_schemas
debug=true

Run

/opt/confluent-5.5.1/bin/schema-registry-start /opt/confluent-5.5.1/etc/schema-registry/schema-registry.properties

Run with systemd

Run Schema Registry as a Systemd service for better reliability.

# /etc/systemd/system/schemaregistry.service

[Unit]
Description=Schema Registry

[Service]
Type=simple
PIDFile=/var/run/schemaregistry.pid
User=ubuntu
Group=ubuntu
ExecStart=/opt/confluent-5.5.1/bin/schema-registry-start /opt/confluent-5.5.1/etc/schema-registry/schema-registry.properties
ExecStop=/opt/confluent-5.5.1/bin/schema-registry-stop
Restart=on-failure
SyslogIdentifier=schemaregistry

[Install]
WantedBy=multi-user.target

Now enable the service and start it

sudo systemctl enable schemaregistry
sudo systemctl start schemaregistry

And now the service will start automatically every time the server is rebooted.

To check the status of the service

sudo systemctl status schemaregistry

Use nginx as proxy

Instead of having Schema Registry listen to http://0.0.0.0:8081 you can change that to http://127.0.0.1:8081 and you can put nginx in front. This allows you to use an encrypted connection and it also adds the possibility to use a custom path, for example, different port, subdomain or a custom path to reach Schema Registry. You can also add some security to the endpoint by configuring nginx to check for basic auth header which will then force the user to use username and password to access the http service. More about that here: https://docs.nginx.com/nginx/admin-guide/security-controls/configuring-http-basic-authentication/