Skip to content

Latest commit

 

History

History
659 lines (424 loc) · 24.8 KB

hybrid-cloud.rst

File metadata and controls

659 lines (424 loc) · 24.8 KB

Module 2: Deploy Hybrid |cp| and |ccloud| Environment

Hybrid Deployment to |ccloud|

In a hybrid |ak-tm| deployment scenario, you can have both an on-premises |cp| deployment as well as a Confluent Cloud deployment. In this module, you will use Cluster Linking and Schema Linking to send data and schemas to |ccloud|, and monitor both deployments with Confluent Health+ and the Confluent Cloud Metrics API.

image

Before you begin this module, make sure the cp-demo start.sh script successfully completed and |cp| is already running :ref:`(see the on-prem module) <cp-demo-run>`.

Cost to Run

Caution

|ccloud| Promo Code

To receive an additional $50 free usage in |ccloud|, enter promo code CPDEMO50 in the |ccloud| Console's Billing and payment section (details). This promo code should sufficiently cover up to one day of running this |ccloud| example, beyond which you may be billed for the services that have an hourly charge until you destroy the |ccloud| resources created by this example.

Set Up |ccloud|

  1. Create a |ccloud| account at https://confluent.cloud.

  2. Enter the promo code CPDEMO50 in the |ccloud| UI Billing and payment section to receive an additional $50 free usage.

  3. Go to https://confluent.cloud/environments and click "+ Add cloud environment". Name the environment cp-demo-env.

  4. Inside the "cp-demo-env" environment, create a Dedicated |ccloud| cluster named cp-demo-cluster in the cloud provider and region of your choice with default configurations. Wait until your cluster is in a running state before proceeding.

    images/running-state-cluster.png

    Note

    Cluster Linking requires a dedicated cluster

  5. Create a Schema Registry for the "cp-demo-env" environment in the same region as your cluster.

Set Up Confluent CLI and variables

  1. Install the Confluent CLI locally, v3.28.0 or later (run confluent update if you already have it installed to update).

    curl -sL --http1.1 https://cnfl.io/cli | sh -s -- latest \
    && alias confluent="$PWD/bin/confluent"
    

    Verify the installation was successful.

    confluent version
    
  2. Using the CLI, log in to |ccloud| with the command confluent login, and use your |ccloud| username and password. The --save argument saves your |ccloud| user login credentials for future use.

    confluent login --save
  3. Use the demo |ccloud| environment.

    CC_ENV=$(confluent environment list -o json \
             | jq -r '.[] | select(.name | contains("cp-demo")) | .id') \
    && echo "Your Confluent Cloud environment: $CC_ENV" \
    && confluent environment use $CC_ENV
  4. Get the |ccloud| cluster ID and use the cluster.

    CCLOUD_CLUSTER_ID=$(confluent kafka cluster list -o json \
                      | jq -r '.[] | select(.name | contains("cp-demo")) | .id') \
    && echo "Your Confluent Cloud cluster ID: $CCLOUD_CLUSTER_ID" \
    && confluent kafka cluster use $CCLOUD_CLUSTER_ID
  5. Get the bootstrap endpoint for the |ccloud| cluster.

    CC_BOOTSTRAP_ENDPOINT=$(confluent kafka cluster describe -o json | jq -r .endpoint) \
    && echo "Your Cluster's endpoint: $CC_BOOTSTRAP_ENDPOINT"
  6. Create a |ccloud| service account for CP Demo and get its ID.

    confluent iam service-account create cp-demo-sa --description "service account for cp-demo" \
    && SERVICE_ACCOUNT_ID=$(confluent iam service-account list -o json \
                         | jq -r '.[] | select(.name | contains("cp-demo")) | .id') \
    && echo "Your cp-demo service account ID: $SERVICE_ACCOUNT_ID"
  7. Enable Schema Registry in your |ccloud| environment, if you have not already done so.

    confluent schema-registry cluster enable --cloud <cloud> --geo <geo>
  8. Get the ID and endpoint URL for your Schema Registry cluster.

    CC_SR_CLUSTER_ID=$(confluent schema-registry cluster describe -o json | jq -r .cluster_id) \
    && CC_SR_ENDPOINT=$(confluent schema-registry cluster describe -o json | jq -r .endpoint_url) \
    && echo "Schema Registry Cluster ID: $CC_SR_CLUSTER_ID" \
    && echo "Schema Registry Endpoint: $CC_SR_ENDPOINT"
  9. Create a Schema Registry API key for the cp-demo service account.

    confluent api-key create \
       --service-account $SERVICE_ACCOUNT_ID \
       --resource $CC_SR_CLUSTER_ID \
       --description "SR key for cp-demo schema link"

    Verify your output resembles

    It may take a couple of minutes for the API key to be ready.
    Save the API key and secret. The secret is not retrievable later.
    +---------+------------------------------------------------------------------+
    | API Key | SZBKJLD67XK5NZNZ                                                 |
    | Secret  | NTqs/A3Mt0Ohkk4fkaIsC0oLQ5Q/F0lLowYo/UrsTrEAM5ozxY7fjqxDdVwMJz99 |
    +---------+------------------------------------------------------------------+
    

    Set variables to reference the Schema Registry credentials returned in the previous step.

    SR_API_KEY=SZBKJLD67XK5NZNZ
    SR_API_SECRET=NTqs/A3Mt0Ohkk4fkaIsC0oLQ5Q/F0lLowYo/UrsTrEAM5ozxY7fjqxDdVwMJz99
  10. Create a Kafka cluster API key for the cp-demo service account.

    confluent api-key create \
       --service-account $SERVICE_ACCOUNT_ID \
       --resource $CCLOUD_CLUSTER_ID \
       --description "Kafka key for cp-demo cluster link"

    Verify your output resembles

    It may take a couple of minutes for the API key to be ready.
    Save the API key and secret. The secret is not retrievable later.
    +---------+-------------------------------------------------------------------+
    | API Key | SZBKLMG61XK9NZAB                                                  |
    | Secret  | QTpi/A3Mt0Ohkk4fkaIsGR3ATQ5Q/F0lLowYo/UrsTr3AMsozxY7fjqxDdVwMJz02 |
    +---------+-------------------------------------------------------------------+
    

    Set variables to reference the Kafka credentials returned in the previous step.

    CCLOUD_CLUSTER_API_KEY=SZBKLMG61XK9NZAB
    CCLOUD_CLUSTER_API_SECRET=QTpi/A3Mt0Ohkk4fkaIsGR3ATQ5Q/F0lLowYo/UrsTr3AMsozxY7fjqxDdVwMJz02
  11. We will also need the cluster ID for the on-premises |cp| cluster.

    CP_CLUSTER_ID=$(curl -s https://localhost:8091/v1/metadata/id \
                   --tlsv1.2 --cacert ./scripts/security/snakeoil-ca-1.crt \
                   | jq -r ".id") \
    && echo "Your on-prem Confluent Platform cluster ID: $CP_CLUSTER_ID"

Note

For security purposes, you may be automatically logged out of the confluent CLI at some point. If this happens, run the following command:

confluent login && \
confluent environment use $CC_ENV && \
confluent kafka cluster use $CCLOUD_CLUSTER_ID

Export Schemas to |ccloud| with Schema Linking

Confluent Schema Registry is critical for evolving schemas alongside your business needs and ensuring high data quality. With Schema Linking , you can easily export your schemas from your on-premises Schema Registry to |ccloud|. In this section, you will export the schema subjects wikipedia.parsed-value and wikipedia.parsed.count-by-domain-value from |cp| to |ccloud| with schema linking. These schema subjects will be exported to a new schema context called "cp-demo", so their qualified subject names in |ccloud| will be :.cp-demo:wikipedia.parsed-value and :.cp-demo:wikipedia.parsed.count-by-domain-value.

  1. From here, we will switch back and forth between using |ccloud| and |cp|. We can streamline this "context switching" with the confluent context CLI subcommand. Here let's create a context called "ccloud" from the current context.

    confluent context update --name ccloud
  2. Next, log into |cp| and create a context called "cp". To create a cluster link, the CLI user must have ClusterAdmin privileges. For simplicity, sign in as a super user using the username superUser and password superUser

    confluent login --save --url https://localhost:8091 \
       --ca-cert-path scripts/security/snakeoil-ca-1.crt

    and create a CLI context called "cp".

    confluent context update --name cp
  3. Inspect the schema exporter configuration file at scripts/ccloud/schema-link-example.properties.

    .. literalinclude:: ../scripts/ccloud/schema-link-example.properties
    
    
  4. Run the following command copy the contents of the configuration file to a new file called schema-link.propertes that includes your |sr| credentials.

    sed -e "s|<destination sr url>|${CC_SR_ENDPOINT}|g" \
    -e "s|<destination api key>|${SR_API_KEY}|g" \
    -e "s|<destination api secret>|${SR_API_SECRET}|g" \
    scripts/ccloud/schema-link-example.properties > scripts/ccloud/schema-link.properties
  5. Create a schema exporter called "cp-cc-schema-exporter" for the on-premises Schema Registry.

    confluent schema-registry exporter create cp-cc-schema-exporter \
       --subjects "wikipedia.parsed*" \
       --context-name cp-demo \
       --context-type CUSTOM \
       --schema-registry-endpoint https://localhost:8085 \
       --ca-location scripts/security/snakeoil-ca-1.crt \
       --config scripts/ccloud/schema-link.properties

    Notice we can use a wildcard * to export multiple subjects.

    Note

    Whether using the REST API or the CLI, the user making the request needs permission to create the schema exporter and to read the schema subjects.

    For educational purposes, here is an equivalent command that uses curl on Confuent Server's embedded REST API with the schemaregistryUser principal:

    curl -X POST -H "Content-Type: application/json" \
       -d @<(cat <<-EOF
    {
       "name": "cp-cc-schema-exporter",
       "contextType": "CUSTOM",
       "context": "cp-demo",
       "subjects": ["wikipedia.parsed*"],
       "config": {
          "schema.registry.url": "${CC_SR_ENDPOINT}",
          "basic.auth.credentials.source": "USER_INFO",
          "basic.auth.user.info": "${SR_API_KEY}:${SR_API_SECRET}"
       }
    }
    EOF
    ) \
       --user schemaregistryUser:schemaregistryUser \
       --cacert scripts/security/snakeoil-ca-1.crt \
       https://localhost:8085/exporters
  6. Verify the schema exporter is running.

    confluent schema-registry exporter get-status cp-cc-schema-exporter \
       --schema-registry-endpoint https://localhost:8085 \
       --ca-location scripts/security/snakeoil-ca-1.crt
  7. Switch back to the ccloud CLI context (not to be confused with Schema Registry context!).

    confluent context use ccloud
    
  8. Verify that the schema subjects are being exported to |ccloud|.

    confluent schema-registry subject list --prefix ":.cp-demo:"

    The output should resemble

                               Subject
    ----------------------------------------------------
    :.cp-demo:wikipedia.parsed-value
    :.cp-demo:wikipedia.parsed.count-by-domain-value
    

Schema subjects have been successfully exported from |cp| to |ccloud| with schema linking! As schemas evolve on-premises, those changes will automatically propagate to |ccloud| as long as the exporter is running.

Mirror Data to |ccloud| with Cluster Linking

In this section, you will create a source-initiated cluster link to mirror the topic wikipedia.parsed from |cp| to |ccloud|. For security reasons, most on-premises datacenters don't allow inbound connections, so Confluent recommends source-initiated cluster linking to easily and securely mirror Kafka topics from your on-premises cluster to |ccloud|.

  1. Verify that you're still using the ccloud CLI context.

    confluent context list
    
  2. Give the cp-demo service account the CloudClusterAdmin role in |ccloud| to authorize it to create cluster links and mirror topics in |ccloud|.

    confluent iam rbac role-binding create \
       --principal User:$SERVICE_ACCOUNT_ID \
       --role CloudClusterAdmin \
       --cloud-cluster $CCLOUD_CLUSTER_ID --environment $CC_ENV

    Verify that the role-binding was created. The output should show the role has been created.

    confluent iam rbac role-binding list \
       --principal User:$SERVICE_ACCOUNT_ID \
       -o json | jq
  3. Inspect the file scripts/ccloud/cluster-link-ccloud.properties

    .. literalinclude:: ../scripts/ccloud/cluster-link-ccloud.properties
    
    
  4. Create the |ccloud| half of the cluster link with the name cp-cc-cluster-link.

    confluent kafka link create cp-cc-cluster-link \
       --cluster $CCLOUD_CLUSTER_ID \
       --source-cluster $CP_CLUSTER_ID \
       --config-file ./scripts/ccloud/cluster-link-ccloud.properties
  5. Inspect the file scripts/ccloud/cluster-link-cp-example.properties and read the comments to understand what each property does.

    .. literalinclude:: ../scripts/ccloud/cluster-link-cp-example.properties
    
    
  6. Run the following command to copy the file to scripts/ccloud/cluster-link-cp.properties with credentials and bootstrap endpoint for your own |ccloud| cluster.

    sed -e "s|<confluent cloud cluster link api key>|${CCLOUD_CLUSTER_API_KEY}|g" \
       -e "s|<confluent cloud cluster link api secret>|${CCLOUD_CLUSTER_API_SECRET}|g" \
       -e "s|<confluent cloud bootstrap endpoint>|${CC_BOOTSTRAP_ENDPOINT}|g" \
          scripts/ccloud/cluster-link-cp-example.properties > scripts/ccloud/cluster-link-cp.properties
  7. Next, use the cp CLI context to log into |cp|. To create a cluster link, the CLI user must have ClusterAdmin privileges. For simplicity, we are continuing to use a super user instead of a ClusterAdmin.

    confluent context use cp
  8. The cluster link itself needs the DeveloperRead and DeveloperManage roles for any topics it plans to mirror, as well as the ClusterAdmin role for the Kafka cluster. Our cluster link uses the connectorSA principal, which already has ResourceOwner permissions on the wikipedia.parsed topic, so we just need to add the ClusterAdmin role.

    confluent iam rbac role-binding create \
       --principal User:connectorSA \
       --role ClusterAdmin \
       --kafka-cluster $CP_CLUSTER_ID
  9. Create the |cp| half of the cluster link, still called cp-cc-cluster-link.

    confluent kafka link create cp-cc-cluster-link \
       --destination-bootstrap-server $CC_BOOTSTRAP_ENDPOINT \
       --destination-cluster $CCLOUD_CLUSTER_ID \
       --config ./scripts/ccloud/cluster-link-cp.properties \
       --url https://localhost:8091/kafka \
       --ca-cert-path scripts/security/snakeoil-ca-1.crt
  10. Switch contexts back to "ccloud" and create the mirror topic for wikipedia.parsed in |ccloud|.

    confluent context use ccloud \
    && confluent kafka mirror create wikipedia.parsed --link cp-cc-cluster-link
  11. Consume records from the mirror topic using the schema context "cp-demo". Press Ctrl+C to stop the consumer when you are ready.

    confluent kafka topic consume \
       --api-key $CCLOUD_CLUSTER_API_KEY \
       --api-secret $CCLOUD_CLUSTER_API_SECRET \
       --schema-registry-endpoint $CC_SR_ENDPOINT/contexts/:.cp-demo: \
       --schema-registry-api-key $SR_API_KEY \
       --schema-registry-api-secret $SR_API_SECRET \
       --value-format avro \
          wikipedia.parsed | jq

You successfully created a source-initiated cluster link to seamlessly move data from on-premises to cloud in real time. Cluster linking opens up real-time hybrid cloud, multi-cloud, and disaster recovery use cases. See the Cluster Linking documentation for more information.

|ccloud| ksqlDB

In this section, you will create a |ccloud| ksqlDB cluster to processes data from the wikipedia.parsed mirror topic.

  1. Log into the |ccloud| Console at https://confluent.cloud and navigate to the cp-demo-env environment and then to the cp-demo-cluster cluster within that environment.

  2. Select "ksqlDB" from the left side menu, click "Create cluster myself". Select "Global access". Name the cluster cp-demo-ksql and choose a cluster size of 1 CKU. It will take a minute or so to provision.

  3. Once the ksqlDB cluster is provisioned, click into it and enter these query statements into the editor:

    CREATE STREAM wikipedia WITH (kafka_topic='wikipedia.parsed', value_format='AVRO');
    CREATE STREAM wikipedianobot AS
       SELECT *, (length->new - length->old) AS BYTECHANGE
       FROM wikipedia
          WHERE bot = false
             AND length IS NOT NULL
             AND length->new IS NOT NULL
             AND length->old IS NOT NULL;
  4. Click the "Flow" tab to see the stream processing topology.

    images/ccloud_ksqldb_flow.png
  5. View the events in the ksqlDB streams in |ccloud| by pasting in SELECT * FROM WIKIPEDIANOBOT EMIT CHANGES; and clicking "Run query". Stop the query when you are finished.

    images/ccloud_ksqldb_stream.png

Important

The ksqlDB cluster in |ccloud| has hourly charges even if you are not actively using it. Make sure to go to :ref:`cp-demo-ccloud-cleanup` in the Teardown module to destroy all cloud resources when you are finished.

Metrics API

Configure Confluent Health+ with the Telemetry Reporter

  1. Verify that you're still using the ccloud CLI context.

    confluent context list
    
  2. Create a new Cloud API key and secret to authenticate to |ccloud|. These credentials will be used to configure the Telemetry Reporter in |cp| for Health+, as well as to access the |ccloud| Metrics API directly.

    confluent api-key create --resource cloud -o json \
       --service-account $SERVICE_ACCOUNT_ID \
       --description "cloud api key for cp-demo"

    Verify your output resembles:

    {
       "key": "QX7X4VA4DFJTTOIA",
       "secret": "fjcDDyr0Nm84zZr77ku/AQqCKQOOmb35Ql68HQnb60VuU+xLKiu/n2UNQ0WYXp/D"
    }
    

    The value of the API key, in this case QX7X4VA4DFJTTOIA, and API secret, in this case fjcDDyr0Nm84zZr77ku/AQqCKQOOmb35Ql68HQnb60VuU+xLKiu/n2UNQ0WYXp/D, will differ in your output.

  3. Set variables to reference these credentials returned in the previous step.

    METRICS_API_KEY=QX7X4VA4DFJTTOIA
    METRICS_API_SECRET=fjcDDyr0Nm84zZr77ku/AQqCKQOOmb35Ql68HQnb60VuU+xLKiu/n2UNQ0WYXp/D
    
  4. :ref:`Dynamically configure <kafka-dynamic-configurations>` the on-premises cp-demo cluster to use the Telemetry Reporter, which sends metrics to |ccloud|. This requires setting 3 configuration parameters: confluent.telemetry.enabled=true, confluent.telemetry.api.key, and confluent.telemetry.api.secret.

    docker-compose exec kafka1 kafka-configs \
      --bootstrap-server kafka1:12091 \
      --alter \
      --entity-type brokers \
      --entity-default \
      --add-config confluent.telemetry.enabled=true,confluent.telemetry.api.key=${METRICS_API_KEY},confluent.telemetry.api.secret=${METRICS_API_SECRET}
    
  5. Check the broker logs to verify the brokers were dynamically configured.

    docker logs --since=5m kafka1 | grep confluent.telemetry.api

    Your output should resemble the following, but the confluent.telemetry.api.key value will be different in your environment.

    ...
    confluent.telemetry.api.key = QX7X4VA4DFJTTOIA
    confluent.telemetry.api.secret = [hidden]
    ...
    
  6. Navigate to the Health+ section of the |ccloud| Console at https://confluent.cloud/health-plus and verify you see your cluster's Health+ dashboard.

    images/hosted-monitoring.png

Query Metrics

  1. First we will query the Metrics API for on-premises metrics. Here are the content of the query file :devx-cp-demo:`metrics query file|scripts/ccloud/metrics_query_onprem.json`, which requests io.confluent.kafka.server/received_bytes for the topic wikipedia.parsed in the on-premises cluster (for all queryable metrics examples, see Metrics API):

    .. literalinclude:: ../scripts/ccloud/metrics_query_onprem.json
    
    
  2. Send this query to the Metrics API endpoint at https://api.telemetry.confluent.cloud/v2/metrics/hosted-monitoring/query.

    curl -s -u ${METRICS_API_KEY}:${METRICS_API_SECRET} \
         --header 'content-type: application/json' \
         --data @scripts/ccloud/metrics_query_onprem.json \
         https://api.telemetry.confluent.cloud/v2/metrics/hosted-monitoring/query \
            | jq .
    
  3. Your output should resemble the output below, showing metrics for the on-premises topic wikipedia.parsed:

    {
      "data": [
        {
          "timestamp": "2020-12-14T20:52:00Z",
          "value": 1744066,
          "metric.topic": "wikipedia.parsed"
        },
        {
          "timestamp": "2020-12-14T20:53:00Z",
          "value": 1847596,
          "metric.topic": "wikipedia.parsed"
        }
      ]
    }
    
  4. For the |ccloud| metrics: view the :devx-cp-demo:`metrics query file|scripts/ccloud/metrics_query_ccloud.json`, which requests io.confluent.kafka.server/cluster_link_mirror_topic_bytes for the cluster link cp-cc-cluster-link in |ccloud|, which includes metrics for the wikipedia.parsed mirror topic.

    .. literalinclude:: ../scripts/ccloud/metrics_query_ccloud.json
    
    
    
  5. Send this query to the Metrics API endpoint at https://api.telemetry.confluent.cloud/v2/metrics/cloud/query.

    sed "s/<CCLOUD_CLUSTER_ID>/${CCLOUD_CLUSTER_ID}/g" scripts/ccloud/metrics_query_ccloud.json \
    | curl -s -u ${METRICS_API_KEY}:${METRICS_API_SECRET} \
         --header 'content-type: application/json' \
         --data @- \
         https://api.telemetry.confluent.cloud/v2/metrics/cloud/query \
            | jq .
    
  6. Your output should resemble the output below, showing metrics for the cluster link cp-cc-cluster-link, including the |ccloud| mirror topic wikipedia.parsed:

    {
      "data": [
        {
          "timestamp": "2020-12-14T20:00:00Z",
          "value": 1690522,
          "metric.topic": "wikipedia.parsed"
        }
      ]
    }
    

Tip

See Metrics and Monitoring for Cluster Linking for more information about monitoring cluster links, including how to monitor mirror lag.

Cleanup

Follow the clean up procedure in :ref:`cp-demo-ccloud-cleanup` to avoid unexpected |ccloud| charges.