title | summary | toc | docs_area |
---|---|---|---|
Changefeeds on Tables with Column Families |
Understand how changefeeds work on tables with column families. |
true |
stream_data |
You can create changefeeds on tables with more than one column family. Changefeeds will emit individual messages per column family on a table.
For further detail, see the following sections:
To target a table with multiple column families, set the split_column_families
option when creating a changefeed:
CREATE CHANGEFEED FOR TABLE {table} INTO {sink} WITH split_column_families;
To emit messages for a specific column family, use the FAMILY
keyword:
CREATE CHANGEFEED FOR TABLE {table} FAMILY {family} INTO {sink};
{{site.data.alerts.callout_info}}
You can also use Core changefeeds on tables with column families by using the EXPERIMENTAL CHANGEFEED FOR
statement with split_column_families
or the FAMILY
keyword.
{{site.data.alerts.end}}
If a table has multiple column families, the FAMILY
keyword will ensure the changefeed emits messages for each column family you define with FAMILY
in the CREATE CHANGEFEED
statement. If you do not specify FAMILY
, then the changefeed will emit messages for all the table's column families.
To specify multiple families on the same table, it is necessary to define the table and family in both instances:
CREATE CHANGEFEED FOR TABLE tbl FAMILY f_1, TABLE tbl FAMILY f_2;
The response will follow a typical changefeed message format, but with the family name appended to the table name with a .
, in the format table.family
:
{"after":{"column":"value"},"key":[1],"topic":"table.family"}
For cloud storage sinks, the filename will include the family name appended to the table name with a +
, in the format table+primary
.
Avro schema names will include the family name concatenated to the table name.
The primary key columns will appear in the key
for all column families, and will also appear in the value only for the families that they are a member of.
For example, if the table office_dogs
has a column family primary
, containing the primary key and a STRING
column, and a secondary
column family containing a different STRING
column, then you'll receive two messages for an insert.
CREATE TABLE office_dogs (
id INT PRIMARY KEY,
name STRING,
owner STRING,
FAMILY primary (id, name),
FAMILY secondary (owner)
);
The changefeed targeting this table (started with split_column_families
) will emit the following when there are inserts to the table:
{"after":{"id":4,"name":"Toby"},"key":[4],"topic":"office_dogs.primary"}],"length":1}
{"after":{"owner":"Ashley"},"key":[4],"topic":"office_dogs.secondary"}],"length":1}
The output shows the primary
column family with 4
in the value ({"id":4,"name":"Toby"}
) and the key ("key":[4]
). The secondary
family doesn't contain the id
column, so the primary key 4
is only in the key and not the value. For an update that only affects data in one column family, the changefeed will send one message for that update relating to the family.
- If you create a table without column families and then start a changefeed with the
split_column_families
option, it is not possible to add column families. A subsequentALTER TABLE
statement adding a column family to the table will cause the changefeed to fail. - When you do not specify column family names in the
CREATE
orALTER TABLE
statement, the family names will default to either of the following:primary
: Sinceprimary
is a key word, you'll receive a syntax error if you runCREATE CHANGEFEED FOR table FAMILY primary
. To avoid this syntax error, use double quotes:CREATE CHANGEFEED FOR table FAMILY "primary"
. You'll receive output from the changefeed like:table.primary
.fam_<zero-indexed family id>_<delimited list of columns>
: For a table that does not include a name for the family:FAMILY (id, name)
, you'll receive output from the changefeed containing:table.fam_0_id_name
. This references the table, the family ID and the two columns that this column family includes.
For examples of starting changefeeds on tables with column families, see the following examples for Enterprise and Core changefeeds.
{{site.data.alerts.callout_info}}
CREATE CHANGEFEED
is an Enterprise-only feature. For the Core version, see the CHANGEFEED FOR
example.
{{site.data.alerts.end}}
In this example, you'll set up changefeeds on two tables that have column families. You'll use a single-node cluster sending changes to a webhook sink for this example, but you can use any changefeed sink to work with tables that include column families.
-
If you do not already have one, request a trial {{ site.data.products.enterprise }} license.
-
Use the
cockroach start-single-node
command to start a single-node cluster:{% include_cached copy-clipboard.html %}
cockroach start-single-node --insecure --listen-addr=localhost --background
-
As the
root
user, open the built-in SQL client:{% include_cached copy-clipboard.html %}
cockroach sql --insecure
-
Set your organization and Enterprise license key that you received via email:
{% include_cached copy-clipboard.html %}
SET CLUSTER SETTING cluster.organization = '<organization name>';
{% include_cached copy-clipboard.html %}
SET CLUSTER SETTING enterprise.license = '<secret>';
-
Enable the
kv.rangefeed.enabled
cluster setting:{% include_cached copy-clipboard.html %}
SET CLUSTER SETTING kv.rangefeed.enabled = true;
-
In a separate terminal window, set up your HTTP server. Clone the test repository:
{% include_cached copy-clipboard.html %}
git clone https://github.com/cockroachlabs/cdc-webhook-sink-test-server.git
{% include_cached copy-clipboard.html %}
cd cdc-webhook-sink-test-server/go-https-server
-
Next make the script executable and then run the server (passing a specific port if preferred, otherwise it will default to
:3000
):{% include_cached copy-clipboard.html %}
chmod +x ./server.sh
{% include_cached copy-clipboard.html %}
./server.sh <port>
-
Back in your SQL shell, create a database called
cdc_demo
:{% include_cached copy-clipboard.html %}
CREATE DATABASE cdc_demo;
-
Set the database as the default:
{% include_cached copy-clipboard.html %}
USE cdc_demo;
-
Create a table with two column families:
{% include_cached copy-clipboard.html %}
CREATE TABLE office_dogs ( id INT PRIMARY KEY, name STRING, dog_owner STRING, FAMILY dogs (id, name), FAMILY employee (dog_owner) );
-
Insert some data into the table:
{% include_cached copy-clipboard.html %}
INSERT INTO office_dogs (id, name, dog_owner) VALUES (1, 'Petee', 'Lauren'), (2, 'Max', 'Taylor'), (3, 'Patch', 'Sammy'), (4, 'Roach', 'Ashley');
-
Create a second table that also defines column families:
{% include_cached copy-clipboard.html %}
CREATE TABLE office_plants ( id INT PRIMARY KEY, plant_name STRING, office_floor INT, safe_for_dogs BOOL, FAMILY dog_friendly (office_floor, safe_for_dogs), FAMILY plant (id, plant_name) );
-
Insert some data into
office_plants
:{% include_cached copy-clipboard.html %}
INSERT INTO office_plants (id, plant_name, office_floor, safe_for_dogs) VALUES (1, 'Sansevieria', 11, false), (2, 'Monstera', 11, false), (3, 'Peperomia', 10, true), (4, 'Jade', 9, true);
-
Create a changefeed on the
office_dogs
table targeting one of the column families. Use theFAMILY
keyword in theCREATE
statement:{% include_cached copy-clipboard.html %}
CREATE CHANGEFEED FOR TABLE office_dogs FAMILY employee INTO 'webhook-https://localhost:3000?insecure_tls_skip_verify=true';
You'll receive one message for each of the inserts that affects the specified column family:
{"payload":[{"after":{"dog_owner":"Lauren"},"key":[1],"topic":"office_dogs.employee"}],"length":1} {"payload":[{"after":{"dog_owner":"Sammy"},"key":[3],"topic":"office_dogs.employee"}],"length":1} {"payload":[{"after":{"dog_owner":"Taylor"},"key":[2],"topic":"office_dogs.employee"}],"length":1} {"payload":[{"after":{"dog_owner":"Ashley"},"key":[4],"topic":"office_dogs.employee"}],"length":1}
{{site.data.alerts.callout_info}} The ordering of messages is not guaranteed. That is, you may not always receive messages for the same row, or even the same change to the same row, next to each other. {{site.data.alerts.end}}
Alternatively, create a changefeed using the
FAMILY
keyword across two tables:{% include_cached copy-clipboard.html %}
CREATE CHANGEFEED FOR TABLE office_dogs FAMILY employee, TABLE office_plants FAMILY dog_friendly INTO 'webhook-https://localhost:3000?insecure_tls_skip_verify=true';
You'll receive one message for each insert that affects the specified column families:
{"payload":[{"after":{"dog_owner":"Lauren"},"key":[1],"topic":"office_dogs.employee"}],"length":1} {"payload":[{"after":{"office_floor":11,"safe_for_dogs":false},"key":[1],"topic":"office_plants.dog_friendly"}],"length":1} {"payload":[{"after":{"office_floor":9,"safe_for_dogs":true},"key":[4],"topic":"office_plants.dog_friendly"}],"length":1} {"payload":[{"after":{"dog_owner":"Taylor"},"key":[2],"topic":"office_dogs.employee"}],"length":1} {"payload":[{"after":{"office_floor":11,"safe_for_dogs":false},"key":[2],"topic":"office_plants.dog_friendly"}],"length":1} {"payload":[{"after":{"office_floor":10,"safe_for_dogs":true},"key":[3],"topic":"office_plants.dog_friendly"}],"length":1} {"payload":[{"after":{"dog_owner":"Ashley"},"key":[4],"topic":"office_dogs.employee"}],"length":1} {"payload":[{"after":{"dog_owner":"Sammy"},"key":[3],"topic":"office_dogs.employee"}],"length":1}
This allows you to define particular column families for the changefeed to target, without necessarily specifying every family in a table.
{{site.data.alerts.callout_info}} To create a changefeed specifying two families on one table, ensure that you define the table and family in both instances:
CREATE CHANGEFEED FOR TABLE office_dogs FAMILY employee, TABLE office_dogs FAMILY dogs INTO {sink};
{{site.data.alerts.end}} -
To create a changefeed that emits messages for all column families in a table, use the
split_column_families
option:{% include_cached copy-clipboard.html %}
CREATE CHANGEFEED FOR TABLE office_dogs INTO 'webhook-https://localhost:3000?insecure_tls_skip_verify=true' with split_column_families;
You'll receive output for both of the column families in the
office_dogs
table:{"payload":[{"after":{"id":1,"name":"Petee"},"key":[1],"topic":"office_dogs.dogs"}],"length":1} {"payload":[{"after":{"dog_owner":"Lauren"},"key":[1],"topic":"office_dogs.employee"}],"length":1} {"payload":[{"after":{"id":2,"name":"Max"},"key":[2],"topic":"office_dogs.dogs"}],"length":1} {"payload":[{"after":{"dog_owner":"Taylor"},"key":[2],"topic":"office_dogs.employee"}],"length":1} {"payload":[{"after":{"id":3,"name":"Patch"},"key":[3],"topic":"office_dogs.dogs"}],"length":1} {"payload":[{"after":{"dog_owner":"Sammy"},"key":[3],"topic":"office_dogs.employee"}],"length":1} {"payload":[{"after":{"id":4,"name":"Roach"},"key":[4],"topic":"office_dogs.dogs"}],"length":1} {"payload":[{"after":{"dog_owner":"Ashley"},"key":[4],"topic":"office_dogs.employee"}],"length":1}
{{site.data.alerts.callout_info}} You can find details of your changefeed job using
SHOW CHANGEFEED JOBS
. Changefeeds streaming to Kafka or Google Cloud Pub/Sub will populate thetopics
field in theSHOW CHANGEFEED JOBS
output.When using the
FAMILY
keyword, thetopics
field will display in the formattopic.family
, e.g.,office_dogs.employee,office_dogs.dogs
. With thesplit_column_families
option set,topics
will show the topic name and a family placeholdertopic.{family}
, e.g.,office_dogs.{family}
. {{site.data.alerts.end}} -
Update one of the values in the table:
{% include_cached copy-clipboard.html %}
UPDATE office_dogs SET name = 'Izzy' WHERE id = 4;
This only affects one column family, which means you'll receive one message:
{"payload":[{"after":{"id":4,"name":"Izzy"},"key":[4],"topic":"office_dogs.dogs"}],"length":1}
In this example, you'll set up Core changefeeds on two tables that have column families. You'll use a single-node cluster with the Core changefeed sending changes to the client.
-
Use the
cockroach start-single-node
command to start a single-node cluster:{% include_cached copy-clipboard.html %}
cockroach start-single-node --insecure --listen-addr=localhost --background
-
As the
root
user, open the built-in SQL client:{% include_cached copy-clipboard.html %}
cockroach sql --url="postgresql://[email protected]:26257?sslmode=disable" --format=csv
-
Enable the
kv.rangefeed.enabled
cluster setting:{% include_cached copy-clipboard.html %}
SET CLUSTER SETTING kv.rangefeed.enabled = true;
-
Create a database called
cdc_demo
:{% include_cached copy-clipboard.html %}
CREATE DATABASE cdc_demo;
-
Set the database as the default:
{% include_cached copy-clipboard.html %}
USE cdc_demo;
-
Create a table with two column families:
{% include_cached copy-clipboard.html %}
CREATE TABLE office_dogs ( id INT PRIMARY KEY, name STRING, dog_owner STRING, FAMILY dogs (id, name), FAMILY employee (dog_owner) );
-
Insert some data into the table:
{% include_cached copy-clipboard.html %}
INSERT INTO office_dogs (id, name, dog_owner) VALUES (1, 'Petee', 'Lauren'), (2, 'Max', 'Taylor'), (3, 'Patch', 'Sammy'), (4, 'Roach', 'Ashley');
-
Create another table that also defines two column families:
{% include_cached copy-clipboard.html %}
CREATE TABLE office_plants ( id INT PRIMARY KEY, plant_name STRING, office_floor INT, safe_for_dogs BOOL, FAMILY dog_friendly (office_floor, safe_for_dogs), FAMILY plant (id, plant_name) );
-
Insert some data into
office_plants
:{% include_cached copy-clipboard.html %}
INSERT INTO office_plants (id, plant_name, office_floor, safe_for_dogs) VALUES (1, 'Sansevieria', 11, false), (2, 'Monstera', 11, false), (3, 'Peperomia', 10, true), (4, 'Jade', 9, true);
-
Create a changefeed on the
office_dogs
table targeting one of the column families. Use theFAMILY
keyword in the statement:{% include_cached copy-clipboard.html %}
EXPERIMENTAL CHANGEFEED FOR TABLE office_dogs FAMILY employee;
You'll receive one message for each of the inserts that affects the specified column family:
table,key,value office_dogs.employee,[1],"{""after"": {""owner"": ""Lauren""}}" office_dogs.employee,[2],"{""after"": {""owner"": ""Taylor""}}" office_dogs.employee,[3],"{""after"": {""owner"": ""Sammy""}}" office_dogs.employee,[4],"{""after"": {""owner"": ""Ashley""}}"
{{site.data.alerts.callout_info}} The ordering of messages is not guaranteed. That is, you may not always receive messages for the same row, or even the same change to the same row, next to each other. {{site.data.alerts.end}}
Alternatively, create a changefeed using the
FAMILY
keyword across two tables:{% include_cached copy-clipboard.html %}
EXPERIMENTAL CHANGEFEED FOR TABLE office_dogs FAMILY employee, TABLE office_plants FAMILY dog_friendly;
You'll receive one message for each insert that affects the specified column families:
table,key,value office_plants.dog_friendly,[1],"{""after"": {""office_floor"": 11, ""safe_for_dogs"": false}}" office_plants.dog_friendly,[2],"{""after"": {""office_floor"": 11, ""safe_for_dogs"": false}}" office_plants.dog_friendly,[3],"{""after"": {""office_floor"": 10, ""safe_for_dogs"": true}}" office_plants.dog_friendly,[4],"{""after"": {""office_floor"": 9, ""safe_for_dogs"": true}}" office_dogs.employee,[1],"{""after"": {""dog_owner"": ""Lauren""}}" office_dogs.employee,[2],"{""after"": {""dog_owner"": ""Taylor""}}" office_dogs.employee,[3],"{""after"": {""dog_owner"": ""Sammy""}}" office_dogs.employee,[4],"{""after"": {""dog_owner"": ""Ashley""}}"
This allows you to define particular column families for the changefeed to target, without necessarily specifying every family in a table.
{{site.data.alerts.callout_info}} To create a changefeed specifying two families on one table, ensure that you define the table and family in both instances:
EXPERIMENTAL CHANGEFEED FOR TABLE office_dogs FAMILY employee, TABLE office_dogs FAMILY dogs;
{{site.data.alerts.end}} -
To create a changefeed that emits messages for all column families in a table, use the
split_column_families
option:{% include_cached copy-clipboard.html %}
EXPERIMENTAL CHANGEFEED FOR TABLE office_dogs WITH split_column_families;
In your other terminal window, insert some more values:
{% include_cached copy-clipboard.html %}
cockroach sql --insecure -e "INSERT INTO cdc_demo.office_dogs (id, name, dog_owner) VALUES (5, 'Daisy', 'Cameron'), (6, 'Sage', 'Blair'), (7, 'Bella', 'Ellis');"
Your changefeed will output the following:
table,key,value office_dogs.dogs,[1],"{""after"": {""id"": 1, ""name"": ""Petee""}}" office_dogs.employee,[1],"{""after"": {""owner"": ""Lauren""}}" office_dogs.dogs,[2],"{""after"": {""id"": 2, ""name"": ""Max""}}" office_dogs.employee,[2],"{""after"": {""owner"": ""Taylor""}}" office_dogs.dogs,[3],"{""after"": {""id"": 3, ""name"": ""Patch""}}" office_dogs.employee,[3],"{""after"": {""owner"": ""Sammy""}}" office_dogs.dogs,[4],"{""after"": {""id"": 4, ""name"": ""Roach""}}" office_dogs.employee,[4],"{""after"": {""owner"": ""Ashley""}}" office_dogs.dogs,[5],"{""after"": {""id"": 5, ""name"": ""Daisy""}}" office_dogs.employee,[5],"{""after"": {""owner"": ""Cameron""}}" office_dogs.dogs,[6],"{""after"": {""id"": 6, ""name"": ""Sage""}}" office_dogs.employee,[6],"{""after"": {""owner"": ""Blair""}}" office_dogs.dogs,[7],"{""after"": {""id"": 7, ""name"": ""Bella""}}" office_dogs.employee,[7],"{""after"": {""owner"": ""Ellis""}}"
-
In your other terminal window, update one of the values in the table:
{% include_cached copy-clipboard.html %}
cockroach sql --insecure -e "UPDATE cdc_demo.office_dogs SET name = 'Izzy' WHERE id = 4;"
This only affects one column family, which means you'll receive one message:
office_dogs.dogs,[4],"{""after"": {""id"": 4, ""name"": ""Izzy""}}"