Skip to content

Latest commit

 

History

History
87 lines (51 loc) · 5.91 KB

POSTINSTALL.md

File metadata and controls

87 lines (51 loc) · 5.91 KB

See it in action

You can test out this extension right away!

  1. Go to your Cloud Firestore dashboard in the Firebase console.

  2. If it doesn't already exist, create the collection you specified during installation: ${param:COLLECTION_PATH}

  3. Create a document in the collection called bigquery-mirror-test that contains any fields with any values that you'd like.

  4. Go to the BigQuery web UI in the Google Cloud Platform console.

  5. Query your raw changelog table, which should contain a single log of creating the bigquery-mirror-test document.

    SELECT *
    FROM `${param:BIGQUERY_PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_raw_changelog`
    
  6. Query your latest view, which should return the latest change event for the only document present -- bigquery-mirror-test.

    SELECT *
    FROM `${param:BIGQUERY_PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_raw_latest`
    
  7. Delete the bigquery-mirror-test document from Cloud Firestore. The bigquery-mirror-test document will disappear from the latest view and a DELETE event will be added to the raw changelog table.

  8. You can check the changelogs of a single document with this query:

    SELECT *
    FROM `${param:BIGQUERY_PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_raw_changelog`
    WHERE document_name = "bigquery-mirror-test"
    ORDER BY TIMESTAMP ASC
    

Using the extension

Whenever a document is created, updated, imported, or deleted in the specified collection, this extension sends that update to BigQuery. You can then run queries on this mirrored dataset which contains the following resources:

To review the schema for these two resources, click the Schema tab for each resource in BigQuery.

Note that this extension only listens for document changes in the collection, but not changes in any subcollection. You can, though, install additional instances of this extension to specifically listen to a subcollection or other collections in your database. Or if you have the same subcollection across documents in a given collection, you can use {wildcard} notation to listen to all those subcollections (for example: chats/{chatid}/posts).

Enabling wildcard references will provide an additional STRING based column. The resulting JSON field value references any wildcards that are included in ${param:COLLECTION_PATH}. You can extract them using JSON_EXTRACT_SCALAR.

Partition settings cannot be updated on a pre-existing table, if these options are required then a new table must be created.

Clustering will not need to create or modify a table when adding clustering options, this will be updated automatically.

Configuring Cross-Platform BigQuery Setup

When defining a specific BigQuery project ID, a manual step to set up permissions is required:

  1. Navigate to https://console.cloud.google.com/iam-admin/iam?project=${param:BIGQUERY_PROJECT_ID}
  2. Add the BigQuery Data Editor role to the following service account: ext-${param:EXT_INSTANCE_ID}@${param:PROJECT_ID}.iam.gserviceaccount.com.

(Optional) Import existing documents

If you chose not to automatically import existing documents when you installed this extension, you can backfill your BigQuery dataset with all the documents in your collection using the import script.

If you don't either enable automatic import or run the import script, the extension only exports the content of documents that are created or changed after installation.

The import script can read all existing documents in a Cloud Firestore collection and insert them into the raw changelog table created by this extension. The script adds a special changelog for each document with the operation of IMPORT and the timestamp of epoch. This is to ensure that any operation on an imported document supersedes the IMPORT.

Warning: Make sure to not run the import script if you enabled automatic backfill during the extension installation, as it might result in data loss.

Important: Run the import script over the entire collection after installing this extension, otherwise all writes to your database during the import might be lost.

Learn more about using the import script to backfill your existing collection.

(Optional) Generate schema views

After your data is in BigQuery, you can use the schema-views script (provided by this extension) to create views that make it easier to query relevant data. You only need to provide a JSON schema file that describes your data structure, and the schema-views script will create the views.

Learn more about using the schema-views script to generate schema views.

Monitoring

As a best practice, you can monitor the activity of your installed extension, including checks on its health, usage, and logs.