MOLT Verify transformations and filter rules#22700

Open

taroface wants to merge 1 commit intomainfrom

molt-verify-transformations

Contributor

taroface commented Feb 18, 2026


          MOLT Verify transformations and filter rules

fd4dd4c

taroface requested review from Jeremyyang920, KeithCh and ryanluu12345

February 18, 2026 22:14

netlify bot commented Feb 18, 2026

✅ Deploy Preview for cockroachdb-api-docs canceled.

Name	Link
🔨 Latest commit	`fd4dd4c`
🔍 Latest deploy log	https://app.netlify.com/projects/cockroachdb-api-docs/deploys/699639d37ad2c6000856f16e

netlify bot commented Feb 18, 2026

✅ Deploy Preview for cockroachdb-interactivetutorials-docs canceled.

Name	Link
🔨 Latest commit	`fd4dd4c`
🔍 Latest deploy log	https://app.netlify.com/projects/cockroachdb-interactivetutorials-docs/deploys/699639d39345d2000864abe8

github-actions bot commented Feb 18, 2026

Files changed:

src/current/molt/molt-verify.md

netlify bot commented Feb 18, 2026

✅ Netlify Preview

Name	Link
🔨 Latest commit	`fd4dd4c`
🔍 Latest deploy log	https://app.netlify.com/projects/cockroachdb-docs/deploys/699639d3d7396d00082a9ca7
😎 Deploy Preview	https://deploy-preview-22700--cockroachdb-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

KeithCh approved these changes

View reviewed changes

KeithCh left a comment

LGTM!

bsanchez-the-roach reviewed

View reviewed changes

Contributor

bsanchez-the-roach left a comment

Some comments are minor nits about phrasing that you can take or leave, others are slightly more substantial.

src/current/molt/molt-verify.md

               `--row-batch-size` | Number of rows to get from a table at a time. <br>**Default:** 20000
               `--schema-filter` | Verify schemas that match a specified [regular expression](https://wikipedia.org/wiki/Regular_expression).<br><br>**Default:** `'.*'`
               `--table-filter` | Verify tables that match a specified [regular expression](https://wikipedia.org/wiki/Regular_expression).<br><br>**Default:** `'.*'`
+              `--transformations-file` | Path to a JSON file that defines transformation rules applied during comparison to verify data that was transformed during [fetch]({% link molt/molt-fetch.md %}#transformations). Use the same transformation file from `molt fetch`. Refer to [Verify transformed data](#verify-transformed-data).

Contributor

bsanchez-the-roach Feb 20, 2026

Suggested change

      
            `--transformations-file` | Path to a JSON file that defines transformation rules applied during comparison to verify data that was transformed during [fetch]({% link molt/molt-fetch.md %}#transformations). Use the same transformation file from `molt fetch`. Refer to [Verify transformed data](#verify-transformed-data).
          
            `--transformations-file` | Path to a JSON file that defines transformation rules to be applied during comparison. If verifying data that was [transformed during a bulk load with MOLT Fetch]({% link molt/molt-fetch.md %}#transformations), use the same transformation file from that `molt fetch` run. Refer to [Verify transformed data](#verify-transformed-data).

src/current/molt/molt-verify.md

+              Filter rules apply `WHERE` clauses to specified tables during verification. Columns referenced in filter expressions **must** be indexed.
+              {{site.data.alerts.callout_info}}
+              Only PostgreSQL and MySQL sources are supported for selective data verification.

Contributor

bsanchez-the-roach Feb 20, 2026

Suggested change

      
            Only PostgreSQL and MySQL sources are supported for selective data verification.
          
            Selective data verification is only supported for PostgreSQL and MySQL sources.

src/current/molt/molt-verify.md

+              - `resource_specifier`: Identifies which schemas and tables to filter. Schema and table names are case-insensitive.
+              	- `schema`: Schema name containing the table.
+              	- `table`: Table name to apply the filter to.
+              - `expr`: SQL expression that applies to both source and target databases. The expression must be valid for both database dialects.

Contributor

bsanchez-the-roach Feb 20, 2026

Does "for both database dialects" mean both the source and the target database dialects? I assume so, but might be good to say that explicitly to avoid ambiguity. (Especially because this comes not long after the callout about both Postgres and MySQL so that's where my head jumped to when considering "two dialects").

src/current/molt/molt-verify.md


		#### Step 1. Create a filter rules file

		Create a JSON file that defines the filter rules. The following example defines filter rules on two tables, `public.filtertbl` and `public.filtertbl2`:

Contributor

bsanchez-the-roach Feb 20, 2026

Are those the names of the schemas/tables on the source or on the target? Or must they match? (Maybe that latter question relates to the transformation content that I haven't yet read).

src/current/molt/molt-verify.md

+              	- `schema`: Schema name containing the table.
+              	- `table`: Table name to apply the filter to.
+              - `expr`: SQL expression that applies to both source and target databases. The expression must be valid for both database dialects.
+              - `source_expr` and `target_expr`: SQL expressions that apply to the source and target databases, respectively. These must be defined together, and cannot be used with `expr`.

Contributor

bsanchez-the-roach Feb 20, 2026

I think I might want a bit more understanding about how this works. So are there three total options: source_expr, target_expr, and expr? Which are optional, and which are mutually exclusive? Are these filter expressions applied sequentially, like first the source_expr does one round of filtering then the target_expr does another? Does the expr filter get applied in between?

I also wonder if elaborating on all of that info is best done in a conceptual doc as opposed to a how-to (though I don't have a strong opinion about that at the moment).

src/current/molt/molt-verify.md

+              - `resource_specifier`: Identifies which schemas and tables to filter. Schema and table names are case-insensitive.
+              	- `schema`: Schema name containing the table.
+              	- `table`: Table name to apply the filter to.

Contributor

bsanchez-the-roach Feb 20, 2026

Suggested change

      
            	- `table`: Table name to apply the filter to.
          
            	- `table`: Name of the table to apply the filter to.

src/current/molt/molt-verify.md

+              ~~~
+              - `resource_specifier`: Identifies which schemas and tables to transform. Schema and table names are case-insensitive.
+              	- `schema`: Schema name containing the table.

Contributor

bsanchez-the-roach Feb 20, 2026

Suggested change

      
            	- `schema`: Schema name containing the table.
          
            	- `schema`: Name of the schema containing the table.

src/current/molt/molt-verify.md

+              }
+              ~~~
+              - `resource_specifier`: Identifies which schemas and tables to transform. Schema and table names are case-insensitive.

Contributor

bsanchez-the-roach Feb 20, 2026

For both schema and table: these are the names on the source, I assume? Should probably state that for clarity.

src/current/molt/molt-verify.md

+              - `table_rename_opts`: Rename the table on the target database.
+              	- `value`: The target table name to compare against.
+              - `schema_rename_opts`: Rename the schema on the target database.
+              	- `value`: The target schema name to compare against.

Contributor

bsanchez-the-roach Feb 20, 2026

Similar question as above re: which are optional, which are mutually exclusive. If I only wanted to rename the schema, would I even need to include the "table" item?

src/current/molt/molt-verify.md

               ## Known limitations
               - MOLT Verify compares 20,000 rows at a time by default, and row values can change between batches, potentially resulting in temporary inconsistencies in data. To configure the row batch size, use the `--row_batch_size` [flag](#flags).
+              - MOLT Verify only supports comparing one MySQL database to a whole CockroachDB schema (which is assumed to be `public`).

Contributor

bsanchez-the-roach Feb 20, 2026

I'm confused by this. Does this not contradict the fact that MySQL sources are supported for selective data verification?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet