Conversation
✅ Deploy Preview for cockroachdb-api-docs canceled.
|
✅ Deploy Preview for cockroachdb-interactivetutorials-docs canceled.
|
Files changed: |
✅ Netlify Preview
To edit notification comments on pull requests, go to your Netlify project configuration. |
bsanchez-the-roach
left a comment
There was a problem hiding this comment.
Some comments are minor nits about phrasing that you can take or leave, others are slightly more substantial.
| `--row-batch-size` | Number of rows to get from a table at a time. <br>**Default:** 20000 | ||
| `--schema-filter` | Verify schemas that match a specified [regular expression](https://wikipedia.org/wiki/Regular_expression).<br><br>**Default:** `'.*'` | ||
| `--table-filter` | Verify tables that match a specified [regular expression](https://wikipedia.org/wiki/Regular_expression).<br><br>**Default:** `'.*'` | ||
| `--transformations-file` | Path to a JSON file that defines transformation rules applied during comparison to verify data that was transformed during [fetch]({% link molt/molt-fetch.md %}#transformations). Use the same transformation file from `molt fetch`. Refer to [Verify transformed data](#verify-transformed-data). |
There was a problem hiding this comment.
| `--transformations-file` | Path to a JSON file that defines transformation rules applied during comparison to verify data that was transformed during [fetch]({% link molt/molt-fetch.md %}#transformations). Use the same transformation file from `molt fetch`. Refer to [Verify transformed data](#verify-transformed-data). | |
| `--transformations-file` | Path to a JSON file that defines transformation rules to be applied during comparison. If verifying data that was [transformed during a bulk load with MOLT Fetch]({% link molt/molt-fetch.md %}#transformations), use the same transformation file from that `molt fetch` run. Refer to [Verify transformed data](#verify-transformed-data). |
| Filter rules apply `WHERE` clauses to specified tables during verification. Columns referenced in filter expressions **must** be indexed. | ||
|
|
||
| {{site.data.alerts.callout_info}} | ||
| Only PostgreSQL and MySQL sources are supported for selective data verification. |
There was a problem hiding this comment.
| Only PostgreSQL and MySQL sources are supported for selective data verification. | |
| Selective data verification is only supported for PostgreSQL and MySQL sources. |
| - `resource_specifier`: Identifies which schemas and tables to filter. Schema and table names are case-insensitive. | ||
| - `schema`: Schema name containing the table. | ||
| - `table`: Table name to apply the filter to. | ||
| - `expr`: SQL expression that applies to both source and target databases. The expression must be valid for both database dialects. |
There was a problem hiding this comment.
Does "for both database dialects" mean both the source and the target database dialects? I assume so, but might be good to say that explicitly to avoid ambiguity. (Especially because this comes not long after the callout about both Postgres and MySQL so that's where my head jumped to when considering "two dialects").
|
|
||
| #### Step 1. Create a filter rules file | ||
|
|
||
| Create a JSON file that defines the filter rules. The following example defines filter rules on two tables, `public.filtertbl` and `public.filtertbl2`: |
There was a problem hiding this comment.
Are those the names of the schemas/tables on the source or on the target? Or must they match? (Maybe that latter question relates to the transformation content that I haven't yet read).
| - `schema`: Schema name containing the table. | ||
| - `table`: Table name to apply the filter to. | ||
| - `expr`: SQL expression that applies to both source and target databases. The expression must be valid for both database dialects. | ||
| - `source_expr` and `target_expr`: SQL expressions that apply to the source and target databases, respectively. These must be defined together, and cannot be used with `expr`. |
There was a problem hiding this comment.
I think I might want a bit more understanding about how this works. So are there three total options: source_expr, target_expr, and expr? Which are optional, and which are mutually exclusive? Are these filter expressions applied sequentially, like first the source_expr does one round of filtering then the target_expr does another? Does the expr filter get applied in between?
I also wonder if elaborating on all of that info is best done in a conceptual doc as opposed to a how-to (though I don't have a strong opinion about that at the moment).
|
|
||
| - `resource_specifier`: Identifies which schemas and tables to filter. Schema and table names are case-insensitive. | ||
| - `schema`: Schema name containing the table. | ||
| - `table`: Table name to apply the filter to. |
There was a problem hiding this comment.
| - `table`: Table name to apply the filter to. | |
| - `table`: Name of the table to apply the filter to. |
| ~~~ | ||
|
|
||
| - `resource_specifier`: Identifies which schemas and tables to transform. Schema and table names are case-insensitive. | ||
| - `schema`: Schema name containing the table. |
There was a problem hiding this comment.
| - `schema`: Schema name containing the table. | |
| - `schema`: Name of the schema containing the table. |
| } | ||
| ~~~ | ||
|
|
||
| - `resource_specifier`: Identifies which schemas and tables to transform. Schema and table names are case-insensitive. |
There was a problem hiding this comment.
For both schema and table: these are the names on the source, I assume? Should probably state that for clarity.
| - `table_rename_opts`: Rename the table on the target database. | ||
| - `value`: The target table name to compare against. | ||
| - `schema_rename_opts`: Rename the schema on the target database. | ||
| - `value`: The target schema name to compare against. |
There was a problem hiding this comment.
Similar question as above re: which are optional, which are mutually exclusive. If I only wanted to rename the schema, would I even need to include the "table" item?
| ## Known limitations | ||
|
|
||
| - MOLT Verify compares 20,000 rows at a time by default, and row values can change between batches, potentially resulting in temporary inconsistencies in data. To configure the row batch size, use the `--row_batch_size` [flag](#flags). | ||
| - MOLT Verify only supports comparing one MySQL database to a whole CockroachDB schema (which is assumed to be `public`). |
There was a problem hiding this comment.
I'm confused by this. Does this not contradict the fact that MySQL sources are supported for selective data verification?
DOC-15554