Allow arbitrary JSON in google.protobuf.Struct fields during validation#2119
Allow arbitrary JSON in google.protobuf.Struct fields during validation#2119
Conversation
kolina
left a comment
There was a problem hiding this comment.
Let's add compilation test cases
kolina
left a comment
There was a problem hiding this comment.
Some tests seem to be failing?
core/actions/table.ts
Outdated
| return; | ||
| } | ||
| const config = this.verifyConfig(unverifiedConfig); | ||
| if (this.unverifiedConfig.metadata?.extraProperties) { |
There was a problem hiding this comment.
Users of JavaScript API may theoretically pass a protobuf object directly: see here.
I think, there should be a check that if extraProperties is already an instance of protobuf struct object, that this conversion is skipped.
There was a problem hiding this comment.
Can we add a test for this case?
common/protos/index.ts
Outdated
| export function jsonToValue(value: any): any { | ||
| if (value === null || typeof value === "undefined") { | ||
| return { nullValue: 0 }; | ||
| } else if (typeof value === "number") { |
There was a problem hiding this comment.
Else statement are redundant in these chains.
common/protos/index.ts
Outdated
| return { fields }; | ||
| } | ||
|
|
||
| export function jsonToValue(value: any): any { |
There was a problem hiding this comment.
Please don't omit the returned type
common/protos/index.ts
Outdated
| return buf; | ||
| } | ||
|
|
||
| export function jsonToStruct(obj: any): any { |
There was a problem hiding this comment.
Please don't omit the return type
core/session.ts
Outdated
| } | ||
|
|
||
| this.jitContextData.fields[key] = unknownToValue(data); | ||
| this.jitContextData.fields[key] = google.protobuf.Value.create(jsonToValue(data)); |
There was a problem hiding this comment.
We don't require JiT data to be a JSON object - it can be any JS object. Please consider using a better name for a conversion function.
…and session.ts" This reverts commit 6860230.
common/protos/index.ts
Outdated
| return buf; | ||
| } | ||
|
|
||
| export interface IValue { |
There was a problem hiding this comment.
Why do we need this interface? google.protobuf.IValue should already cover this
| const CONFIGS_PROTO_DOCUMENTATION_URL = | ||
| "https://dataform-co.github.io/dataform/docs/configs-reference"; | ||
| const REPORT_ISSUE_URL = "https://github.com/dataform-co/dataform/issues"; | ||
| const STRUCT_FIELD_NAMES = new Set(["extraProperties", "contexts", "glossary_terms"]); |
There was a problem hiding this comment.
we don't want to parametrize this set on all the possible keys that we can store inside extraProperties - we'd need to update this open source project every time we want to support another key. Let's generalize it so we don't need to store glossary_terms or contacts here.
There was a problem hiding this comment.
Why do we have other fields except extraProperties here?
| glossary_terms: [ | ||
| { | ||
| column_name: "trip_id", | ||
| project: "bq-dataworkeragent-test", |
There was a problem hiding this comment.
let's not use the name of our internal project here - rename this to project_identifier or something else
| } | ||
| ], | ||
| generic: { | ||
| system: "yo", |
There was a problem hiding this comment.
let's replace these with better values (ie. "my custom system value"
| const CONFIGS_PROTO_DOCUMENTATION_URL = | ||
| "https://dataform-co.github.io/dataform/docs/configs-reference"; | ||
| const REPORT_ISSUE_URL = "https://github.com/dataform-co/dataform/issues"; | ||
| const STRUCT_FIELD_NAMES = new Set(["extraProperties", "contexts", "glossary_terms"]); |
There was a problem hiding this comment.
Why do we have other fields except extraProperties here?
| return { | ||
| structValue: { | ||
| fields: Object.fromEntries( | ||
| Object.entries(raw as object).map(([key, value]) => [key, unknownToValue(value)]) |
There was a problem hiding this comment.
is cast raw as object needed here?
| if (structFieldNames.has(presentKey)) { | ||
| return; | ||
| } |
There was a problem hiding this comment.
why is this check needed here?
| if (Array.isArray(value)) { | ||
| result[key] = { | ||
| listValue: { | ||
| values: value.map(item => unknownToValue(item)) | ||
| } | ||
| }; | ||
| } else if (typeof value === "object" && !value.fields) { | ||
| const converted = unknownToValue(value); | ||
| result[key] = converted.structValue; | ||
| } |
There was a problem hiding this comment.
This logic is a bit unclear to me, I think you only need a check for top-level field that is always a struct at the moment?
core/actions/table.ts
Outdated
| return; | ||
| } | ||
| const config = this.verifyConfig(unverifiedConfig); | ||
| if (this.unverifiedConfig.metadata?.extraProperties) { |
There was a problem hiding this comment.
Can we add a test for this case?
The current verifyObjectMatchesProto helper is very strict. When users provide arbitrary JSON metadata in a Struct field (like extra_properties), the validator throws an "unexpected property" error because those custom keys are not explicitly defined in the protobuf message.
This PR introduces a manual conversion from plain JavaScript objects to the standard google.protobuf.Struct format during the validation phase.