Skip to content

Introduce bq util to get latest table name given prefix.#516

Open
svij-sc wants to merge 3 commits intomainfrom
svij/intro_get_latest_table_bq_util
Open

Introduce bq util to get latest table name given prefix.#516
svij-sc wants to merge 3 commits intomainfrom
svij/intro_get_latest_table_bq_util

Conversation

@svij-sc
Copy link
Collaborator

@svij-sc svij-sc commented Feb 26, 2026

Scope of work done

Introducing functionality to get latest table name given prefix for a datetime suffixed table.
We are duplicating this functionality in downstream use cases - figured i'd intro a utility.

@svij-sc
Copy link
Collaborator Author

svij-sc commented Feb 26, 2026

/unit_test

@github-actions
Copy link
Contributor

github-actions bot commented Feb 26, 2026

GiGL Automation

@ 19:37:29UTC : 🔄 Python Unit Test started.

@ 19:43:51UTC : ❌ Workflow failed.
Please check the logs for more details.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 26, 2026

GiGL Automation

@ 19:37:30UTC : 🔄 Scala Unit Test started.

@ 19:45:04UTC : ✅ Workflow completed successfully.

@kmontemayor2-sc
Copy link
Collaborator

Also let's fix the unit tests?

Comment on lines +294 to +297
*table_partition_suffix*. All supported GCP partition suffixes (``YYYY``,
``YYYYMM``, ``YYYYMMDD``, ``YYYYMMDDHH``, integer ranges) are
lexicographically sortable, so the latest table is the
lexicographic maximum.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there some GCP docs we can link to here?

bq_dataset_path=bq_dataset_path, table_match_string=table_prefix
)
suffix_len = len(table_partition_suffix)
candidates = []
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit. type empty collections

Suggested change
candidates = []
candidates: list[str]= []

Comment on lines +325 to +338
for table_name in matched_full_table_paths:
assert (
len(table_name) == len(bq_table_path_prefix) + suffix_len
), f"Table name {table_name} does not end with a suffix of format {table_partition_suffix}"
if cap_date is None or table_name[-suffix_len:] <= cap_date:
candidates.append(table_name)
if not candidates:
raise ValueError(
f"No tables found with prefix {bq_table_path_prefix} and cap date {cap_date}"
)
candidates.sort()
return candidates[
-1
] # Get the latest table @ last index (since sorted Ascending)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

double nit. we don't need to sort we can just have a latest_table we update as we go through validating the table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants