Add script to retry failed ingestion job#1927
Add script to retry failed ingestion job#1927vish-cs wants to merge 1 commit intodatacommonsorg:masterfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a crucial automation script designed to enhance the resilience of Dataflow ingestion pipelines. By automatically detecting failed imports, rolling back their state in Spanner to a consistent previous version, and resetting pending jobs, the script significantly reduces manual intervention required for recovery. This ensures that ingestion workflows can self-correct and resume processing efficiently after transient failures, improving overall data reliability and operational efficiency. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a new script to automate the process of retrying failed Dataflow ingestion jobs. The script is well-structured, using functions to isolate responsibilities like identifying failed imports, reverting them in Spanner, and retriggering the workflow.
My review focuses on a critical issue in the revert_import function where it fails to return the transaction's status, which disables the infinite-loop protection mechanism.
I've provided a detailed comment with a code suggestion for this point.
Added a script that identifies failed imports within a specific Dataflow job,
reverts the failed imports in the Spanner database to their last known
good version, resets any 'PENDING' imports back to 'STAGING', and
optionally retriggers the Spanner ingestion workflow.