Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
📝 WalkthroughWalkthroughThree new Jupyter Notebooks added to the materials_designer/workflows directory. Introduction.ipynb provides a navigable index to calculation notebooks across multiple categories. band_structure.ipynb and band_structure_hse.ipynb implement end-to-end workflows for band structure calculations (standard and HSE-based) on the Mat3ra platform, orchestrating material setup, workflow configuration, compute parameters, job submission, and results retrieval. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~70 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 Ruff (0.15.5)other/materials_designer/workflows/band_structure.ipynbUnexpected end of JSON input other/materials_designer/workflows/band_structure_hse.ipynbUnexpected end of JSON input other/materials_designer/workflows/Introduction.ipynbUnexpected end of JSON input Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| "#### [4.1.1. Band Gap / Band Gap+DOS (HSE).](band_gap.ipynb)\n", | ||
| "\n", | ||
| "### 4.2. Band Structure\n", | ||
| "#### [4.2.1. Band Structure / Band Structure+DOS / HSE / Magnetic / Spin-Orbit Coupling / DOS.](band_structure.ipynb)\n", |
There was a problem hiding this comment.
Should be split into multiple
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (2)
other/materials_designer/workflows/band_structure.ipynb (1)
401-424: Don't rely on a positional subworkflow here.This notebook supports both
band_structureandband_structure_dos, butworkflow.subworkflows[1 if ADD_RELAXATION else 0]assumes the target units always live in a fixed slot. If the selected Standata template changes shape, the laterpw_scf/pw_bands/pw_nscfupdates quietly no-op.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@other/materials_designer/workflows/band_structure.ipynb` around lines 401 - 424, The code assumes bs_subworkflow = workflow.subworkflows[1 if ADD_RELAXATION else 0], which is brittle; instead locate the subworkflow that actually contains the target units (e.g. "pw_scf", "pw_bands", "pw_nscf") by scanning workflow.subworkflows and using get_unit_by_name to find a matching unit, and use that subworkflow for SCF/KPATH/NSCF edits; update all places that use bs_subworkflow (and the initial assignment) to compute it dynamically (e.g. choose the first subworkflow where get_unit_by_name(name="pw_scf") or "pw_bands" or "pw_nscf" returns non-null) so changes won't silently no-op when templates change.other/materials_designer/workflows/band_structure_hse.ipynb (1)
633-634: Make the property selection explicit here too.
other/materials_designer/workflows/band_structure.ipynbat Lines 620-621 passesPropertyName.non_scalar.band_structure.valueto the same helper. Leaving it implicit here couples this cell toutils.api.get_properties_for_job()defaults.Proposed fix
-band_structure_data = get_properties_for_job(client, job_id) +band_structure_data = get_properties_for_job( + client, + job_id, + property_name=PropertyName.non_scalar.band_structure.value, +) visualize_properties(band_structure_data, title="Band Structure (HSE)")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@other/materials_designer/workflows/band_structure_hse.ipynb` around lines 633 - 634, The call to get_properties_for_job is relying on its defaults; update the call where band_structure_data is assigned to explicitly pass the property selector (use PropertyName.non_scalar.band_structure.value) so it matches the pattern in band_structure.ipynb and avoids coupling to default behavior; locate the call to get_properties_for_job and add the property argument, then leave visualize_properties(band_structure_data, title="Band Structure (HSE)") unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@other/materials_designer/workflows/band_structure_hse.ipynb`:
- Around line 421-425: The preliminary SCF unit "pw_scf" is not receiving the
user-specified cutoffs (ECUTWFC/ECUTRHO) because the cutoff-application loop
only targets "pw_scf_bands_hse" and relaxation units; modify the logic that
applies ECUTWFC and ECUTRHO so it also locates and updates the unit returned by
preliminary_scf_subworkflow.get_unit_by_name(name="pw_scf") (same place you add
the SCF_KGRID) — add the PointsGridDataProvider-style context for cutoffs or the
existing cutoff data provider to that unit and call
preliminary_scf_subworkflow.set_unit(unit); repeat the same fix for the second
occurrence around the 439-446 region so both preliminary SCF instances receive
the user cutoffs.
- Around line 527-537: Validate the selected cluster before constructing
Compute: after resolving cluster (the variable computed via CLUSTER_NAME lookup
against clusters), check that clusters is non-empty and that cluster is not
None; if clusters is empty raise/log a clear error and stop, and if CLUSTER_NAME
did not match any hostname either pick a sensible default (e.g., clusters[0])
only if clusters is non-empty or else raise/log a descriptive error; ensure the
error message references CLUSTER_NAME and the clusters list and occurs before
calling Compute(...), so Compute(cluster=cluster, ...) never receives None.
- Around line 230-232: The lookup of the default project using
client.projects.list and then indexing projects[0] is unsafe; modify the block
that assigns projects, project_id, and the print call to first check if the
returned list (projects) is non-empty and handle the empty case explicitly
(e.g., raise a clear RuntimeError/ValueError or print a user-friendly message
and exit), otherwise proceed to set project_id = projects[0]["_id"] and print
the selection; this change should touch the code around the client.projects.list
call, the projects variable, project_id assignment, and the print statement.
In `@other/materials_designer/workflows/band_structure.ipynb`:
- Around line 229-231: The code assumes client.projects.list(...) returns at
least one project and directly uses projects[0], which will raise IndexError if
empty; modify the block around client.projects.list(...) / projects / project_id
so it checks whether projects is non-empty (e.g., if not projects or
len(projects) == 0) and handle that case by either raising a clear error
message, prompting the user, or creating/selecting a fallback project before
accessing projects[0], then proceed to set project_id and call print only when a
valid project is available.
- Around line 514-524: The code may pass a None cluster into Compute because
cluster is set to next(...) or clusters[0] without validating a match; before
constructing Compute (where Compute(..., cluster=cluster) is called) add a
validation that cluster is not None (and that clusters is non-empty), and if
validation fails raise/throw a clear error or choose a safe fallback; update the
logic around CLUSTER_NAME, clusters, and the cluster variable to either select a
default cluster explicitly or raise an informative exception (including
CLUSTER_NAME and available hostnames) so Compute is never constructed with
cluster == None.
---
Nitpick comments:
In `@other/materials_designer/workflows/band_structure_hse.ipynb`:
- Around line 633-634: The call to get_properties_for_job is relying on its
defaults; update the call where band_structure_data is assigned to explicitly
pass the property selector (use PropertyName.non_scalar.band_structure.value) so
it matches the pattern in band_structure.ipynb and avoids coupling to default
behavior; locate the call to get_properties_for_job and add the property
argument, then leave visualize_properties(band_structure_data, title="Band
Structure (HSE)") unchanged.
In `@other/materials_designer/workflows/band_structure.ipynb`:
- Around line 401-424: The code assumes bs_subworkflow = workflow.subworkflows[1
if ADD_RELAXATION else 0], which is brittle; instead locate the subworkflow that
actually contains the target units (e.g. "pw_scf", "pw_bands", "pw_nscf") by
scanning workflow.subworkflows and using get_unit_by_name to find a matching
unit, and use that subworkflow for SCF/KPATH/NSCF edits; update all places that
use bs_subworkflow (and the initial assignment) to compute it dynamically (e.g.
choose the first subworkflow where get_unit_by_name(name="pw_scf") or "pw_bands"
or "pw_nscf" returns non-null) so changes won't silently no-op when templates
change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 249959c9-833e-41f7-9705-3bb3a24b65f1
📒 Files selected for processing (3)
other/materials_designer/workflows/Introduction.ipynbother/materials_designer/workflows/band_structure.ipynbother/materials_designer/workflows/band_structure_hse.ipynb
| "projects = client.projects.list({\"isDefault\": True, \"owner._id\": ACCOUNT_ID})\n", | ||
| "project_id = projects[0][\"_id\"]\n", | ||
| "print(f\"\\u2705 Using project: {projects[0]['name']} ({project_id})\")" |
There was a problem hiding this comment.
Guard the default-project lookup.
If client.projects.list(...) returns no default project for the selected account, projects[0] raises IndexError and the notebook aborts. Please handle the empty result explicitly.
Proposed fix
projects = client.projects.list({"isDefault": True, "owner._id": ACCOUNT_ID})
+if not projects:
+ raise RuntimeError(
+ f"No default project found for account {ACCOUNT_ID}. Create one first or select a project explicitly."
+ )
project_id = projects[0]["_id"]
print(f"\u2705 Using project: {projects[0]['name']} ({project_id})")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@other/materials_designer/workflows/band_structure_hse.ipynb` around lines 230
- 232, The lookup of the default project using client.projects.list and then
indexing projects[0] is unsafe; modify the block that assigns projects,
project_id, and the print call to first check if the returned list (projects) is
non-empty and handle the empty case explicitly (e.g., raise a clear
RuntimeError/ValueError or print a user-friendly message and exit), otherwise
proceed to set project_id = projects[0]["_id"] and print the selection; this
change should touch the code around the client.projects.list call, the projects
variable, project_id assignment, and the print statement.
| "if SCF_KGRID is not None:\n", | ||
| " unit = preliminary_scf_subworkflow.get_unit_by_name(name=\"pw_scf\")\n", | ||
| " if unit:\n", | ||
| " unit.add_context(PointsGridDataProvider(dimensions=SCF_KGRID, isEdited=True).yield_data())\n", | ||
| " preliminary_scf_subworkflow.set_unit(unit)\n", |
There was a problem hiding this comment.
Apply the user-specified cutoffs to pw_scf too.
This notebook updates the preliminary SCF through pw_scf, but the cutoff loop skips that same unit. As written, ECUTWFC and ECUTRHO only reach pw_scf_bands_hse and any relaxation step, so the preliminary SCF can still run with template defaults.
Proposed fix
if ECUTWFC is not None:
cutoffs_context = PlanewaveCutoffsContextProvider(wavefunction=ECUTWFC, density=ECUTRHO, isEdited=True).yield_data()
- for unit_name in ["pw_relax", "pw_vc-relax", "pw_scf_bands_hse"]:
+ for unit_name in ["pw_relax", "pw_vc-relax", "pw_scf", "pw_scf_bands_hse"]:
for swf in workflow.subworkflows:
unit = swf.get_unit_by_name(name=unit_name)
if unit:
unit.add_context(cutoffs_context)
swf.set_unit(unit)Also applies to: 439-446
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@other/materials_designer/workflows/band_structure_hse.ipynb` around lines 421
- 425, The preliminary SCF unit "pw_scf" is not receiving the user-specified
cutoffs (ECUTWFC/ECUTRHO) because the cutoff-application loop only targets
"pw_scf_bands_hse" and relaxation units; modify the logic that applies ECUTWFC
and ECUTRHO so it also locates and updates the unit returned by
preliminary_scf_subworkflow.get_unit_by_name(name="pw_scf") (same place you add
the SCF_KGRID) — add the PointsGridDataProvider-style context for cutoffs or the
existing cutoff data provider to that unit and call
preliminary_scf_subworkflow.set_unit(unit); repeat the same fix for the second
occurrence around the 439-446 region so both preliminary SCF instances receive
the user cutoffs.
| "if CLUSTER_NAME:\n", | ||
| " cluster = next((c for c in clusters if CLUSTER_NAME in c[\"hostname\"]), None)\n", | ||
| "else:\n", | ||
| " cluster = clusters[0]\n", | ||
| "\n", | ||
| "compute = Compute(\n", | ||
| " cluster=cluster,\n", | ||
| " queue=QUEUE_NAME,\n", | ||
| " ppn=PPN\n", | ||
| ")\n", | ||
| "print(f\"Using cluster: {compute.cluster.hostname}, queue: {QUEUE_NAME}, ppn: {PPN}\")" |
There was a problem hiding this comment.
Validate the cluster selection before constructing Compute.
When CLUSTER_NAME does not match any hostname, cluster becomes None, and the failure only shows up later. The empty-cluster case has the same problem.
Proposed fix
from mat3ra.ide.compute import Compute
+if not clusters:
+ raise RuntimeError("No clusters are available for the selected account.")
+
if CLUSTER_NAME:
cluster = next((c for c in clusters if CLUSTER_NAME in c["hostname"]), None)
else:
cluster = clusters[0]
+
+if cluster is None:
+ raise RuntimeError(f"No cluster matched {CLUSTER_NAME!r}.")
compute = Compute(
cluster=cluster,📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "if CLUSTER_NAME:\n", | |
| " cluster = next((c for c in clusters if CLUSTER_NAME in c[\"hostname\"]), None)\n", | |
| "else:\n", | |
| " cluster = clusters[0]\n", | |
| "\n", | |
| "compute = Compute(\n", | |
| " cluster=cluster,\n", | |
| " queue=QUEUE_NAME,\n", | |
| " ppn=PPN\n", | |
| ")\n", | |
| "print(f\"Using cluster: {compute.cluster.hostname}, queue: {QUEUE_NAME}, ppn: {PPN}\")" | |
| "if not clusters:\n", | |
| " raise RuntimeError(\"No clusters are available for the selected account.\")\n", | |
| "\n", | |
| "if CLUSTER_NAME:\n", | |
| " cluster = next((c for c in clusters if CLUSTER_NAME in c[\"hostname\"]), None)\n", | |
| "else:\n", | |
| " cluster = clusters[0]\n", | |
| "\n", | |
| "if cluster is None:\n", | |
| " raise RuntimeError(f\"No cluster matched {CLUSTER_NAME!r}.\")\n", | |
| "\n", | |
| "compute = Compute(\n", | |
| " cluster=cluster,\n", | |
| " queue=QUEUE_NAME,\n", | |
| " ppn=PPN\n", | |
| ")\n", | |
| "print(f\"Using cluster: {compute.cluster.hostname}, queue: {QUEUE_NAME}, ppn: {PPN}\")" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@other/materials_designer/workflows/band_structure_hse.ipynb` around lines 527
- 537, Validate the selected cluster before constructing Compute: after
resolving cluster (the variable computed via CLUSTER_NAME lookup against
clusters), check that clusters is non-empty and that cluster is not None; if
clusters is empty raise/log a clear error and stop, and if CLUSTER_NAME did not
match any hostname either pick a sensible default (e.g., clusters[0]) only if
clusters is non-empty or else raise/log a descriptive error; ensure the error
message references CLUSTER_NAME and the clusters list and occurs before calling
Compute(...), so Compute(cluster=cluster, ...) never receives None.
| "projects = client.projects.list({\"isDefault\": True, \"owner._id\": ACCOUNT_ID})\n", | ||
| "project_id = projects[0][\"_id\"]\n", | ||
| "print(f\"\\u2705 Using project: {projects[0]['name']} ({project_id})\")" |
There was a problem hiding this comment.
Guard the default-project lookup.
If client.projects.list(...) returns no default project for the selected account, projects[0] raises IndexError and the notebook aborts. Please handle the empty result explicitly.
Proposed fix
projects = client.projects.list({"isDefault": True, "owner._id": ACCOUNT_ID})
+if not projects:
+ raise RuntimeError(
+ f"No default project found for account {ACCOUNT_ID}. Create one first or select a project explicitly."
+ )
project_id = projects[0]["_id"]
print(f"\u2705 Using project: {projects[0]['name']} ({project_id})")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@other/materials_designer/workflows/band_structure.ipynb` around lines 229 -
231, The code assumes client.projects.list(...) returns at least one project and
directly uses projects[0], which will raise IndexError if empty; modify the
block around client.projects.list(...) / projects / project_id so it checks
whether projects is non-empty (e.g., if not projects or len(projects) == 0) and
handle that case by either raising a clear error message, prompting the user, or
creating/selecting a fallback project before accessing projects[0], then proceed
to set project_id and call print only when a valid project is available.
| "if CLUSTER_NAME:\n", | ||
| " cluster = next((c for c in clusters if CLUSTER_NAME in c[\"hostname\"]), None)\n", | ||
| "else:\n", | ||
| " cluster = clusters[0]\n", | ||
| "\n", | ||
| "compute = Compute(\n", | ||
| " cluster=cluster,\n", | ||
| " queue=QUEUE_NAME,\n", | ||
| " ppn=PPN\n", | ||
| ")\n", | ||
| "print(f\"Using cluster: {compute.cluster.hostname}, queue: {QUEUE_NAME}, ppn: {PPN}\")" |
There was a problem hiding this comment.
Validate the cluster selection before constructing Compute.
When CLUSTER_NAME does not match any hostname, cluster becomes None, and the failure only shows up later. The empty-cluster case has the same problem.
Proposed fix
from mat3ra.ide.compute import Compute
+if not clusters:
+ raise RuntimeError("No clusters are available for the selected account.")
+
if CLUSTER_NAME:
cluster = next((c for c in clusters if CLUSTER_NAME in c["hostname"]), None)
else:
cluster = clusters[0]
+
+if cluster is None:
+ raise RuntimeError(f"No cluster matched {CLUSTER_NAME!r}.")
compute = Compute(
cluster=cluster,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@other/materials_designer/workflows/band_structure.ipynb` around lines 514 -
524, The code may pass a None cluster into Compute because cluster is set to
next(...) or clusters[0] without validating a match; before constructing Compute
(where Compute(..., cluster=cluster) is called) add a validation that cluster is
not None (and that clusters is non-empty), and if validation fails raise/throw a
clear error or choose a safe fallback; update the logic around CLUSTER_NAME,
clusters, and the cluster variable to either select a default cluster explicitly
or raise an informative exception (including CLUSTER_NAME and available
hostnames) so Compute is never constructed with cluster == None.
Summary by CodeRabbit
Release Notes