Skip to content

Feature/SOF-7859 Update: specific example BS of Gr + N#273

Open
VsevolodX wants to merge 13 commits intomainfrom
feature/SOF-7859
Open

Feature/SOF-7859 Update: specific example BS of Gr + N#273
VsevolodX wants to merge 13 commits intomainfrom
feature/SOF-7859

Conversation

@VsevolodX
Copy link
Member

@VsevolodX VsevolodX commented Mar 10, 2026

Summary by CodeRabbit

Release Notes

  • Documentation

    • Added a new example workflow demonstrating band structure calculation for nitrogen-doped graphene, including material setup, workflow configuration with debug and production modes, compute settings, job submission, monitoring, and results visualization.
  • Bug Fixes

    • Extended visualization support to display additional data types in property results.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@coderabbitai
Copy link

coderabbitai bot commented Mar 10, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

A Jupyter notebook demonstrating nitrogen-doped graphene band structure workflow is added. The API client is updated to clean workflow dictionaries during persistence, ensuring unit contexts are properly handled. Visualization functionality is extended to support additional data array formats.

Changes

Cohort / File(s) Summary
Band Structure Workflow Example
other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb
New notebook providing end-to-end workflow for band structure calculation of nitrogen-doped graphene using Quantum ESPRESSO, featuring RUN_PROFILE modes (debug/production), OIDC authentication, material loading, workflow creation/modification, compute configuration, job submission, and result visualization.
Workflow Dictionary Handling
utils/api.py
Introduced _clean_workflow_dict() to sanitize unit-level context from workflows before persistence. Modified get_or_create_workflow() to return existing workflows directly and use cleaned dicts for new workflows. Updated create_job() to strip _id from embedded workflow payloads to preserve unit contexts.
Data Visualization Filter
utils/visualize.py
Extended visualize_properties() data key filter to include "xDataArray" alongside "value" and "values", broadening support for result data formats.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • Feature/SOF-7843 TE NB #271: Directly modifies utils/api.py with overlapping workflow and job helper logic (get_or_create_workflow and create_job behavior changes).
  • SOF-7824 update: use properties API #269: Previously modified utils/visualize.py visualization filtering logic; this PR further extends the same filter with additional data format support.

Suggested reviewers

  • timurbazhirov

Poem

🐰 A graphene dream takes shape today,
With workflows clean, they find their way,
Quantum whispers through the band,
Nitrogen-doped, precisely planned,
Data dances in arrays new,
From debug mode to production's view!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Feature/SOF-7859 Update: specific example BS of Gr + N' partially relates to the changeset. While it references a specific example update and uses abbreviations (BS for band structure, Gr for graphene, N for nitrogen), the main change involves adding a complete band structure workflow notebook with supporting API utility modifications. The title is overly abbreviated and doesn't clearly convey the primary contribution to a developer unfamiliar with the abbreviations. Consider a clearer title like 'Add band structure workflow example for nitrogen-doped graphene' that explicitly conveys the main change without requiring knowledge of internal abbreviations or ticket numbers.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/SOF-7859
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (3)
other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb (3)

137-138: POLL_INTERVAL is defined but never used.

The variable is set to 60 seconds but not passed to wait_for_jobs_to_finish_async. If the intention is to configure the polling interval, you'll need to pass it to the function (if supported) or remove the unused variable.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`
around lines 137 - 138, POLL_INTERVAL is defined but never used — either pass it
into the wait_for_jobs_to_finish_async call or remove the variable; update the
invocation of wait_for_jobs_to_finish_async to accept the POLL_INTERVAL (e.g.,
pass POLL_INTERVAL as the polling interval/timeout parameter) if the function
supports it, or delete the unused POLL_INTERVAL definition and any related
comment to clean up dead code; look for the POLL_INTERVAL symbol and the
wait_for_jobs_to_finish_async call to apply the change.

618-624: Remove unused import.

PropertyName is imported but never used.

♻️ Proposed fix
-    "from mat3ra.prode import PropertyName\n",
     "from utils.visualize import visualize_properties\n",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`
around lines 618 - 624, Remove the unused import PropertyName from the top of
the notebook cell; locate the import line "from mat3ra.prode import
PropertyName" (near the imports used with client.properties.get_for_job and
visualize_properties) and delete it so only necessary imports remain (e.g., keep
"from utils.visualize import visualize_properties") to eliminate the
unused-symbol warning.

470-479: Remove unused variable.

workflow_id_or_dict is assigned but never used.

♻️ Proposed fix
     "from utils.generic import dict_to_namespace\n",
     "from utils.api import get_or_create_workflow\n",
     "\n",
-    "workflow_id_or_dict = None\n",
-    "\n",
     "saved_workflow_response = get_or_create_workflow(client, workflow, ACCOUNT_ID)\n",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`
around lines 470 - 479, Remove the unused local variable workflow_id_or_dict:
delete the assignment "workflow_id_or_dict = None" from the cell since it is
never read; if the intent was to pass an explicit id/dict into
get_or_create_workflow, instead pass that variable into
get_or_create_workflow(client, workflow, ACCOUNT_ID) or rename/reuse it
appropriately, otherwise simply remove the variable to avoid the
unused-assignment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`:
- Line 28: Fix the typo in the notebook cell string "1. Retrieve results: get
and display the properies." by changing "properies" to "properties" so the line
reads "1. Retrieve results: get and display the properties."; locate the exact
text in the band_structure_graphene.ipynb cell content and update it
accordingly.
- Around line 249-253: The code assumes client.projects.list(...) returns a
non-empty list and directly accesses projects[0], which will raise IndexError if
no default project exists; update the logic around projects, project_id and the
print line to check if projects is truthy/has length > 0 before accessing
projects[0], and handle the empty-case by either raising a clear error or
creating/selecting a fallback project (e.g., log a helpful message with
ACCOUNT_ID and abort or call the project-creation flow), ensuring all references
to projects[0] (used to set project_id and in the print) are only executed after
the presence check.
- Around line 157-165: Remove the hardcoded OIDC_ACCESS_TOKEN assignment in the
notebook cell (the os.environ["OIDC_ACCESS_TOKEN"] line) and any other hardcoded
credentials (API_HOST/API_PORT/API_SECURE if they are sensitive); instead update
the cell to rely on externally set environment variables or the existing OIDC
flow in the next cell by deleting the literal token assignment and adding a
brief comment instructing developers to set OIDC_ACCESS_TOKEN (and other vars)
via their shell or a .env/local dev script before running the notebook.
- Around line 515-530: When CLUSTER_NAME is set but no matching entry exists,
cluster becomes None which will later cause an AttributeError; update the
selection logic around CLUSTER_NAME/clusters to handle the not-found case: after
attempting next((c for c in clusters if CLUSTER_NAME in c["hostname"]), None)
check if cluster is None and either raise a clear ValueError mentioning
CLUSTER_NAME and available cluster hostnames or fall back to a safe default
(e.g., clusters[0]) before constructing Compute; ensure this validation happens
prior to calling Compute(...) and before printing compute.cluster.hostname.

---

Nitpick comments:
In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`:
- Around line 137-138: POLL_INTERVAL is defined but never used — either pass it
into the wait_for_jobs_to_finish_async call or remove the variable; update the
invocation of wait_for_jobs_to_finish_async to accept the POLL_INTERVAL (e.g.,
pass POLL_INTERVAL as the polling interval/timeout parameter) if the function
supports it, or delete the unused POLL_INTERVAL definition and any related
comment to clean up dead code; look for the POLL_INTERVAL symbol and the
wait_for_jobs_to_finish_async call to apply the change.
- Around line 618-624: Remove the unused import PropertyName from the top of the
notebook cell; locate the import line "from mat3ra.prode import PropertyName"
(near the imports used with client.properties.get_for_job and
visualize_properties) and delete it so only necessary imports remain (e.g., keep
"from utils.visualize import visualize_properties") to eliminate the
unused-symbol warning.
- Around line 470-479: Remove the unused local variable workflow_id_or_dict:
delete the assignment "workflow_id_or_dict = None" from the cell since it is
never read; if the intent was to pass an explicit id/dict into
get_or_create_workflow, instead pass that variable into
get_or_create_workflow(client, workflow, ACCOUNT_ID) or rename/reuse it
appropriately, otherwise simply remove the variable to avoid the
unused-assignment.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8964867e-d2ec-41d5-88c3-524d90c5f0ab

📥 Commits

Reviewing files that changed from the base of the PR and between 0f3c1cf and e991ffe.

📒 Files selected for processing (1)
  • other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb

Comment on lines +249 to +253
"source": [
"projects = client.projects.list({\"isDefault\": True, \"owner._id\": ACCOUNT_ID})\n",
"project_id = projects[0][\"_id\"]\n",
"print(f\"✅ Using project: {projects[0]['name']} ({project_id})\")"
]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add guard for empty projects list.

If no default project exists for the account, accessing projects[0] will raise an IndexError.

🛡️ Proposed fix
     "source": [
      "projects = client.projects.list({\"isDefault\": True, \"owner._id\": ACCOUNT_ID})\n",
+     "if not projects:\n",
+     "    raise ValueError(f\"No default project found for account {ACCOUNT_ID}\")\n",
      "project_id = projects[0][\"_id\"]\n",
      "print(f\"✅ Using project: {projects[0]['name']} ({project_id})\")"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"source": [
"projects = client.projects.list({\"isDefault\": True, \"owner._id\": ACCOUNT_ID})\n",
"project_id = projects[0][\"_id\"]\n",
"print(f\"✅ Using project: {projects[0]['name']} ({project_id})\")"
]
"source": [
"projects = client.projects.list({\"isDefault\": True, \"owner._id\": ACCOUNT_ID})\n",
"if not projects:\n",
" raise ValueError(f\"No default project found for account {ACCOUNT_ID}\")\n",
"project_id = projects[0][\"_id\"]\n",
"print(f\"✅ Using project: {projects[0]['name']} ({project_id})\")"
]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`
around lines 249 - 253, The code assumes client.projects.list(...) returns a
non-empty list and directly accesses projects[0], which will raise IndexError if
no default project exists; update the logic around projects, project_id and the
print line to check if projects is truthy/has length > 0 before accessing
projects[0], and handle the empty-case by either raising a clear error or
creating/selecting a fallback project (e.g., log a helpful message with
ACCOUNT_ID and abort or call the project-creation flow), ensuring all references
to projects[0] (used to set project_id and in the print) are only executed after
the presence check.

Comment on lines +515 to +530
"source": [
"from mat3ra.ide.compute import Compute\n",
"\n",
"# Select cluster: use specified name if provided, otherwise use first available\n",
"if CLUSTER_NAME:\n",
" cluster = next((c for c in clusters if CLUSTER_NAME in c[\"hostname\"]), None)\n",
"else:\n",
" cluster = clusters[0]\n",
"\n",
"compute = Compute(\n",
" cluster=cluster,\n",
" queue=QUEUE_NAME,\n",
" ppn=PPN\n",
")\n",
"print(f\"Using cluster: {compute.cluster.hostname}, queue: {QUEUE_NAME}, ppn: {PPN}\")"
]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Handle case when specified cluster is not found.

If CLUSTER_NAME is set but doesn't match any cluster hostname, cluster will be None. This will cause an AttributeError on line 529 when accessing compute.cluster.hostname, or potentially earlier in the Compute constructor.

🛡️ Proposed fix
     "# Select cluster: use specified name if provided, otherwise use first available\n",
     "if CLUSTER_NAME:\n",
     "    cluster = next((c for c in clusters if CLUSTER_NAME in c[\"hostname\"]), None)\n",
+    "    if cluster is None:\n",
+    "        raise ValueError(f\"No cluster matching '{CLUSTER_NAME}' found. Available: {[c['hostname'] for c in clusters]}\")\n",
     "else:\n",
     "    cluster = clusters[0]\n",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`
around lines 515 - 530, When CLUSTER_NAME is set but no matching entry exists,
cluster becomes None which will later cause an AttributeError; update the
selection logic around CLUSTER_NAME/clusters to handle the not-found case: after
attempting next((c for c in clusters if CLUSTER_NAME in c["hostname"]), None)
check if cluster is None and either raise a clear ValueError mentioning
CLUSTER_NAME and available cluster hostnames or fall back to a safe default
(e.g., clusters[0]) before constructing Compute; ensure this validation happens
prior to calling Compute(...) and before printing compute.cluster.hostname.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb (1)

102-107: Consider using ASCII representation for gamma point.

The k-path uses Cyrillic "Г" (U+0413) for the gamma point. While this may display correctly, it could cause encoding issues in some environments or confusion when searching/comparing strings. Consider using "Gamma" or "G" if the visualization library supports it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`
around lines 102 - 107, The KPATH definition uses the Cyrillic character "Г" for
the gamma point which can cause encoding/search issues; update the KPATH list
(variable name KPATH) to replace "{\"point\": \"Г\", \"steps\": 6}" with an
ASCII label such as "{\"point\": \"Gamma\", \"steps\": 6}" or "{\"point\":
\"G\", \"steps\": 6}" (whichever the visualization library expects) and verify
any plotting/search code that consumes KPATH recognizes the new label.
utils/api.py (1)

153-161: Defensive mutation concern depends on external library implementation.

This function relies on mat3ra.wode.Workflow.to_dict() behavior. If it returns shallow references to nested structures, modifying unit["context"] could mutate the original workflow object. However, standard library design patterns typically create new dict objects rather than shallow references. Since the original workflow is not reused after this function returns, the practical risk is minimal.

If defensive programming is preferred or if there are known issues with mat3ra.wode, consider using copy.deepcopy() before modifying. Otherwise, this is likely safe as-is.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@utils/api.py` around lines 153 - 161, The current _clean_workflow_dict
modifies the dict returned from Workflow.to_dict() which may contain shallow
references and could mutate the original Workflow; to fix, make a deep copy of
the result (e.g., using copy.deepcopy) into a new variable before iterating and
clearing unit["context"], then return that deep-copied dict—update references in
_clean_workflow_dict and document that we use a defensive deepcopy of
Workflow.to_dict() to avoid mutating the source.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`:
- Around line 102-107: The KPATH definition uses the Cyrillic character "Г" for
the gamma point which can cause encoding/search issues; update the KPATH list
(variable name KPATH) to replace "{\"point\": \"Г\", \"steps\": 6}" with an
ASCII label such as "{\"point\": \"Gamma\", \"steps\": 6}" or "{\"point\":
\"G\", \"steps\": 6}" (whichever the visualization library expects) and verify
any plotting/search code that consumes KPATH recognizes the new label.

In `@utils/api.py`:
- Around line 153-161: The current _clean_workflow_dict modifies the dict
returned from Workflow.to_dict() which may contain shallow references and could
mutate the original Workflow; to fix, make a deep copy of the result (e.g.,
using copy.deepcopy) into a new variable before iterating and clearing
unit["context"], then return that deep-copied dict—update references in
_clean_workflow_dict and document that we use a defensive deepcopy of
Workflow.to_dict() to avoid mutating the source.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b92cc0e9-08c8-49f4-84e8-2d66146aab32

📥 Commits

Reviewing files that changed from the base of the PR and between e991ffe and 8fddfed.

📒 Files selected for processing (2)
  • other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb
  • utils/api.py

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (3)
other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb (3)

513-524: ⚠️ Potential issue | 🟠 Major

Validate cluster selection before building Compute.

This cell still assumes a usable cluster is always available. If clusters is empty, or if CLUSTER_NAME matches nothing, Compute(...) / the print on Line 524 will fail with a less helpful runtime error.

🛡️ Suggested fix
 from mat3ra.ide.compute import Compute
 
 # Select cluster: use specified name if provided, otherwise use first available
+if not clusters:
+    raise ValueError("No clusters available for the current account")
+
 if CLUSTER_NAME:
     cluster = next((c for c in clusters if CLUSTER_NAME in c["hostname"]), None)
+    if cluster is None:
+        raise ValueError(
+            f"No cluster matching '{CLUSTER_NAME}' found. Available: {[c['hostname'] for c in clusters]}"
+        )
 else:
     cluster = clusters[0]
 
 compute = Compute(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`
around lines 513 - 524, The code assumes a valid cluster is selected before
constructing Compute; validate that clusters is non-empty and that the selection
logic yields a non-None cluster (the variables/functions to look at are
clusters, CLUSTER_NAME, cluster, Compute, and compute). If clusters is empty, or
CLUSTER_NAME doesn't match any hostname, raise or log a clear error and abort
(or choose a safe fallback) instead of passing None into Compute; ensure the
print that accesses compute.cluster.hostname only runs after a successful
selection and handle/report the failure with a descriptive message.

238-240: ⚠️ Potential issue | 🟠 Major

Guard the default-project lookup before indexing.

This still assumes client.projects.list(...) returns at least one project. If it returns an empty list, the notebook fails on Line 239 with IndexError.

🛡️ Suggested fix
 projects = client.projects.list({"isDefault": True, "owner._id": ACCOUNT_ID})
+if not projects:
+    raise ValueError(f"No default project found for account {ACCOUNT_ID}")
 project_id = projects[0]["_id"]
 print(f"✅ Using project: {projects[0]['name']} ({project_id})")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`
around lines 238 - 240, The notebook assumes client.projects.list(...) returns
at least one entry and then indexes projects[0]; guard this by checking the
returned list before indexing: after calling client.projects.list(...) (the
projects variable) verify if projects is non-empty and if empty either raise a
clear error message or create/select a fallback project, otherwise proceed to
set project_id = projects[0]["_id"] and print the chosen project; update the
lookup around the projects variable and project_id assignment to avoid
IndexError when projects == [].

32-32: ⚠️ Potential issue | 🟡 Minor

Fix the typo in the results summary.

properies should be properties.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`
at line 32, Update the notebook cell text to correct the typo in the results
summary: change "Retrieve results: get and display the properies." to "Retrieve
results: get and display the properties." so the cell content in the
band_structure_graphene notebook reflects the correct word.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`:
- Around line 513-524: The code assumes a valid cluster is selected before
constructing Compute; validate that clusters is non-empty and that the selection
logic yields a non-None cluster (the variables/functions to look at are
clusters, CLUSTER_NAME, cluster, Compute, and compute). If clusters is empty, or
CLUSTER_NAME doesn't match any hostname, raise or log a clear error and abort
(or choose a safe fallback) instead of passing None into Compute; ensure the
print that accesses compute.cluster.hostname only runs after a successful
selection and handle/report the failure with a descriptive message.
- Around line 238-240: The notebook assumes client.projects.list(...) returns at
least one entry and then indexes projects[0]; guard this by checking the
returned list before indexing: after calling client.projects.list(...) (the
projects variable) verify if projects is non-empty and if empty either raise a
clear error message or create/select a fallback project, otherwise proceed to
set project_id = projects[0]["_id"] and print the chosen project; update the
lookup around the projects variable and project_id assignment to avoid
IndexError when projects == [].
- Line 32: Update the notebook cell text to correct the typo in the results
summary: change "Retrieve results: get and display the properies." to "Retrieve
results: get and display the properties." so the cell content in the
band_structure_graphene notebook reflects the correct word.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 915eca59-2d25-4c55-8044-e3a3b9463fd6

📥 Commits

Reviewing files that changed from the base of the PR and between 8fddfed and 6810376.

📒 Files selected for processing (2)
  • other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb
  • utils/visualize.py

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb (2)

240-242: ⚠️ Potential issue | 🟡 Minor

Guard projects[0] before indexing.

Accounts without a default project will raise IndexError here.

🛡️ Proposed fix
 projects = client.projects.list({"isDefault": True, "owner._id": ACCOUNT_ID})
+if not projects:
+    raise ValueError(f"No default project found for account {ACCOUNT_ID}")
 project_id = projects[0]["_id"]
 print(f"✅ Using project: {projects[0]['name']} ({project_id})")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`
around lines 240 - 242, The code assumes client.projects.list(...) returns a
non-empty list and directly indexes projects[0], which causes IndexError if no
default project exists; update the logic around projects, project_id and the
print statement to check for an empty list (e.g., if not projects: handle the
missing-default case by raising a clear error or creating/selecting a fallback
project) before accessing projects[0], and only set project_id =
projects[0]["_id"] and print the success message when a project is present.

494-496: ⚠️ Potential issue | 🟠 Major

Validate cluster selection before constructing Compute.

This still assumes clusters is non-empty and that CLUSTER_NAME matches at least one hostname. Otherwise this path fails with IndexError or AttributeError.

🛡️ Proposed fix
 clusters = client.clusters.list()
+if not clusters:
+    raise ValueError("No clusters are available for the authenticated account")
 print(f"Available clusters: {[c['hostname'] for c in clusters]}")
 
 # Select cluster: use specified name if provided, otherwise use first available
 if CLUSTER_NAME:
     cluster = next((c for c in clusters if CLUSTER_NAME in c["hostname"]), None)
+    if cluster is None:
+        raise ValueError(
+            f"No cluster matching '{CLUSTER_NAME}' found. Available: {[c['hostname'] for c in clusters]}"
+        )
 else:
     cluster = clusters[0]

Also applies to: 515-526

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`
around lines 494 - 496, The notebook assumes client.clusters.list() returns a
non-empty list and that CLUSTER_NAME matches a hostname before building the
Compute object, which can raise IndexError/AttributeError; update the logic
around clusters = client.clusters.list() and the Compute construction (the code
that selects cluster by CLUSTER_NAME and uses clusters[0] or similar) to first
check for an empty list and raise/handle a clear error if no clusters are
available, then validate that a cluster with hostname == CLUSTER_NAME exists (or
fall back to a default selection) and raise/handle a descriptive error if not
found; ensure the selection code uses safe lookups instead of direct indexing so
Compute is only constructed with a valid cluster object.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`:
- Around line 470-475: The printed saved_workflow.id is from the template
returned by get_or_create_workflow (which strips unit contexts), not the actual
workflow you submit; fix by persisting and using the same workflow object you
submit: call get_or_create_workflow(client, workflow, ACCOUNT_ID) after you
finish editing k-grid/k-path/cutoff so the returned
saved_workflow_response/saved_workflow reflects the submitted workflow and then
submit using that saved_workflow (or, alternatively, after submit capture and
print the workflow id from the job submission response); reference
get_or_create_workflow, saved_workflow_response, saved_workflow, and the
in-memory workflow so the printed ID matches the submitted job.
- Around line 553-567: The code treats job_response from utils.api.create_job()
as a single dict but create_job returns a List[dict]; update the logic that uses
create_job (the variables job_response, dict_to_namespace, and job_id) to handle
a list: check that job_response is non-empty, pick the intended job dict (e.g.,
job_response[0] or iterate if multiple), convert that dict with
dict_to_namespace, and then read its _id into job_id; also add an error path for
an empty list and a short log message so the cell doesn’t assume a single dict.

---

Duplicate comments:
In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`:
- Around line 240-242: The code assumes client.projects.list(...) returns a
non-empty list and directly indexes projects[0], which causes IndexError if no
default project exists; update the logic around projects, project_id and the
print statement to check for an empty list (e.g., if not projects: handle the
missing-default case by raising a clear error or creating/selecting a fallback
project) before accessing projects[0], and only set project_id =
projects[0]["_id"] and print the success message when a project is present.
- Around line 494-496: The notebook assumes client.clusters.list() returns a
non-empty list and that CLUSTER_NAME matches a hostname before building the
Compute object, which can raise IndexError/AttributeError; update the logic
around clusters = client.clusters.list() and the Compute construction (the code
that selects cluster by CLUSTER_NAME and uses clusters[0] or similar) to first
check for an empty list and raise/handle a clear error if no clusters are
available, then validate that a cluster with hostname == CLUSTER_NAME exists (or
fall back to a default selection) and raise/handle a descriptive error if not
found; ensure the selection code uses safe lookups instead of direct indexing so
Compute is only constructed with a valid cluster object.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 30284172-9c05-4e42-933e-49b5f0748f5c

📥 Commits

Reviewing files that changed from the base of the PR and between 0f02f6d and 8131d25.

📒 Files selected for processing (1)
  • other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb

Comment on lines +470 to +475
"from utils.generic import dict_to_namespace\n",
"from utils.api import get_or_create_workflow\n",
"\n",
"saved_workflow_response = get_or_create_workflow(client, workflow, ACCOUNT_ID)\n",
"saved_workflow = Workflow.create(saved_workflow_response)\n",
"print(f\"Workflow ID: {saved_workflow.id}\")"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

The saved workflow ID is not the workflow you actually submit.

get_or_create_workflow() strips unit contexts before persistence, but Line 556 submits the in-memory workflow with the edited k-grid, k-path, and cutoff settings intact. The printed saved_workflow.id is therefore a cleaned base template, not the workflow that produced the job, which makes the notebook’s traceability misleading.

🧭 Suggested clarification
-saved_workflow_response = get_or_create_workflow(client, workflow, ACCOUNT_ID)
-saved_workflow = Workflow.create(saved_workflow_response)
-print(f"Workflow ID: {saved_workflow.id}")
+base_workflow_response = get_or_create_workflow(client, workflow, ACCOUNT_ID)
+base_workflow = Workflow.create(base_workflow_response)
+print(f"Base workflow ID (without job-specific context): {base_workflow.id}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"from utils.generic import dict_to_namespace\n",
"from utils.api import get_or_create_workflow\n",
"\n",
"saved_workflow_response = get_or_create_workflow(client, workflow, ACCOUNT_ID)\n",
"saved_workflow = Workflow.create(saved_workflow_response)\n",
"print(f\"Workflow ID: {saved_workflow.id}\")"
"from utils.generic import dict_to_namespace\n",
"from utils.api import get_or_create_workflow\n",
"\n",
"base_workflow_response = get_or_create_workflow(client, workflow, ACCOUNT_ID)\n",
"base_workflow = Workflow.create(base_workflow_response)\n",
"print(f\"Base workflow ID (without job-specific context): {base_workflow.id}\")"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`
around lines 470 - 475, The printed saved_workflow.id is from the template
returned by get_or_create_workflow (which strips unit contexts), not the actual
workflow you submit; fix by persisting and using the same workflow object you
submit: call get_or_create_workflow(client, workflow, ACCOUNT_ID) after you
finish editing k-grid/k-path/cutoff so the returned
saved_workflow_response/saved_workflow reflects the submitted workflow and then
submit using that saved_workflow (or, alternatively, after submit capture and
print the workflow id from the job submission response); reference
get_or_create_workflow, saved_workflow_response, saved_workflow, and the
in-memory workflow so the printed ID matches the submitted job.

Comment on lines +553 to +567
"job_response = create_job(\n",
" api_client=client,\n",
" materials=[saved_material],\n",
" workflow=workflow,\n",
" project_id=project_id,\n",
" owner_id=ACCOUNT_ID,\n",
" prefix=job_name,\n",
" compute=compute.to_dict(),\n",
")\n",
"\n",
"job = dict_to_namespace(job_response)\n",
"job_id = job._id\n",
"print(\"✅ Job created successfully!\")\n",
"print(f\"Job ID: {job_id}\")\n",
"display_JSON(job_response)"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Treat create_job() as a list-returning helper.

utils.api.create_job() returns List[dict], but this cell immediately treats job_response as a single job dict. dict_to_namespace(job_response) and job._id will break once the helper returns its documented list payload.

🧩 Proposed fix
-job_response = create_job(
+job_responses = create_job(
     api_client=client,
     materials=[saved_material],
     workflow=workflow,
     project_id=project_id,
     owner_id=ACCOUNT_ID,
     prefix=job_name,
     compute=compute.to_dict(),
 )
 
-job = dict_to_namespace(job_response)
+if not job_responses:
+    raise RuntimeError("No jobs were created")
+
+job_response = job_responses[0]
+job = dict_to_namespace(job_response)
 job_id = job._id
 print("✅ Job created successfully!")
 print(f"Job ID: {job_id}")
 display_JSON(job_response)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`
around lines 553 - 567, The code treats job_response from utils.api.create_job()
as a single dict but create_job returns a List[dict]; update the logic that uses
create_job (the variables job_response, dict_to_namespace, and job_id) to handle
a list: check that job_response is non-empty, pick the intended job dict (e.g.,
job_response[0] or iterate if multiple), convert that dict with
dict_to_namespace, and then read its _id into job_id; also add an error path for
an empty list and a short log message so the cell doesn’t assume a single dict.

@@ -0,0 +1,651 @@
{
Copy link
Member

@timurbazhirov timurbazhirov Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use ECUTWFC of 40 per the GBRV guidelines


Reply via ReviewNB

@@ -0,0 +1,651 @@
{
Copy link
Member

@timurbazhirov timurbazhirov Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #30.    bands_unit.input = [{"name": "bands.in", "content": (

We can comment it out and explain that this is the "last-resort" measure


Reply via ReviewNB

utils/api.py Outdated
for swf in workflow_dict.get("subworkflows", []):
for unit in swf.get("units", []):
unit["context"] = {}
return workflow_dict
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be inside Wode

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As part of the Workflow class

@VsevolodX VsevolodX changed the title update: 1st implementation Feature/SOF-7859 Update: specific example BS of Gr + N Mar 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants