Feature/SOF-7859 Update: specific example BS of Gr + N#273
Feature/SOF-7859 Update: specific example BS of Gr + N#273
Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughA Jupyter notebook demonstrating nitrogen-doped graphene band structure workflow is added. The API client is updated to clean workflow dictionaries during persistence, ensuring unit contexts are properly handled. Visualization functionality is extended to support additional data array formats. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (3)
other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb (3)
137-138:POLL_INTERVALis defined but never used.The variable is set to 60 seconds but not passed to
wait_for_jobs_to_finish_async. If the intention is to configure the polling interval, you'll need to pass it to the function (if supported) or remove the unused variable.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb` around lines 137 - 138, POLL_INTERVAL is defined but never used — either pass it into the wait_for_jobs_to_finish_async call or remove the variable; update the invocation of wait_for_jobs_to_finish_async to accept the POLL_INTERVAL (e.g., pass POLL_INTERVAL as the polling interval/timeout parameter) if the function supports it, or delete the unused POLL_INTERVAL definition and any related comment to clean up dead code; look for the POLL_INTERVAL symbol and the wait_for_jobs_to_finish_async call to apply the change.
618-624: Remove unused import.
PropertyNameis imported but never used.♻️ Proposed fix
- "from mat3ra.prode import PropertyName\n", "from utils.visualize import visualize_properties\n",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb` around lines 618 - 624, Remove the unused import PropertyName from the top of the notebook cell; locate the import line "from mat3ra.prode import PropertyName" (near the imports used with client.properties.get_for_job and visualize_properties) and delete it so only necessary imports remain (e.g., keep "from utils.visualize import visualize_properties") to eliminate the unused-symbol warning.
470-479: Remove unused variable.
workflow_id_or_dictis assigned but never used.♻️ Proposed fix
"from utils.generic import dict_to_namespace\n", "from utils.api import get_or_create_workflow\n", "\n", - "workflow_id_or_dict = None\n", - "\n", "saved_workflow_response = get_or_create_workflow(client, workflow, ACCOUNT_ID)\n",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb` around lines 470 - 479, Remove the unused local variable workflow_id_or_dict: delete the assignment "workflow_id_or_dict = None" from the cell since it is never read; if the intent was to pass an explicit id/dict into get_or_create_workflow, instead pass that variable into get_or_create_workflow(client, workflow, ACCOUNT_ID) or rename/reuse it appropriately, otherwise simply remove the variable to avoid the unused-assignment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`:
- Line 28: Fix the typo in the notebook cell string "1. Retrieve results: get
and display the properies." by changing "properies" to "properties" so the line
reads "1. Retrieve results: get and display the properties."; locate the exact
text in the band_structure_graphene.ipynb cell content and update it
accordingly.
- Around line 249-253: The code assumes client.projects.list(...) returns a
non-empty list and directly accesses projects[0], which will raise IndexError if
no default project exists; update the logic around projects, project_id and the
print line to check if projects is truthy/has length > 0 before accessing
projects[0], and handle the empty-case by either raising a clear error or
creating/selecting a fallback project (e.g., log a helpful message with
ACCOUNT_ID and abort or call the project-creation flow), ensuring all references
to projects[0] (used to set project_id and in the print) are only executed after
the presence check.
- Around line 157-165: Remove the hardcoded OIDC_ACCESS_TOKEN assignment in the
notebook cell (the os.environ["OIDC_ACCESS_TOKEN"] line) and any other hardcoded
credentials (API_HOST/API_PORT/API_SECURE if they are sensitive); instead update
the cell to rely on externally set environment variables or the existing OIDC
flow in the next cell by deleting the literal token assignment and adding a
brief comment instructing developers to set OIDC_ACCESS_TOKEN (and other vars)
via their shell or a .env/local dev script before running the notebook.
- Around line 515-530: When CLUSTER_NAME is set but no matching entry exists,
cluster becomes None which will later cause an AttributeError; update the
selection logic around CLUSTER_NAME/clusters to handle the not-found case: after
attempting next((c for c in clusters if CLUSTER_NAME in c["hostname"]), None)
check if cluster is None and either raise a clear ValueError mentioning
CLUSTER_NAME and available cluster hostnames or fall back to a safe default
(e.g., clusters[0]) before constructing Compute; ensure this validation happens
prior to calling Compute(...) and before printing compute.cluster.hostname.
---
Nitpick comments:
In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`:
- Around line 137-138: POLL_INTERVAL is defined but never used — either pass it
into the wait_for_jobs_to_finish_async call or remove the variable; update the
invocation of wait_for_jobs_to_finish_async to accept the POLL_INTERVAL (e.g.,
pass POLL_INTERVAL as the polling interval/timeout parameter) if the function
supports it, or delete the unused POLL_INTERVAL definition and any related
comment to clean up dead code; look for the POLL_INTERVAL symbol and the
wait_for_jobs_to_finish_async call to apply the change.
- Around line 618-624: Remove the unused import PropertyName from the top of the
notebook cell; locate the import line "from mat3ra.prode import PropertyName"
(near the imports used with client.properties.get_for_job and
visualize_properties) and delete it so only necessary imports remain (e.g., keep
"from utils.visualize import visualize_properties") to eliminate the
unused-symbol warning.
- Around line 470-479: Remove the unused local variable workflow_id_or_dict:
delete the assignment "workflow_id_or_dict = None" from the cell since it is
never read; if the intent was to pass an explicit id/dict into
get_or_create_workflow, instead pass that variable into
get_or_create_workflow(client, workflow, ACCOUNT_ID) or rename/reuse it
appropriately, otherwise simply remove the variable to avoid the
unused-assignment.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8964867e-d2ec-41d5-88c3-524d90c5f0ab
📒 Files selected for processing (1)
other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb
other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb
Outdated
Show resolved
Hide resolved
other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb
Outdated
Show resolved
Hide resolved
| "source": [ | ||
| "projects = client.projects.list({\"isDefault\": True, \"owner._id\": ACCOUNT_ID})\n", | ||
| "project_id = projects[0][\"_id\"]\n", | ||
| "print(f\"✅ Using project: {projects[0]['name']} ({project_id})\")" | ||
| ] |
There was a problem hiding this comment.
Add guard for empty projects list.
If no default project exists for the account, accessing projects[0] will raise an IndexError.
🛡️ Proposed fix
"source": [
"projects = client.projects.list({\"isDefault\": True, \"owner._id\": ACCOUNT_ID})\n",
+ "if not projects:\n",
+ " raise ValueError(f\"No default project found for account {ACCOUNT_ID}\")\n",
"project_id = projects[0][\"_id\"]\n",
"print(f\"✅ Using project: {projects[0]['name']} ({project_id})\")"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "source": [ | |
| "projects = client.projects.list({\"isDefault\": True, \"owner._id\": ACCOUNT_ID})\n", | |
| "project_id = projects[0][\"_id\"]\n", | |
| "print(f\"✅ Using project: {projects[0]['name']} ({project_id})\")" | |
| ] | |
| "source": [ | |
| "projects = client.projects.list({\"isDefault\": True, \"owner._id\": ACCOUNT_ID})\n", | |
| "if not projects:\n", | |
| " raise ValueError(f\"No default project found for account {ACCOUNT_ID}\")\n", | |
| "project_id = projects[0][\"_id\"]\n", | |
| "print(f\"✅ Using project: {projects[0]['name']} ({project_id})\")" | |
| ] |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`
around lines 249 - 253, The code assumes client.projects.list(...) returns a
non-empty list and directly accesses projects[0], which will raise IndexError if
no default project exists; update the logic around projects, project_id and the
print line to check if projects is truthy/has length > 0 before accessing
projects[0], and handle the empty-case by either raising a clear error or
creating/selecting a fallback project (e.g., log a helpful message with
ACCOUNT_ID and abort or call the project-creation flow), ensuring all references
to projects[0] (used to set project_id and in the print) are only executed after
the presence check.
| "source": [ | ||
| "from mat3ra.ide.compute import Compute\n", | ||
| "\n", | ||
| "# Select cluster: use specified name if provided, otherwise use first available\n", | ||
| "if CLUSTER_NAME:\n", | ||
| " cluster = next((c for c in clusters if CLUSTER_NAME in c[\"hostname\"]), None)\n", | ||
| "else:\n", | ||
| " cluster = clusters[0]\n", | ||
| "\n", | ||
| "compute = Compute(\n", | ||
| " cluster=cluster,\n", | ||
| " queue=QUEUE_NAME,\n", | ||
| " ppn=PPN\n", | ||
| ")\n", | ||
| "print(f\"Using cluster: {compute.cluster.hostname}, queue: {QUEUE_NAME}, ppn: {PPN}\")" | ||
| ] |
There was a problem hiding this comment.
Handle case when specified cluster is not found.
If CLUSTER_NAME is set but doesn't match any cluster hostname, cluster will be None. This will cause an AttributeError on line 529 when accessing compute.cluster.hostname, or potentially earlier in the Compute constructor.
🛡️ Proposed fix
"# Select cluster: use specified name if provided, otherwise use first available\n",
"if CLUSTER_NAME:\n",
" cluster = next((c for c in clusters if CLUSTER_NAME in c[\"hostname\"]), None)\n",
+ " if cluster is None:\n",
+ " raise ValueError(f\"No cluster matching '{CLUSTER_NAME}' found. Available: {[c['hostname'] for c in clusters]}\")\n",
"else:\n",
" cluster = clusters[0]\n",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`
around lines 515 - 530, When CLUSTER_NAME is set but no matching entry exists,
cluster becomes None which will later cause an AttributeError; update the
selection logic around CLUSTER_NAME/clusters to handle the not-found case: after
attempting next((c for c in clusters if CLUSTER_NAME in c["hostname"]), None)
check if cluster is None and either raise a clear ValueError mentioning
CLUSTER_NAME and available cluster hostnames or fall back to a safe default
(e.g., clusters[0]) before constructing Compute; ensure this validation happens
prior to calling Compute(...) and before printing compute.cluster.hostname.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb (1)
102-107: Consider using ASCII representation for gamma point.The k-path uses Cyrillic "Г" (U+0413) for the gamma point. While this may display correctly, it could cause encoding issues in some environments or confusion when searching/comparing strings. Consider using "Gamma" or "G" if the visualization library supports it.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb` around lines 102 - 107, The KPATH definition uses the Cyrillic character "Г" for the gamma point which can cause encoding/search issues; update the KPATH list (variable name KPATH) to replace "{\"point\": \"Г\", \"steps\": 6}" with an ASCII label such as "{\"point\": \"Gamma\", \"steps\": 6}" or "{\"point\": \"G\", \"steps\": 6}" (whichever the visualization library expects) and verify any plotting/search code that consumes KPATH recognizes the new label.utils/api.py (1)
153-161: Defensive mutation concern depends on external library implementation.This function relies on
mat3ra.wode.Workflow.to_dict()behavior. If it returns shallow references to nested structures, modifyingunit["context"]could mutate the original workflow object. However, standard library design patterns typically create new dict objects rather than shallow references. Since the original workflow is not reused after this function returns, the practical risk is minimal.If defensive programming is preferred or if there are known issues with
mat3ra.wode, consider usingcopy.deepcopy()before modifying. Otherwise, this is likely safe as-is.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@utils/api.py` around lines 153 - 161, The current _clean_workflow_dict modifies the dict returned from Workflow.to_dict() which may contain shallow references and could mutate the original Workflow; to fix, make a deep copy of the result (e.g., using copy.deepcopy) into a new variable before iterating and clearing unit["context"], then return that deep-copied dict—update references in _clean_workflow_dict and document that we use a defensive deepcopy of Workflow.to_dict() to avoid mutating the source.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`:
- Around line 102-107: The KPATH definition uses the Cyrillic character "Г" for
the gamma point which can cause encoding/search issues; update the KPATH list
(variable name KPATH) to replace "{\"point\": \"Г\", \"steps\": 6}" with an
ASCII label such as "{\"point\": \"Gamma\", \"steps\": 6}" or "{\"point\":
\"G\", \"steps\": 6}" (whichever the visualization library expects) and verify
any plotting/search code that consumes KPATH recognizes the new label.
In `@utils/api.py`:
- Around line 153-161: The current _clean_workflow_dict modifies the dict
returned from Workflow.to_dict() which may contain shallow references and could
mutate the original Workflow; to fix, make a deep copy of the result (e.g.,
using copy.deepcopy) into a new variable before iterating and clearing
unit["context"], then return that deep-copied dict—update references in
_clean_workflow_dict and document that we use a defensive deepcopy of
Workflow.to_dict() to avoid mutating the source.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b92cc0e9-08c8-49f4-84e8-2d66146aab32
📒 Files selected for processing (2)
other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynbutils/api.py
There was a problem hiding this comment.
♻️ Duplicate comments (3)
other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb (3)
513-524:⚠️ Potential issue | 🟠 MajorValidate cluster selection before building
Compute.This cell still assumes a usable cluster is always available. If
clustersis empty, or ifCLUSTER_NAMEmatches nothing,Compute(...)/ the print on Line 524 will fail with a less helpful runtime error.🛡️ Suggested fix
from mat3ra.ide.compute import Compute # Select cluster: use specified name if provided, otherwise use first available +if not clusters: + raise ValueError("No clusters available for the current account") + if CLUSTER_NAME: cluster = next((c for c in clusters if CLUSTER_NAME in c["hostname"]), None) + if cluster is None: + raise ValueError( + f"No cluster matching '{CLUSTER_NAME}' found. Available: {[c['hostname'] for c in clusters]}" + ) else: cluster = clusters[0] compute = Compute(🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb` around lines 513 - 524, The code assumes a valid cluster is selected before constructing Compute; validate that clusters is non-empty and that the selection logic yields a non-None cluster (the variables/functions to look at are clusters, CLUSTER_NAME, cluster, Compute, and compute). If clusters is empty, or CLUSTER_NAME doesn't match any hostname, raise or log a clear error and abort (or choose a safe fallback) instead of passing None into Compute; ensure the print that accesses compute.cluster.hostname only runs after a successful selection and handle/report the failure with a descriptive message.
238-240:⚠️ Potential issue | 🟠 MajorGuard the default-project lookup before indexing.
This still assumes
client.projects.list(...)returns at least one project. If it returns an empty list, the notebook fails on Line 239 withIndexError.🛡️ Suggested fix
projects = client.projects.list({"isDefault": True, "owner._id": ACCOUNT_ID}) +if not projects: + raise ValueError(f"No default project found for account {ACCOUNT_ID}") project_id = projects[0]["_id"] print(f"✅ Using project: {projects[0]['name']} ({project_id})")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb` around lines 238 - 240, The notebook assumes client.projects.list(...) returns at least one entry and then indexes projects[0]; guard this by checking the returned list before indexing: after calling client.projects.list(...) (the projects variable) verify if projects is non-empty and if empty either raise a clear error message or create/select a fallback project, otherwise proceed to set project_id = projects[0]["_id"] and print the chosen project; update the lookup around the projects variable and project_id assignment to avoid IndexError when projects == [].
32-32:⚠️ Potential issue | 🟡 MinorFix the typo in the results summary.
properiesshould beproperties.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb` at line 32, Update the notebook cell text to correct the typo in the results summary: change "Retrieve results: get and display the properies." to "Retrieve results: get and display the properties." so the cell content in the band_structure_graphene notebook reflects the correct word.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`:
- Around line 513-524: The code assumes a valid cluster is selected before
constructing Compute; validate that clusters is non-empty and that the selection
logic yields a non-None cluster (the variables/functions to look at are
clusters, CLUSTER_NAME, cluster, Compute, and compute). If clusters is empty, or
CLUSTER_NAME doesn't match any hostname, raise or log a clear error and abort
(or choose a safe fallback) instead of passing None into Compute; ensure the
print that accesses compute.cluster.hostname only runs after a successful
selection and handle/report the failure with a descriptive message.
- Around line 238-240: The notebook assumes client.projects.list(...) returns at
least one entry and then indexes projects[0]; guard this by checking the
returned list before indexing: after calling client.projects.list(...) (the
projects variable) verify if projects is non-empty and if empty either raise a
clear error message or create/select a fallback project, otherwise proceed to
set project_id = projects[0]["_id"] and print the chosen project; update the
lookup around the projects variable and project_id assignment to avoid
IndexError when projects == [].
- Line 32: Update the notebook cell text to correct the typo in the results
summary: change "Retrieve results: get and display the properies." to "Retrieve
results: get and display the properties." so the cell content in the
band_structure_graphene notebook reflects the correct word.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 915eca59-2d25-4c55-8044-e3a3b9463fd6
📒 Files selected for processing (2)
other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynbutils/visualize.py
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb (2)
240-242:⚠️ Potential issue | 🟡 MinorGuard
projects[0]before indexing.Accounts without a default project will raise
IndexErrorhere.🛡️ Proposed fix
projects = client.projects.list({"isDefault": True, "owner._id": ACCOUNT_ID}) +if not projects: + raise ValueError(f"No default project found for account {ACCOUNT_ID}") project_id = projects[0]["_id"] print(f"✅ Using project: {projects[0]['name']} ({project_id})")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb` around lines 240 - 242, The code assumes client.projects.list(...) returns a non-empty list and directly indexes projects[0], which causes IndexError if no default project exists; update the logic around projects, project_id and the print statement to check for an empty list (e.g., if not projects: handle the missing-default case by raising a clear error or creating/selecting a fallback project) before accessing projects[0], and only set project_id = projects[0]["_id"] and print the success message when a project is present.
494-496:⚠️ Potential issue | 🟠 MajorValidate cluster selection before constructing
Compute.This still assumes
clustersis non-empty and thatCLUSTER_NAMEmatches at least one hostname. Otherwise this path fails withIndexErrororAttributeError.🛡️ Proposed fix
clusters = client.clusters.list() +if not clusters: + raise ValueError("No clusters are available for the authenticated account") print(f"Available clusters: {[c['hostname'] for c in clusters]}") # Select cluster: use specified name if provided, otherwise use first available if CLUSTER_NAME: cluster = next((c for c in clusters if CLUSTER_NAME in c["hostname"]), None) + if cluster is None: + raise ValueError( + f"No cluster matching '{CLUSTER_NAME}' found. Available: {[c['hostname'] for c in clusters]}" + ) else: cluster = clusters[0]Also applies to: 515-526
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb` around lines 494 - 496, The notebook assumes client.clusters.list() returns a non-empty list and that CLUSTER_NAME matches a hostname before building the Compute object, which can raise IndexError/AttributeError; update the logic around clusters = client.clusters.list() and the Compute construction (the code that selects cluster by CLUSTER_NAME and uses clusters[0] or similar) to first check for an empty list and raise/handle a clear error if no clusters are available, then validate that a cluster with hostname == CLUSTER_NAME exists (or fall back to a default selection) and raise/handle a descriptive error if not found; ensure the selection code uses safe lookups instead of direct indexing so Compute is only constructed with a valid cluster object.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`:
- Around line 470-475: The printed saved_workflow.id is from the template
returned by get_or_create_workflow (which strips unit contexts), not the actual
workflow you submit; fix by persisting and using the same workflow object you
submit: call get_or_create_workflow(client, workflow, ACCOUNT_ID) after you
finish editing k-grid/k-path/cutoff so the returned
saved_workflow_response/saved_workflow reflects the submitted workflow and then
submit using that saved_workflow (or, alternatively, after submit capture and
print the workflow id from the job submission response); reference
get_or_create_workflow, saved_workflow_response, saved_workflow, and the
in-memory workflow so the printed ID matches the submitted job.
- Around line 553-567: The code treats job_response from utils.api.create_job()
as a single dict but create_job returns a List[dict]; update the logic that uses
create_job (the variables job_response, dict_to_namespace, and job_id) to handle
a list: check that job_response is non-empty, pick the intended job dict (e.g.,
job_response[0] or iterate if multiple), convert that dict with
dict_to_namespace, and then read its _id into job_id; also add an error path for
an empty list and a short log message so the cell doesn’t assume a single dict.
---
Duplicate comments:
In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`:
- Around line 240-242: The code assumes client.projects.list(...) returns a
non-empty list and directly indexes projects[0], which causes IndexError if no
default project exists; update the logic around projects, project_id and the
print statement to check for an empty list (e.g., if not projects: handle the
missing-default case by raising a clear error or creating/selecting a fallback
project) before accessing projects[0], and only set project_id =
projects[0]["_id"] and print the success message when a project is present.
- Around line 494-496: The notebook assumes client.clusters.list() returns a
non-empty list and that CLUSTER_NAME matches a hostname before building the
Compute object, which can raise IndexError/AttributeError; update the logic
around clusters = client.clusters.list() and the Compute construction (the code
that selects cluster by CLUSTER_NAME and uses clusters[0] or similar) to first
check for an empty list and raise/handle a clear error if no clusters are
available, then validate that a cluster with hostname == CLUSTER_NAME exists (or
fall back to a default selection) and raise/handle a descriptive error if not
found; ensure the selection code uses safe lookups instead of direct indexing so
Compute is only constructed with a valid cluster object.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 30284172-9c05-4e42-933e-49b5f0748f5c
📒 Files selected for processing (1)
other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb
| "from utils.generic import dict_to_namespace\n", | ||
| "from utils.api import get_or_create_workflow\n", | ||
| "\n", | ||
| "saved_workflow_response = get_or_create_workflow(client, workflow, ACCOUNT_ID)\n", | ||
| "saved_workflow = Workflow.create(saved_workflow_response)\n", | ||
| "print(f\"Workflow ID: {saved_workflow.id}\")" |
There was a problem hiding this comment.
The saved workflow ID is not the workflow you actually submit.
get_or_create_workflow() strips unit contexts before persistence, but Line 556 submits the in-memory workflow with the edited k-grid, k-path, and cutoff settings intact. The printed saved_workflow.id is therefore a cleaned base template, not the workflow that produced the job, which makes the notebook’s traceability misleading.
🧭 Suggested clarification
-saved_workflow_response = get_or_create_workflow(client, workflow, ACCOUNT_ID)
-saved_workflow = Workflow.create(saved_workflow_response)
-print(f"Workflow ID: {saved_workflow.id}")
+base_workflow_response = get_or_create_workflow(client, workflow, ACCOUNT_ID)
+base_workflow = Workflow.create(base_workflow_response)
+print(f"Base workflow ID (without job-specific context): {base_workflow.id}")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "from utils.generic import dict_to_namespace\n", | |
| "from utils.api import get_or_create_workflow\n", | |
| "\n", | |
| "saved_workflow_response = get_or_create_workflow(client, workflow, ACCOUNT_ID)\n", | |
| "saved_workflow = Workflow.create(saved_workflow_response)\n", | |
| "print(f\"Workflow ID: {saved_workflow.id}\")" | |
| "from utils.generic import dict_to_namespace\n", | |
| "from utils.api import get_or_create_workflow\n", | |
| "\n", | |
| "base_workflow_response = get_or_create_workflow(client, workflow, ACCOUNT_ID)\n", | |
| "base_workflow = Workflow.create(base_workflow_response)\n", | |
| "print(f\"Base workflow ID (without job-specific context): {base_workflow.id}\")" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`
around lines 470 - 475, The printed saved_workflow.id is from the template
returned by get_or_create_workflow (which strips unit contexts), not the actual
workflow you submit; fix by persisting and using the same workflow object you
submit: call get_or_create_workflow(client, workflow, ACCOUNT_ID) after you
finish editing k-grid/k-path/cutoff so the returned
saved_workflow_response/saved_workflow reflects the submitted workflow and then
submit using that saved_workflow (or, alternatively, after submit capture and
print the workflow id from the job submission response); reference
get_or_create_workflow, saved_workflow_response, saved_workflow, and the
in-memory workflow so the printed ID matches the submitted job.
| "job_response = create_job(\n", | ||
| " api_client=client,\n", | ||
| " materials=[saved_material],\n", | ||
| " workflow=workflow,\n", | ||
| " project_id=project_id,\n", | ||
| " owner_id=ACCOUNT_ID,\n", | ||
| " prefix=job_name,\n", | ||
| " compute=compute.to_dict(),\n", | ||
| ")\n", | ||
| "\n", | ||
| "job = dict_to_namespace(job_response)\n", | ||
| "job_id = job._id\n", | ||
| "print(\"✅ Job created successfully!\")\n", | ||
| "print(f\"Job ID: {job_id}\")\n", | ||
| "display_JSON(job_response)" |
There was a problem hiding this comment.
Treat create_job() as a list-returning helper.
utils.api.create_job() returns List[dict], but this cell immediately treats job_response as a single job dict. dict_to_namespace(job_response) and job._id will break once the helper returns its documented list payload.
🧩 Proposed fix
-job_response = create_job(
+job_responses = create_job(
api_client=client,
materials=[saved_material],
workflow=workflow,
project_id=project_id,
owner_id=ACCOUNT_ID,
prefix=job_name,
compute=compute.to_dict(),
)
-job = dict_to_namespace(job_response)
+if not job_responses:
+ raise RuntimeError("No jobs were created")
+
+job_response = job_responses[0]
+job = dict_to_namespace(job_response)
job_id = job._id
print("✅ Job created successfully!")
print(f"Job ID: {job_id}")
display_JSON(job_response)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@other/materials_designer/specific_examples/workflows/band_structure_graphene.ipynb`
around lines 553 - 567, The code treats job_response from utils.api.create_job()
as a single dict but create_job returns a List[dict]; update the logic that uses
create_job (the variables job_response, dict_to_namespace, and job_id) to handle
a list: check that job_response is non-empty, pick the intended job dict (e.g.,
job_response[0] or iterate if multiple), convert that dict with
dict_to_namespace, and then read its _id into job_id; also add an error path for
an empty list and a short log message so the cell doesn’t assume a single dict.
| @@ -0,0 +1,651 @@ | |||
| { | |||
There was a problem hiding this comment.
| @@ -0,0 +1,651 @@ | |||
| { | |||
There was a problem hiding this comment.
Line #30. bands_unit.input = [{"name": "bands.in", "content": (
We can comment it out and explain that this is the "last-resort" measure
Reply via ReviewNB
utils/api.py
Outdated
| for swf in workflow_dict.get("subworkflows", []): | ||
| for unit in swf.get("units", []): | ||
| unit["context"] = {} | ||
| return workflow_dict |
There was a problem hiding this comment.
As part of the Workflow class
Summary by CodeRabbit
Release Notes
Documentation
Bug Fixes