Skip to content

Add disambiguationBehavior option#8054

Open
alesan99 wants to merge 12 commits intomainfrom
issue-3922-1
Open

Add disambiguationBehavior option#8054
alesan99 wants to merge 12 commits intomainfrom
issue-3922-1

Conversation

@alesan99
Copy link
Copy Markdown
Contributor

@alesan99 alesan99 commented May 4, 2026

Fixes #3922

image

Checklist

  • Self-review the PR after opening it to make sure the changes look good and
    self-explanatory (or properly documented)
  • Add relevant issue to release milestone
  • Add pr to documentation list
  • Add automated tests

Testing instructions

  • Go to workbench and create a dataset and mapping where multiple records are matched to a table (which requires disambiguation)
    • For example, if you have multiple agents named 'test' on your database, in a Collection Object dataset add the collector first name field and put 'test' in the field. It will ask you to disambiguate the agent.
  • Validate
  • By default, you should be asked to disambiguate records
  • You should be able to successfully upload the dataset once you disambiguate. Check to make sure all the records were uploaded correctly.
  • On the dataset, rollback and go back to the field mapper, and click the gear next to a field/column mapping
  • Disambiguation Behavior should appear correctly
  • You should be able to switch between disambiguation behaviors.
  • When you switch between behaviors, all fields that belong to that table should automatically also have their behaviors modified at the same time.
  • Validate
  • You should no longer be asked the same fields you were asked to disambiguate before.
  • You should be able to successfully upload the dataset. Check to make sure all the records were uploaded correctly.
  • Open an existing dataset.
  • Make sure there are no new errors when validating existing datasets.

Summary by CodeRabbit

  • New Features

    • Added a disambiguation behavior option for workbench upload mappings with "ask" (default) and "pickFirst" choices to control handling of ambiguous matches.
    • When "pickFirst" is chosen, multiple matches can be auto-resolved to the first match.
  • UI

    • Mapping options menu now lets users change disambiguation behavior; new UI labels/descriptions added.
  • Tests

    • Added tests covering the new disambiguation behavior.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 4, 2026

Review Change Stack

Warning

Rate limit exceeded

@alesan99 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 53 minutes and 56 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 6b2de6b2-2a80-46da-af23-081afb91d9ae

📥 Commits

Reviewing files that changed from the base of the PR and between 7df9e83 and f3fda4b.

📒 Files selected for processing (1)
  • specifyweb/backend/workbench/upload/column_options.py
📝 Walkthrough

Walkthrough

This PR introduces configurable "pick first" disambiguation behavior for WorkBench matching. A new DisambiguationBehavior type ("ask" | "pickFirst") is added to backend and frontend types and JSON schema, threaded through parsing and ParseResult, applied in upload-table match resolution to return the first match when configured, and exposed in the frontend mapping UI and reducer. Defaults remain "ask"; tests and fixtures updated.

Changes

WorkBench Disambiguation Behavior

Layer / File(s) Summary
Type definitions and schema contracts
specifyweb/backend/workbench/upload/column_options.py, specifyweb/frontend/js_src/lib/components/WbPlanView/uploadPlanParser.ts, specifyweb/backend/workbench/upload/upload_plan_schema.py
Add DisambiguationBehavior ("ask" | "pickFirst"). Extend backend ColumnOptions, ExtendedColumnOptions, and frontend ColumnOptions to include disambiguationBehavior. JSON schema and parse_column_options include the property with default "ask".
Parsing pipeline propagation
specifyweb/backend/workbench/upload/parsing.py, specifyweb/backend/workbench/upload/scoping.py, specifyweb/backend/workbench/upload/upload_attachments.py
ParseResult gains disambiguation_behavior. ParseResult.from_parse_success, filter_and_upload, and parse_with_picklist accept and forward the behavior (default "ask"). Scoping and attachment-plan builders propagate or set "ask" for generated columns.
Match resolution with pick-first logic
specifyweb/backend/workbench/upload/upload_table.py
BoundUploadTable._match checks parsed fields for disambiguation_behavior == "pickFirst" and, when n_matched > 1 and enabled, returns the first match instead of MatchedMultiple. New helper _disambiguation_pick_first() added.
Frontend state management
specifyweb/frontend/js_src/lib/components/WbPlanView/mappingReducer.ts, specifyweb/frontend/js_src/lib/components/WbPlanView/linesGetter.ts
Added ChangeDisambiguationBehaviorAction to update columnOptions.disambiguationBehavior for matching mapping lines (path-based). defaultColumnOptions now includes disambiguationBehavior: 'ask'.
Frontend UI components and wiring
specifyweb/frontend/js_src/lib/components/WbPlanView/Mapper.tsx, specifyweb/frontend/js_src/lib/components/WbPlanView/MapperComponents.tsx
Mapper wires onChangeDisambiguationBehavior to dispatch action. Mapping options menu renders disambiguation behavior radio choices and forwards selection to the handler.
UI localization
specifyweb/frontend/js_src/lib/localization/wbPlan.ts
Add localization strings for disambiguation behavior label, "Ask" option, and "Pick first" option and descriptions.
Tests and fixtures
specifyweb/backend/workbench/upload/tests/testparsing.py, specifyweb/frontend/js_src/lib/components/WbPlanView/__tests__/linesGetter.test.ts, specifyweb/frontend/js_src/lib/tests/fixtures/*
Backend tests updated and new DisambiguationBehaviorTests added to verify pickFirst. Existing tests and frontend fixtures updated to include disambiguationBehavior: "ask" where applicable.

Suggested reviewers

  • acwhite211
  • CarolineDenis
🚥 Pre-merge checks | ✅ 6
✅ Passed checks (6 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add disambiguationBehavior option' clearly and concisely summarizes the main change: introducing a new disambiguation behavior configuration option to the Workbench uploader.
Linked Issues check ✅ Passed The PR implements all coding requirements from issue #3922: adds configurable 'ask' and 'pickFirst' disambiguation behavior options across backend (parsing, upload logic) and frontend (UI, reducer, localization), with automated tests validating the functionality.
Out of Scope Changes check ✅ Passed All changes are directly scoped to implementing the disambiguationBehavior feature: type definitions, schema updates, UI controls, reducer logic, test fixtures, and automated tests. No unrelated modifications detected.
Automatic Tests ✅ Passed PR includes automatic tests covering backend pickFirst disambiguation behavior and frontend fixture updates for the new field across test suites.
Testing Instructions ✅ Passed Comprehensive testing provided: backend test validates pickFirst behavior, frontend fixtures updated, default ask behavior tested via existing tests and explicit setup.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch issue-3922-1

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

alesan99 and others added 4 commits May 4, 2026 15:51
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
@alesan99 alesan99 modified the milestone: 7.12.2 May 6, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 8, 2026

Caution

Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted.

Error details
{}

@alesan99 alesan99 marked this pull request as ready for review May 8, 2026 15:03
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
specifyweb/backend/workbench/upload/column_options.py (1)

21-24: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve disambiguationBehavior in shorthand serialization.

At Line [21], the shorthand return (return self.column) drops disambiguationBehavior. A pickFirst setting with otherwise default options is lost on save/load and reverts to "ask".

Proposed fix
     def to_json(self) -> dict | str:
-        if self.matchBehavior == "ignoreNever" and self.nullAllowed and self.default is None:
+        if (
+            self.matchBehavior == "ignoreNever"
+            and self.nullAllowed
+            and self.default is None
+            and self.disambiguationBehavior == "ask"
+        ):
             return self.column
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@specifyweb/backend/workbench/upload/column_options.py` around lines 21 - 24,
The shorthand branch in the serializer returns self.column and thereby drops
self.disambiguationBehavior (losing a pickFirst setting); update the conditional
that returns self.column in the method using
self.matchBehavior/self.nullAllowed/self.default so that the returned dict
includes disambiguationBehavior when it is set (or when it differs from the
default "ask") — e.g., construct a copy of self.column and inject
{"disambiguationBehavior": self.disambiguationBehavior} (or always include it)
before returning, keeping the rest of the shorthand logic intact.
🧹 Nitpick comments (1)
specifyweb/backend/workbench/upload/parsing.py (1)

35-47: ⚡ Quick win

Prefer keyword construction for ParseResult.

This PR had to update every positional ParseResult(...)/from_parse_success(...) call just to add one field. Switching these to keyword args, or routing the blank-value case through a helper, would make the next shape change much harder to miss.

♻️ Suggested cleanup
 `@classmethod`
-def from_parse_success(cls, ps: ParseSucess, filter_on: Filter, add_to_picklist: PicklistAddition | None, column: str,
-                       missing_required: str | None, disambiguationBehavior: DisambiguationBehavior):
-    return cls(filter_on=filter_on, upload=ps.payload, add_to_picklist=add_to_picklist, column=column,
-               missing_required=missing_required, disambiguationBehavior=disambiguationBehavior)
+def from_parse_success(
+    cls,
+    ps: ParseSucess,
+    filter_on: Filter,
+    add_to_picklist: PicklistAddition | None,
+    column: str,
+    missing_required: str | None,
+    disambiguationBehavior: DisambiguationBehavior,
+):
+    return cls(
+        filter_on=filter_on,
+        upload=ps.payload,
+        add_to_picklist=add_to_picklist,
+        column=column,
+        missing_required=missing_required,
+        disambiguationBehavior=disambiguationBehavior,
+    )

 def filter_and_upload(f: Filter, column: str, disambiguation_behavior: DisambiguationBehavior = 'ask') -> ParseResult:
-    return ParseResult(f, f, None, column, None, disambiguation_behavior)
+    return ParseResult(
+        filter_on=f,
+        upload=f,
+        add_to_picklist=None,
+        column=column,
+        missing_required=None,
+        disambiguationBehavior=disambiguation_behavior,
+    )

Also applies to: 53-54, 81-82

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@specifyweb/backend/workbench/upload/parsing.py` around lines 35 - 47, Update
all constructions of ParseResult and calls to ParseResult.from_parse_success to
use explicit keyword arguments for every field (e.g., filter_on=..., upload=...,
add_to_picklist=..., column=..., missing_required=...,
disambiguationBehavior=...) instead of relying on positional ordering; this
includes the other uses of ParseResult/from_parse_success in this module. If
there are places that pass a placeholder/empty value for a field, route that
through the existing from_parse_success helper (or add a small helper) so
callers don’t rely on positional defaults. Aim to make every ParseResult(...)
invocation use named parameters so future shape changes won’t break silencely.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@specifyweb/backend/workbench/upload/tests/testparsing.py`:
- Around line 886-900: The test is asserting the top-level CollectionObject
result instead of the nested Agent disambiguation; change the first assertion to
expect results[0].record_result to be Uploaded (since CO '123' is new) and add
an assertion that the nested agent outcome is Matched by checking
results[0].toOne['cataloger'].record_result is an instance of Matched; keep the
existing assertion that results[1].record_result is Uploaded so the second row's
CO behavior is still validated (refer to do_upload, results, record_result,
toOne['cataloger'], Matched, Uploaded).

In `@specifyweb/backend/workbench/upload/upload_plan_schema.py`:
- Around line 277-281: The JSON schema for the property disambiguationBehavior
has an invalid default "ignoreNever" that isn't listed in its enum; update the
schema for disambiguationBehavior (property name) so its "default" is one of the
allowed enum values ("ask" or "pickFirst") or remove the "default" key entirely
— e.g., change the default to "pickFirst" or "ask" to restore internal
consistency with the enum.

In `@specifyweb/frontend/js_src/lib/components/WbPlanView/mappingReducer.ts`:
- Around line 405-421: ChangeDisambiguationBehaviorAction updates matched lines
but incorrectly uses the triggering line's columnOptions as the base
(state.lines[action.line].columnOptions), which overwrites other lines'
matchBehavior/nullAllowed/default; change the spread to use the current
iteration's columnOptions (line.columnOptions) when constructing the new
columnOptions for each matched line, keeping disambiguationBehavior from action
and preserving each line's existing options; locate
ChangeDisambiguationBehaviorAction and replace the base spread accordingly while
still using isSameMappingPath(mappingPath, ...) to select lines and setting
changesMade: true.

---

Outside diff comments:
In `@specifyweb/backend/workbench/upload/column_options.py`:
- Around line 21-24: The shorthand branch in the serializer returns self.column
and thereby drops self.disambiguationBehavior (losing a pickFirst setting);
update the conditional that returns self.column in the method using
self.matchBehavior/self.nullAllowed/self.default so that the returned dict
includes disambiguationBehavior when it is set (or when it differs from the
default "ask") — e.g., construct a copy of self.column and inject
{"disambiguationBehavior": self.disambiguationBehavior} (or always include it)
before returning, keeping the rest of the shorthand logic intact.

---

Nitpick comments:
In `@specifyweb/backend/workbench/upload/parsing.py`:
- Around line 35-47: Update all constructions of ParseResult and calls to
ParseResult.from_parse_success to use explicit keyword arguments for every field
(e.g., filter_on=..., upload=..., add_to_picklist=..., column=...,
missing_required=..., disambiguationBehavior=...) instead of relying on
positional ordering; this includes the other uses of
ParseResult/from_parse_success in this module. If there are places that pass a
placeholder/empty value for a field, route that through the existing
from_parse_success helper (or add a small helper) so callers don’t rely on
positional defaults. Aim to make every ParseResult(...) invocation use named
parameters so future shape changes won’t break silencely.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 2bf69460-911b-4e80-aace-d8b058412006

📥 Commits

Reviewing files that changed from the base of the PR and between 753a2cf and 7c59a11.

📒 Files selected for processing (17)
  • specifyweb/backend/workbench/upload/column_options.py
  • specifyweb/backend/workbench/upload/parsing.py
  • specifyweb/backend/workbench/upload/scoping.py
  • specifyweb/backend/workbench/upload/tests/testparsing.py
  • specifyweb/backend/workbench/upload/upload_attachments.py
  • specifyweb/backend/workbench/upload/upload_plan_schema.py
  • specifyweb/backend/workbench/upload/upload_table.py
  • specifyweb/frontend/js_src/lib/components/WbPlanView/Mapper.tsx
  • specifyweb/frontend/js_src/lib/components/WbPlanView/MapperComponents.tsx
  • specifyweb/frontend/js_src/lib/components/WbPlanView/__tests__/linesGetter.test.ts
  • specifyweb/frontend/js_src/lib/components/WbPlanView/linesGetter.ts
  • specifyweb/frontend/js_src/lib/components/WbPlanView/mappingReducer.ts
  • specifyweb/frontend/js_src/lib/components/WbPlanView/uploadPlanParser.ts
  • specifyweb/frontend/js_src/lib/localization/wbPlan.ts
  • specifyweb/frontend/js_src/lib/tests/fixtures/mappinglines.1.json
  • specifyweb/frontend/js_src/lib/tests/fixtures/uploadplan.1.json
  • specifyweb/frontend/js_src/lib/tests/fixtures/wbplanviewlines.1.json

Comment thread specifyweb/backend/workbench/upload/tests/testparsing.py Outdated
Comment thread specifyweb/backend/workbench/upload/upload_plan_schema.py
@github-project-automation github-project-automation Bot moved this from 📋Back Log to Dev Attention Needed in General Tester Board May 8, 2026
@alesan99 alesan99 requested review from a team May 8, 2026 21:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Dev Attention Needed

Development

Successfully merging this pull request may close these issues.

Add "pick first" matching behavior to the WorkBench

1 participant