Skip to content

Add MSBuild target authoring skills (target-authoring, property-patterns, item-management, extension-points)#669

Merged
JanKrivanek merged 20 commits into
dotnet:mainfrom
YuliiaKovalova:msbuild-target-authoring-skills
May 22, 2026
Merged

Add MSBuild target authoring skills (target-authoring, property-patterns, item-management, extension-points)#669
JanKrivanek merged 20 commits into
dotnet:mainfrom
YuliiaKovalova:msbuild-target-authoring-skills

Conversation

@YuliiaKovalova
Copy link
Copy Markdown
Member

Summary

Adds four new skills to the dotnet-msbuild plugin covering MSBuild target/props authoring best practices derived from analysis of the MSBuild repo's own .targets and .props files.

Closes #668

New Skills

1. target-authoring

Three-level target chain pattern (Before/Core/After), DependsOn chain extension, DependsOnTargets vs BeforeTargets/AfterTargets guidance, Returns vs Outputs, incremental build with Inputs/Outputs, naming conventions, and a complete target template.

2. property-patterns

Conditional defaults, semicolon-delimited composition, path normalization, MSBuild string functions, TFM condition helpers, guard properties, feature gating, and fallback chains.

3. item-management

Include/Remove/Update semantics, batching (single-axis vs cross-product pitfall), item transforms, Exclude patterns, conditional item inclusion, PrivateAssets/ExcludeAssets metadata, and FileWrites registration for generated files.

4. extension-points

CustomBefore/CustomAfter hooks, wildcard import directories with alphabetic ordering, import gating with control properties, NuGet package build extension layout (build/buildTransitive), Directory.Build discovery and multi-level hierarchy, and the import guard pattern.

Tests

Each skill includes:

  • eval.yaml with scenarios, assertions, and rubric
  • Anti-pattern fixture files (.csproj, Directory.Build.props/targets) for the eval to review

Checklist

New skills:
- target-authoring: three-level target chain, DependsOn extension, naming conventions
- property-patterns: conditional defaults, composition, path normalization, TFM helpers
- item-management: Include/Remove/Update, batching, transforms, FileWrites registration
- extension-points: CustomBefore/After hooks, import gating, NuGet build extensions

Each skill includes eval.yaml tests with anti-pattern fixtures.

Closes dotnet#668
Copilot AI review requested due to automatic review settings May 19, 2026 09:34
@JanKrivanek
Copy link
Copy Markdown
Member

/evaluate

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds four new authoring-focused skills to the dotnet-msbuild plugin (target authoring, property patterns, item management, and extension points) along with evaluation scenarios and anti-pattern fixture files to validate the agent’s review guidance.

Changes:

  • Added 4 new skills under plugins/dotnet-msbuild/skills/* documenting canonical MSBuild authoring patterns.
  • Added 4 new test suites under tests/dotnet-msbuild/* with eval.yaml scenarios and fixture projects/files containing intentional anti-patterns.
  • Added supporting fixture assets (Directory.Build.props/targets, schemas, JSON data, and placeholder source files) used by the eval scenarios.

Reviewed changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
plugins/dotnet-msbuild/skills/target-authoring/SKILL.md New guidance for structuring custom targets, chaining, Returns vs Outputs, and templates.
plugins/dotnet-msbuild/skills/property-patterns/SKILL.md New guidance for conditional defaults, composition, path handling, and property functions.
plugins/dotnet-msbuild/skills/item-management/SKILL.md New guidance for Include/Remove/Update, batching, transforms, and common pitfalls.
plugins/dotnet-msbuild/skills/extension-points/SKILL.md New guidance for CustomBefore/After hooks, import guards, Directory.Build discovery, and NuGet build extensions.
tests/dotnet-msbuild/target-authoring/TargetAuthoring.csproj Fixture project containing target-authoring anti-patterns for review.
tests/dotnet-msbuild/target-authoring/Placeholder.cs Minimal source file for the target-authoring fixture.
tests/dotnet-msbuild/target-authoring/eval.yaml Eval scenario asserting correct feedback on target-authoring anti-patterns.
tests/dotnet-msbuild/property-patterns/PropertyPatterns.csproj Fixture project for property-patterns eval context.
tests/dotnet-msbuild/property-patterns/Placeholder.cs Minimal source file for the property-patterns fixture.
tests/dotnet-msbuild/property-patterns/eval.yaml Eval scenario asserting correct feedback on property definition anti-patterns.
tests/dotnet-msbuild/property-patterns/Directory.Build.props Fixture props file containing property anti-patterns.
tests/dotnet-msbuild/item-management/ItemManagement.csproj Fixture project containing item-management anti-patterns for review.
tests/dotnet-msbuild/item-management/Placeholder.cs Minimal source file for the item-management fixture.
tests/dotnet-msbuild/item-management/eval.yaml Eval scenario asserting correct feedback on item/batching/generation issues.
tests/dotnet-msbuild/item-management/schemas/users.xsd Fixture schema input for item-management scenarios.
tests/dotnet-msbuild/item-management/schemas/orders.xsd Fixture schema input for item-management scenarios.
tests/dotnet-msbuild/item-management/data/users.json Fixture data input for item-management scenarios.
tests/dotnet-msbuild/item-management/data/orders.json Fixture data input for item-management scenarios.
tests/dotnet-msbuild/item-management/Constants.g.cs Fixture generated-file example used by the item-management scenario.
tests/dotnet-msbuild/extension-points/ExtensionPoints.csproj Fixture project for extension-points eval context.
tests/dotnet-msbuild/extension-points/Placeholder.cs Minimal source file for the extension-points fixture.
tests/dotnet-msbuild/extension-points/eval.yaml Eval scenario asserting correct feedback on import guards and extensibility patterns.
tests/dotnet-msbuild/extension-points/Directory.Build.targets Fixture targets file containing extension-point anti-patterns.
tests/dotnet-msbuild/extension-points/Directory.Build.props Fixture props file defining RepoRoot for the extension-points scenario.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread plugins/dotnet-msbuild/skills/target-authoring/SKILL.md Outdated
Comment thread plugins/dotnet-msbuild/skills/property-patterns/SKILL.md Outdated
Comment thread plugins/dotnet-msbuild/skills/extension-points/SKILL.md
github-actions Bot added a commit that referenced this pull request May 19, 2026
github-actions Bot added a commit that referenced this pull request May 19, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Skill Validation Results

Skill Scenario Quality Skills Loaded Overfit Verdict
item-management Review item group management patterns 5.0/5 → 5.0/5 ✅ item-management; tools: skill, bash / ✅ item-management; msbuild-antipatterns; tools: task, glob, read_agent, skill 🟡 0.48 [1]
extension-points Review MSBuild extension point patterns 5.0/5 → 5.0/5 ✅ extension-points; tools: skill 🟡 0.26 [2]
property-patterns Review MSBuild properties for anti-patterns 5.0/5 → 5.0/5 ✅ property-patterns; tools: skill, edit / ⚠️ NOT ACTIVATED 🟡 0.27 [3]
target-authoring Review custom target for authoring anti-patterns 5.0/5 → 4.0/5 🔴 ✅ target-authoring; tools: skill, bash, edit / ✅ target-authoring; tools: skill, task, glob, read_agent, edit, bash 🟡 0.24 [4]

[1] ⚠️ High run-to-run variance (CV=0.54) — consider re-running with --runs 5. (Isolated) Quality unchanged but weighted score is -18.4% due to: judgment, quality, tokens (74017 → 114510)
[2] ⚠️ High run-to-run variance (CV=0.73) — consider re-running with --runs 5. (Plugin) Quality unchanged but weighted score is -3.2% due to: tokens (26497 → 53858), time (17.6s → 24.6s)
[3] (Plugin) Quality unchanged but weighted score is -2.1% due to: tokens (26010 → 33891), time (17.2s → 21.0s)
[4] ⚠️ High run-to-run variance (CV=1.14) — consider re-running with --runs 5. (Plugin) Quality unchanged but weighted score is -4.4% due to: tokens (58214 → 97213), time (38.3s → 55.2s)

Model: claude-opus-4.6 | Judge: claude-opus-4.6

🔍 Full Results - additional metrics and failure investigation steps

▶ Sessions Visualisation -- interactive replay of all evaluation sessions

- target-authoring: Reword DO NOT USE clause to exclude only deep
  incremental-build diagnostics, not basic Inputs/Outputs usage
- property-patterns: Close unclosed XML elements in String Functions
  snippet (PropertyGroup, TargetFrameworkMoniker)
- extension-points: Add missing CustomAfterMySDK property definition
  to match the import at bottom of example
Copilot AI review requested due to automatic review settings May 19, 2026 09:51
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 24 out of 24 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (3)

plugins/dotnet-msbuild/skills/extension-points/SKILL.md:150

  • The parent Directory.Build.props import example is missing an Exists(...) guard. Without it, the Import can fail when the file isn’t found (and the repo’s own directory-build-organization skill shows the guarded pattern). Consider adding an Exists condition to make the snippet robust and consistent.
```xml
<!-- src/Directory.Build.props -->
<Import Project="$([MSBuild]::GetPathOfFileAbove('Directory.Build.props', '$(MSBuildThisFileDirectory)..\'))" />
**plugins/dotnet-msbuild/skills/property-patterns/SKILL.md:78**
* The IsPathRooted example passes $(MSBuildProjectExtensionsPath) unquoted into the property function. Quote the argument so paths containing spaces/special chars don’t break the function call, and keep function-call quoting consistent within the doc.
$([System.IO.Path]::Combine('$(MSBuildProjectDirectory)', '$(MSBuildProjectExtensionsPath)')) ``` **plugins/dotnet-msbuild/skills/extension-points/SKILL.md:142** * In the Directory.Build.props discovery snippet, quote $(MSBuildProjectDirectory) when passing it into GetDirectoryNameOfFileAbove to avoid issues with spaces/special characters in paths and to keep function-call quoting consistent with other examples. ``` ```xml &lt;_DirectoryBuildPropsBasePath&gt; $([MSBuild]::GetDirectoryNameOfFileAbove($(MSBuildProjectDirectory), 'Directory.Build.props')) ```

Comment thread plugins/dotnet-msbuild/skills/extension-points/SKILL.md Outdated
Comment thread plugins/dotnet-msbuild/skills/property-patterns/SKILL.md Outdated
- extension-points: Use full MSBuildExtensionsPath expression and
  consistent Exists() casing in wildcard import example
- property-patterns: Quote property arguments in NormalizePath call
@YuliiaKovalova
Copy link
Copy Markdown
Member Author

/evaluate

@ViktorHofer
Copy link
Copy Markdown
Member

This looks good but before approving I would like to see an improvement in an evaluation run. My understanding was that the LLM with the public available documentation was good enough.

@ViktorHofer
Copy link
Copy Markdown
Member

/evaluate

github-actions Bot added a commit that referenced this pull request May 19, 2026
github-actions Bot added a commit that referenced this pull request May 19, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Skill Validation Results

Skill Scenario Quality Skills Loaded Overfit Verdict
item-management Review item group management patterns 5.0/5 → 5.0/5 ✅ item-management; tools: skill / ✅ item-management; including-generated-files; msbuild-antipatterns; tools: task, read_agent, skill 🟡 0.45
extension-points Review MSBuild extension point patterns 4.7/5 → 4.7/5 ✅ extension-points; tools: skill, edit / ✅ extension-points; tools: skill 🟡 0.30 [1]
property-patterns Review MSBuild properties for anti-patterns 5.0/5 → 5.0/5 ✅ property-patterns; tools: skill / ⚠️ NOT ACTIVATED 🟡 0.29 [2]
target-authoring Review custom target for authoring anti-patterns 5.0/5 → 5.0/5 ✅ target-authoring; tools: skill, edit, bash / ✅ target-authoring; tools: skill, task, read_agent 🟡 0.29 [3]

[1] ⚠️ High run-to-run variance (CV=3.64) — consider re-running with --runs 5. (Isolated) Quality unchanged but weighted score is -24.0% due to: judgment, tokens (26649 → 89596), quality, tool calls (5 → 8), time (17.1s → 27.2s)
[2] ⚠️ High run-to-run variance (CV=0.57) — consider re-running with --runs 5. (Plugin) Quality unchanged but weighted score is -1.9% due to: tokens (25980 → 33909)
[3] ⚠️ High run-to-run variance (CV=3.22) — consider re-running with --runs 5. (Plugin) Quality unchanged but weighted score is -0.7% due to: tokens (47641 → 72285), time (36.2s → 49.3s)

Model: claude-opus-4.6 | Judge: claude-opus-4.6

🔍 Full Results - additional metrics and failure investigation steps

▶ Sessions Visualisation -- interactive replay of all evaluation sessions

Rewrite all 4 eval prompts as natural developer problem descriptions
instead of skill-aligned checklists. Softens technique-prescriptive
rubric items. Result: overfitting scores drop from 0.15-0.31 to
0.06-0.08 (all green). property-patterns now passes eval.
Copilot AI review requested due to automatic review settings May 19, 2026 10:43
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 24 out of 24 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

plugins/dotnet-msbuild/skills/property-patterns/SKILL.md:110

  • The IsNullOrWhitespace example passes $(TargetFrameworkProfile) without quoting it inside the property function call. This can break if the property is empty or contains spaces/special characters; quote it as a string argument so the sample is copy/paste safe.
<!-- IsNullOrWhitespace -->
<TargetFrameworkMoniker
    Condition="'$([System.String]::IsNullOrWhitespace($(TargetFrameworkProfile)))' != 'true'">
  $(TargetFrameworkIdentifier),Version=$(TargetFrameworkVersion),Profile=$(TargetFrameworkProfile)
</TargetFrameworkMoniker>

Comment thread plugins/dotnet-msbuild/skills/property-patterns/SKILL.md Outdated
Comment thread plugins/dotnet-msbuild/skills/extension-points/SKILL.md Outdated
Comment thread tests/dotnet-msbuild/target-authoring/TargetAuthoring.csproj Outdated
@Evangelink
Copy link
Copy Markdown
Member

/evaluate

github-actions Bot added a commit that referenced this pull request May 20, 2026
github-actions Bot added a commit that referenced this pull request May 20, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Skill Validation Results

Skill Scenario Quality Skills Loaded Overfit Verdict
item-management Diagnose item group and batching issues 5.0/5 → 3.7/5 ⏰ 🔴 ✅ item-management; tools: skill ✅ 0.17 [1]
item-management Diagnose cascading item and batching bugs in code generation pipeline 4.3/5 → 5.0/5 🟢 ✅ item-management; tools: skill, glob / ⚠️ NOT ACTIVATED ✅ 0.17 [2]
item-management Fix item management anti-patterns 3.0/5 → 4.7/5 🟢 ✅ item-management; tools: skill / ⚠️ NOT ACTIVATED ✅ 0.17 [3]
extension-points Diagnose build extension point failures 3.0/5 → 4.7/5 🟢 ✅ extension-points; tools: skill ✅ 0.11 [4]
extension-points Diagnose NuGet package and repo extension conflicts 3.0/5 → 3.0/5 ✅ extension-points; tools: skill, edit, glob / ✅ extension-points; tools: skill ✅ 0.11 [5]
extension-points Fix extension point anti-patterns 5.0/5 → 4.3/5 🔴 ✅ extension-points; tools: skill, glob / ⚠️ NOT ACTIVATED ✅ 0.11 [6]
property-patterns Diagnose shared build property issues 4.0/5 → 4.0/5 ✅ property-patterns; tools: skill, edit / ✅ property-patterns; tools: skill, glob, edit ✅ 0.08 [7]
property-patterns Diagnose multi-level property hierarchy bugs 4.7/5 → 4.7/5 ✅ property-patterns; tools: skill ✅ 0.08 [8]
property-patterns Fix shared property configuration 4.3/5 → 4.3/5 ✅ property-patterns; tools: skill, bash / ⚠️ NOT ACTIVATED ✅ 0.08 [9]
target-authoring Diagnose custom target build regression 2.7/5 → 4.7/5 🟢 ✅ target-authoring; tools: skill, glob / ✅ target-authoring; tools: skill, bash ✅ 0.07 [10]
target-authoring Diagnose broken SDK target chain across files 3.0/5 → 3.0/5 ✅ target-authoring; tools: skill / ⚠️ NOT ACTIVATED ✅ 0.07 [11]
target-authoring Fix custom target anti-patterns 4.3/5 → 4.0/5 🔴 ✅ target-authoring; tools: skill, bash / ⚠️ NOT ACTIVATED ✅ 0.07 [12]

[1] ⚠️ High run-to-run variance (CV=190%) — consider re-running with --runs 5
[2] ⚠️ High run-to-run variance (CV=360%) — consider re-running with --runs 5
[3] ⚠️ High run-to-run variance (CV=56%) — consider re-running with --runs 5
[4] ⚠️ High run-to-run variance (CV=79%) — consider re-running with --runs 5
[5] ⚠️ High run-to-run variance (CV=102%) — consider re-running with --runs 5. (Plugin) Quality unchanged but weighted score is -4.8% due to: tokens (57043 → 99549), time (52.5s → 70.4s)
[6] ⚠️ High run-to-run variance (CV=314%) — consider re-running with --runs 5
[7] ⚠️ High run-to-run variance (CV=320%) — consider re-running with --runs 5
[8] ⚠️ High run-to-run variance (CV=90%) — consider re-running with --runs 5. (Isolated) Quality unchanged but weighted score is -37.9% due to: judgment, quality, tokens (99324 → 143768), tool calls (11 → 15)
[9] ⚠️ High run-to-run variance (CV=282%) — consider re-running with --runs 5. (Isolated) Quality unchanged but weighted score is -7.5% due to: judgment
[10] ⚠️ High run-to-run variance (CV=64%) — consider re-running with --runs 5
[11] ⚠️ High run-to-run variance (CV=285%) — consider re-running with --runs 5
[12] ⚠️ High run-to-run variance (CV=286%) — consider re-running with --runs 5. (Plugin) Quality unchanged but weighted score is -4.1% due to: tokens (77863 → 278406), time (57.1s → 121.1s), tool calls (12 → 15)

timeout — run(s) hit the (240s) scenario timeout limit; scoring may be impacted by aborting model execution before it could produce its full output (increase via timeout in eval.yaml)

Model: claude-opus-4.6 | Judge: claude-opus-4.6

🔍 Full Results - additional metrics and failure investigation steps

▶ Sessions Visualisation -- interactive replay of all evaluation sessions

Add 'Only activate in MSBuild/.NET build context.' prefix to all 4 skills
for consistency with existing skills in the plugin.

Add explicit 'diagnosing and fixing' and 'reviewing' keywords to USE FOR
sections so the SDK selects these skills over the broader
msbuild-antipatterns skill when prompts ask to fix or review specific
item/extension/property/target patterns.

Add 'general MSBuild anti-pattern catalog (use msbuild-antipatterns)' to
DO NOT USE FOR sections to help the SDK disambiguate.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 21, 2026 09:47
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 46 out of 46 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (1)

tests/dotnet-msbuild/property-patterns/eval.yaml:52

  • The rubric item about LangVersion being "impossible for projects to override" doesn’t match MSBuild evaluation order for this fixture: values set in Directory.Build.props are overridden by later .csproj assignments. Align the rubric with the intended issue (e.g., earlier-layer/command-line overrides) or adjust the fixture so LangVersion is assigned late enough to block project overrides.

Comment thread tests/dotnet-msbuild/property-patterns/eval.yaml
Comment thread plugins/dotnet-msbuild/skills/extension-points/SKILL.md Outdated
Comment thread tests/dotnet-msbuild/target-authoring/eval.yaml
Comment thread tests/dotnet-msbuild/target-authoring/hard/CustomSdk.targets
All 4 new skills had descriptions exceeding the 1024-character maximum
enforced by the skill-validator check command. The Copilot SDK silently
ignores skills with over-length descriptions, causing 0% activation in
both isolated and plugin evaluation modes.

Shortened all 4 descriptions while preserving key activation keywords
(USE FOR / DO NOT USE FOR / 'Only activate in MSBuild/.NET build context').

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@JanKrivanek
Copy link
Copy Markdown
Member

/evaluate

github-actions Bot added a commit that referenced this pull request May 21, 2026
github-actions Bot added a commit that referenced this pull request May 21, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Skill Validation Results

Skill Scenario Quality Skills Loaded Overfit Verdict
item-management Diagnose item group and batching issues 5.0/5 → 5.0/5 ✅ item-management; tools: skill ✅ 0.12 [1]
item-management Diagnose cascading item and batching bugs in code generation pipeline 4.3/5 → 5.0/5 🟢 ✅ item-management; tools: skill, bash, edit / ✅ item-management; tools: skill, edit, bash ✅ 0.12 [2]
item-management Fix item management anti-patterns 3.3/5 → 4.7/5 🟢 ✅ item-management; tools: skill / ⚠️ NOT ACTIVATED ✅ 0.12 [3]
extension-points Diagnose build extension point failures 3.0/5 → 5.0/5 🟢 ✅ extension-points; tools: skill ✅ 0.11 [4]
extension-points Diagnose NuGet package and repo extension conflicts 3.0/5 → 3.0/5 ✅ extension-points; tools: skill, glob, edit / ✅ extension-points; tools: skill ✅ 0.11 [5]
extension-points Fix extension point anti-patterns 5.0/5 → 4.7/5 🔴 ✅ extension-points; tools: skill / ✅ extension-points; tools: glob, skill ✅ 0.11 [6]
property-patterns Diagnose shared build property issues 4.0/5 → 4.0/5 ✅ property-patterns; tools: skill, bash / ✅ property-patterns; tools: glob, skill, bash ✅ 0.20 [7]
property-patterns Diagnose multi-level property hierarchy bugs 4.3/5 → 4.3/5 ✅ property-patterns; tools: skill ✅ 0.20 [8]
property-patterns Fix shared property configuration 4.0/5 → 4.0/5 ✅ property-patterns; tools: skill / ⚠️ NOT ACTIVATED ✅ 0.20 [9]
target-authoring Diagnose custom target build regression 3.0/5 → 4.7/5 🟢 ✅ target-authoring; tools: skill, edit, glob / ✅ target-authoring; tools: skill, edit, bash 🟡 0.25
target-authoring Diagnose broken SDK target chain across files 3.0/5 → 3.0/5 ✅ target-authoring; tools: skill / ✅ target-authoring; tools: skill, glob 🟡 0.25 [10]
target-authoring Fix custom target anti-patterns 3.7/5 → 5.0/5 🟢 ✅ target-authoring; tools: skill, bash / ⚠️ NOT ACTIVATED 🟡 0.25

[1] ⚠️ High run-to-run variance (CV=61%) — consider re-running with --runs 5
[2] ⚠️ High run-to-run variance (CV=144%) — consider re-running with --runs 5. (Plugin) Quality improved but weighted score is -26.0% due to: judgment, quality, tokens (82492 → 167158), tool calls (10 → 13)
[3] ⚠️ High run-to-run variance (CV=78%) — consider re-running with --runs 5
[4] ⚠️ High run-to-run variance (CV=103%) — consider re-running with --runs 5
[5] ⚠️ High run-to-run variance (CV=58%) — consider re-running with --runs 5. (Isolated) Quality unchanged but weighted score is -4.3% due to: tokens (55727 → 92431), tool calls (9 → 11)
[6] ⚠️ High run-to-run variance (CV=130%) — consider re-running with --runs 5
[7] ⚠️ High run-to-run variance (CV=109%) — consider re-running with --runs 5. (Isolated) Quality unchanged but weighted score is -13.6% due to: judgment, quality
[8] ⚠️ High run-to-run variance (CV=126%) — consider re-running with --runs 5. (Isolated) Quality unchanged but weighted score is -20.8% due to: judgment, quality, tool calls (12 → 16)
[9] ⚠️ High run-to-run variance (CV=257%) — consider re-running with --runs 5
[10] ⚠️ High run-to-run variance (CV=231%) — consider re-running with --runs 5. (Plugin) Quality unchanged but weighted score is -0.4% due to: tokens (90754 → 119518)

Model: claude-opus-4.6 | Judge: claude-opus-4.6

🔍 Full Results - additional metrics and failure investigation steps

▶ Sessions Visualisation -- interactive replay of all evaluation sessions

@JanKrivanek JanKrivanek enabled auto-merge (squash) May 21, 2026 14:03
- target-authoring eval.yaml: fix prompt to say 'Outputs attribute' (not
  'Returns') since the fixture uses Outputs on a query target, which is
  the actual bug that causes MSBuild to skip re-execution
- property-patterns eval.yaml: reword prompt and rubric — the real issue
  is unconditional assignment breaking parent-child Directory.Build.props
  inheritance, not project files being unable to override properties
- extension-points SKILL.md: fix GetPathOfFileAbove example to factor
  the path into a property so Project= and Condition= use the same value
  consistently (avoiding the ..\\ vs ..\ discrepancy)
- CustomSdk.targets: add WriteLinesToFile to CoreCodeGen so it actually
  creates the .g.cs output files (prevents compilation failure in fixture)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 21, 2026 16:00
auto-merge was automatically disabled May 21, 2026 16:00

Head branch was pushed to by a user without write access

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 46 out of 46 changed files in this pull request and generated 2 comments.

Comment thread tests/dotnet-msbuild/property-patterns/eval.yaml
Comment thread tests/dotnet-msbuild/property-patterns/eval.yaml
Scenario 1 (path merge bug):
- PropertyPatterns.csproj: add OutputPath using CustomOutputDir without a
  trailing separator, making the merge bug observable
  (artifacts\binPropertyPatterns\ instead of artifacts\bin\PropertyPatterns\)

Scenario 2 (multi-level hierarchy bugs):
- hard/Directory.Build.props: make LangVersion unconditional so it
  overwrites the child src/Directory.Build.props LangVersion=preview,
  matching the rubric claim about parent unconditional assignments
- hard/Directory.Build.props: add NoWarn=NU1702 (conditional default)
  so parent suppressions are visible but lost when the child's
  unconditional <NoWarn>CS1591;IDE0005</NoWarn> overwrites them after import

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@JanKrivanek
Copy link
Copy Markdown
Member

/evaluate

@JanKrivanek JanKrivanek merged commit 5f11701 into dotnet:main May 22, 2026
37 checks passed
github-actions Bot added a commit that referenced this pull request May 22, 2026
github-actions Bot added a commit that referenced this pull request May 22, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Skill Validation Results

Skill Scenario Quality Skills Loaded Overfit Verdict
item-management Diagnose item group and batching issues 5.0/5 → 5.0/5 ✅ item-management; tools: skill / ⚠️ NOT ACTIVATED ✅ 0.14 [1]
item-management Diagnose cascading item and batching bugs in code generation pipeline 4.7/5 → 5.0/5 🟢 ✅ item-management; tools: skill, edit ✅ 0.14 [2]
item-management Fix item management anti-patterns 3.0/5 → 4.7/5 🟢 ✅ item-management; tools: skill, glob / ⚠️ NOT ACTIVATED ✅ 0.14 [3]
extension-points Diagnose build extension point failures 3.3/5 → 4.3/5 🟢 ✅ extension-points; tools: skill ✅ 0.10 [4]
extension-points Diagnose NuGet package and repo extension conflicts 3.0/5 → 3.0/5 ✅ extension-points; tools: skill, edit, glob / ✅ extension-points; tools: skill, edit ✅ 0.10 [5]
extension-points Fix extension point anti-patterns 4.3/5 → 5.0/5 🟢 ✅ extension-points; tools: skill ✅ 0.10 [6]
property-patterns Diagnose shared build property issues 5.0/5 → 5.0/5 ✅ property-patterns; tools: skill / ✅ property-patterns; tools: glob, skill 🟡 0.24 [7]
property-patterns Diagnose multi-level property hierarchy bugs 4.0/5 → 4.3/5 🟢 ✅ property-patterns; tools: skill / ✅ property-patterns; tools: skill, bash 🟡 0.24 [8]
property-patterns Fix shared property configuration 4.7/5 → 5.0/5 🟢 ✅ property-patterns; tools: skill / ⚠️ NOT ACTIVATED 🟡 0.24 [9]
target-authoring Diagnose custom target build regression 3.0/5 → 5.0/5 🟢 ✅ target-authoring; tools: skill, bash, glob / ✅ target-authoring; tools: glob, skill 🟡 0.23 [10]
target-authoring Diagnose broken SDK target chain across files 2.7/5 → 3.0/5 🟢 ✅ target-authoring; tools: skill / ✅ target-authoring; tools: skill, glob 🟡 0.23 [11]
target-authoring Fix custom target anti-patterns 4.7/5 → 4.0/5 🔴 ✅ target-authoring; tools: skill, bash / ⚠️ NOT ACTIVATED 🟡 0.23 [12]

[1] ⚠️ High run-to-run variance (CV=1985%) — consider re-running with --runs 5
[2] ⚠️ High run-to-run variance (CV=149%) — consider re-running with --runs 5. (Isolated) Quality improved but weighted score is -27.8% due to: quality, judgment, tokens (90063 → 144510), tool calls (10 → 14)
[3] ⚠️ High run-to-run variance (CV=77%) — consider re-running with --runs 5
[4] ⚠️ High run-to-run variance (CV=92%) — consider re-running with --runs 5
[5] ⚠️ High run-to-run variance (CV=770%) — consider re-running with --runs 5
[6] ⚠️ High run-to-run variance (CV=273%) — consider re-running with --runs 5. (Plugin) Quality improved but weighted score is -2.0% due to: tokens (98488 → 128490)
[7] ⚠️ High run-to-run variance (CV=99%) — consider re-running with --runs 5. (Plugin) Quality unchanged but weighted score is -2.5% due to: tokens (121252 → 171366), time (71.1s → 95.0s)
[8] ⚠️ High run-to-run variance (CV=326%) — consider re-running with --runs 5. (Isolated) Quality improved but weighted score is -1.4% due to: tool calls (12 → 16)
[9] ⚠️ High run-to-run variance (CV=967%) — consider re-running with --runs 5
[10] ⚠️ High run-to-run variance (CV=63%) — consider re-running with --runs 5
[11] (Isolated) Quality improved but weighted score is -3.6% due to: tokens (69056 → 105426), time (53.2s → 69.2s)
[12] ⚠️ High run-to-run variance (CV=57%) — consider re-running with --runs 5

Model: claude-opus-4.6 | Judge: claude-opus-4.6

🔍 Full Results - additional metrics and failure investigation steps

▶ Sessions Visualisation -- interactive replay of all evaluation sessions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[dotnet-msbuild] Add skills for custom target authoring, property patterns, item management, and extension points

6 participants