doc: add AI/LLM-assisted contributions policy#14223
doc: add AI/LLM-assisted contributions policy#14223RonnyPfannschmidt wants to merge 3 commits intopytest-dev:mainfrom
Conversation
Add a policy section to CONTRIBUTING.rst requiring disclosure of AI tool usage, rejecting purely agentic contributions, and outlining consequences (public ban) for non-disclosure or abusive AI-generated PRs. Includes context section explaining the rationale — unsupervised agentic tools waste maintainer time and demonstrate disrespect for human reviewers. Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude <claude@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This pull request adds an AI/LLM-Assisted Contributions Policy to the pytest project's CONTRIBUTING.rst file. The policy aims to address concerns about low-quality, AI-generated contributions by establishing clear disclosure requirements and boundaries for AI tool usage. The change includes both the policy itself and an extensive Context subsection explaining the rationale behind it.
Changes:
- Adds a new "AI/LLM-Assisted Contributions Policy" section with mandatory disclosure requirements for AI tool usage
- Establishes that purely agentic (unsupervised AI) contributions will result in bans
- Includes a Context subsection explaining the motivation and reasoning behind the policy
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
CONTRIBUTING.rst
Outdated
| where an agent goes on a rampage of low-quality pull requests. | ||
| Oftentimes this looks like a human beginner with fresh agent access trying to learn, | ||
| but in reality it's just an unsupervised agentic tool wasting human time for bad-faith contributions. | ||
|
|
||
| With a human we would be more than happy to help and guide them to the right path. | ||
| With an agent we are losing human time we never get back. | ||
| The promise of AI was to free up human time to focus on more important things like family and friends. | ||
|
|
||
| Fully agentic contributions turn that around — there is no human learning or growing behind those bots, | ||
| only soulless, semi-functional prompt adherence. | ||
|
|
||
| We classify that as evil. Badly guardrailed unsupervised tools that run rampant in open-source projects are not smart; | ||
| they are the combination of bad laziness and lack of critical thinking that costs everyone else. | ||
|
|
||
| Part of this originates from us having access to coding agents and being painfully aware of the need to correctly prompt and supervise them. | ||
| Even modern frontier models repeatedly make grave mistakes when working at framework/tooling level. | ||
|
|
||
| Anyone running those unsupervised and unguarded on open-source projects is demonstrating complete disrespect for actual humans. | ||
|
|
||
|
|
There was a problem hiding this comment.
The Context subsection, while providing rationale, significantly deviates in tone from the rest of the document. The CONTRIBUTING.rst file consistently maintains a welcoming, professional, and instructional tone. Consider either moving this context to a separate document (e.g., a blog post or design decision document) or substantially revising it to match the professional tone of the rest of the document while still conveying the policy rationale.
| where an agent goes on a rampage of low-quality pull requests. | |
| Oftentimes this looks like a human beginner with fresh agent access trying to learn, | |
| but in reality it's just an unsupervised agentic tool wasting human time for bad-faith contributions. | |
| With a human we would be more than happy to help and guide them to the right path. | |
| With an agent we are losing human time we never get back. | |
| The promise of AI was to free up human time to focus on more important things like family and friends. | |
| Fully agentic contributions turn that around — there is no human learning or growing behind those bots, | |
| only soulless, semi-functional prompt adherence. | |
| We classify that as evil. Badly guardrailed unsupervised tools that run rampant in open-source projects are not smart; | |
| they are the combination of bad laziness and lack of critical thinking that costs everyone else. | |
| Part of this originates from us having access to coding agents and being painfully aware of the need to correctly prompt and supervise them. | |
| Even modern frontier models repeatedly make grave mistakes when working at framework/tooling level. | |
| Anyone running those unsupervised and unguarded on open-source projects is demonstrating complete disrespect for actual humans. | |
| where an agent produces a large number of low-quality pull requests. | |
| Oftentimes this can look similar to a human beginner with new access to tools and trying to learn, | |
| but in practice it is usually an unsupervised agentic tool generating changes without meaningful human review. | |
| When a human contributor is learning, we are glad to invest time to help, give feedback, and guide them in the right direction. | |
| With fully agentic, unsupervised tools, that same review effort does not support anyone’s learning or growth. | |
| Instead, it diverts limited maintainer time away from improving the project and supporting engaged contributors. | |
| Fully agentic contributions invert the intended benefit of these tools: rather than saving time, they create avoidable review and triage work. | |
| There is no accountable human author thoughtfully iterating on feedback, only automated output driven by prompts. | |
| From our own experience using coding agents, we know they must be carefully prompted, supervised, and checked by humans. | |
| Even modern models can make serious mistakes when operating at framework or tooling level, and those mistakes can be subtle and time‑consuming to diagnose. | |
| Running such tools unsupervised on open-source projects imposes this cost on maintainers and other contributors without their consent. | |
| Our goal with this policy is to set clear expectations, protect reviewer time, and ensure that contributions remain collaborative, respectful, and sustainable. |
There was a problem hiding this comment.
I agree with this comment from Copilot: the document starts respectful and professional, which I believe is the stance that the pytest project should take on the matter. Of course our opinions are our own, but this document should reflect the professional/respectful stance of the project as a whole.
I really like the suggestion provided by Copilot! I ask to consider using it directly in this document.
CONTRIBUTING.rst
Outdated
| With the advent of unsupervised agentic tools like OpenClaw, there has been a rise in low-quality contributions | ||
| where an agent goes on a rampage of low-quality pull requests. | ||
| Oftentimes this looks like a human beginner with fresh agent access trying to learn, | ||
| but in reality it's just an unsupervised agentic tool wasting human time for bad-faith contributions. |
There was a problem hiding this comment.
The phrase "bad-faith contributions" may be too strong, as it implies intentional malice. Many contributors using unsupervised AI tools may not realize the impact rather than acting in bad faith. Consider rephrasing to "low-quality or inappropriate contributions" to focus on the outcome rather than attributing intent.
| but in reality it's just an unsupervised agentic tool wasting human time for bad-faith contributions. | |
| but in reality it's just an unsupervised agentic tool wasting human time by generating low-quality or inappropriate contributions. |
CONTRIBUTING.rst
Outdated
| Context | ||
| ~~~~~~~ | ||
|
|
||
| With the advent of unsupervised agentic tools like OpenClaw, there has been a rise in low-quality contributions |
There was a problem hiding this comment.
The reference to "OpenClaw" appears to be a specific tool name. If this is meant to be a general term or if the tool name is incorrect, this should be clarified. If it's a real tool, consider whether naming specific tools in the Context subsection could become outdated or be perceived as targeting specific products.
| With the advent of unsupervised agentic tools like OpenClaw, there has been a rise in low-quality contributions | |
| With the advent of unsupervised agentic tools, there has been a rise in low-quality contributions |
There was a problem hiding this comment.
Disagree, using OpenClaw as an example is useful.
CONTRIBUTING.rst
Outdated
| The promise of AI was to free up human time to focus on more important things like family and friends. | ||
|
|
||
| Fully agentic contributions turn that around — there is no human learning or growing behind those bots, | ||
| only soulless, semi-functional prompt adherence. |
There was a problem hiding this comment.
Also there's an assymetry at play. Someone is prioritizing what we review without making an equivalent time or money investment. When a contributor think an issue is important enough to work on it and does, they effectively gives time to influence the priorization of the project. Most github sponsor pay tier include a 'prioritize this issue', generally an expansive tier intended for companies. Those agentic contribution expect to prioritize what maintainers work on at near zero cost for them.
|
im thinking of moving the context part to a blog post as the language is intentionally a bit more hostile, and it may be a bit misplaced in the docs that are intentionally more welcoming |
CONTRIBUTING.rst
Outdated
| With the advent of unsupervised agentic tools like OpenClaw, there has been a rise in low-quality contributions | ||
| where an agent goes on a rampage of low-quality pull requests. | ||
| Oftentimes this looks like a human beginner with fresh agent access trying to learn, | ||
| but in reality it's just an unsupervised agentic tool wasting human time for bad-faith contributions. |
There was a problem hiding this comment.
I wouldn't say they are inherently made in bad faith, we don't need that harsh language IMHO. Just saying the consequences (wasting human time) is good enough.
| but in reality it's just an unsupervised agentic tool wasting human time for bad-faith contributions. | |
| but in reality it's just an unsupervised agentic tool wasting human time. |
There was a problem hiding this comment.
given the way the models go off the rails so easy, i consider it important to classify letting them run wild as bad faith
There was a problem hiding this comment.
But I don't think (and I might be optimistic here) that the humans behind them really intend for that, that's why I don't think "bad faith" fits here.
CONTRIBUTING.rst
Outdated
| Oftentimes this looks like a human beginner with fresh agent access trying to learn, | ||
| but in reality it's just an unsupervised agentic tool wasting human time for bad-faith contributions. | ||
|
|
||
| With a human we would be more than happy to help and guide them to the right path. |
There was a problem hiding this comment.
| With a human we would be more than happy to help and guide them to the right path. | |
| Were the contribution made by an actual human, we would be more than happy to help and guide them to the right path. |
CONTRIBUTING.rst
Outdated
| Fully agentic contributions turn that around — there is no human learning or growing behind those bots, | ||
| only soulless, semi-functional prompt adherence. | ||
|
|
||
| We classify that as evil. Badly guardrailed unsupervised tools that run rampant in open-source projects are not smart; |
There was a problem hiding this comment.
I don't think we need to use this harsh language... evil can have many other connotations.
Perhaps:
| We classify that as evil. Badly guardrailed unsupervised tools that run rampant in open-source projects are not smart; | |
| We classify that as wasteful. Badly guardrailed unsupervised tools that run rampant in open-source projects are not smart; |
CONTRIBUTING.rst
Outdated
| only soulless, semi-functional prompt adherence. | ||
|
|
||
| We classify that as evil. Badly guardrailed unsupervised tools that run rampant in open-source projects are not smart; | ||
| they are the combination of bad laziness and lack of critical thinking that costs everyone else. |
There was a problem hiding this comment.
| they are the combination of bad laziness and lack of critical thinking that costs everyone else. | |
| they are the combination of laziness and lack of critical thinking that costs everyone else. |
CONTRIBUTING.rst
Outdated
| where an agent goes on a rampage of low-quality pull requests. | ||
| Oftentimes this looks like a human beginner with fresh agent access trying to learn, | ||
| but in reality it's just an unsupervised agentic tool wasting human time for bad-faith contributions. | ||
|
|
||
| With a human we would be more than happy to help and guide them to the right path. | ||
| With an agent we are losing human time we never get back. | ||
| The promise of AI was to free up human time to focus on more important things like family and friends. | ||
|
|
||
| Fully agentic contributions turn that around — there is no human learning or growing behind those bots, | ||
| only soulless, semi-functional prompt adherence. | ||
|
|
||
| We classify that as evil. Badly guardrailed unsupervised tools that run rampant in open-source projects are not smart; | ||
| they are the combination of bad laziness and lack of critical thinking that costs everyone else. | ||
|
|
||
| Part of this originates from us having access to coding agents and being painfully aware of the need to correctly prompt and supervise them. | ||
| Even modern frontier models repeatedly make grave mistakes when working at framework/tooling level. | ||
|
|
||
| Anyone running those unsupervised and unguarded on open-source projects is demonstrating complete disrespect for actual humans. | ||
|
|
||
|
|
There was a problem hiding this comment.
I agree with this comment from Copilot: the document starts respectful and professional, which I believe is the stance that the pytest project should take on the matter. Of course our opinions are our own, but this document should reflect the professional/respectful stance of the project as a whole.
I really like the suggestion provided by Copilot! I ask to consider using it directly in this document.
|
We should also update https://github.com/pytest-dev/pytest/blob/main/.github/PULL_REQUEST_TEMPLATE.md with a clear note about AI disclosure and forbidding unsupervised agents. |
I left a comment on that section: I think the suggestion made by Copilot is onpoint and should be in the same document, as I think context is important to convey our intentions clearly. |
Address PR feedback: - "evil" → "wasteful" (nicoddemus) - "bad laziness" → "laziness" (nicoddemus) - "With a human" → "Were the contribution made by an actual human" (nicoddemus) - Em dashes → commas for document consistency - Break long lines for source readability - Incorporate asymmetry argument from Pierre-Sassoulas: agentic PRs prioritize reviewer attention at near-zero cost to the sender Add AI/LLM disclosure checkboxes to the PR template with a link to the full policy in CONTRIBUTING.rst. Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude <claude@anthropic.com>
|
i agree, my personal frustrations need to be separated from this text which targets contributors - personal rants have a different place - the amount of ai slop i see these days creates the strange desire to climb up a mountain and scream at the night sky |
|
I think this looks good, thanks @RonnyPfannschmidt! |
Pierre-Sassoulas
left a comment
There was a problem hiding this comment.
Look great, just a small suggestion
CONTRIBUTING.rst
Outdated
| but in reality it's just an unsupervised agentic tool wasting human time | ||
| for bad-faith contributions. |
There was a problem hiding this comment.
| but in reality it's just an unsupervised agentic tool wasting human time | |
| for bad-faith contributions. | |
| but in reality it's just an unsupervised agentic tool wasting maintainer time. |
Let's assume good faith ? (I suppose the intent is still to help, although I already encountered AI researchers that were just trying their creation in the wild using my time, those were not deterred by common decency and are not going to be deterred by the contributing.rst)
|
cc @sirosen ^ |
webknjaz
left a comment
There was a problem hiding this comment.
I shared this on Discord some time ago but in case not everyone saw, here's some more discussions/materials/tools:
.github/PULL_REQUEST_TEMPLATE.md
Outdated
| **If you used AI/LLM tools** (e.g. GitHub Copilot, ChatGPT, Claude, or similar) to help with this PR, you **must** disclose it below. State which tools were used and to what extent. Purely agentic contributions are not accepted. See our [AI/LLM-Assisted Contributions Policy](https://github.com/pytest-dev/pytest/blob/main/CONTRIBUTING.rst#aillm-assisted-contributions-policy). | ||
| - [ ] This PR was made **without** AI/LLM assistance. | ||
| - [ ] This PR used AI/LLM assistance (describe tools and extent below). |
There was a problem hiding this comment.
It'd be useful to get the prompts too.
There was a problem hiding this comment.
based on how i observe agent tool usage and refinement that would be a mess
There was a problem hiding this comment.
The original prompt would be very useful to have indeed. Often more useful than the result tbh. And it would show the actual amount of effort that went into it which would be a great deterrent against low effort ai driven fix.
There was a problem hiding this comment.
This is not something you can rightfully expect contributors to record and does not match with the wide variety of ways people use modern tooling. It also conveys zero information.
These checkboxes can only be used to discriminate against contributions based on non-technical merits. Meaning people will naturally self-correct towards not disclosing because of the discriminatory attitude the existence of this text implies.
I say this out of compassion. This will alienate contributors by trying to claim that how a change was made and by whom is more important than the substance.
Focus on the respectful interactions, outcomes, and code quality needs. Knowing what text editor or IDE someone used is 100% irrelevant.
See the first paragraph from https://devguide.python.org/getting-started/generative-ai/ for example.
(From a geeky agentic model use learning perspective: While I enjoy seeing prompts and sessions along with the exact specific details of which model version was used via what, that is an entirely unreasonable ask of people. It is not something that belongs in any project's policy.)
There was a problem hiding this comment.
This is not something you can rightfully expect contributors to record and does not match with the wide variety of ways people use modern tooling. It also conveys zero information.
I kind agree with this; I almost never get something done using a AI coding tool with just one prompt, often I end up using more than one prompt. Are contributors expected to post the entire conversation in their PR description? This is not practical, and probably brings almost no value, that I can see at least.
In the future, it is very likely that every contribution will have some AI involved, ranging from basic AI features (like smarter auto complete) from full blown coding agent. Requiring users to disclose this fact does not seem very useful, TBH.
The intent from this policy, from my POV, is to stop/prevent fully automated contributions, where at no point a human was involved prior to the PR being submitted. I have no problem people using AI tools to contribute, and I don't think we need to require them to disclose using those tools -- both from not bringing much value as well as not being practical.
There was a problem hiding this comment.
currently theres no good way to track prompts and intents
sometimes i literally prompt by pasting issue link into the chat, sometimes i write down 5-10 paragraphs of detail and run a planning rounds with q&a by the agent - and then on top of that sometimes i do re-basing and squashing with the agent - currently im not aware of a way to propperly trace that in a sensible manner
also sometimes i do swear at the llms when they do something extra messy - and like with a good old hammer or saw or screwdriver i like that the tool keeps no memory of my language there
This is a non-starter and will fail. It is blatant discrimination. |
CONTRIBUTING.rst
Outdated
| **Consequences of non-disclosure.** Failure to disclose AI involvement, or submitting | ||
| low-effort AI-generated pull requests that waste reviewer time, will result in a | ||
| **public ban** from the project. Flooding the project with AI-generated PRs is abuse | ||
| of the maintainers' volunteer time and will be treated as such. |
There was a problem hiding this comment.
Most of your Context section below and asymmetry feels makes total sense to me. Very real. I understand why you're proposing this PR in reaction to the current state of the affairs but I think you've swung too far as written and will not get the outcomes you seek.
But this section is just plain cruel. Yes of course projects ban spammers, we all do that all the time. Always have, always will. And there are a whole lot more of them of late. Feels. But this text as written attempts to threaten contributors with public shaming for effectively not disclosing what editor they used. WTF.
Engaging in public shaming of people is never okay. Making a policy of it is even worse.
Again, just focus on the outcomes you want: You do not want low quality far more maintainer volunteer time effort than contributor effort PRs.
I'd remove the "Consequences", "public ban", and "disclosure" wording - those fly in the face of reason. Focus on contributions being respectful of maintainer time.
There was a problem hiding this comment.
Why is asking for disclosure equivalent to shaming? Disclosure just helps both parties start from the same page instead of tiptoeing around eggshells. "Can I discuss their use of AI? Will it offend them if I suggest it was AI-driven, when it was not?"
It's like what educators do these days. To quote a high school teacher I know, "there's no other way."
There was a problem hiding this comment.
(Of course, there are other ways, but I'm just mentioning that for at least some educators, it feels like their experience has driven them to this result.)
There was a problem hiding this comment.
The text will result in a **public ban** from the project is read as a direct threat of public shaming.
There was a problem hiding this comment.
Mandatory disclosure invites discrimination and implies that it is okay to do so. I know OSS contributors who intentionally avoid disclosing to instead focus on the task at hand in part due to the unwelcoming hostile attitudes of others.
There was a problem hiding this comment.
I think the sticking point for me is being mandatory. I'd like projects to encourage and welcome voluntary disclosure.
Otherwise it suggests someone is actually going to be policing changes and authors effectively for their editor choice and trying to burn them at the stake instead of focusing on the proposed change and if it is being carried out in a manner that is respectful of maintainer time and attention.
There was a problem hiding this comment.
Thanks, the word "public" escaped my notice, and so I'm glad I asked (even though this was just a drive-by comment arising from a link from another python project, which may not be welcome -- sorry).
We've been asking for disclosure for about two months at Django and calling it required, mostly because we have enough trouble asking for compliance with ticking a box that just says "my commit message mentions the ticket number". It hadn't occurred to me yet that people might think we were being hostile. I can feed this back to folks. 💚
There was a problem hiding this comment.
What kind of discrimination are we talking about exactly here ?
There was a problem hiding this comment.
i reiterated the language - i believe its now more in line with the sprit and intent, I’d appreciate another review iteration
Address PR review feedback: remove "mandatory disclosure", "public ban", and "consequences" language. Replace hard disclosure requirement with asking contributors to credit AI agents as co-authors via Co-authored-by trailers. Rewrite Context section in a professional tone, removing personal frustrations while keeping the core arguments about maintainer time asymmetry and unsupervised agent risks. Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Claude claude-4.6-opus-high-thinking <noreply@anthropic.com>
|
I want this to be merged as merge commit so we have the iteration of the wording from starting as something almost a rage post to born from frustration to the new language that reflects and takes the important and mindful feedback from people who care into account thanks to everyone for calling out the language issues and barriers here - that’s indispensable help to go from something that comes form raw frustration to something that can ship and sail |
|
Personally the main thing I would want from an AI policy is that the issues, PR descriptions and comments (or more generally "communication"), are written by humans. I simply do not want to talk with some machine. The code is actually sometimes decent and getting better. (I note that this very PR's description is in violation of this...) As a bonus, I think that requiring human communication would much reduce the automated PRs (at least from honest users) since IMO their users are unlikely to create them if it requires any actual effort from them... |
Summary
CONTRIBUTING.rstMotivation
Unsupervised agentic tools have led to a rise in low-quality contributions that waste maintainer time. This policy makes expectations explicit and protects reviewers.
Made with Cursor