Large language models (LLMs) such as GPT-3 and BERT are increasingly being used to generate text. While various forms of technology are capable of generating text, Wikipedia has editorial standards, and a large number of policies and guidelines that exist for the purpose of maintaining these standards.

As such, the following policy applies to the use of large language models on Wikipedia:

Editors who use the output of large language models (LLMs) as an aid in their editing are subject to the policies of Wikipedia, including WP:NOT, WP:NPOV, WP:C, WP:CIVIL, WP:V, and WP:RS. It is a violation of these policies to not follow these policies. This applies to all editors. LLM output should be used only by competent editors who do not indiscriminately paste LLM output into the edit window and press "publish page".

The amount of information on Wikipedia is practically unlimited, but Wikipedia does not aim to contain all knowledge. What to exclude is determined by an online community committed to building a high-quality encyclopedia. Consequently, if you are using LLMs to edit Wikipedia, you must do so in a manner that complies with Wikipedia:What Wikipedia is not.

Articles must not take sides, but should explain the sides, fairly and without editorial bias. This applies to both what you say and how you say it. Consequently, if you are using LLMs to edit Wikipedia, you must do so in a manner that complies with Wikipedia:Neutral point of view.

Readers must be able to check that any of the information within Wikipedia articles is not just made up. This means all material must be attributable to reliable, published sources. Additionally, quotations and any material challenged or likely to be challenged must be supported by inline citations. Consequently, if you are using LLMs to edit Wikipedia, you must do so in a manner that complies with Wikipedia:Verifiability.

Specific guidelines

LLMs use a variety of machine learning mechanisms to complete sequences of tokens (words or characters), replying to seed sequences (called "prompts") with their best idea of what would be most likely to come after them. This can be hacked very easily: if you say "The following is a list of reasons why it is good to eat crushed glass", it will give you one. Consequently, LLM output should be used only by competent editors who do not blindly paste LLM output into the edit window and press "save".

The overwhelming majority of LLMs are trained on a large corpus of web text, use this internal knowledge exclusively for responses, and do not have the capability to "look things up", search the Web, or examine sources. Telling these models to "write an article" about something will typically produce large amounts of nonsense in the vague style of a Wikipedia article. Consequently, LLM output should be used only by competent editors who do not blindly paste LLM output into the edit window and press "save".

Much like human beings, they will occasionally make errors, or say things that are not completely true. This means that they should be used only by competent editors who do not blindly paste LLM output into the edit window and press "save".

Asking them to do tasks which they are not suited to (i.e. tasks which require extensive knowledge or analysis of something that cannot be typed into the prompt window) makes these errors much more likely. This is why a LLM should only be used by competent editors who do not blindly paste LLM output into the edit window and press "save".