This is a guide to spotting violations of the Wikipedia copyright policy that are simple copy-and-pastes from other websites. Please remember to assume good faith when doing the important work of keeping Wikipedia compliant with CC BY-SA and, where co-licensed, GFDL. It's important to keep in mind as well that what appears to be copied content may not be a copyright issue in some cases - for example, when Wikipedia had the content first or when the content is public domain or compatibly licensed.

Signs that an article might be copy-and-pasted[edit]

There are a number of signs that an article might be copy-and-pasted. None of these are conclusive evidence, but more than one of these signs tends to be apparent in a copy-and-pasted article.

Indicative, but by no means conclusive signs:

Strong signs of copy and pasting:

Irrefutable evidence:

Checking it out[edit]

Once alerted by one or more of these suspicious signs, you can then check the article by highlighting a sentence or non-trivial sentence fragment that is unlikely to be found by chance in many documents, copying and pasting it into a search engine. You should then check the matching pages, if any, for further correspondence to the submitted article. Be aware that many sites "mirror" content from Wikipedia, so a search engine may find several sites with the exact content. Those sites should list Wikipedia as the source of the article, but do not always do so. Wikipedia:Mirrors and forks can help you identify known mirror sites, if you suspect that's what you've encountered. The Wayback Machine can also help confirm copying but has limitations in eliminating it - it does not store every site or every page within a site and may lag by six months or so even on pages that it does store. For extra thoroughness, you may also want to check out the "groups" or "books" options in Google.

Many times an image from some other website is uploaded here under the same name. Hence if you suspect an image to be a copyright violation, you can try searching Google Images for the filename of the image to check if there are matches from other websites for the same image. Even if the image was uploaded with a different name, a google image search for relevant search terms might help finding the original image in case of a copyright violation. TinEye and other reverse image search engines can also be useful.

To find the date when suspected copyrighted text was inserted into an article, you can use the WikiBlame tool for this. There is a link to WikiBlame (as well as to an alternative tool) on the 'View History' tab of every article. Look for the line beginning "External tools: Find addition/removal" towards the top of that page. This lets you determine when specified text was inserted and to compare it against the date of the other source (assuming one was given). Sometimes we find that old article text has been taken from Wikipedia and used without attribution on more recent blogs or websites. Understanding who has copied from who is extremely helpful, and avoids the embarrassment of making flawed accusations of WP:COPYVIO to good faith editors. Where currently active editors appear to be making copyright violations, it is appropriate to warn them and to request WP:REVDEL of all the subsequent edits containing that text. This can sometimes span a number of years.

If you suspect that a page is a copyright infringement[edit]

If you suspect one, you should at the very least bring up the issue on that page's talk page. Others can then examine the situation and take action if needed. The most helpful piece of information you can provide is a URL or other reference to what you believe may be the source of the text. If the talk page is not watched, however, your note on the talk page may go unseen. If you suspect this will be the case, please consider also using ((Copypaste)) or opening a section listing your concern at WT:CP.

Notes[edit]

  1. ^ In very, very rare cases, official sites have been found to have updated to include Wikipedia's content after earlier versions had been used as sources for the Wikipedia page. If the Wayback Machine doesn't demonstrate that earlier versions of the suspected origin page differed from the current text, the content on Wikipedia should only be retained if there is strong evidence of natural evolution - that is, if content evolved gradually on Wikipedia over time and especially with the contribution of more than one person.

See also[edit]