Jump to content

Wikipedia talk:Plagiarism

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Kmhkmh (talk | contribs) at 11:08, 4 November 2010 (→‎Difference: Copyright and Plagiarism Issues). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Simple and obvious

Rereading, I think this paragraph doesn't work:

Phrases that are the simplest and most obvious way to present information. Editors who claim that the phrasing at issue is plagiarism must show that there is an alternative phrasing that does not make the passage more difficult to read. If a proposed rephrasing may impair the clarity, or flow, of a paragraph, they must propose a rephrasing that avoids such side-effects, possibly by rephrasing content preceding and following the disputed passage, or even the whole paragraph. An objective measure of whether a proposed rephrasing makes the passage more difficult to read can be obtained by a readability tool such as Dispenser's Readability Analyser. However, issues about clarity and flow will have to be resolved by discussion.

It was a turbulent time, when that was added, but I would imagine that the intent was to clarify that simple, non-creative content does not require attribution. That's basically what's implied by the first sentence, but the subsequent seem to suggest that reading fluidity is the real issue here...and that doesn't make sense. Suppose a public domain source has an excellent description of the flight pattern of a Monarch butterfly and that rewriting it would dilute it. Does this eliminate the attribution requirement, when all that is needed to retain it is a note that it's copied verbatim? (If the passage is non-free, of course, it's even less useful, since copyright law doesn't care if you like the way the copyright holder puts it.)

I propose eliminating everything after the first sentence, replacing it with an expanded explanation that the reason this is not a plagiarism problem is because it lacks the degree of creativity requiring attribution. Maybe something like this:

Phrases that are the simplest and most obvious way to present information. Sentences such as "John Smith was born on 2 February, 1900" lack sufficient creativity to require attribution.

As it is currently written, anyone can claim that any non-free content they've imported requires no attribution if it has a better readability index. Again, all that's required if the content is free to permit its use is attribution. If it's non-free, readability is immaterial.

Thoughts? --Moonriddengirl (talk) 15:34, 19 July 2010 (UTC)[reply]

I agree. And I think your text version protects me from plagiarism accusations regarding the previous sentence. Hans Adler 21:05, 19 July 2010 (UTC)[reply]
I also agree (has that also protected me?). I also like Mrg's explanation of "simple, non-creative content does not require attribution", which perhaps could be included.
It does raise one issue though. When I have been involved in POV disputes where someone demands sources to justify a phrase it can be extremely difficult not to use very similar wording to the original source, because if the wording differs, the person who wish to emphasise different POV will say that the wording does not accursedly reflect the experts point of view. For example take the phrases:
  • "the majority of Australian experts are considerably more circumspect" (Mark Levene p. 344, footnote 105).
  • however the majority of Australian experts are more circumspect (Wikipedia history wars --cites Levene and another source (for the rest of the sentence))
The words "majority of Australian experts" is mandated by WP:NPOV as accurately reflecting what was called "mass attribution" (e.g. "maority" is not "most" and not all "experts" are "historians"), and I would have liked to use a different word to "circumspect" but I could not find one that meant precisely the same thing (probably a lack of ability/imagination). If it were not for a POV dispute, I would have used a different word, but the phrase would have a slightly different meaning.
The question comes down to whether without turning a passage into a string of quotes, such phrases can be justified given the self imposed limitations we work under of accurately summarising the sources with no original research, and whether you (Mrg) intend your proposed wording to cover such instances. -- PBS (talk) 21:48, 19 July 2010 (UTC)[reply]
I doubt it. Occasionally, precision will demand direct (fair use) quotations. However, I think someone that repeatedly demands the exact words of a source takes Wikipedia too seriously, and should just go read the book. --Hroðulf (or Hrothulf) (Talk) 13:00, 20 July 2010 (UTC)[reply]
At the very least, they are misunderstanding WP:NOR.:) There are certainly occasionally sources so contentious that they require direct quotation to make sure they are accurately attributed and conveyed. I see that Lemkin has an in-sentence attribution. If not overused, this is generally taken in academic settings at least to authorize a slightly closer following of language. It also makes it easier to toss quotation marks around the odd "striking phrase". I would drop some quotation marks around that myself. In fact, I think I'll go have a shot at it. :) --Moonriddengirl (talk) 13:08, 20 July 2010 (UTC)[reply]
Oops! Levine is the source of the quote; not Lemkin. Hmmm. Well, still, the point stands, though I'm wrong in this instance. --Moonriddengirl (talk) 13:10, 20 July 2010 (UTC)[reply]

User notice template

Hi. I've created a notice template for unattributed copying of PD content, here. Wanted to let people know about it in case it can be improved. --Moonriddengirl (talk) 15:02, 11 September 2010 (UTC)[reply]

Where to place attribution templates

I note the section Wikipedia:Plagiarism#Where to place attribution added on July 20 2009. This advice seems to be in conflict with Wikipedia:MOS#Section headings which talks of "primary headings are then ==H2==, ===H3===, ====H4====, and so on up to ======H6======" and makes no mention of ad hoc emboldened section headings.

There does seem to be sense in using a section heading to regularise and highlight the attribution of text inserts into articles. Although hitherto I've placed attribution templates under an H2 ==Notes==, I could cope with placing it under an H3 ===Attribution===. Does anyone have thoughts on this?

I note that whatever advice is given here should probably be reflected in Wikipedia:Citing_sources, such that we establish a regular pattern for citations, references, notes, attribution, etc. --Tagishsimon (talk) 10:31, 13 September 2010 (UTC)[reply]

Why can you live with putting it under a section called notes and not at the end of a section called references? -- PBS (talk) 20:59, 13 September 2010 (UTC)[reply]
Attribution is not a section heading it is simply a bold line. We added it because the contributors to this talk page thought it was desirable to highlight the attribution. I considered suggesting that it should be a section header but rejected the idea for three reasons:
  • The first was to trigger a TOC on a stub for one extra line was a complication that most editors could do without.
  • The second was I saw it as a only a rearrangement of the lines already in the References section(s): usually {{1911}} or whatever where already in the list of general citations and as such it was to highlight the inclusion of the attribution for the reader which was one of the major plagiarism concerns of those who were involved in writing this guideline who were opposed to including any text from third party sources unless it was in quotes.
  • The third was that if Attribution became a section it would just add more confusion for editors over what the difference was between Notes, References and this other section called Attribution, it would also mean a rewrite of WP:CITE and WP:LAYOUT making the text in those guidelines more complicated than if Attribution remained just a bold line -- it is a relatively small number of pages which include unquoted text from third party sources.
I think the advantages of keeping it as a bold line (simplicity) outweigh the advantages of having it as a separate section header. -- PBS (talk) 20:59, 13 September 2010 (UTC)[reply]
We added it because the contributors to this talk page thought it was desirable to highlight the attribution. I've had a look through the archives of the talk pages of this guideline, but didn't find the discussion you allude to. Would you please point me to it. --Tagishsimon (talk) 16:17, 14 September 2010 (UTC)[reply]
The conversations from the time are in archive 5, but the two preceding archives are also relevant. -- PBS (talk) 23:10, 14 September 2010 (UTC)[reply]
I've searched Archive 5 thoroughly and found no discussions of the attribution header in it. Could you be any more specific about where these discussions are, please? --Tagishsimon (talk) 22:46, 15 September 2010 (UTC)[reply]
I myself just put the notices directly under the word "references". --Moonriddengirl (talk) 17:11, 14 September 2010 (UTC)[reply]
I also put it as the first thing in the Notes/References section (whichever one the article uses for its inline refs). I never actually noticed that this said to put it under it's own fake heading. VernoWhitney (talk) 19:42, 14 September 2010 (UTC)[reply]
VernoWhitney why would you want to put it above the {{reflist}} and not in the References section
The reason I think that they are better at the bottom of the reference section is because sometimes there is more than one of them, and sometimes directly below the PD attribution is a list of the general references used in the PD article (and mixing up the references used by Wikiepdia editors and secondary references used by the PD sources does not keep the clarity we need for "say were you got it".
The second reason is a technical one. Most of the attribution templates do not contain the code to work with {{sfn}} and {{harvnb}} so if an article uses short citations there needs to be a standard {{cite book}} or whatever with the "ref=harv" parameter set (it is set by default in the {{citation}} template. The "mark 2" attribution templates I have been coding for {{1911}}, {{catholic}}, and {{DNB}} all have the "ref=harv" parameter set, but they are the exception to the rule. This means that for most articles such as Western Allied invasion of Germany, the citation has to be given twice, and given that I think it looks better if the attribution goes at the bottom. -- PBS (talk) 23:10, 14 September 2010 (UTC)[reply]
Most of the articles I add attribution to are freshly created and don't have separate Notes/References sections. I guess my logic is that the source for all of that material really is wherever you're copying it from, and so it should go in with all of the other footnotes which show directly where you got the info from. I just put it at the top of that section so people would be more likely to read it instead of skimming it along with all of the other footnotes. An attribution header would work just as well, but as I said, I just skimmed that part of the article before so I didn't know there was supposed to be one. VernoWhitney (talk) 01:56, 15 September 2010 (UTC)[reply]
I am at the moment working through the DNB articles putting in a "wstitle=" parameter, as it happens just after I asked this question of you came across a few articles laid out the way you describe. Here are a couple Henry Newcome and George Henry Harlow and I think they look OK, but I think that William Kiffin, which has a mix, is a mess and will become even more so as other sources are cited as references. I have altered Charles Lucas (politician) but have left the note at the top, and also altered Richard Royston, the latter shows how an article like William Kiffin can be modified to use an "attribution" line along with the {{harvnb}} template and can take any number of other references. However these are the exception and much more typical are articles like John Gayer and Rowland Searchfield. PBS (talk) 02:44, 15 September 2010 (UTC)[reply]
Sad to say I'm appalled by the (to me) new footnote style you're imposing on DNB articles, PBS. Henry Acton, for example, now conflates together the reference and the attribution, when previously there was a clear division between these two. The danger of going down the Henry Acton route in an article with multiple references is that attribution loses all prominence because it is buried in a mass of other references. My expectation is that we publish in an article which has need to attribute a PD insert, a clear and self standing attribution statement, one that it entirely separate from a citation reference. (And I also gently point you to my reply above, where I'm still searching for the prior discussion, still not finding it in archive 5.) --Tagishsimon (talk) 23:10, 15 September 2010 (UTC)[reply]
If there is only one references then it does not really matter if there is an attribution line or not as no one who looks at the references section (such as in the article Henry Acton is going to fail to notice it) it does no real harm to add it. In a more developed article, were there may be dozens of references, then putting the references which includes copying PD line at the bottom of the reference section is in my opinion quite an elegant solution, take for example the article George Monck, 1st Duke of Albemarle -- PBS (talk) 09:34, 18 September 2010 (UTC)[reply]
It didn't used to be there, Verno. :) It used to just say "Attribution for compatibly licensed and public domain text is generally provided through the use of an appropriate attribution template, or similar annotation, placed in a "References section" near the bottom of the page." I don't remember noticing the change; I've gone right along with what I believed it said. Tagishsimon, I suspect he may be referring to this comment. Not a lengthy discussion, but the question of prominence was raised and nobody seems to have objected to his bold edit at the time. We can talk about changing the current recommendations, though, if you think them inappropriate. I myself think what it used to say was fine. It allowed leeway for getting "fancy" if need be and for the simpler situations I usually encounter. --Moonriddengirl (talk) 23:40, 15 September 2010 (UTC)[reply]
I'm at the point where I quite like PBS' suggestion of an Attribution header, other than that per WP:MOS, as I said at the top, it should in my view be in the form of an H2 or H3 header, not merely ad hoc emboldening. I think it important that we signal in no uncertain terms inserts of PD text, and the suggested header reinforces the template. I don't think that mere whim should override WP:MOS (which gets many more eyeballs than this page and the guidelines in which are, thus, arguably given with more force, or at least much boader consensus than we represent). And the same argument for prominence and clarity in attribution is (part of) what is distressing me about the approach taken in Henry Acton (and by now many other PBS touched articles), that references and attribution are being conflated. (I recognise that questions of prominence are value judgements and that YMMV applies.) So, what are we discussing?
  • Whether the recommendation for an Attribution header should remain
  • Whether, if it does, it should be emboldened by hand, or an H2 or an H3
  • Whether attribution templates can also serve as / be conflated with references, as in Henry Acton
  • And, for completeness, since there's discussion in other places [1] [2], whether we want attribution templates such as {{DNB Cite}} and {{1911}} to be adorned with logos such as a PD logo or a wikisource logo.
--Tagishsimon (talk) 00:49, 16 September 2010 (UTC)[reply]
Tagishsimon using the term "decorative" is a cigarette punch (The Cray twins discovered that rather annoyingly that people expected to be hit by them, so they used to offer a victim a cigarette before hitting them, because it a victim had their jaw open when they punched them if tended to result in a broken jaw). Using the term decorative in this context implies that it serves no purpose, but for example the article Thomas More the wikipsource icon servers exactly the purpose it is supposed to serve in showing that there is a Catholic enclycopedia article on Wikisource relating to Tom More.
Tagishsimon you wrote "I quite like PBS' suggestion of an Attribution header" I have not suggested that it should be a header (See what I wrote above). In my opinion your previous usage of "notes" and "references" section headings an the content you have placed in those sections shows that you were confused by the guidance in WP:CITE. Having been through 1,000 of articles in the last couple of weeks (changing unamed parameters to "wstitle=") I have found that the majority of editors treat {{catholic}}, {{1911}} and {{DNB}} like any other reference, so your question "Whether attribution templates can also serve as / be conflated with references" is answered by usage: yes they can (and I can only conclude that you ask this question because of the rather unique way you have interpreted the usage of "notes" and "references" section headings) -- PBS (talk) 10:05, 18 September 2010 (UTC)[reply]

BTW for anyone who is interested: all three templates {{1911}},{{Catholic}} and {{DNB}} have a flag called "inline=1" which allows them to be used like the {{citation-attribution}} "To aid with attribution at the end of a few sentences ..." eg <nowkik>{{tl|1911|inline=1|wstitle=A}} returns:

  •  One or more of the preceding sentences incorporates text from a publication now in the public domainChisholm, Hugh, ed. (1911). "A". Encyclopædia Britannica (11th ed.). Cambridge University Press.

-- PBS (talk) 10:21, 18 September 2010 (UTC)[reply]

Some of this discussion goes a bit beyond my areas of participation. :) I'm not very visual, as my userpage attests. Here are a few examples of one of the PD templates I created (maybe the only one? I lose track) in use: Omaha Race Riot of 1919#References; Limos (mythology)#References; William Zouche#Notes. IMO, it stands out clearly enough from other references to draw attention to itself so that readers who check sources should not miss it. As far as I'm concerned, the primary purpose of this guideline is to get the attribution on the article in some clear way, and any way that does that works for me. I don't think we need to make it too elaborate. Sometimes, it is the only source we have, or one of the few, and I worry about overwhelming a stub article with sections. Take Cavum vaginale, for instance. Would it be improved by the addition of a subheader or bolding for attribution? I don't think so. I think we should avoid being overly directive here. --Moonriddengirl (talk) 13:15, 19 September 2010 (UTC)[reply]

In-text attribution

Hi Moonriddengirl, what's your objection to this? It's standard practice per V to use in-text attribution without quotation marks. SlimVirgin talk|contribs 15:39, 8 October 2010 (UTC)[reply]

Consensus in the development of this guideline was that copied content should be clearly denoted as copied. The section in which you've added the note says, "You can avoid any dispute concerning potential plagiarism by". I don't believe people can "avoid any dispute" by in-line attribution if the content they are copying is complex or lengthy. Certainly, not everybody is going to regard it as plagiarism, but I believe that it's not the failsafe the guideline would suggest it to be. --Moonriddengirl (talk) 15:46, 8 October 2010 (UTC)[reply]
The guideline can't contradict other policies, guidelines, and standard practice, MRG (e.g. see WP:QUOTEFARM). It's normal practice on Wikipedia and every other kind of publication to attribute without quotation marks, because no one wants their articles to become a list of quotes. Writing "Moonriddengirl said she loved it" is just as appropriate as "Moodriddengirl said, 'I love it!'". Quotation marks are for words that we want to draw attention to for some reason, perhaps because very distinctive, or legally or politically important.
You are right that copyright violations can't be avoided with in-text attribution alone, but it also can't be avoided with quotation marks either. But plagiarism is avoided by clearly attributing the source next to the text you are citing; quotation marks are irrelevant in that sense. SlimVirgin talk|contribs
Writing "Moonriddengirl said she loved it" is a proper paraphrase. The content "I loved it" is short and almost entirely devoid of creativity. The attribution in that case is sufficient. Writing:

Moonriddengirl said the US government utilizes a "substantial similarity" test intended to determine if infringement exists and that Melville Nimmer produced subcategories of "substantial similarity" for which the courts search.

is a different matter. (That's copied from my userpage, I note, to avoid self-plagiarism. ;)) Quotation marks are for words that are copied. This is a guideline; WP:QUOTEFARM is an essay. Nevertheless, it agrees that "Quotations must always be clearly indicated as being quotations." Inline attribution is insufficient if creative language is retained. --Moonriddengirl (talk) 16:08, 8 October 2010 (UTC)[reply]
You just keep saying that, but without a source. Please provide a reliable source. It's clearly fine a great deal of the time, on Wikipedia and in every other form of publication, to add reported speech without quotation marks so long as you make clear who said it. SlimVirgin talk|contribs 17:19, 8 October 2010 (UTC)[reply]
University of Pittsburgh, section D: "Copying Distinctive Words or Phrases without Proper Attribution Is Plagiarism The following use of Bettelheim’s “debilitated state of the ego” is plagiarism, even though the ideas are attributed to him, because there are no quotation marks around this distinctive phrase..." (I'll leave you to read the rest of the example yourself.) --Moonriddengirl (talk) 17:27, 8 October 2010 (UTC)[reply]
But it doesn't say you need quotation marks. You just need to make clear that the phrase is not your own, and that can easily be done with the writing, with the way you attribute in-text. It's just wrong to say that quotation marks are the only way to do that, and that webpage doesn't support your argument that it is. You're recommending a very poor writing practice. SlimVirgin talk|contribs 17:56, 8 October 2010 (UTC)[reply]
"The following use of Bettelheim’s “debilitated state of the ego” is plagiarism, even though the ideas are attributed to him, because there are no quotation marks around this distinctive phrase..."" (emphasis added). --Moonriddengirl (talk) 18:17, 8 October 2010 (UTC)[reply]
A few more for you: "[3]: "*Note that in the paraphrase, a very brief quotation is used. When you paraphrase, you cannot conveniently borrow the direct language of your source, however brief, without using quotation marks"; [4]: "Be sure you have enclosed the exact words of the source with quotation marks or you have set off longer quotes with indention. You are committing plagiarism if you neglect this critical punctuation, demarcating words you have gathered directly from sources"; [5]: "Use quotation marks to identify any unique term or phraseology you have borrowed exactly from the source"; [6]: "Additionally, paraphrasing is plagiarism where you fail to cite your original source and, in some cases, where you fail to use quotation marks as well.... even where the original source has been cited, plagiarism occurs where you fail to use quotation marks around words or phrases that show the author’s distinct and original thought or expression.... While determining whether certain words warrant quotation marks might seem dependent on the reader, you should always use quotation marks when in doubt." --Moonriddengirl (talk) 18:32, 8 October 2010 (UTC)[reply]

I think there is a misunderstanding going on. I guess SlimVirgin is probably thinking of quotations of entire paragraphs, using appropriate means for marking them as quotations. If there are no quotation marks this includes at the minimum starting a new paragraph and saying where it is from, but typically that new paragraph is also indented and/or in italics. I think Moonriddengirl is probably thinking of the recent big copyvio case involving an established editor and regular contributor to policy discussions, who thought copy/paste (without quotation marks or other markup) is the only way to avoid original research, and acted accordingly for years. Hans Adler 16:49, 8 October 2010 (UTC)[reply]

Yes, you're right; that's the kind of thing I am thinking about. We see a lot of that at WP:CP and WP:CCI, and I wouldn't want to inadvertently confuse our contributors into thinking that quotation marks aren't necessary. Conversations about plagiarism on Wikipedia (nevermind copyvio) can get really ugly. :/ Clarity matters. When it comes to non-free content, there's already this in that section: "properly attributing any public-domain, or free-content text, that you place directly into an article." --Moonriddengirl (talk) 16:56, 8 October 2010 (UTC)[reply]
I don't see what difference adding quotation marks would make to the kind of thing you're describing. But the point is that quotation marks are not needed if there is in-text attribution, and if you want to change that you're going to need a wider consensus, because it would fly in the face of common writing practices and other policies. SlimVirgin talk|contribs 17:16, 8 October 2010 (UTC)[reply]
What other policies? --Moonriddengirl (talk) 17:18, 8 October 2010 (UTC)[reply]
In any case, (hypothetically contradictory) policies local to the project do not override legal standards of copyright which require that verbatim copying of non-free text be specifically attributed as direct quotations for any fair use claim to be plausible, nor can they supersede generally accepted standards of academic honesty. Peter Karlsen (talk) 18:04, 8 October 2010 (UTC)[reply]

SV you say that to meet Wikipedia content polices and other guidelines, in-line attribution and an inline-citation are enough, but how is a reader to know if what I have just written is a summary of what you said or a direct quote unless quotes are marked as such? I would assume that although imitation is the sincerest form of flattery, to avoid copyright and plagiarism issues, I would have put in quotes an exact copy of your words. MRG has told me in the past that if a Wikiepdia editor use a well worn phrase -- as I did in the last sentence -- then those do not have to be quoted. On reading that last sentence would you assume that I was quoting verbatim what MRG said to me or paraphrasing what she said to me? -- PBS (talk) 01:11, 9 October 2010 (UTC)[reply]

Most of the time it doesn't matter. What is the difference between "Philip Baird Shearer said it was all nonsense," and "Philip Baird Shearer said: 'It is all nonsense'"?
Quotation marks are only needed if it's important or interesting for some reason to mark the exact words used, e.g. "U.S. President Philip Baird Shearer said the Russian president was "talking nonsense." Then it would matter exactly what you had said. SlimVirgin talk|contribs 01:22, 9 October 2010 (UTC)[reply]
You asked for reliable sources for my position that inline attribution does not automatically eliminate plagiarism concerns. I have provided those. But I'm curious: do you have reliable sources to substantiate your view that "Quotation marks are only needed if it's important or interesting for some reason to mark the exact words used"? --Moonriddengirl (talk) 01:33, 9 October 2010 (UTC)[reply]

Consensus

Someone said the quotation-mark requirement was added after a discussion reached consensus about it. Can someone link to that discussion, please? SlimVirgin talk|contribs 17:59, 8 October 2010 (UTC)[reply]

I think this kind of escalation is premature. I am not sure at this point what this discussion is about, and I guess I am not the only one. I am still sure that we can get to unanimity if we start talking about concrete things that we want to allow or not, or that we want to encourage or not. I am absolutely sure that you don't want to allow the kind of thing that Moonriddengirl wants to prevent, and that we can find language that is acceptable to all once proper communication has started. Hans Adler 18:17, 8 October 2010 (UTC)[reply]
This conversation seems to begin with WP:NFC (policy and guideline). The policy has long permitted "in accordance with the guideline, use brief verbatim textual excerpts from copyrighted media, properly attributed or cited to its original source or author" and the guideline has long said, "Copyrighted text that is used verbatim must be attributed with quotation marks or other standard notation, such as block quotes." On the 7th, a contributor changed the policy by incorporating some of the guideline into it (here), following which Slim changed the policy thusly (perhaps not realizing that this contradicted the guideline, which had long been incorporated by reference...although not after her edit). Her change was reverted. A bit of back and forth editing followed and some conversation at Wikipedia talk:Non-free content#"If appropriate" quotation marks. It seems very likely to me that this conversation is an extension of that. It's my perception based on the thrust of conversation there that the goal here is to avoid unnecessary use of quotation marks. --Moonriddengirl (talk) 19:46, 8 October 2010 (UTC)[reply]
Yes, it's clearly about unnecessary use of quotation marks. But it's all extremely abstract so far. I would like to see a concrete example of the kind of literal quotation without quotation marks that SV is arguing for. Hans Adler 20:10, 8 October 2010 (UTC)[reply]
Example: I've just added to Ezra Pound (issue at point in bold):

At a literary salon in February 1909, he befriended the novelist Olivia Shakespear—Yeats's former lover and the subject of his The Lover Mourns for the Loss of Love—and her daughter, Dorothy, Pound's future wife, who Iris Barry said carried herself with the air of a young Victorian lady out skating, in strong contrast to Pound.[1]

The skating analogy comes from Iris Barry, who described Dorothy as "carrying herself delicately with the air, always, of a young Victorian lady out skating."
According to Moonriddengirl, attributing the description to Iris Barry is not enough. I would also have to place some of these words in quotation marks. But there is no other publication that would require that of a writer. Quotation marks there are entirely optional, depending on the extent to which I want to draw attention to the phrase. SlimVirgin talk|contribs 00:16, 9 October 2010 (UTC)[reply]
Thanks. I think that has clarified things sufficiently and we can all agree that quoting in this way is not plagiarism. The problem seems to be how we can make it clear that this is acceptable, without inadvertently encouraging some of our less literate editor colleagues to write something like the following:

At a literary salon in February 1909, he befriended the novelist Olivia Shakespear—Yeats's former lover and the subject of his The Lover Mourns for the Loss of Love—and her daughter, Dorothy, Pound's future wife, who always carried herself delicately with the air of a young Victorian lady out skating.[2]

I believe that's the kind of thing that Moonriddengirl is constantly cleaning up. I am sure it's not a nice job and she would prefer having more time for writing content. The problem is that the editors who write like that also tend to be rather bad at understanding the nuances of any instructions we give them. If we give them a chance to misunderstand what we tell them as permitting what they want to do, they won't miss it. Hans Adler 16:41, 9 October 2010 (UTC)[reply]
I'm afraid you may have misunderstood me. I didn't say it was "a discussion." I said, "in the development of this guideline". There are six pages of archived talk, and the question of directly noting copied content has been raised repeatedly. The guideline says, with respect to non-free content, "In addition to the edit summary note, be sure to attribute the material either by using blockquotes, or quotation marks, by using an attribution template, using an inline citation and/or adding your own note in the reference section of the article to indicate that language has been used verbatim." But all that misses the initial point. You added your change to a section that says (specifically), "You can avoid any dispute concerning potential plagiarism by..." (in your words) "providing in-text attribution without quotation marks, and referencing the source". I've quoted a number of reliable sources for you now which indicate that quotation marks are required to avoid plagiarism. Accordingly, it is not true to say that you can avoid any dispute concerning plagiarism by "providing in-text attribution without quotation marks, and referencing the source", and it does no service to readers of this guideline to tell them otherwise. If they don't use quotation marks, even if the content is free, they certainly may encounter very vitriolic disputes about plagiarism, and if they do it extensively with non-free sources, they might wind up at WP:CP. --Moonriddengirl (talk) 18:49, 8 October 2010 (UTC)[reply]
You said there had been consensus in the development of the guideline, and you reverted me on that basis, so could you link to wherever you feel consensus was expressed that all quotations must be in quotation marks, and that's it's never okay simply to write that Moonriddengirl said SlimVirgin was an idiot? :) SlimVirgin talk|contribs 23:14, 8 October 2010 (UTC)[reply]
No, I reverted you on the basis that there is "No consensus for this" (quoting my edit summary); you made a change, and I disagreed. You then asked me my problem with it, and I've explained...including with the external WP:RS you requested. If you want to change the guideline, you're welcome to try to achieve consensus for that. --Moonriddengirl (talk) 23:33, 8 October 2010 (UTC)[reply]
Please don't tell me the word string above would have to be rendered as:
who "Iris Barry" said carried "herself ... with the air ... of a young Victorian lady out skating".
Save me. Save our readers from it. One of the principles of good quotation technique is to shield readers from bad English or formatting, and to integrate what others write/say smoothly into the grammar of the WP text. Provided the original meaning is not changed substantively (which it is not, here) and the attribution is there (yup), it would be objectionable not to do this. I copy-edit some of The Signpost's journalistic stuff each week. There, I sometimes see that readers would be forced to jump though hoops (ellipsis points, weaving in and out of quoted fragments) in the supposed service of black-letter law on this matter. Or exposed to rather bad prose in quoted material. No one—not the original source, not our readers, not WP—is served well by such inflexibility. Tony (talk) 04:05, 9 October 2010 (UTC)[reply]
In the example given who said "in strong contrast to Pound"? Was it Iris Barry? If not, then why does the citation come after that phrase? Also isn't the sentence ambiguous as one can take the contrast to be in the deportment of Dorothy and Pound, or a contrast of opinion over Dorothy's deportment.
"You can avoid any dispute concerning potential plagiarism by: ... providing in-text attribution without quotation marks, and referencing the source" (shoudn't that be citing the source?) Can we tease out of this bundle and discuss the two issues of copyright and plagiarism (as at the moment we seem to be conflating the two) and see what common ground there is and what the differences are? For example I don't think anyone is tying to impose a crude three word rule. -- PBS (talk) 11:36, 9 October 2010 (UTC)[reply]
First, in spite of Slim's statement that "According to Moonriddengirl, attributing the description to Iris Barry is not enough..." I haven't actually weighed in on the proper punctuation of that snippet. That said, I believe in keeping with the sources I supplied above to demonsttate problems with Slim's assertion that "You can avoid any dispute concerning potential plagiarism by..." (in her words) "providing in-text attribution without quotation marks, and referencing the source", that quotation marks would serve the sentence here:

Pound's future wife, who Iris Barry said carried herself "with the air...of a young Victorian lady out skating"

This would satisfy the plagiarism standards set out by those sources. People who weave in and out of ellipses are badly paraphrasing. My personal opinion (which I have stated on Wikipedia in the past, if Slim needs verification) is that where in-line attribution is used we can more closely follow our sources, though I do follow the convention of using quotation marks to indicate creative word choice. But none of this has anything to do with a black letter law on quotation marks. It's to do with quite the opposite, a blanket assertion that "You can avoid any dispute concerning potential plagiarism by..." (emphasis added). You can't. I've verified that with enough sources to demonstrate significant alternative viewpoints. Leaving aside non-free content concerns, the question of when and how to use quotation marks to avoid plagiarism is entirely up to the community...whether we choose to embrace a standard that meets the most rigorous expectations or not is up to us. --Moonriddengirl (talk) 12:02, 9 October 2010 (UTC)[reply]
I would have to endorse Moonriddengirl's position here. Slimvirgin's proposed modification to the guideline is – likely inadvertently – far too broad in its scope. It implies that inline citation is always an appropriate alternative to the use of quotation marks; it directly contradicts the previous instruction regarding verbatim copying. While I suspect that everyone participating in this discussion has a reasonable grasp of what plagiarism actually means and wouldn't be tempted to (mis)read the guideline that way, we should strive to write in such a way that we won't confuse editors who aren't familiar with proper academic sourcing standards. I regularly deal with editors who believe that it is appropriate to copy entire sentences or even paragraphs from outside sources, as long as they tack on an external link and mayhaps substitute a synonym in a couple of places. Editors naively reading the modified version of the guideline would feel that that sort of wholesale copying is acceptable. TenOfAllTrades(talk) 15:40, 9 October 2010 (UTC)[reply]

Proposal

The first section currently ends as follows: Template:Blockquotetop You can avoid any dispute concerning potential plagiarism by:

  • rewriting text completely into your own words, using multiple referenced sources;
  • marking any material you copy as a verbatim quote, using quotation marks, and referencing the source;[3]
  • properly attributing any public-domain, or free-content text, that you place directly into an article.

  1. ^ Montgomery, Paul L. Ezra Pound: A Man of Contradictions", The New York Times, 2 November 1972.
  2. ^ Montgomery, Paul L. Ezra Pound: A Man of Contradictions", The New York Times, 2 November 1972.
  3. ^ Note that the amount of text you quote from non-free sources must be limited to comply with non-free content guidelines.

Template:Blockquotebottom

Contrary to what the section title suggests, this does not try to define plagiarism but only gives a bright-line rule for those who want to play it safe. How about something like the following instead:

Template:Blockquotetop Defining and identifying plagiarism is not as easy as it may appear, but we can establish some bright lines:

Recognising obvious plagiarism
  • More than 20 words are copied from a source with no or minimal rephrasing. The source does not appear in a citation.
  • More than 20 words are copied from a source with no or minimal rephrasing. The source appears in a citation, but nothing indicates to the reader that information from the source was copied rather than independently rephrased or summarised.
Playing it safe
  • Rewrite text completely into your own words, using multiple referenced sources.
  • Mark any material you copy as a verbatim quote, using quotation marks, and referencing the source.[1]
  • Properly attribute any public-domain, or free-content, text that you place directly into an article.[2]

There is a huge area between these two bright lines, and most of it is plagiarism. That said, competent writers have techniques for copying text verbatim without quotation marks and still attributing it correctly to the source. Only try this if you are really sure you understand plagiarism better than 50% of American undergraduate students.[3]


  1. ^ Note that the amount of text you quote from non-free sources must be limited to comply with non-free content guidelines.
  2. ^ See 1911 for an example.
  3. ^ Roig, Miguel (1997), "Can undergraduate students determine whether text has been plagiarized?", The Psychological Record

Template:Blockquotebottom I probably missed some important things, but maybe this can serve as inspiration for a version that satisfies all concerns. Hans Adler 17:40, 9 October 2010 (UTC)[reply]

I think it's a sensible direction; I like your "There is a huge area between these two bright lines, and most of it is plagiarism". "Playing it safe" is probably good verbiage there. I'm not really comfortable with "That said, competent writers have techniques for copying text verbatim without quotation marks and still attributing it correctly to the source." It may be worth noting that some view inline attribution as adequate to avoid plagiarism, but, according to the sources I quote above, copying text verbatim without quotation marks is plagiarism even where attributed, and a good many Wikipedians are likely to agree. It's not a question of the competence of the writer, but of the definition and standards of plagiarism adopted. There really just is not a bright line definition of plagiarism; it varies by culture and context. I also worry a bit about setting any number as a definitive standard under "obvious plagiarism." I'm afraid people will overlook the "most of it is plagiarism" bit that follows for lesser copying and defend with a "but I only copied 19 words!" Five or six words in a striking phrase can be plagiarism, particularly if uncited. --Moonriddengirl (talk) 18:52, 9 October 2010 (UTC)[reply]
Hans, the problem with playing it safe it that you encourage quote farming and very poor writing, and we see this too much on Wikipedia already. The couple lived "in the heart of Oxford," after paying "a large sum" for a "sunny apartment". It's very much to be avoided, so we need to make the point that text can be attributed by saying who said something. SlimVirgin talk|contribs 20:28, 9 October 2010 (UTC)[reply]
I think TenOfAllTrades has sum up the problem we face. If we put any number in there it will be abused by people of less than good faith. I have been involved in a number of plagiarism cases over the last year and the thing that has shocked me and made me cynical is that when it is pointed out to the offenders they at first plead ignorance (often claiming that they did it before this guideline was written), but then not one of them has helped clean up the mess they have created even though most of them are very prolific contributors to WP.
"non-free sources" is open to abuse because it has more than one meaning. As Frank Zappa once said "If you can't be free, at least you can be cheap". I downloaded it from a none pay per view site, so cost me nothing so its free.
If we put in a limit it will be abused, because it is a question of judgement as to what constitutes plagiarism. EG If it is a list of members of a parliament in alphabetic order, the it may be a direct copy of over 1000 words long, or if it is the first sentence of a biography "John Smith (1 June 1920 - 10 January 2000) was a chemist who won the Nobel Prize for chemistry in 1950 for work on pork scratchings". In the last example SV gives, I think we should be encouraging people to summarise using other words if possible, I would have thought that all the phrases you have used were cliche enough to be used without quotes but it the sentence in the original was "The couple lived in the heart of Oxford, after paying a large sum for a sunny apartment" then I think we should encourage its rewrite eg "The couple bought an expensive flat in the centre of Oxford and [moved in during ...].". The whole "original" sentence is is only 17 words long so it fits comfortably under a 20 words limit which if adopted would allow someone just to copy the sentence from the original, so far from "encourage[ing] quote farming and very poor writing", I think the current wording is encouraging creative summarising and discouraging plagiarism.
Perhaps the solution lies in rearranging the information already in the guideline. How about grouping the three sections "Definition of plagiarism", "Why plagiarism is a problem" and "What is not plagiarism" as subsections under a new section heading. Then moving "What is not plagiarism" above "Why plagiarism is a problem" and moving the words after "You can avoid any dispute.." out of definition section (as they are not definitions) into a new subsection called something like "How to avoid plagiarism" and place it as the last section in the new section? This would allow people to see the already large number of things that when copied are not plagiarism, before suggesting rule for avoidance of plagiarism with other text. -- PBS (talk) 21:56, 9 October 2010 (UTC)[reply]
To be honest, I'd much prefer to see over-quotation and over-citation than under-citation. Wikipedia articles are often works in progress, and we expect each editor to contribute according to his strengths. After a 'researcher' identifies important sources and statements which ought to form the basis for an article (or which might be added to an existing article), a 'copyeditor' can mould that clumsily-assembled raw material into brilliant prose, while a 'wikignome' correctly formats the references and footnotes. At the same time, editors who engage excessively in quote-farming – or whose contributions habitually degrade well-written articles – can be taken aside and gently encouraged to improve their approach to contributing. It is far easier to repair an article which relies too heavily on clearly-cited quotations than one which contains an excess of concealed and unreported copying or too-close paraphrasing, largely because the former is obvious to casual inspection while the latter is not. Trying to discourage the use of quotation marks probably won't reduce the amount of quote mining; it will just increase the amount of inconspicuous plagiarism.
It's also possible that advice on how to reduce one's reliance on quotations could (or should) become part of some other policy or guideline. Generally, an overuse of quotations in article writing is more an issue of bad writing than one of plagiarism. While still troubling, it's a bit out of our scope here (except in the cases where quotations entirely or near-entirely substitute for an article's prose — but closely paraphrasing someone else's article to use as our own without quotes is equally problematic). TenOfAllTrades(talk) 23:17, 9 October 2010 (UTC)[reply]

Phrases 'may' require quotation marks

SV you made this change which inserted "may" into "Note that even with inline attribution, distinctive words or phrases may require quotation marks." Have you see what is written in the policy section of WP:NFC

Articles and other Wikipedia pages may, in accordance with the guideline, use brief verbatim textual excerpts from copyrighted media, properly attributed or cited to its original source or author, and specifically indicated as direct quotations via quotation marks, <blockquote>, or a similar method.

Below the poliy is a section called "Text" which seem to confirm the wording here ("Copyrighted text that is used verbatim must be attributed with quotation marks or other standard notation, such as block quotes") without the addition of the word "may". It seems to me that you need to address the issue in WP:NFC, and then modify this one once a change is made to that page. -- PBS (talk) 22:16, 29 October 2010 (UTC)[reply]

Regardless of what any page says, it's a matter of fact that writers do not need to use quotation marks to signal what someone said. Indirect speech is perfectly fine. There is no difference between "Philip said the shop had shut," and "Philip said 'The shop has shut.'"
One of the very frustrating things about WP is people trying to reinvent the wheel. We don't have to work everything out from first principles. :) SlimVirgin talk|contribs 22:22, 29 October 2010 (UTC)[reply]
There's no distinctive phrases in there, though. It was distinctive phrases we were discussing above and for which I supplied sources. (Plagiarism isn't a matter of fact, really. That cats are mammals is a "matter of fact," but plagiarism is not codified by law. It's highly subjective.) That said, I don't really object myself to the addition of the word "may" in that context. That section is about how to avoid any dispute of potential plagiarism, and it has the same effect whether the word "may" is included or not: if people don't mark distinctive phrases and encounter plagiarism disputes, they'll know they should have. --Moonriddengirl (talk) 22:36, 29 October 2010 (UTC)[reply]
Sv assuming the Philip is not me how do I know if Philip said exactly that or if you are summerising? For example "Sir Robert Armstrong said he told the truth" is a summary of "Sir Robert Armstrong said he was 'economical with the truth'" but if I have no rule in place how do you know if the former is a summary or his words? In this example, I assume that the sentence with the quoted words "economical with the truth" are what he said and the rest is a summary of what he said. -- PBS (talk) 00:05, 30 October 2010 (UTC)[reply]
And if you want to add quotation marks to "economical with the truth," you're free to do that. This policy is about how we guard against plagiarism. The way to do that is to add in-text attribution when you're using another person's words, or close to them, and if you want to add quotation marks too when it's direct speech that's fine. SlimVirgin talk|contribs 01:27, 31 October 2010 (UTC)[reply]

This guideline is constantly mixing up plagiarism and copyright issues which are totally different from one another. I am surprised I should be the only one noticing that (didn't read the archives, though). Detailed reasoning here. --Pgallert (talk) 10:57, 4 November 2010 (UTC)[reply]

I'm not that comfortable with the use of that term either and personally I don't consider a "plagiarism" only as a problem, when it represents that a copyvio of some sort or contains unsourced material as well. But for those 2 cases we have separate guidelines anyway. However since guideline is around for 2 years now, not that long that I would consider it a core RL, but I guess it has to be considered as somewhat established. --Kmhkmh (talk) 11:08, 4 November 2010 (UTC)[reply]