Monday, March 10, 2008

Another plagiarist bites the dust (anatomy of a plagiarizing paper)

Plagiarism is the practice of claiming or implying original authorship of (or incorporating material from) someone else's written or creative work, in whole or in part, into one's own without adequate acknowledgement. Unlike cases of forgery, in which the authenticity of the writing, document, or some other kind of object itself is in question, plagiarism is concerned with the issue of false attribution.

Within academia, plagiarism by students, professors, or researchers is considered academic dishonesty or academic fraud and offenders are subject to academic censure.

Plagiarism is different from copyright infringement. While both terms may apply to a particular act, they emphasize different aspects of the transgression. Copyright infringement is a violation of the rights of the copyright holder, when material is used without the copyright holder's consent. On the other hand, plagiarism is concerned with the unearned increment to the plagiarizing author's reputation that is achieved through false claims of authorship.
(Source: Wikipedia [WebCite]

I've always been interested in plagiarism - perhaps because I tend to be quite "open" about my ideas and intellectual products, releasing them on the web immediately as they arise. Being open about ideas however does not mean that I am always thrilled to see my ideas, words, phrases and products reused without proper attribution, which remains the cornerstone of scholarship. In fact, plagiarism is probably the #1 enemy of openess. (Having been burned a couple of times, readers of this blog will notice that I am particularly anal about pointing out how blog entries should be cited. Feel free to use anything posted here - but for god sake don't just copy & paste anything - if you use copy & paste, add some quotation marks and a reference to this blog, preferably using WebCite).

I have seen plagiarism in all shapes and forms, I know how prevalent it is, and I am not afraid to speak out and to blow the whistle, because I see it as a major threat to scholarship and openess.

In 1999, in a high-profile case that was also reported in Nature (Dalton R. Professors use web to catch students who plagiarize...and author gets similar paper retracted. (WebCite), one academic author copied & pasted extensive sections from my website, and reused it in a (subsequently retracted) peer-reviewed publication. A detailed account of this incident was also published in JMIR (Eysenbach G. Report of a case of cyberplagiarism - and reflections on detecting and preventing academic misconduct using the Internet. J Med Internet Res 2000;2(1):e4), which helped to coin the term "cyberplagiarism". This JMIR editorial also contained some considerations of what journals should do to avoid this plague, which is to use automatic software to check for signs of plagiarism. One of the epublishing innovations I am proud of is that JMIR was (to my knowledge) the first (and remains the to date the only?) journal that has implemented a policy of checking submissions routinely for plagiarism, using the TurnItIn plagiarism checker, a now very successful product and company, founded by John Barrie (Barrie JM, Presti DE. Digital plagiarism - The web giveth and the web shall taketh. J Med Internet Res 2000;2(1):e6).

These experiences also inspired me to do a little bit of research in that area on the prevalence of scientific misconduct, with some worrying results (Eysenbach G. Medical students see that academic misconduct is common. BMJ 2001;322:1307).

Since then I have been plagiarized many times again, and - while still a nuisance - I've almost learned to accept it as the most sincere form of flattery. I've encountered plagiarism in all forms and flavors: web-to-paper plagiarism (somebody who writes a paper lifted something from my webpage without attribution), web-to-web plagiarism (somebody lifted an idea and text from a website I created and recreates the same site without attribution - see this blog entry), software-to-software plagiarism (another research group takes open source software developed in my group, removes the copyright statements, and rebrands it as a project/idea initiated in their group without giving attribution), and also paper-to-paper plagiarism (words and ideas from my papers are being reused in other publications without proper attribution).

The latest case of paper-to-paper plagiarism was just pointed out to me three days ago. A watchful student sent me an email alerting me of the fact that one paper which was published in a medical informatics journal (not JMIR!) seemed to contain some paragraphs taken from my BMJ article on "Consumer Health Informatics", which was not cited (see this ithenticate report page - boxed paragraphs marked with [6] are from my paper). But this was only half the story.
I ran the entire published article through the plagiarism checker software which we use at JMIR (now called ithenticate - thanks to John Barrie's company for making this software available to JMIR!), and the result was quite shocking. Almost 50% of that published article (a review) appears to be cobbled together (i.e. copied & pasted) from various websites, abstracts etc. The excerpt from the report below speaks for itself. Everything displayed in boxes has been recognized by ithenticate as having been lifted verbatim from a website or an abstract. 5% of the paper was copied & pasted from my BMJ article on "Consumer Health Informatics" (which was not cited at all), 7% was copied from a EU call for papers (which was also not cited). A lot of information appears to be copied & pasted from abstracts of various papers, which were cited, but in these cases the author failed to indicate (using quotation marks) that these were direct quotes.
Perhaps even more shocking is that the author is not an inexperienced student, but a rather senior health informatician. What is also worrisome is that it is unclear how this blatant work of plagiarism could slip through the editorial process of that particular medical informatics journal, edited by respected medical informaticians. The plagiarism could have been easily spotted even in the absence of sophisticated plagiarism checking software by just skimming through the paper: One paragraph appears to be stolen from the AMIA Consumer Health Informatics website (again, no reference is made to it) - but it is so obvious that it is stolen from a workshop report because the the plagiarist actually forgot to remove revealing phrases such as "In this workshop, participants will discuss..". It is unclear whether any editor, let alone any peer-reviewer has ever read this paper in the first place.
There are some other juicy details in this developing story, but I have to bite my tongue at this point and wait what the editorial and institutional investigations will reveal.

For now, I'll let the ithenticate report speak for itself (remember, all boxed paragraphs were found by the software on websites or in other articles. The author used no quotation marks in the entire article). I have removed details on the authors' name and the title of the manuscript at this point.


Figure 1a-d. ithenticate report highlighting stolen paragraphs (boxed) in a published, plagiarizing paper from a medical informatics journal (not JMIR!).











Please cite as: Eysenbach, Gunther. Another plagiarist bites the dust (anatomy of a plagiarizing paper). Gunther Eysenbach Random Research Rants Blog. 2008-03-10. URL:http://gunther-eysenbach.blogspot.com/2008/03/another-plagiarist-bites-dust-anatomy.html. Accessed: 2008-03-10. (Archived by WebCite® at http://www.webcitation.org/5WDlfpXAB)



UPDATE 27/05/2008:
The editor of the journal in question has informed me that the plagiarizing paper will be retracted. The following statement will be published as retraction notice:

The author of the article
[CITATION OF THE PLAGIARIZING PAPER]
has verified that he used a substantial fragment of text without attributing it to
Eysenbach G. Recent Advances: Consumer Health Informatics. BMJ 2000; 320:1713-16.

The author of the article including the unacknowledged material states that his failure to attribute was unintended and the result of hurried completion of his paper. He extends his deepest apology to the author of the original text and the readers of the [SERIAL] for this plagiarism. The editors of the [SERIAL] apologize to the non-cited author of the original article and to the readers of the [SERIAL] for not detecting and excluding this plagiarism from publication. They thank those who brought it to their attention. It is the policy of the [SERIAL] that all articles comply with professional standards of publication to avoid plagiarism. This requires that only short fragments of text from original sources be quoted (enclosed in quotation marks) and be immediately followed by an accurate citation of the source from which the fragment was taken. Because the fragment in question was neither short nor cited for its origin, the author has asked that his paper be withdrawn, and the editors hereby publish this retraction.

I'll let the reader decide whether the retraction statement above is appropriate or downplays the extent of the plagiarism - given that (as shown above) this is not just a failure to attribute a single citation. It also remains unclear why the paper was not reviewed in the first place, even though the SERIAL claims that its contributions are peer-reviewed.
The plagiarizing author is by the way also an official (a working group chair) of a scientific society (SERIAL is the official publication of that society), and the editor of the SERIAL is the president of that society. And to my knowledge, there were no consequences for the author from that society (while I resigned from my position within this society).
I guess this case is as embarrassing for the editor(s) of that journal as it is for the author, which is why there is a common interest in keeping this case "low profile".

3 comments:

eHealth PhD student said...
This comment has been removed by the author.
eHealth PhD student said...

I am the student who discovered this case. It was really easy to find it out. The paper is incoherent, the order of the sentences make no sense,etc. It was very unpleasant for me to find such paper in a good journal, which is supported by a leading scientific institution in Medical Informatics and written by a senior "researcher". I hope that the editors will handle it properly and soon the paper will be retracted. Also, I hope that the author will just apologize for his misconduct and he is not going to find a excuse, such as being in a hurry. It will be hard to believe that his misconduct was unintended. Should the author of this fraud paper have a high position in scientific organizations?

It is the first time that I have reported about something "weird". Sometimes, I have found "weird" things during my short career in research(I just started my PhD). In most of the cases my fellow researchers advised me to leave it and don't get any deeper. Btw, the tendency of suggesting to don't take any action was higher among the senior researchers. Maybe Thomas S. Kuhn was right and science is conservative by nature.

I have found two other weird things, which are not reported to the editors. The first one was a case where two papers (one technical and another medical) about 2 different experiments had the same patient's outcome graph (or almost the same). Maybe, it was just a coincidence. The second weird case is a special track of a well-known conference where 85% of the papers have authors in the scientific committee. Almost everybody suggested me to don't take any action about these 2 weird cases.

Is that normal? Or is it a symptom of something wrong in eHealth research? Why many researchers have suggested me to don't report weird things?

Best regards,


PS: If you want to contact me, please leave me a comment in my anonymous blog, http://skepticalehealthresearcher.blogspot.com/ .

Anonymous said...

Are you aware the the author of the plagiarizing paper has been promoted to vice-president of an international scientific organization and also to vicedean of Tech. University sice the cased was found?

What is wrong in our field?