Unread citations

As a matter of fact, scientists often don’t read what they reference, but copy citations from literature lists used in other papers instead. You can do this and get away with it until one day you copy a citation, which carries in it a DNA of someone else’s misprint. In this case you can be identified and brought to justice (similar to how biological DNA evidence helps to convict criminals, who committed more serious offences than yours).

 

Figure 1. Distribution of misprints in citations to one renowned paper, ranked according to frequency of their repetition.  (This figure is from cond-mat/0212043 )

 

Figure 1 shows distribution of misprints in citations to one renowned paper, in the rank-frequency representation, introduced by Zipf.  Among 4300 citations to the paper in question 196 contained misprints, out of which only 45 were distinct. One can estimate the ratio of the number of readers to the number of citers, as the ratio of the number of distinct misprints, D, to the total number of misprints, T. Indeed: we know that among T citers, T - D copied, because they repeated someone else’s misprint. For the D others, with the information at hand, we don't have any evidence that they copied, so according to the presumed innocent principle, we assume that they read.The fraction of citations which were read by the citing authors, R, is thus . A more careful analysis, which employed stochastic modeling, gave very close number (see cond-mat/0212043 , cond-mat/0401529 ).

 

Theory of citing

During the “Manhattan project” (the making of nuclear bomb), Fermi asked Gen. Groves, the head of the project, what is the definition of a “great” general. Groves replied that any general who had won five battles in a row might safely be called great. Fermi then asked how many generals are great. Groves said about three out of every hundred. Fermi conjectured that considering that opposing forces for most battles are roughly equal in strength, the chance of winning one battle is ½ and the chance of winning five battles in a row is.  “So you are right, General, about three out of every hundred. Mathematical probability, not genius.”

 

A commonly accepted measure of “greatness” for scientists is the number of citations to their papers. Majority of papers are either never cited, or get just several citations. At the same time there are out there renowned papers with thousands of citations. Thousand is not five, winning thousand citations by chance is impossible. One is tempted to conclude that the papers that achieve such citation rate must be “great”. Not exactly so when majority of citations are copied from the lists of references used in other papers. This way a paper that already was cited is likely to be cited again, and after it is cited again it is even more likely to be cited in the future.   We find ourselves in a perfect position to launch a frontal attack on the whole institution of scientific citation indexing. The model of random-citing scientists ( see cond-mat/0305150 ) was inspired by Fermi’s insight and justified by the aforementioned   repeat misprints. It is as follows: when a scientist is writing a manuscript he picks three random papers, cites them, and also copies a quarter of their references.  The model accounts quantitatively for empirically observed citation distribution (see Fig. 2).  Simple mathematical probability, not genius, can explain why some papers are cited a lot more than the other.

 

Figure 2.  Outcome of the model of random-citing scientists compared to actual citation data. (This figure is from cond-mat/0305150 ) 

 

 

Some popular articles about and discussions of this research:

Scientists exposed as sloppy reporters

(in Spanish) Cita a ciegas

(in German) Kopiert statt gelesen

(in Polish) Jak rozpoznac wielkosc uczonego

(in German) Wissenschaftlicher Ruhm kann Zufall sein

(in Russian)  Об учёных

(in Norvegian) Forskerne kopierer hverandres feil

(in German) Freudsche Versprecher entlarven Forscher

(in French) L'homme qui a cité l'homme qui a cité l'homme qui a cité l'ours

(in Portuguese) Cientista evita ler fontes originais antes de citá-las

Citational Slips and Stochastic Analysis

(in Chinese)  最新统计表明 科研人员抄袭率惊人

(in Bulgarian)  УЧЕНИТЕ ЦИТИРАЛИ НЕКОРЕКТНО

(in Finnish) Tutkijat jättävät lähdeaineistoa lukematta

(in Danish) Forskere fusker med citater

(in Vietnamese) Các nhà khoa học những báo cáo viên cẩu thả

(in Italian) La Cupola della pubblicazione scientifica

Scientists Don't Read the Papers They Cite

( in Japanese) 論文の著者は引用文献を本当に読んでいるか?

( in Thai)  อะ...แฮ่ม..เมื่อนักวิทยาศาสตร์ไม่ยอมทำการบ้าน!

Scientific Researchers Routinely Fudge Citations

Why Almost Everything Written About Treating Blood Pressure Is Wrong