Tuesday, November 11, 2008

Google Uses Searches to Track Flu’s Spread (and forgets to say where this idea came from)

See also: Robert Lemos. Sick Searchers Help Track Flu. MIT Technology Review, Nov 12 [Archived in WebCite]
----
I am currently in a mild state of shock. A couple of years ago - when Google.org was just created and Larry Brilliant was appointed CEO of Google.org, I was trying to collaborate with them regarding my research on the correlation of searches on Google and Flu symptoms (published in 2006 here: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1839505). My hope was that a visionary guy like Larry would be open to a collaboration to explore my idea, the correlation between Internet searches and disease outbreaks (most notably influenza).
Larry Brilliant explained in his 2006 TED presentation (after which he was made executive director of Google.org) that his TED wish was an early detection system of disease outbreaks through Internet monitoring of published reports. The "idea" to analyze and aggregate media reports was not original - it was directly "inspired" by Ron St John's GPHIN (another project made in Canada) - but at least he gave GPHIN credit. Monitoring and aggregating what people publish (new items, Internet postings etc) is what I called "supply-based" infodemiology / infoveillance. However, what Larry Brilliant was not thinking about at that time, and what was (and is not) a part of GPHIN or other infoveillance system was what I call "demand-based infodemiology" (infoveillance), i.e. automatically analyzing what people are searching for on the Internet. At the time Larry became CEO of Google.org I was already working (and publishing / speaking) about this.
I was hoping that Google.org would perhaps be open to fund this project, or to share data.
What I did not expect is that they just go ahead and do what I proposed themselves, without ever getting back to me!
Today the NYT reported that Google.org has -- mmh, let's say "adopted" my idea, without giving any credits to its origin.
In the past 12 hours, at least half a dozen people who know about my infodemiology work have emailed me and asked "Hey, isn't this what you were doing, why aren't you on that paper?".
The NYT even goes so far to (wrongly) report that "Google Flu Trends appears to be the first public project that uses the powerful database of a search engine to track the emergence of a disease.". Wrong - apparently this reporter didn't do his homework or checked the published literature. In fact, I started doing this line of research about 4-5 years ago - in 2003, and talked about this at the CDC and on various conferences (e.g. at AMIA, in 2006 - where my paper "Infodemiology: Tracking Flu-Related Searches on the Web for Syndromic Surveillance" won the Distinguished Paper Award, and in 2007, on a joint panel organized by John Brownstein from Healthmap.
Healthmap by the way just got a few millions from Google.org.
The google.org team led by Dr Brilliant (according to the NYT article) even managed to get a paper accepted in Nature (while I haven't been as lucky, see below). Their paper
"Detecting influenza epidemics using search engine query data.
Jeremy Ginsberg ,Matthew H. Mohebbi, Rajan S. Patel, Lynnette Brammer, Mark S. Smolinski and Larry Brilliant is here.
Google.org obviously has every right to use and publish their data, and perhaps it is a case of "great minds think alike" (though I have not seen any presentation or publication from them which precedes mine), but I wonder how outside researchers who submit proposals and suggestions to Google.org can be certain that they don't just steal the ideas? Google.org has not the structure and level of accountability as traditional funding agencies or charities - in fact, it is set up as a company. As an academic, my currency is reputation and getting credit where credit is due. It would have been so easy in this case!

-------------
The following is my slidedeck of my "infodemiology" and "infoveillancve" experiments, which I have been conducting since 2004. The slides were presented at various AMIA meetings, at the CDC, at a NCI/NSF workshop, and in other places. For the record: The terms infodemiology and infoveillance were created by me!





Here is a presubmission inquiry letter I sent to Nature Medicine in October 2005 (the editor said this is not something they would consider publishing. Apparently, after 3 years and with authors from Google.org and the CDC on the paper, the situation has changed!):

Research Letter


Using Google for Syndromic Surveillance: Web Searches for Flu Symptoms as Predictor for Influenza Outbreaks


Editor,

We are considering submission of a short paper (500-800 words, with 1 figure) entitled

Web Searches for Flu Symptoms as Predictor for Influenza Outbreaks

An increasing proportion of people in industrialized countries are using the Internet to seek health information (1), often before they visit a health professional. An interesting question is whether tracking health information seeking behaviour of populations over time can be used for public health purposes, particularly syndromic surveillance, which has been defined by the CDC as “surveillance using health-related data that precede diagnosis and signal a sufficient probability of a case or an outbreak to warrant further public health response”. While most syndromic surveillance systems rely on data from clinical encounters with health professionals, monitoring for example sick-leave prescriptions, house calls, hospital- or pharmacy-based data (2), we explored whether an automated analysis of trends in Internet searches can be useful to predict outbreaks such as influenza epidemics. We analyzed data from the Canadian flu season 2004/2005 over a period of 33 weeks from week 41/2004 (Oct 3-9) to week 20/2005 (May 15-21), comparing Fluwatch data on the number of influenza lab tests for Influenza A or B conducted in sentinel laboratories (“lab tests”), the number of lab tests testing positive (“cases”), and the number of cases of influenza like illness (ILI) reported by sentinel physicians with clicks in Google on an sponsored link triggered by influenza-related keywords. In our analysis, search engine clicks were a better predictor for flu cases than ILI reported by sentinel physicians. These data suggest that for diseases where consumers are likely to consult the Internet first before they visit a physician, tracking Internet search behaviour may be a valuable method complementing traditional methods of syndromic surveillance.



Response

Decision on presubmission inquiry NMED-PI29222

From: medicine@natureny.com
To: geysenba@uhnres.utoronto.ca
Date: Oct 31 2005 - 2:09pm


31st Oct 2005

Dear Dr. Eysenbach,

Thank you for asking us to consider your proposed article "Web Searches for Flu Symptoms as Predictor for Influenza Outbreaks" (NMED-PI29222). We have given the paper our careful consideration but sadly have decided that submission to Nature Medicine would not be appropriate.

We receive a great many presubmission inquiries and given the very limited space in our journal we are only able to invite formal submission of a few of these. Thus, the competition is fierce. In choosing which to invite we consider the likely broad appeal of papers, the level of advance they offer over previously published work and whether or not they adequately address subjects that will interest and be accessible to both scientists and clinicians within the biomedical research community. In this case we were not persuaded that your proposed article would compete well with the others that we have received. In making this decision we do not intend a criticism of the work -- indeed we are sure that others in your field will find it of significant value -- but merely question its appropriateness for Nature Medicine.

Nevertheless, thank you very much for giving us the opportunity to consider your work. I am sorry that we cannot be more positive and hope you are soon able to interest an alternative journal in your work.

Sincerely,


Clare Thomas, PhD
Associate Editor
Nature Medicine

This email has been sent through the NPG Manuscript Tracking System NY-610A-NPG& MTS



My paper was also turned down at the BMJ, CMAJ, JAMA, Lancet, and Science. In all but one case (CMAJ) the editors did not even send it out for peer-review. Here is the Science rejection letter.

From: Debbie Dennison [ddennison@science-int.co.uk]
Sent: November 4, 2005 7:14 AM
To: geysenba@uhnres.utoronto.ca
Cc: cash@science-int.co.uk
Subject: Decision on your Science manuscript 1122097 Eysenbach

4 November 2005


Dr. Gunther Eysenbach
Centre for Global eHealth
Fraser Elliott Bldg, 4th fl
University Health Network
190 Elizabeth St
Toronto ON M5G2K5
CANADA

Ref: 1122097

Dear Dr. Eysenbach:

Thank you for submitting your manuscript "Tracking Web Searches for
Syndromic Surveillance" to Science. Because your manuscript was not given a
high priority rating during the initial screening process, we will not be
able to send it out for in-depth review. Although your analysis is
interesting, we feel that the scope and focus of your paper make it more
appropriate for a more specialized journal. We are therefore notifying you
so that you can seek publication elsewhere.

We now receive many more interesting papers than we can publish. We
therefore send for in-depth review only those papers most likely to be
ultimately published in Science. Papers are selected on the basis of
discipline, novelty, and general significance, in addition to the usual
criteria for publication in specialized journals. Therefore, our decision
is not necessarily a reflection of the quality of your research but rather
of our stringent space limitations.

We wish you every success when you submit the paper elsewhere.

Sincerely,



Caroline Ash, Ph.D.
Senior Editor


Debbie Dennison
Editorial Assistant
Science International
Bateman House
82-88 Hills Road
Cambridge
CB2 1LQ

Phone: +44 (0)1223 326500
Fax: +44 (0)1223 326501




Reference:

1. G. Eysenbach. Infodemiology: Tracking Flu-Related Searches on the Web for Syndromic Surveillance. AMIA Annu Symp Proc. 2006; 2006: 244–248 : http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1839505


2 comments:

Sérgio Nunes said...

Hi,

Using data from Google Adsense is very ingenious. Excellent idea.

I've found that your work is cited in a manuscript linked from the Google Flu Trends page:

http://www.google.org/about/flutrends/manuscript.pdf

Anonymous said...

hi gunther

this is strange. btw - i had met you at the AMIA, 2006 and showed you the system we have built as a proto-type - epitrends! just ame across this randomly today though the day i read the google flu tracking news, i thought about your paper and AMIA presentation.