Background: Search engine activity has been used as an indicator of disease incidence in developed countries. Non-availability of high quality data on Internet use and disease surveillance has hampered similar studies in developing countries.
Objective: To evaluate the correlation between search volumes for “fever” on Google and febrile infectious disease outbreaks reported by the Integrated Disease Surveillance Program.
Methods: Data on Google search volume for “fever” between January 2014 and December 2014 from India was downloaded from the Google Trends Insights website. Weekly data on outbreaks and case counts of infectious causes of febrile illnesses (dengue, PUO/fever, chikungunya, and typhoid) were obtained from the Integrated Disease Surveillance Program website. Spearman’s rho was calculated to estimate unadjusted correlation between search activities with weekly disease metrics. Time-series analysis of Google search query volume and disease metrics was done to ascertain whether they shared a common stochastic drift using the two-step Engle-Granger method.
Results: The unadjusted correlation was statistically significant between search activity and dengue outbreak (r=0.632, p<0.001) and cases (r=0.673, p<0.001); PUO/fever outbreaks (r=0.315, p=0.026) and cases (r=0.323, p=0.022); chikungunya cases (r= 0.374, p=0.007); and total outbreaks (r=0.581, p<0.001) and cases (r=0.615, p<0.001). The test for cointegration of search activity showed stationarity of residuals with both disease outbreaks (p=0.048) and case counts (p=0.041).
Conclusions: There was agreement in the search volume with disease outbreaks and case counts. The time trends of search activity were cointegrated with both disease outbreaks and case counts on a weekly basis. This indicated that search activity and infectious diseases outbreaks and case counts were related. However, owing to the non-representative nature of the disease metrics and non-uniform access and limited use of Internet to seek health information, it would be inappropriate to use search engine query volume to predict or forecast disease outbreaks in the Indian context.