This page has been archived.

Estimating the Proportion of Health-Related Websites Disclosing Information That Can Be Used to Assess Their Quality

Final Report - May 30, 2006

Executive Summary

Widespread and growing use of the Internet as a medium for disseminating and gathering information has raised concerns about users' ability to assess the quality of the health and medical information presented on Internet websites. The Office of Disease Prevention and Health Promotion (ODPHP) has identified six types of information that should be publicly disclosed to users of health-related websites-including information on the identity of the website sponsors (Identity), the purpose of the site (Purpose), the source of the information provided (Content and Content Development), policies for protecting the confidentiality of personal information (Privacy), how the site solicits user feedback and is evaluated (User Feedback), and how the content is updated (Content Updating). As part of the Healthy People 2010 initiative, ODPHP has established a national objective to increase the proportion of health-related websites that disclose information consistent with these six criteria (Communication Objective 11-4). Mathematica Policy Research, Inc. (MPR), under contract to ODPHP, has developed, tested, and implemented a methodology for estimating the proportion of health websites that disclose information consistent with the identified criteria.


We defined "health-related website" to include websites associated with a wide variety of sponsoring organizations that provide information for staying well, for preventing and managing disease, and for making decisions about health, health care, health products, or health services. Using information generated by Hitwise, a commercial vendor that tracks Internet traffic, we identified 3,608 health-related websites that had been visited by Internet users in the United States during October 2005. We then stratified the 3,608 websites into two groups—(1) the "target stratum" of the 213 most-frequently-visited sites, which account for 60 percent of all visits; and (2) the 3,395 sites in the "remainder"—and drew a simple random sample from each stratum. Our final sample of 102 websites included 52 from the target stratum and 50 from the remainder.1

We developed technical specifications for determining compliance with each of the six disclosure criteria, enumerating the disclosure elements required for compliance as well as for accessibility to users (in most cases, within two clicks of the home page). We then drafted and pretested a data collection instrument on a subsample of 10 websites, and revised the protocols based on the findings. Two reviewers then evaluated the websites from the final sample; 24 websites were independently evaluated by both reviewers (to assess inter-rater reliability2), and 78 websites were singly reviewed. Once the data were collected, cleaned, and validated, we coded all responses for scoring and analysis. We determined compliance at the criterion level and for disclosure elements subsumed under each criterion.


None of the 102 websites reviewed for this analysis met all six of the disclosure criteria enumerated in Healthy People 2010 Objective 11-4, and only 6 complied with more than three criteria. Figure A displays the frequency of compliance for the whole sample, the sites most frequently visited, and the remainder, by the number of criteria in compliance.

Of the six criteria, Privacy was met most often, followed by User Feedback. The lowest levels of compliance were on the Content and Content Development criterion and the Content Updating criterion, both of which required specific disclosure elements on three randomly selected items of health content. Across all six criteria, a somewhat higher proportion of websites from the "target" stratum most frequently visited were compliant than were those drawn from the remainder.

Figure B displays the frequency of compliance for the whole sample, the target stratum of sites most frequently visited, and the remainder, by each of the six criteria.

Figure A. Estimates of Compliance for All Health Websites and Frequently Visited Sites, by Number of Criteria in Compliance

Figure A.  Estimates of Compliance for All Health Websites and Frequently Visited Sites, by Number of Criteria in Compliance

View text version of Figure A.

Figure B. Estimates of Compliance for All Health Websites and Frequently Visited Sites, by Criterion
Figure B
View text version of Figure B.


There was a noteworthy lack of consistency in how or where websites disclosed information relating to the criteria. The disclosure elements reported here on which compliance was high are indicative of the few conventions in practice that have emerged to convey information about identity, privacy, and purpose, and to differentiate advertising content from other information. However, no such conventions govern the disclosure of other critical pieces of information—notably, information on sources of funding, editorial oversight, authorship, or dating of information.

The same qualities that make the Internet appealing as a medium to search for information—the ability to navigate quickly through multiple pages, sites, and sources—also complicate the task of disclosure. Many websites provide ready access to health information from a variety of different sources, but very few consistently disclose information on authorship or content updating on randomly selected items of health content. It is also unclear whether stated policies found through links to affiliated sites are intended to apply to the home site. While advertising may be clearly labeled, hyperlinks to what appears to be health information sometimes take the user to commercial promotions.

The small sample size of this study limits the reliability of several of our baseline estimates and our ability to detect statistically significant progress on later assessments. Notwithstanding these limitations, this baseline estimate of health websites' compliance with the disclosure criteria clearly identifies both the areas on which some progress has been made and those on which future improvement efforts must focus. The marginally better performance among the websites most frequently visited suggests that some of the conventions in practice that will improve disclosure in the future may be emerging. A qualitative analysis of the practices used by the better-performing websites could offer useful insights and guidance for improvement.

1 We selected in each stratum a larger equal probability sample than we expected to need in order to replace sites found to be ineligible. Of the 150 sites in the larger sample, 48 were ineligible, leaving a final sample of 102.

2 Our inter-rater reliability Kappa coefficient of 0.81 and Lin concordance correlation coefficient of 0.80 suggest moderate to substantial correlation between the raters.