We defined "health-related website" to include websites associated with a wide variety of sponsoring organizations that provide information for staying well, for preventing and managing disease, and for making decisions about health, health care, health products, or health services. Using information generated by Hitwise, a commercial vendor that tracks Internet traffic, we identified 3,608 health-related websites that had been visited by Internet users in the United States during October 2005. We then stratified the 3,608 websites into two groups—(1) the "target stratum" of the 213 most-frequently-visited sites, which account for 60 percent of all visits; and (2) the 3,395 sites in the "remainder"—and drew a simple random sample from each stratum. Our final sample of 102 websites included 52 from the target stratum and 50 from the remainder.1
We developed technical specifications for determining compliance with each of the six disclosure criteria, enumerating the disclosure elements required for compliance as well as for accessibility to users (in most cases, within two clicks of the home page). We then drafted and pretested a data collection instrument on a subsample of 10 websites, and revised the protocols based on the findings. Two reviewers then evaluated the websites from the final sample; 24 websites were independently evaluated by both reviewers (to assess inter-rater reliability2), and 78 websites were singly reviewed. Once the data were collected, cleaned, and validated, we coded all responses for scoring and analysis. We determined compliance at the criterion level and for disclosure elements subsumed under each criterion.
None of the 102 websites reviewed for this analysis met all six of the disclosure criteria enumerated in Healthy People 2010 Objective 11-4, and only 6 complied with more than three criteria. Figure A displays the frequency of compliance for the whole sample, the sites most frequently visited, and the remainder, by the number of criteria in compliance.
Of the six criteria, Privacy was met most often, followed by User Feedback. The lowest levels of compliance were on the Content and Content Development criterion and the Content Updating criterion, both of which required specific disclosure elements on three randomly selected items of health content. Across all six criteria, a somewhat higher proportion of websites from the "target" stratum most frequently visited were compliant than were those drawn from the remainder.
Figure B displays the frequency of compliance for the whole sample, the target stratum of sites most frequently visited, and the remainder, by each of the six criteria.
DISCUSSION AND IMPLICATIONS
There was a noteworthy lack of consistency in how or where websites disclosed information relating to the criteria. The disclosure elements reported here on which compliance was high are indicative of the few conventions in practice that have emerged to convey information about identity, privacy, and purpose, and to differentiate advertising content from other information. However, no such conventions govern the disclosure of other critical pieces of information—notably, information on sources of funding, editorial oversight, authorship, or dating of information.
The same qualities that make the Internet appealing as a medium to search for information—the ability to navigate quickly through multiple pages, sites, and sources—also complicate the task of disclosure. Many websites provide ready access to health information from a variety of different sources, but very few consistently disclose information on authorship or content updating on randomly selected items of health content. It is also unclear whether stated policies found through links to affiliated sites are intended to apply to the home site. While advertising may be clearly labeled, hyperlinks to what appears to be health information sometimes take the user to commercial promotions.
The small sample size of this study limits the reliability of several of our baseline estimates and our ability to detect statistically significant progress on later assessments. Notwithstanding these limitations, this baseline estimate of health websites' compliance with the disclosure criteria clearly identifies both the areas on which some progress has been made and those on which future improvement efforts must focus. The marginally better performance among the websites most frequently visited suggests that some of the conventions in practice that will improve disclosure in the future may be emerging. A qualitative analysis of the practices used by the better-performing websites could offer useful insights and guidance for improvement.
1 We selected in each stratum a larger equal probability sample than we expected to need in order to replace sites found to be ineligible. Of the 150 sites in the larger sample, 48 were ineligible, leaving a final sample of 102.
2 Our inter-rater reliability Kappa coefficient of 0.81 and Lin concordance correlation coefficient of 0.80 suggest moderate to substantial correlation between the raters.