See No Evil
Please see the Kaiser
Family Foundation site for general information on our study, See
No Evil: How Internet Filters Affect the Search for Online
Health Information.
Publications
and other documents
- Richardson, C., P. Resnick, D. Hansen, and V. Rideout, Does
Pornography-Blocking Software Block Access to Health Information on the
Internet? Journal of the American Medical Association, Dec. 11,
2002. 288(22) 2887-2894.
- Resnick, P., D. Hansen, and C. Richardson, Calculating Error Rates for
Filtering Software. Forthcoming in Communications of the ACM (preprint
version in PDF format)
- Richardson, C., H. Derry, D. Hansen, and P. Resnick, Adolescents
Searching for Health Information on the Internet: An observational study.
J Med Internet Res 2003, October, 17;5(4):e25
- Executive Summary for the project available from the Kaiser
Family Foundation site
- Appendices available from the Kaiser
Family Foundation site. Provides all the details about testing dates, configurations,
etc., as well as some additional results about Google's filtered search engine.
Selected Media Coverage:
Data and Analysis
The main data set
The main data set includes one row for each site that was tested. Some URLs
appear multiple times-- duplicates should be removed for certain kinds of
analyses, as was done in the analysis files below. The data dictionary file
provides some useful information about how to interpret the columns.
Analysis: stata log files
- Analysis of search results, filters.log
(eliminates duplicate URLs returned from more than one search)
- Analysis of search results by search string, filterss.log
(the numbers from the JAMA article and the executive summary that are broken
down by search string come from here; eliminates duplicate URLs only if they
were returned by more than search engine for a single search string,
allowing analysis of results by search string)
- Analysis of top health sites, filterb.log
(eliminates duplicate URLs found in more than one category at Yahoo or
Google)
- Analysis of search engine performance, filterse.log
(not reported as part of the main study; analyses amount of health, porn,
and other sites returned by different search engines when given health
search terms)
- Analysis of Google filtered search, as compared to Google regular search, filtersearchprep.log
and filtersearch.log (not
reported as part of the main study; reported briefly in an on-line appendix
available from the Kaiser site.)
Cached web sites
Cached copies of the first two levels of web sites, as of the time that
searches and filtering were done, are available to researchers who wish to do
follow-up analysis. The cache is very large (more than 1GB zipped) and contains
both copyrighted and pornographic images, so we are not making it available for
general download. Please send email if
you have a legitimate research use for this data set, and we will arrange to
send it to you.
The Study Team
University of Michigan School of Information
- Paul Resnick, project director
- Derek Hansen
Department of Family Medicine, University of Michigan Medical School and VA
Health Services Research and Development Service
University of Michigan Health Media Research Laboratory
- Holly Derry
- Ian Jones
- Mike Nowak
- Ed Saunders
- Vic Strecher
Department of Biostatistics, University of Michigan School of Public Health
Kaiser Family Foundation