This project was the first enterprise-wide ACSI application and probably the largest enterprise Web evaluation project to date in the US government. The project implemented the largest number of ACSI surveys (55) at any one government agency. Other agencies using the ACSI have multiple measures but in smaller numbers; for example, the Centers for Medicare and Medicaid Services are using 20, the US Department of State is using 15, the US Department of Agriculture uses 9, and the US Department of the Treasury uses 8 (personal communication, Ron Oberbillig, Federal Consulting Group, US Department of the Treasury, April 16, 2007).
The trans-NIH ACSI project met all of the original study and evaluation goals—a broad cross-section of NIH websites participated, the trans-NIH project leadership team drew from several NIH organizations and functioned very well for the 2-year project duration, NIH Web staff attendance at quarterly meetings was good to excellent, the project evaluation methodology was well designed and funded and fully implemented, and the evaluation itself was successful in identifying useful information on the site-specific and trans-NIH impacts of using the ACSI as well as assessing the success of the project as a whole.
Multimedia Appendix 5 is a PowerPoint presentation highlighting select evaluation and trans-NIH results, presented at the last trans-NIH meeting to be held as part of the project (October 2006). Multimedia Appendix 4 is a PowerPoint presentation discussing the enterprise-wide approach, presented at the Federal Consulting Group’s ACSI Web Survey Group quarterly meeting (March 2007).
A majority of participating website teams reported significant benefits and new knowledge from the ACSI survey results and from being involved in the overall project process. The more experienced and better funded so-called “power users” among the participating NIH websites were able to use the ACSI as a ready-to-use customer satisfaction metric that provided pre-approved OMB clearance (a major advantage in streamlining the start-up process) and as a tool for incorporating custom questions into the survey in order to identify specific website issues and problems. Power users also employed the ACSI results as a source of information about site visitor demographics and as a means to analyze the satisfaction levels and information retrieval results of visitor subgroups to identify needed site improvements. The power users utilized the ACSI as a source of information for planning any follow-up or parallel work involving additional evaluation methods and as an archive of survey data for future use and analysis in website redesign and information enhancements.
These power users were able to apply the ACSI survey results to benchmark their particular NIH websites against other government and private sector websites and to gain insights about and opportunities for improving their Web presence through site-specific feedback. The ACSI results allowed power users to respond more quickly and effectively to the ever-evolving and changing Web environment and to help determine the impact of website changes and evaluate whether Web-based information dissemination programs are performing significantly better or worse over a defined period of time.
As a group, the participating NIH websites performed very well overall against US government and private sector benchmarks. The power user NIH websites—again, typically the larger and more heavily used, staffed, and funded websites—tended to have higher satisfaction scores than other participating websites. These websites also were more likely to use several evaluation methods in order to triangulate results and obtain more complete inferences and interpretations. However, with all NIH websites included, the NIH-wide average satisfaction score exceeded the government-wide average from the beginning of the project until the end.
As a consequence, NIH as a whole, and some individual NIH organizations, received significant positive media coverage of their Web performance during the course of the project [
53-
57]. Also, NIH received the first ever e-government award from the Federal Consulting Group / US Department of the Treasury—the Customer Performance Achievement Award—conferred by the OMB Administrator for Electronic Government and Information Technology in recognition of the success of the trans-NIH ACSI project.
Websites varied in their ability to implement the ACSI and utilize results. The majority of participating websites were able to implement the ACSI and receive survey results, including satisfaction scores. Some sites were able to implement the ACSI but did not generate sufficient completed surveys to generate satisfaction scores due to low traffic on the website or because the ACSI was implemented too late in the study. However, these sites were able to obtain the results of their custom questions. The ACSI or any other online user survey does not work well with low-traffic websites. It simply takes too long to obtain a minimum sample for statistically significant results.
Due to the large number of websites involved, the trans-NIH project, out of necessity, implemented the ACSI in stages, determined in part by the degree of readiness of each website to participate. This generally meant that the more experienced better-staffed websites (including sites that had been pilot testing the ACSI) fully implemented the ACSI earlier and had more time to collect survey results. Other sites were not ready to implement the ACSI until late in the project. In addition, some sites that dropped out were replaced by others late in the project. The late starters in some cases did not have sufficient time to generate enough completed surveys.
Website teams that used the ACSI the longest tended to be satisfied with and find value in its use, especially for planning site changes and comparing versions of the website before and after revisions or redesigns. Teams with relatively later start dates and/or slow rates of collecting completed ACSI surveys were more likely to be dissatisfied with the ACSI because they did not have sufficient time or opportunity to receive and/or act on ACSI survey results.
Relative inexperience in using the survey may also have been related to perceived value because of the complexity of the survey results. The ACSI, unlike simpler survey methods, generates multidimensional results based on both standardized and custom questions. Segmentation of results, while analytically powerful, can also be daunting to the inexperienced.
In addition to time and experience, other key factors driving successful use of the ACSI or, by extension, other similar online survey methods, based on this project experience include staff and management buy-in, adequate resources, staff training and understanding, the website design cycle, and technical support.
Across all participating NIH websites, the Web teams derived substantially greater value from their custom question data and from segmentation data (breaking out results by specific types of visitors, information seeking goals, demographics, etc), than from the standardized ACSI questions. The custom question data provided many Web teams with valuable insight about visitor profiles and visit characteristics. For example, through cross-correlations between responses to custom and standardized questions, Web teams were able to identify visitor subgroups that were less satisfied and highlight needed website improvements. Many teams also took advantage of having a continuous source of customer feedback for tracking the visitor responses to website improvements implemented in response to ACSI data (as reflected in satisfaction scores).
The ACSI, like all online surveys in the Web environment, has relatively low response rates (typically about 5%, but ranging from 3% to 7%). The ACSI uses random intercepts and several cross-checks to help assure that nonresponse bias is minimized, but the latter is still a concern and warrants greater attention in the academic and survey research communities. NLM, NCI, and NHLBI, three of the participating NIH organizations, had used online surveys for several years prior to the ACSI. The prior surveys placed greater emphasis on the custom questions and less on standardized questions or benchmarking. Comparison of results about site visitors between the prior surveys and the ACSI results for several websites (eg, MedlinePlus, AIDSinfo, and TOXNET at NLM, and the NHLBI website) indicated that similar results were obtained between the earlier surveys and the ACSI surveys [
22,
23, personal communication, Cindy Love, April 30, 2007; personal communication, Mark Malamud, October 9, 2007]. This suggests that the ACSI survey results can be considered reasonably valid, and not unduly affected by non-response bias, unless there are undetected sources of non-response bias affecting all surveys over an extended time frame.
However, it is best not to rely too heavily on any one Web evaluation methodology. As noted earlier, a multidimensional approach is warranted and has been adopted by the more experienced better-funded NIH websites. The survey of NIH Web teams indicates that 21 of the participating teams practise, to varying degrees, a multidimensional approach. In addition to the ACSI, during the time of the trans-NIH project, 19 of the 21 websites also used Web log software, 18 used usability testing, 11 used expert or heuristic reviews, 4 used other types of surveys, 4 used focus groups, 3 used audience measurement and profiling, and 1 indicated other.
Conclusions
The trans-NIH leadership team believed in the importance of Web evaluation going into the trans-NIH ACSI project and was motivated to make the ACSI available to a broad group of NIH websites. The hope was to significantly increase the use of online customer surveys, the ACSI being a particular variant of the general class, within the NIH Web community. Further, the hope was that the project would not only increase NIH staff understanding of the value of this and other forms of Web evaluation, but also strengthen the management and financial support for Web evaluation at NIH.
The project was successful in increasing the use of and interest in online surveys and enhancing the understanding of the strengths and limitations of such surveys. A majority of participating websites found considerable added value in the survey process and results. However, many of the Web teams gave a clear indication in the project evaluation survey that notwithstanding the benefits, it was uncertain or questionable whether they would be able to fund the modest (US $20,000 or so per year per website) cost of renewing the ACSI from their own funds if central NIH funds were no longer available. As it turned out, central funding was not continued beyond the 2-year project life of this trans-NIH project, and each participating NIH website had to make its own decision whether to continue, and, if so, find its own funding to do so. The result was that only about one quarter of the NIH websites renewed their ACSI license, and half of those renewals were the early experimenters who had been using the ACSI for the longest time.
For this trans-NIH project, the US $18,000 survey license fee per website was considered to be competitive with other online survey options in terms of cost and to offer a better value added per dollar when considering the other benefits of the ACSI. For those websites wishing to continue, the FCG and ForeSee Results offered an ACSI “lite” version at US $15,000 (compared to US $25,000 for full service), but even at that price point there were relatively few renewals.
The NIH was fortunate to have the support of the Evaluation Set-Aside Program for the trans-NIH ACSI project. Much was learned, and many websites received significant added value, in their own estimation. But this was an experiment, not an ongoing operational activity. Without central funding, only the more experienced better-resourced larger websites, for the most part, continued with the ACSI.
Thus, a final lesson learned from the trans-NIH ACSI project experiment is the tenuous nature of Web evaluation in the age of e-government, when OMB and departmental policies are placing ever greater emphasis on Web-based delivery of government information and services. A parallel commitment to adequate evaluation of those Web-based activities may well be needed in order to help assure that the potential of the Web and other information technologies to improve customer and citizen satisfaction is fully realized.