Lessons in End User Experience from the ACA Exchange Launch
By Aaron Rudger | October 4, 2013
October 1 marked the official start of health care insurance access through the Affordable Care Act on the 17 state exchange websites, and the federal government’s new HealthCare.gov site. Reports have indicated that citizens around the country have flocked to the sites, overwhelming many of them. The launch of these sites provides a unique case study in WebOps planning and ongoing operations.
Planning for a large “launch” event like Tuesday’s exchange opening is not easy. Keynote has been working with customers in the e-tail, entertainment, publishing and technology industries for years, helping them with “spike” demand readiness. One of the lessons Keynote has learned with our customers is that effective planning is part art, part science and part culture. No new platform release is ever without issue, and so the ability to recover and adapt to a big release scenario is equally important to the preparation, and largely dependent on that preparation.
One best practice for readiness preparation commonly used in the retail industry is deep analysis of historical user behavior to create models for stress or load testing. Load tests using real traffic from the Internet can be run against a website using these models, simulating a range of user behavior and scenarios.
In the case of the Exchange websites, they had limited or no traffic history to draw from. We do not know if they attempted to load test their sites with real Internet traffic, but doing so effectively would have been difficult. Our friends at AppDynamics have also discussed other challenges the Exchange sites faced.
However, estimating the anticipated amount of traffic to the home page of the site could have been done with demographic and other forecasting data. We presume most of the Exchange sites did plan and test the website “front door” experience. On this measure, it looks like the Exchange sites have scored quite well. Keynote began measuring the performance and availability of these home pages beginning around 3pm PT, Oct 1 using our network of agents nationwide. In aggregate, they’ve delivered a good user experience (3.21 seconds page load time / 97.7% availability) and more importantly, shown steady improvement over the past 3 days:
In the first 2 days of monitoring, we noticed that 11 of the 18 sites averaged good page load times, below the industry best practice of 3 seconds. The Connecticut and Federal (HealthCare.gov) sites averaged well below this benchmark with page load times of 1.93 and 1.65 seconds respectively.
The availability chart above shows strong improvement across the 18 sites, demonstrating their ability to adapt to the fluid production demands since launching October 1. A pattern that emerges when we look at some of the availability dips, and measurement detail is the use of nighttime maintenance to adapt and improve both performance and delivery. Here we see the Vermont site using this strategy to work through issues it has been facing:
In the past 24 hours, the portal.HealthConnect.Vermont.gov site has maintained a very responsive median User Time of 2.69 seconds.
The practice of recovering, adapting and improving site performance and reliability is critical to any site’s success. Rapid re-testing of a site with real Internet traffic is a good best practice that can be difficult for operations teams. Hopefully “lean” operational practices will be used to quickly “rinse and repeat” stability testing across these sites. We’re encouraged by what we’ve monitored so far.