Testing Inside the Box: Getting Down to Downloads
By Product Management | July 22, 2013
In my last post, Big Picture Before the Deep Dive, I shared the big picture view I got after I tested the native Facebook app on an iPhone 4S. I clicked on a point in the graph, and showed you part of the detailed activity report.
Given that the scatterplot showed us that the average time to launch, upload a photo and refresh the newsfeed was 15 seconds, a user experience time of 36 seconds for three basic steps is unacceptable. (That user experience time, remember, excludes user swipes and taps and so on: this is all just data transfer time.) In order to diagnose and fix the issue, I have to know which of the many connections the mobile app makes is failing. That's what the rest of the report shows me.
Welcome to Application Activity – results from inside the box.
At first glance it looks a lot like a web waterfall chart. But that's not at all the case. Remember, mobile apps aren't about how quickly they download files to build a browser page or transfer data. They're about how well they make, maintain, and end connections with a number of servers. So the column at the far left (that in a web waterfall lists downloaded files) lists each connection the mobile app makes. Not just connections to a server either—you'll see that the launch phase involved executing HTTP and API calls—no doubt to manage the login process.
The chart represents activity over time. I'm not sure if you can clearly see the color legend at the top right (click on the image to see a larger version), so let me point out three things.
Above the main panel, the three colored horizontal bars indicate the time spent on app launch (purple), photo upload (green) and newsfeed update (light blue).
In the main panel, the orange lines indicate data sent by the device and the dark blue lines data received by the device. The longer the line, the more time the transfers took.
At the far right are the actual amounts of data transferred for each connection. Given just that brief explanation, it probably doesn't take you any longer than it did me to unmask the performance culprit: photo upload. Another quick look in the main panel shows us which data transfers took the most time (the longest orange and blue lines that I circled). Following that row to the far left uncovers the server that's misbehaving. And since I know the IP address of the device, I can quickly hone in on that device in the server logs to uncover backend issues.
Now, at this point, I don't know what's up with that server—or a related router, load balancer, CDN and so on. But my monitoring has pinpointed, in seconds, that it's the root cause of intermittent performance failures.
None of these connections are seen by web performance tools. If a native app fails, without having this inside-the-box visibility, finding the source of a problem is going to take a lot of time—kind of like figuring out what's wrong with a car by driving it instead of by running onboard diagnostics.
There's more yet to understand about what's going on: inside and outside the box. Web app testers record the app in action during test, so they can see how it behaved during points of performance failure. Mobile app testers need the same thing—the ability to view their app in action at the point of failure. To do this, we run a video timeline, that replays what the app was doing when the problem occurred.
And that's what I'll show you next time.