Web Performance Delivered: How to Get Effective Feedback and Seamless Monitoring
About the Webcast
When you're delivering 365 million emails in a single day, you may not think web performance really matters all that much.
But for Constant Contact, web performance is the only thing that matters—at least when one of their 600,000 customers needs to launch a holiday email campaign that is vital to their business.
Michael Monteiga shares how Constant Contact has crafted a feedback loop that keeps everyone (Operations and Development Teams) in sync about the state of their consumer website, SaaS application, and developer APIs.
Watch to find out about the technology stack at Constant Contact and the ways they monitor it from their customers' perspectives for reliability and improvement.
- The role of the network operations center
- What "no-blame" means and how to do it successfully
- Which metrics and tooling are used to manage performance
- Why performance data can make you happy
- How to mash data from multiple sources and get even happier
- When monitoring happiness becomes contagious
Learn more about Keynote Web Monitoring.
Hi everyone, thank you for joining us. My name is Katie Fallon and I'm with O'Reilly Media and I will be your host for today's webcast. We would like to begin today's presentation by saying thank you to our sponsor keynotes and let you know that Keynote is trusted by some of the world's best known brands to help them make every digital interaction count. Now a part of Dynatrace, Keynote offers cloud based testing, monitoring and performance analytic solutions for web and mobile. Their customers analyze more than seven hundred million mobile and website performance measurements each day to keep their apps and sites both reliable and delightful for user. Thank you again Keynote.
Our speaker today is Michael Montega presenting Web Performance Delivered, how to get effective feedback and seamless monitoring. Michael's is current NOC analyst for Constant Contact; a technology companies champion the needs of smaller organizations to build successful longer customer relationships. Formerly, Michael was a computer programmer for the U.S. Air Force and held various support related roles within the financial and software industry, with focuses on monitoring, service management, incident management and problem management.
I will turn the program over to Mike in just a moment, but first let me go over a few housekeeping things to help you get the most out of today webcast. You will want to open the group chat widget if you haven’t already done so. This is where we will interact with each other and where you will submit your questions. We find that our audience usually has a lot of good knowledge to share, so we encourage you to chat freely during the event, however, if you have questions for Mike, please preface them with a capitol letter Q so we know it's for him and we will make sure that he sees it when he is ready to review questions and answers. You can also open, move and resize any other widgets. If you would like to tweet from the twitter widget you will need to give us permission to access your account. Our hashtag today is #velocityconf and the twitter widget will automatically send to your tweet so you don’t have to. If you have any problems during the event, we encourage you to look at the help widget which is very thorough, but if you continue to have problems, post in the chat room and one of our staff will help you. And remember, the best thing you can do for a good audio stream is close any aps that could be interfering. We are recording this event and we will send an email to everyone who registered when the recording is available and that is usually within 48 hours. At this time I would like to turn the program over to Mike for his presentation. Hi Mike.
Hi, this is Mike Montega coming at you from Waltham Massachusetts, headquarters for Constant Contact. We are a Keynote customer and I'm just going to go through a slide deck that we put together of the things that we are going to go over at Constant Contact and the challenges of being an email marketing firm. An overview of our NOC, network operations center, how our NOC and Constant Contact uses Keynote and maybe what we are looking forward to do.
So, who is Constant Contact? Well, Constant Contact is a company that helps small business's due more business. Small business could be the bakery down the street that employs four people and all they want to do is sell muffins. It could be the local yoga studio or it could be a nonprofit. One example of a nonprofit is a group called More Than Words and they provide job training for at risk youth. The kind of training that they do give them is through a café that they have so they can learn barista skills, they write a newsletter, which is sent out vial Constant Contact software and they also have an online book business.
We have been revolutionizing the success formula for small businesses and non-profits since 1998. We have over six hundred thousand customers worldwide and we do have an award winning coaching in product support. We offer an all in one marketing platform that helps these small business drive repeat business and find new customers, through newsletters, announcements, different offers, online listing, event registrations and such, some interesting facts that we have reached 331million dollars in revenue. We have sent, on behalf of our customers, 365 million emails in a day. We have peaked at 9,400 emails delivered in 1 minute. We have sent about 68 billion emails on behalf of our customers. We are at around 1400 employees and we continue to grow.
We have offices in Waltham Massachusetts, Loveland Colorado, Boca Raton, Florida, The Tribeca area in New York City, Battery Park in New York and San Francisco California. We do have a presence as well in the United Kingdom. Our customer base is as follows: About 85 percent of our customers have less than 25 employees. About 65 percent of our customer base has ten or less employees.
Now I'm going to give an overview of what the NOC responsibilities are here at Constant Contact. Like most NOC's, we are put into action mainly when things go wrong. We are a little bit unique and are sort of a hybrid of a NOC. We do work as a liaison between the product deliveries teams coordinating, monitoring trending, admin procedures and tech and ap support. We coordinate and take part in communications with third party vendors. We create incident tickets. We perform initial triage, we resolve in some cases, or if it's really that bad of a situation, we will escalate to the appropriate teams. We are responsible for our outage notifications. We establish conference calls between ops and development teams and we will send out notifications for production issues based on guidelines that we have. We do host daily meetings. We review critical and major incidents. We review the next 24 hours as well as the past 24 hours. We do offer proactive monitoring on all of our products and services. We do add the first level support. We are also responsible for incident management. We champion the post mortem process and out notifications. We deal with problem management. We support just about all maintenances that are going on. We do coordinate activity. The impact of customer experiences using Keynote.
Keynote has, what they call, agents, all over the world. Think of them as machines with different geographic locations. Within North American, we use 25 agents and outside of North American, we use 11 agents. Within the NOC, we do application monitoring, website user interface, UPI monitoring, we have a measurement depository; we use Keynote measurements during our post-mortems. We also use some of those measurements to report daily, weekly and year end. Within Constant Contact, the NOC is not the only team to use Keynote. We also have some of our development teams such as different API's, network engineering, ISP specific issues that may come up, general network troubleshooting, the website team uses it and get reports daily and weekly. Our growth is only growing with Keynote within our organization.
So, we are going to pose a question out there or pole and you may see it come up on your screen now or very soon. Simply answer yes or no to, are you currently a Keynote user? I will give you about 45 seconds for people to respond. Also, this is a good moment to add any questions on what you have heard so far, whether it be a question about Constant Contact in general or it be regarding the NOC at Constant Contact. So, when we look at the results of the pole, it looks like about 90 percent of the listeners are not currently using Keynote and about 10 percent are.
Happiness, something we all want to achieve, but for us in the NOC at Constant Contact using Keynote, happiness is our measurements of how we want them, or for you it may be your measurements, how you want them. So a simple case study, we have an environment consisting of 12 different cells. We know if our overall service is down, but is there a way we can have insight to each individual cell? Well, there is and through Keynote, you can see the graph that is up there on the screen right now and it shows an example of 12 separate cells. You can see performance statistics and you can see availability, but it doesn’t show you over how long or how many measurements, but of course, management likes pictures so we can show them more data. We can have co-mingled performance and we can have co-mingled availability, but what do those lines mean?
Management may say, wow that is great, but that sure is a lot of screens to look at. Well, we can solve that, one screen with a combination graph. Although it is very small on the screen to see right now, we can see in the upper right, that is the first slide shown and then the other two slides that were previously shown are on there as well. We find that a very useful tool here when trying to get data or statistics out. On that screen you can toggle individual measurements as well. So, if you want to see one or two cells, you can actually compare them very easily. You can also see some outliers, someone may ask if it is really taking that long, and it may just be one check in Sidney Australia, I'm going to cross half way across the globe. Can we specifically pinpoint service disruptions? Yes, unfortunately it happens, and we do get disruptions in service just like everyone else. We can pinpoint them and disruptions are easy to see. Good old fashion red is bad and green is good, even though we are not focused in, we can see from afar, that section of red in this display. We can pinpoint specific time ranges as well and in a table we sort by time.
If you look at this screen shot right here, we can see that at 6:06 we received an internal server error and that status continued up until 6:10. Here at Constant Contact, we make several API calls per hour to Keynote and we take our measurements and we ingest them. Our scripts insert measurements into our local database and each transaction gets some table, we have approximately 80 tables, at this point, and we have one table for Keynote agents to use for querying. We also have custom searches outside of the Keynote UI. What that does is gives us the ability to do deep dives and go back quite a ways. We have two years' worth of data and we have over 31 million rows of data. So, that brings us to, what is that data and why do we have it and where are we going and what are we going to do now? Well, one thing we can use that for and it seems to be very useful in people outside the market interested in it, is for trending. An example here and this is an actual example that we had notice when we do our monthly report; we had noticed that latency over one of our websites we increasing. So, we took a deeper dive and we did monthly averages on items such as first page, interactive page and a number of page bytes. So, if you look at this slide, you can easily see that the page bytes were increasing along with blatancy. Trending can also help us on, where do we want to be. We need to look at trending; sometimes it can lead to other discoveries as we can see here. One benefit that we have in ingesting the data is that we can take our data, which is from a synthetic source, Keynote, and we can use our real user through put and we can lay it on top of each other.
So, in this screen shot here, we can see from 9:00 a.m. up to a little bit past 2:00 p.m. our requests per minute has increased. Overall, other than some of those spikes in our latency, we can see that it sort of stays steady, even those request per minute increases. At times the NOC is asked specific data on demand, whether it is from a dove team from management team and we are able to give that to people with an ways click of a URL, which just hits our internal sight. We can give them data, such as this, if they have one item of data, give them an average over time, or we can give them multiple on demand metrics. We can also present some data in a classic table. One interesting request that we had was; give us our five fastest and five slowest points within the last four hours, for one of our websites. I think everyone can see that in there and it is interesting to see which geographic locations are fastest for us and slowest for us. Looking forward, we want to allow Keynote UI access to other teams, even though we are very happy to offer per request, data tables or URL's that have data, which people request. We think that we need to offer Keynote access, so that people can go on at will, whenever they want and see what is going on with their specific applications. We are looking more into mobile. The market is starting to grow for us from a NOC perspective, we are not charged with monitoring mobile transactions, at least not yet. We are going to continue to get the best value out of our measurements and keep diving into using measurements to see what they mean. So, what our trends are, sort term and long term. Now I would like to open up this section for and questions people may have. Katie, if you could come back on line to facilitate questions that would be great.
Sure, we have a few questions already. I just want to remind everyone who is on the call, to open up your group chat widget, if you haven't already done so and you can submit your questions for Mike through there. Just put a Q in front of it so that we know its for him. A question from Stephen, what technology is being used to compare Keynote metrics against data sourced elsewhere, such as Requests per Minutes?
Stephen that is a great question. What we use, for Request Per Minute, we use New Relic and we take that data and we put it into graphite as well as our Keynote data, at least for that overlay that you saw earlier and just have it on a screen in the NOC that keeps updating every one minute, I believe. Next question.
Terry asks, cell sounds like a domain specific terminology, when you have the opportunity, can you define it?
Sure, cell is just a silo and a way of dividing one of the services that we have here, which is pretty interesting in a way. If one cell goes down, I believe in the screen shot we broke it down to 12 cells. If one cell goes down, it doesn’t give us a 100 percent outage, it gives us a 1/12th outage for that specific service. So, it's just the term we use in house here and we could call it silo or anything else that would mean the same thing.
How do you monitor your third party suppliers such as CDN or plug in providers?
We use ACaMI here and we just have basic help checks against ACaMI. Other than that, some third parties that will have issues we can see in the waterfall grafts that we have in our Keynote monitoring. Sometimes, a lot of times, we get calls from customers. Of course, our goal or any goal of an organization such as us, would want to be able to identify issues before they happen, but the customers are using the product constantly. I would say, a good percentage of times, we do hear from the customer first.
Do you utilize real user monitoring as well as Keynote?
We are currently looking for a solution in Keynote/Dynatrace - it's one of the solutions being looked at and considered, but that is not within the realm of my responsibilities.
Gregory want to know if you find Keynote six usable or do you just stick with Keynote five.
That is a great question and I think its great because that is not even something I pay attention to, the version. We just keep using it and we keep putting more measurements in and it gives us what we need, so, I have not even considered having to upgrade or checking if we have upgraded.
How do you work with your development team to deal with issues that you find in the NOC?
Well, here at Constant Contact we have the development teams are many and we have constant communication with them whenever there are issues. There are no walls to break down; we do have escalation lists, and chat rooms that we use. Or we can just walk over to someone's desk, it’s a very open environment, like I mentioned, there just aren't many walls anywhere here. It's an interesting culture that is very relaxed and the customer is always first. That is just the culture here.
How correlated are the issues you catch in the NOC with releases or change?
Can you repeat that question?
Is this regarding Keynote or out of Keynote? Of course I can get an answer because I'm just talking to the phone. So, in Keynote, we only see bumps in latency with releases. It is very, very infrequent that a change or a release that is done, at Constant Contact, causes a major outage with a service. I believe that in 2014 we had over 500 deployments and I think we had 10 or 11 that caused any sort of bump with production service.
Ronald asks, is Keynote useful for trouble shooting down to a specific cause of a problem?
Yes. We had an issue, sometime in the past few months, where Adobe had released a deployment that caused some circular action in some of our test and target solution. Because of that, it wasn’t a very evident thing to see, that there was a sick lick issue happening, but when we looked in to Keynote and we did our waterfall graphs, it was evident right away. Normally we would see about 30 assets in one specific waterfall graph and we saw over two thousand. It was easy to identify that there was a loop happening, which of course, was causing a distress to customers and to our services. Our development team then reached out to Adobe who they happened to have a relationship with, so they were the people who called out and they took care of the issue.
Well, that looks like all of the questions that we have today, Mike, so, I'd like to just thank you so much for presenting such an outstanding webcast to our audience and sharing your knowledge and expertise with us.
You are very welcome.
Before we close out, we would like to, once again, thank our sponsor, Keynote and let you know that Keynote is trusted by some of the world's best known brands, to help them make every digital interaction count. Now a part of Dynatrace, Keynote offers Cloud based testing, monitoring and performance analytic software solutions for web and mobile. There customers analyze more than 700 million mobile and website performances measurements each day to keep their aps and sights both reliable and delightful for users. Thanks everyone for joining us today and we would like to invite you to visit O'Reilly and take a look at all of our upcoming webcasts. We have lots of events coming up and they are all free. You can also sign up for our new letter that comes out every week and it highlights what is happening at O'Reilly webcast. We would also like to remind everyone that our Velocity conference is coming up October 12th through 14th in New York City. If you have enjoyed this webcast and you are interested in these topics, Velocity is probably the conference for you. I'm going to put the URL for the conference in our group chat right now so you can take a look and hopefully we will see you there. Thanks again for joining us; this will conclude our webcast today. Good bye everyone.
Duration: 30 minutes