Join the IMPACT coaches for a deep dive on a new topic every month in our free virtual event series.

Register Here

Join the IMPACT coaches for a deep dive on a new topic every month in our free virtual event series.
Register Here
The Ultimate Inbound Marketing Strategy Playbook 2022

Take your inbound strategy to the next level

  • Master the 7 principles of highly effective inbound marketing
  • Dramatically improve your inbound sales
  • Get more buy-in at your company

A Breakdown of HubSpot’s Outage Retrospective for the Non-Technical User

By Carina Duffy

A Breakdown of HubSpot’s Outage Retrospective for the Non-Technical User

If you’re a HubSpot user, unless you were on vacation from March 29 until now (if you were, I’m jealous!), you recently experienced one of the most catastrophic outages HubSpot has ever had.

 Join the IMPACT coaches for a deep dive on a new topic every month in our free virtual event series.

Honestly, it was a mess. Most of the time during product outages you’ll have one or two tools that aren’t working or are having bugs, and it’s resolved within minutes or hours.

This time it was just about everything: emails, form submissions, workflows, lists, sales tools, CRM, imports, analytics. It’s hard to find any pieces of the tool that WEREN’T affected during that time.

What’s even more troubling is that this outage took about 36 hours to be mostly resolved, but the processing of the backlog of data from those 36 hours is still going on at the time this article was released.

Thankfully, during all of this, HubSpot’s crisis communication was solid. Updates on were regular and timely (although are they ever frequent enough??) given the scale of the situation.

JD Sherman (HubSpot’s COO) released an article on March 29 with an apology and an outline of next steps for the team -- namely, doing an in-depth retrospective on the cause of the issue and how they’ll make sure it won’t happen again.

That retrospective was delivered on April 4. You can read the full article here. There’s a lot of detail in there about how their systems are structured and what exactly happened. If you’re not into all of the “geek-speak,” we’ve got you covered.

A Quick Review of How HubSpot’s Infrastructure Works

HubSpot uses a combination of software systems -- Kafka and ZooKeeper -- that allow all of the HubSpot tools to talk to each other and all of the data to be processed effectively.

Both these software systems have redundancies and safeguards built into them so that if some servers crash, other servers can pick up the slack, and end users don’t experience any issues.

So What Broke?

It’s a bit difficult to explain without getting super technical, but think about it like a series of unfortunate events.

High strain was put onto ZooKeeper, causing parts of it to crash. Typically, ZooKeeper recovers quickly, but in this case, it took several minutes. The delay in recovery then broke the communication between ZooKeeper and Kafka, causing Kafka to crash.

Even though the team was able to restore ZooKeeper, the damage was done in Kafka and it wasn’t able to recover. What made things worse was a second outage in ZooKeeper accompanied by trying to restart Kafka, which started to cause data corruption.

Why Did It Take So Long to Fix?

Corrupted data? That sounds bad. And well, it is. This is actually why some things took so long to come back online.

When the HubSpot team realized that the server recovery was starting to corrupt data, they had a decision to make: either focus on recovering data (and safeguarding against corrupted data) or focus on restoring the tools.

They decided to focus on recovering data to ensure that there would be no gaps in historical data for customers (which in the long run they believe to be the right decision, and I’d personally agree!). This is the reason that the affected tools took almost 36 hours to be restored.

So, in the name of protecting customer data, HubSpot manually recovered a whoooole bunch of our data, and then was able to restore the affected tools.

This is also why you’re still seeing (at the time this article was published) the “continuing to process data from March 28 & 29” status message from HubSpot.

What Now?

Now that we know exactly what happened, HubSpot’s got a plan to make sure this never happens again. An interesting note in all of this is that HubSpot’s own teams use many of their tools across different parts of the business, so this not only affected their customers but their own business (even more motivation to make sure it never happens again!).

They’re making changes in a few different areas to protect against another outage: technical/infrastructure, reliability, testing, and communication.

Technical / Infrastructure

As is to be expected, HubSpot will be doing some restructuring of their server clusters to make sure it’s not even possible to have an outage this large again. By doing this, any outage that does happen should be restricted to a small piece of the platform, and the recovery time for issues should be significantly quicker.


HubSpot does have a team of people who test and upgrade their systems, but it hasn’t been as high of a priority as it should be. Now, they’ll have a dedicated team of people who will “oversee new standards, frequencies, and resources to ensure that we're consistently evaluating our key infrastructure systems for code fixes and critical patches without gaps.”


Along with investing much more heavily into the reliability of their platform, HubSpot is also increasing the level of frequency and depth to which they’re testing their systems. Again, it’s not that these processes didn’t exist before, but this outage uncovered some gaps in the frequency in which they test for massive failures, as well as how comprehensively they test these systems.


Lastly, HubSpot is committing to making their communication during any major incident more frequent and helpful, specifically in the minutes and hours immediately following an issue.

Their status updates will now include more detailed explanations of what is going on, as well as when the next update can be expected.

In Conclusion

No one here is pretending that this outage wasn’t bad. Not even HubSpot. But one of the things I appreciate the most about HubSpot as an organization is their transparency and willingness to admit when they’ve messed up.

They know the impact this had on their customers, and on their own business, and they’re actively seeking to make sure it never happens again.

So, even if you’re a little rattled by this outage, know that improvements are being made, fixes are being implemented, and HubSpot will continue to make their product the best it can be. Okay -- HubSpot lovefest over!

Join the IMPACT coaches for a deep dive on a new topic every month in our free virtual event series.


Published on April 9, 2019

Recent Articles

HubSpot Pricing: Your Guide to Everything HubSpot Costs for 2023
November 20, 2022 • 13 min read
Track These 5 Inbound Marketing Metrics to See Better Results
October 31, 2022 • 7 min read
Measuring the Invisible: How To Track Your 'Trust Index' [+ Template]
September 7, 2022 • 8 min read
Can HubSpot Help My Retail Business Grow?
July 31, 2022 • 4 min read
Get More Out of HubSpot Reporting With a Third-party Tool
July 9, 2022 • 5 min read
4 Keys To An Effective HubSpot Strategy in 2022
April 22, 2022 • 6 min read
Using They Ask, You Answer in Customer Service
April 1, 2022 • 5 min read
Is The HubSpot Free CRM Actually Free?
February 18, 2022 • 6 min read
Ultimate List of HubSpot Pros and Cons
January 8, 2022 • 20 min read
How To Optimize Your Marketing Automation Workflows With HubSpot (Tips)
November 12, 2021 • 11 min read
Top 13 Inbound Marketing & HubSpot Solutions Partner Program Agencies for 2022
October 29, 2021 • 8 min read
HubSpot Sales Hub: 18 Things Every Sales Rep Should Know How to Do (+ Videos)
October 28, 2021 • 5 min read
INBOUND 2021 Recap: Takeaways, Speakers, and Lessons Learned
October 25, 2021 • 7 min read
Need a HubSpot Admin? Here’s How to Find and Hire the Right Candidate
October 22, 2021 • 5 min read
How to Get Sales Reps to Use the HubSpot CRM
October 18, 2021 • 4 min read
HubSpot and Data Privacy: How to Collect Contacts the Right Way
October 4, 2021 • 4 min read
INBOUND is Fast Approaching, Google Leads are Syncing, and Workflow Actions are Placeholding [Hubcast 275]
September 23, 2021 • 4 min read
How to Know When You’ve Outgrown HubSpot Sales Hub Starter
September 23, 2021 • 4 min read
How to Get the Most Out of Your 2-Week HubSpot Free Trial
September 13, 2021 • 5 min read
CMS Hub Starter, business unit add-on, and Stephanie does email validation on a giant database [Hubcast ep. 274]
August 27, 2021 • 3 min read
How much HubSpot do I need?
August 17, 2021 • 4 min read
New HubSpot CMS Hub Starter Tier Released for Growing Businesses
August 6, 2021 • 4 min read
HubSpot CRM review (updated for 2022)
July 27, 2021 • 8 min read
5 HubSpot Sales Hub Tips for Assignment Selling
July 23, 2021 • 7 min read
Custom email nurture reporting, a map hack, and are pop-up forms a thing of the past? [Hubcast 273]
July 22, 2021 • 3 min read