Home Web Development Basecamp 3 faces a read-only outage of nearly 5 hours

Basecamp 3 faces a read-only outage of nearly 5 hours

November 13, 2018 - 3:01 am

2866

3 min read

Yesterday, Basecamp shared the cause behind the outage Basecamp 3 faced on November 8. The outage continued for nearly five hours starting from 7:21 am CST to 12:11 pm. Due to this, the users were only able to access existing messages, to-do lists, and files, but they were prevented from entering any new information and altering any existing information.

David Heinemeier Hansson, the creator of Ruby on Rails, founder & CTO at Basecamp said in his post that this was the worst outage Basecamp has faced in probably 10 years:

“It’s bad enough that we had the worst outage at Basecamp in probably 10 years, but to know that it was avoidable is hard to swallow. And I cannot express my apologies clearly or deeply enough.”

Basecamp 3 remains in read-only mode while we're fixing the problem. The current estimate for when we're back remains about one hour. Here's an update on everything we know so far: https://t.co/CUqacnabvp

— Basecamp (@basecamp) November 8, 2018

Key causes behind the Basecamp 3 outage

Every activity that a user does is tracked in Basecamp’s events table, whether it is posting a message, updating a to-do list, or applauding a comment. The root cause behind the Basecamp going into read-only mode was its database hitting the ceiling of 2,147,483,647 on this very busy events table.

Secondly, the programming framework that Basecamp uses, Ruby on Rails updated their default for database tables in version 5.1 released in 2017. This update lifted the headroom for records from 2,147,483,647 to 9,223,372,036,854,775,807 on all tables. But, the column in the database was configured as an integer rather than a big integer.

The complete timeline of the outage

Time	Activity
7:21 am CST	They ran out of ID numbers on the events table in the database because the column in the database was configured as an integer rather than a big integer. The integer runs out of numbers at 2147483647 and big integer can grow until 9223372036854775807.
7:29 am CST	The team started working on database migration where they updated the column type from the regular integer to the big integer type. They later tested this fix on a staging database to make sure it was safe.
7:52 am CST	The test done on the staging database verified that the fix was correct, so they moved on to make the changes to the production database table. Due to the huge size of the production database, the migration was estimated to take about one hour and forty minutes.
10:56 am CST-11:52 am CST	The upgrade to the database was completed, but still, verification of all the data, and configurations update was required to ensure no other problems are faced when it is back online.
12:22 pm CST	After the successful verification, Basecamp came back online.
12:33 pm CST	Basecamp went down again because of the intense load of the application was back online, which caused the caching server to get overwhelmed.
12:41 pm CST	Basecamp came back online after they switched over to the backup caching servers.

To read the entire update on Basecamp’s outage, check out David Heinemeier Hansson’s post on Medium.

Top 6 Cybersecurity Books from Packt to Accelerate Your Career

Your Quick Introduction to Extended Events in Analysis Services from Blog…

Logging the history of my past SQL Saturday presentations from Blog…

Storage savings with Table Compression from Blog Posts – SQLServerCentral

Daily Coping 31 Dec 2020 from Blog Posts – SQLServerCentral

Learning Essential Linux Commands for Navigating the Shell Effectively

Exploring the Strategy Behavioral Design Pattern in Node.js

How to integrate a Medium editor in Angular 8

Implementing memory management with Golang’s garbage collector

How to create sales analysis app in Qlik Sense using DAR…

Basecamp 3 faces a read-only outage of nearly 5 hours

Key causes behind the Basecamp 3 outage

The complete timeline of the outage

Read Next

Must Read in Web Dev

Learning Essential Linux Commands for Navigating the Shell Effectively

Exploring Forms in Angular – types, benefits and differences

Gain Practical Expertise with the Latest Edition of Software Architecture with C# 9...

Interviews

Learn Transformers for Natural Language Processing with Denis Rothman

Clean Coding in Python with Mariano Anaya

Bringing AI to the B2B world: Catching up with Sidetrade CTO Mark Sheldon [Interview]

On Adobe InDesign 2020, graphic designing industry direction and more: Iman Ahmed, an Adobe Certified Partner and Instructor [Interview]

Is DevOps experiencing an identity crisis? [Interview]

MobilePro

datapro

Programming

Subscribe to our newsletter