December 22nd BlackBerry Outage Timeline and Overview of Events, According to BoxTone

BlackBerry Outage according to BoxTone
By Kevin Michaluk on 23 Dec 2009 02:10 pm
0
loading...
13
loading...
0
loading...

Our friends at BoxTone hit us up with an email earlier today with some information of how things played out yesterday from both their and their customers' perspective. We've talked about BoxTone lots of times here before on CrackBerry - they focus on enterprise companies that have a LOT of BlackBerry Smartphones running and provide monitoring and managements services. An afternoon like yesterday is where their services really make an impact. Here's what they told us (all times EST):

  • Between 3:00 and 4:00 PM - Problems with BBM and BIS internet browsing reported around the web (I personally experienced around 3:30 PM).
  • Between 6:30 and 7:00 PM - The problem extended to BES email, preventing the delivery of BES emails to and from BlackBerry smartphones. At each of our customers, BoxTone detected a greater than normal quantity of users with messages pending, based on our learned baseline of what is normal for each server and carrier, and immediately generated a warning alert our customers before the flood of user calls (Sample email alert below). BoxTone also placed all affected BES and Carriers in a Critical state on our customers' Operations Dashboards (depicted by the red dots next to each BES and carrier). The steady growth in Pending Messages beginning around 6:45 is annotated in the attached screenshot and continued until the issue was resolved early this morning. From our monitoring data, it appears that BES were able to communicate with the RIM NOC throughout the outage; however, the NOC was unable to deliver messages.
  • At approximately 12:09 AM, BoxTone detected a brief disconnect in the SRP connection of each BES to the NOC; it appears RIM reset the NOC SRP connection to complete their fixes. Following this reset, delivery of BES mail resumed.
  • By 2:45 AM or earlier, BoxTone detected that most of our customers had returned to their normal (baselined) service levels, and that the backlog of pending mail had been delivered. BoxTone generated notifications informing our users that their service levels had returned to normal and updated the status of the BES and carriers to Normal.

Pretty interesting stuff. Big thanks to Mitch and the BoxTone team for sharing it with us. You can learn more about their services at BoxTone.com.

Kevin Michaluk Kevin Michaluk "Founder, Editor in Chief " 3921 (articles) 3285 (forum posts)
28 comments

krssosic

so it was less than 12 hours....I was affected for about about 8 since I don't use my phone at 1am...I don't know what the big deal is. Unless you're like Corporate America or OBAMA who lives on their phone, everyone should just stop complaining, it's up and running now.(wishful thinking...) =|

webmastir

that's crap. we are paying a crazy amount of $$$ for this unstable service!

shaleem

No, the real crap is no one is forced to use a particular service. Anyone who thinks that ANY service is 100% trouble free is just plain unrealistic. If you don't like what you're paying for, then leave. Like I said in a different post, there ain't no anchor on your behind!

jmaher1023

I see your point, shaleem, but you are wrong on one account: there is a "2 year" anchor called a service contract that holds our behinds down on this. After all, what's the use of having a Blackberry if you don't have the email, internet, and messaging support to go with it! That being said; in my past 3 years of using Blackberries, this has only happened twice that I can remember...so really, I don't see what all the fuss is about!

hardyhar99

I see your point, jmaher1023, but you are also incorrect: the "2 year" anchor called a service contract is just that...a contract. You do not have to accept the contract terms, you only do because you want the deeply discounted price for that device. This is simply capatalism and an open market at its best.

To that, happy holidays and a prosperous (and BB troublefree) new year!!!

tarund

from BoxTone about "communicating" with their customers. I believe BlackBerry users would've been far more calm if RIM/BlackBerry was keeping open comm with its users.

Two outages within 7 days - that is very unlikely but it shouldn't happen. Hey RIM, maybe you should invest more time and resource into Development\QA\Staging environments and then roll-out patches/upgrades to Production environments?

MobiJew

agreed. nothing else to say. jsut agrred :)

brianklewis1

Just one of the pitfalls of being on the cutting edge of technology. I'm happy so far with my BlackBerry. For those who aren't, maybe a free flip phone would have been a safe investment.

sunrisepromo#CB

Stop complaining! It could be worse...

You could have had an iPhone :P

Snypa

i totally agree. BB's are awesome. Glad everything is back.!

Laura Knotek

iPhones, Droids, and Win Mobile devices did not lose their data services.

MobiJew

thats y i love my droid :)

Neosurfur

He isn't saying that the iPhone lost it's service during the BB outage. He's referring to the crappy coverage of the iPhone in general.

mattvait

no at&t just asks their customers to not use their data services....

ecfinn

I started getting new email sometime mid-morning today. I didn't get all of my backlogged email until at least 2PM today. So I had email undelivered for almost 20 hours. I'm glad its back too, but I'm going to disagree with when things got back to normal.

Hengonz81

Blackberry internet is very slow....why is Verizon Wireless Droid 10X faster with internet????

jmaher1023

Like most other smart phones, the Droid connects directly to the web via the carriers broadband access. BBs, on the other hand, are routed through RIM's servers before being granted access to the web. That's why your carrier will have a data package for WinMo, Palm and Android smart phones, and then a separate RIM data package for BBs. I will say this though: Verizon, thanx to RIM's unlimited BB data package for $30 a month, dropped their unlimited data package for other smart phones from $45 a month to $30! So, good job RIM! ;-)

P.S.~ Try using Opera Mini, it seems to load pages faster than the BB web browser.

sk8er_tor

The new browser with OS 5 is much faster than the old browser with OS 4.x.

twester65

I was so thankful last night that I didn't take that operations job with RIM. Outages happen, but total collapse of a network is not something to live through more than once.

bmike-vt

odd but i was able to use opera mini to check the web and snag my email from various webmail servers... but bb browser nor my bb email would connect. beweather was sporadic in updating... i figured if bis was down i wouldn't be able to use opera. glad that i had it.

mhonard

Of course I was installing the new 9700 OS and updating my BES connection when all this hell broke loose. In a panic (as I need BES over the holidays)I downgraded my OS and re-establised my BES connection only to find out all my problems were the result of this outage. What a waste of time and energy !!!!!!!!!!!!!!!!!!

archpopoy

My BBMs been acting up again..... browser failing at times...

theroadwarrior

My 9700 (on BES) started acting up again too! Just around 5PM PST (12/23/09), my browser can't find the server & email (outgoing) messages have the pending clock status icon.
I am in the SF Bay Area. This a carbon copy of what took place yesterday. I was in Portland 12/22/09.
Really Sux.

Snarfler

at 12:03 AM EST on 12/23/09, crackberry hit a new all time high of 17,688 users online

machine3

Sure there's an anchor! What do you think the Early Termination Fee is and life vest?

LJBB

I was wondering why it wasn't working yesterday morning.

bobaloo

While the outage and its timing was frustrating for me while I took off work to get Christmas shopping finished, I don't think it's worth cursing RIM and threatening to spread their shortcoming at the top of my lungs.

Yes, it sucks to be incommunicado, but the percentage of uptime versus downtime is almost nominal. I mean, if RIM were out for three full 24-hour days, it's still less 1% downtime over the course of a one-year period. At least we were able to make phone calls and use SMS during the outage.

Just yesterday my power went out at work because a ComEd line broke loose from the ice buildup. Sometimes things happen that are beyond our control. What's important is how the correction is handled. Just like ComEd in that case, RIM reacted promptly to a large scale problem.

I'm sure there were others who were more critically inconvenienced than I was, but that's my opinion.

R1cE

Outages happen. But good job to RIM for fixing the problem swiftly. Every network is going to run into bugs, big or small.