Status

Charter (Spectrum) Outage Oct. 24-26 Report

Beginning on October 24th at 02:36:56 PDT during a scheduled maintenance window, our circuit to Charter (Spectrum) AS20115 entered loss of service state and was not restored until 57 hours later. Roller Network is disappointed with the delayed response from Charter (Spectrum) to an issue that was created by their own maintenance activities, and the failure of the maintenance group to ensure circuits they removed from service are restored following such activities.

  • At 10:02 we called to notify Charter (Spectrum) that our circuit never recovered following maintenance, in which Charter (Spectrum) intentionally placed the circuit into a loss of service state. We were informed that Charter (Spectrum) will not individually troubleshoot our issue because there was a possible related outage and referenced ticket 50233491.
  • At 14:45 Oct. 24 we again called asking for an ETA on handling out outage. No ETA was given, however we insisted that a ticket was opened and linked to ticket 50233491 (ticket 50234369). We were assured that someone would follow up with us (this did not happen).
  • The following day at 09:26 on Oct. 25 we called to inquire 1) why our circuit was still down and 2) why nobody had followed up with us. We were informed that no followup was made because their notes said the circuit was restored the previous night around 9pm. However, it was not actually restored, and we suggested that ideally someone should have contacted us since we had an open ticket and asked us if it was indeed restored.
  • At 12:16 Oct. 25 we called again to inquire on the status. We were informed that according to the notes nobody had looked at it yet since our last call at 09:26. No information as to why not. At this point the circuit has been down for 33 hours.
  • At 12:48 Oct. 25, a full 34 hours after first loss of service, we finally received a callback asking us to verify site power (which of course we have power) before they can send a tech out.
  • At 13:34 Oct. 25 we received a call from a tech indicating they were en route.
  • At around 15:10 Oct. 25 the tech came to the conclusion that the reason for our ongoing outage was that our circuit was migrated to a new core router, however any and all related configuration was discarded with the migration, specifically all of our BGP configuration for both IPv4 and IPv6, which without BGP the circuit is useless.
  • At 15:21 Oct. 25 we placed BGP neighbors into “shutdown” state because if Charter (Spectrum) maintenance or whoever is responsible for such work deleted our configurations, they would have to recreate it and we would require an audit on their new configuration before we can restore BGP in a controlled manner since the circuit can no longer be trusted.
  • At 21:20 Oct. 25 we stopped actively requesting updates while waiting for Charter (Spectrum) to pass the request to whatever group handled new configurations since Charter (Spectrum) maintenance failed to include migration of existing configurations in their process. However, a decision was made overnight by Charter (Spectrum) to un-migrate our circuit back to its original router and original configuration rather than attempt to migrate our configurations to the new router that maintenance performed to cause the loss of service condition.
  • At 11:51 on Oct. 25 we were finally able to obtain a confirmation in writing from Charter (Spectrum) that our circuit was un-migrated and the original configurations that we had last audited with Charter (Spectrum) on August 3rd, and we returned our BGP neighbors to active state. The total outage duration was 2 days, 9 hours (57 hours) from first loss of service to final confirmation that the circuit was restored to its pre-migration condition.

Service was ultimately restored after 57 hours, however internally Charter (Spectrum) does not recognize this since it overlapped two maintenance windows. Since the circuit was physically restored to a location that it was intentionally moved away from, we fully expect Charter (Spectrum) to make a second attempt at a maintenance window for another migration. Whether or not Charter (Spectrum) will be able to perform this task correctly remains to be seen.

Roller Network disagrees with Charter (Spectrum)’s position that “maintenance” is not responsible for failing to return a circuit to service, and we further assert that whether or not an outage is planned – in this case clearly poorly planned – performing maintenance is still an outage. The sole difference is contractual as to what refunds may be owed or whether or not such could be considered as default of contract. Our circuit went into loss of service state directly due to “maintenance” and was not returned to service, thus “maintenance” is the root cause. From a customer service perspective the ethical course of action would be to cancel any future maintenance and revert all changes performed for failing to complete such within its designated window, rare or not (it was argued that doing so is unnecessary because maintenance failing to successfully complete a task is a “rare” occurrence). Roller Network does not believe it is a customer’s responsibility to make sure “maintenance” performs their job(s) correctly.

Editorial Note: This incident highlights why working with a small business like Roller Network is better than a large company. At no time did our account manager (who was CC’d on all correspondence) offer to step in to help or escalate our case, nor did they follow up to see if our issue was being handled properly. Charter (Spectrum)’s maintenance group, the group one would expect to know exactly what they did to break our circuit, disregarded our issue as a problem for another group since it ceases to be their problem past 6AM even if they fail to restore it working condition by that time. At Roller Network, we do not pass blame between departments, and we always strive make sure our customer’s are in working order – it’s literally our job. Our business with Charter (Spectrum) was treated as unimportant and ultimately irrelevant to them. Charter (Spectrum) is only interested in securing new business for short term gains, disregarding the long term interests of their customers. And that’s the biggest point we can make in our favor: as a small business, when you work with Roller Network you are important to us as an individual on an ongoing, long term basis.

SA 3.4.0 on mail

Updates have been applied to mail.rollernet.us which include bringing SpamAssassin up to version 3.4.0. No issues were observed with mail2.

SA 3.4.0 on mail2

Updates have been applied to mail2.rollernet.us which include bringing SpamAssassin up to version 3.4.0. We’re going to wait before doing the same thing to mail.rollernet.us in case there’s any underlying problems that show up outside of testing.

Emergency Maintenance Notice for Generator 1

Initial Notice: March 1 2017 @ 14:46

Generator 1 serving colocation Suite 1 and our business offices has experienced an engine controller fault which alerted through active monitoring. Diagnostics have been completed and a replacement controller will be expedited from Salt Lake City, but is not anticipated to arrive until Thursday, March 2. Suite 1 AC UPS runtime is approximately 80 minutes in the event of utility power failure.

Generator 2 serving colocation Suite 2 is a separate, unaffected system.

If you have any questions about your colocation please contact us through normal channels.

Progressive updates will be posted here.

UPDATE Thu Mar 2 16:10:04 PST 2017: The engine controller on generator 1 has been replaced and successfully tested.

Mail: New URIBL Restrictions on SMTP AUTH

Starting on December 19th we’re no longer going to accept mail submitted to our SMTP AUTH system with a URI classified as black, grey, red, or gold on URIBL. These categories contain domains that are either actively used by spammers or found in unsolicited bulk mail. This is currently being treated as a trial change.

The “grey” category is a special case. It contains domains that are used by bulk or commercial mail, which may not be spammers in the strict sense, but are nevertheless against our policy for SMTP AUTH submissions (no bulk or marketing mail).

Our goal is quality of mail through our submission system in order to maintain a high reputation for our customers that depend on us for routine communication. Because we are not a “bulk mail” outfit we are unfortunately unable to accommodate bulk mail use cases, and it would be more appropriate to use the mailer’s service to send bulk mail instead of ours.

If you find you’re having trouble submitting messages after this change, contact support so we can investigate. Marketing-type mail should be sent directly to the intended recipient and not rely on forwarding or resending.