Try now Demo en
  • en
  • fr
  • de
  • Solutions
    • Use cases
      • Modern IntranetBuild strong internal culture & sense of belonging
      • Collaboration PlatformEfficient teamwork and project collaboration
      • Social NetworkEngage users & recognize contributions
      • Knowledge hubCentralize, organize and share company knowledge
      • Application PortalUnified access to business applications and information
    • Switch to eXo
      • Microsoft 365 AlternativeAn open-source digital workplace alternative to M365
      • Migrate to eXo PlatformA guided, secure migration path from your existing tools to eXo
  • Product
    • Overview
      • Platform overviewExplore core capabilities
      • Why eXoKey differentiators
      • InternationalisationMultilingual environments
      • MobileBranded mobile applications
    • Platform
      • No CodeTailor to your needs without code
      • IntegrationsConnectors & extension capabilities
      • Controlled AIGoverned, extensible AI
    • Technology
      • ArchitectureArchitecture & technology
      • SecuritySecurity measures
      • Open sourceComponents & licensing
  • Offers
    • EnterprisePrivate cloud or on-premise deployments
    • eXo HubsReady-to-use SaaS edition for teams
    • Compare EditionsCompare editions and choose the right fit
    • OEM EditionFor software vendors & service providers
    • ServicesDiscover eXo professional services
  • Resources
    • Resource center
      • Case studies
      • White Papers
      • Datasheets
      • Videos
    • FAQsAbout the software, the community and our offers
      • Platform & Use Cases
      • AI & Responsible AI
      • Deployment, Security & Compliance
      • Open-source, Pricing & Services
    • From The Blog
      • eXo Platform 7.1 is released
      • Digital sovereignty: when public organizations move from words to action
      • Cloud Vs On-premise Digital Workplace: Which one is right for your business?
  • Community
    • CommunityJoin our online community platform
    • DownloadLaunch eXo platform in your infrastructure
    • Source codeSource code on github
    • REST APIs & DocumentationAll REST APIs available in eXo Platform
  • Company
    • About us
    • Customers
    • Partners
    • Contact us
    • Newsroom
  • Menu mobile
    • Enterprise Offers
    • Blog
    • About us
    • Resource center
    • Careers
    • Contact us
    • Try eXo
Use cases
  • Modern Intranet Build strong internal culture & sense of belonging
  • Collaboration Platform Efficient teamwork and project collaboration
  • Social Network Engage users & recognize contributions
  • Knowledge hub Centralize, organize and share company knowledge
  • Application Portal Unified access to business applications and information
Switch to eXo
  • Microsoft 365 Alternative An open-source digital workplace alternative to M365
  • Migrate to eXo Platform A guided, secure migration path from your existing tools to eXo
Overview
  • Platform overview Explore core capabilities
  • Why eXo Key differentiators
  • Internationalisation Multilingual environments
  • Mobile Branded mobile applications
Platform
  • No Code Tailor to your needs without code
  • Integrations Connectors & extension capabilities
  • Controlled AI Governed, extensible AI
Technology
  • Architecture Architecture & technology
  • Security Security measures
  • Open source Components & licensing
Enterprise Private cloud or on-premise deployments
eXo Hubs Ready-to-use SaaS edition for teams
Compare Editions Compare editions and choose the right fit
OEM Edition For software vendors & service providers
Services Discover eXo professional services
Resource center
  • Case studies
  • White Papers
  • Datasheets
  • Videos
FAQs About the software, the community and our offers
  • Platform & Use Cases
  • AI & Responsible AI
  • Deployment, Security & Compliance
  • Open-source, Pricing & Services
From The Blog
  • eXo Platform 7.1 is released
  • Digital sovereignty: when public organizations move from words to action
  • Cloud Vs On-premise Digital Workplace: Which one is right for your business?
Community Join our online community platform
Download Launch eXo platform in your infrastructure
Source code Source code on github
REST APIs & Documentation All REST APIs available in eXo Platform
About us
Customers
Partners
Contact us
Newsroom
Enterprise Offers
Blog
About us
Resource center
Careers
Contact us
Try eXo
  1. Accueil
  2. Uncategorized
  3. eXo Community Outage: Post Mortem and What You Can Learn

eXo Community Outage: Post Mortem and What You Can Learn

What happened…

In the early morning of Friday, January 17th 2014, we experienced an outage of three services: eXo Community, eXo Blog and eXo Documentation. The services were fully restored at 2:00 pm PST on Friday. Unplanned downtime of any length is unacceptable to us. In this case we fell short of both eXo Tribe’s expectations and our own.

For 12 hours, we worked flat out to restore full access as soon as possible. Though we have shared some brief updates along the way, we owe you a detailed explanation of what happened and what we’ve learned.

Page Load Times
01-Page-Load-Times

Downtime Event
02-Downtime-Event

At 2 am, Friday January 17th 2014, the server hosting our three services, eXo Community, eXo Blog and eXo Documentation, crashed. We immediately detected the problem and tried to reset the server. Nevertheless, it could not be started. After quickly considering the server incident, we decided to migrate its storage to a new server. After three hours we succeeded in restoring two services: eXo Blog and eXo Documentation. Unfortunately, despite our best efforts, the crash had damaged eXo Community data.

To restore service as fast as possible, we performed recovery from our latest backups. We were able to restore most functionality within 2.5 hours, but during the data restoration we detected that the hard disk driver where the Community database was stored was erroneous. Its write and read speed was very slow. A quick disk check was launched and detected some bad sectors on the hard disk. We immediately replaced the broken hard disk with a new one and re-launched the data restoration. The above problem slowed the recovery process, and it took until 2 pm PST Friday for eXo Community service to fully return.

And what we’re doing about it…

Regularly check states of whole servers

Over the past few months our infrastructure has grown rapidly to support thousands of users. We routinely upgrade and repurpose our server. We have also been using several monitoring tools to supervise and monitor the states of servers. Nevertheless, we did not perform regular deep system diagnostics. Also, a quick system check should be launched before a data restoration in order to ensure that the whole system returns in a healthy state. These points will always be kept in our mind after this incident.

Faster disaster recovery

When running infrastructure at large scale, the standard practice of running multiple replicas provides redundancy. However, should those replicas fail, the only option is to restore from the latest backup. The standard tool used to recover MySQL data from backups is slow when dealing with large data sets.

To speed up our recovery, we are going to write a tool that parallelizes the replay of binary logs. This enables much faster recovery from large MySQL backups.

We know that you rely on eXo Community to get things done, and we’re very sorry for the disruption. We wanted to share these technical details to shed some light on what we’re doing in response.

Thanks for your patience and support.

Also, feel free to ping the team on the eXo Community website if you wish to know more.

Tung Tran

Full-featured digital workplace with everything your employees need to work efficiently, smartly integrated for a compelling employee experience

  • Product
    • Software tour
    • Internationalisation
    • Mobile
    • No Code
    • Architecture
    • Integrations
    • Security
    • Open Source
  • Uses cases
    • Employee Portal
    • Knowledge management
    • Entreprise Social Network
    • Employee Engagement
    • Community Management
    • Extranet
  • Guides
    • What is a digital workplace?
    • Intranet guide
    • What is an extranet?
    • Employee engagement
    • Collaboration guide
    • Teamwork guide
    • Internal Communication guide
  • Enterprise
    • Product offer
    • Services Offer
    • Customers
    • About us
  • Resources
    • FAQs
    • Resource Center
    • Intranet Portal
    • What Is a Collaboration Software?
    • Talent Management
    • Employee Connection
    • Employee Intranet
    • Improve internal communication
    • eXo Tribe
  • Terms and Conditions
  • Legal
  • Privacy Policy
  • Accessibility
  • Contact us
  • Sitemap
  • Facebook
  • Twitter
  • LinkedIn
wpDiscuz