DocsBot AI - Bot Training DB Errors – Incident details

Bot Training DB Errors

Resolved
Operational
Started 17 days agoLasted 3 days

Affected

Website

Partial outage from 1:08 AM to 4:26 AM, Operational from 4:26 AM to 3:47 PM, Partial outage from 3:47 PM to 5:27 PM, Degraded performance from 5:27 PM to 6:34 PM

Updates
  • Resolved
    Resolved
    This incident has been resolved.
  • Monitoring
    Monitoring

    The reboot seems to have solved the DB issues. A small number of new bots created during the outage may be broken, we are recreating those now.

  • Update
    Update

    Our cloud provider is having trouble finding the root cause, but they are rebooting the DB for now as a temporary fix (this worked last week for the same issue).

  • Identified
    Identified

    The incident seems to have recurred, impacting training sources. We are communicating with our cloud provider.

  • Update
    Update

    Note, if any sources were added or refreshed during the outage, they may be temporarily stuck in a failed or queued state. You can simply click Retry if available, wait a bit for stuck sources to timeout, or simply add the same source again and delete the old one.

  • Resolved
    Resolved

    We triggered a restart and minor update of the DB cluster, and that seems to have fixed the write issues. We are currently monitoring the result, and awaiting a full postmortem from our cloud provider.

  • Investigating
    Investigating

    We have reports and our monitoring is showing problems with our vector database provider when creating or updating sources.

    This may show as timeouts, or strange errors like: class TenantDocument has multi-tenancy enabled, but request was without tenant

    We are currently investigating this incident and are in contact with our cloud provider.