Impact of AWS Issues
Mezmo is no longer seeing impacts from the AWS events of today. We will continue to monitor our services and respond if we notice any abnormalities.
Home
/
Status
/
Service
Telemetry data pipeline and log analysis solutions.
Source
auto
Category
Development
Adapter
STATUSPAGE IO
Verified
Pending review
Current state
Operational
Checked 14m ago
14
Components
0
Active incidents
0
Maintenance
100%
90d uptime
Impact of AWS Issues
Oct 20, 5:00 PM
Normalized official status-page data for incidents, maintenance, components, and history.
100%
Known uptime
1 known history days
14
Components tracked
0 outage, 0 degraded
50
Incidents indexed
0 active right now
8
Maintenance windows
0 active or scheduled
Components with the most recent status-page events.
Alerting
Operational
Archiving
Operational
Destinations
Operational
Ingestion / Sources
Operational
Livetail
Operational
Component changes, incidents, and maintenance windows grouped by day.
operational
degraded
outage
maintenance
unknown
1
operational days
0
degraded days
0
outage days
0
maintenance days
89
unknown days
Latest outages and degradations detected from the official status page.
Mezmo is no longer seeing impacts from the AWS events of today. We will continue to monitor our services and respond if we notice any abnormalities.
The issues with Search in Live Tail have been resolved, and Search performance is back to normal.
This incident has been resolved.
Mezmo has made a backend change that we believe should mitigate this issue in the future. Mezmo will continue to proactively monitor the platform for any signs of recurrence. Please reach out through our normal support channels if you are still experiencing issues at this time or need further information about this incident. We apologize for the impacts this incident may have had to your services.
The issue has been resolved. The processing of the backlog is progressing, and all delayed logs will be available shortly
This incident has been resolved.
This incident has been resolved. All services are fully operational.
The Pipeline service has resumed. All services are fully operational.
This incident has been resolved.
**Dates:** Start Time: Thursday, December 12, 2024 at 19:24 UTC End Time: Thursday, December 12, 2024 at 20:54 UTC Duration: 1 hour and 30 minutes **What happened:** Log lines submitted to Mezmo for ingestion into Log Analysis were never made available in our WebUI for Searching, Graphing, and Timelines – neither during the incident, nor afterwards. This affected a significant number of accounts. Log lines were still passed through Pipeline and made available at all times within Live Tail. The log data still triggered both Telemetry Pipeline and Log Analysis based alerts, and were still archived in both places. **Why it happened:** A single pod within our indexing service ran out of disk space. This was due to a sudden increase in the volume of log lines sent to Mezmo by accounts that happened to be assigned to this pod. Our pods are configured to limit how much disk space is available for writing new data; these limits should have prevented any pod from running out of disk space, in any scenario. After the incident, we discovered that the limits had been configured incorrectly, which explains why it was possible for this pod to run out of disk space. Our service is designed to tolerate the loss of a single pod within its indexing service without any widespread impact to customers. Instead, for reasons still under investigation, the impact expanded to many other pods. Our service is also designed to retain submitted log lines, even if the indexing portion of our service cannot process them immediately; these log lines can be indexed later, when the service is functional again. In this incident, however, the log lines were never indexed. The reason for this failure is also under investigation. **How we fixed it:** We restarted the pod that had run out of disk space. It immediately had enough free disk space to accept new log lines for indexing. All other indexing pods also returned to a normal operational state. **What we are doing to prevent it from happening again:** We have properly configured our pods to prevent them from running out of disk space. This step alone should prevent any recurrence of the same problem. We have updated our monitoring to send high priority alerts when any indexing pod is in danger of running out of disk space. These alerts were in place before, but set to “low” priority; they did not come to our attention in time to prevent the incident. We will continue to actively investigate why the impact spread to other indexing pods and why log lines were not retained for indexing in the future.
Scheduled and completed maintenance windows are separated from incidents.
The scheduled maintenance has been completed.
The scheduled maintenance has been completed.
The scheduled maintenance has been completed.
The scheduled maintenance has been completed.
The scheduled maintenance has been completed.
The search issue has been resolved for now. We'll continue to work on the improvements.
The scheduled maintenance has been completed.
The scheduled maintenance has been completed.
Uptimus tracks the official Mezmo status page, normalizes upstream events, and separates incidents from scheduled maintenance.
Official source
https://status.mezmo.com
Adapter
STATUSPAGE IO
Alert streams
Incidents, component changes, and maintenance windows.
Public SEO page
Indexable status history for users searching outage information.
Regional reports can be layered on top of official provider status when user signals are available.
Showing 1 to 14 of 14 tracked components.
| Component | Status | Type | Last changed |
|---|---|---|---|
Log Analysis | Operational | Group | Not recorded |
Pipeline | Operational | Group | Not recorded |
Ingestion / Sources | Operational | Component | Not recorded |
Log Ingestion (Agent/REST API/Code Libraries) Log ingestion endpoint for all agent traffic, REST API and traffic from various code libraries such as Node.js, Ruby, Python, etc. | Operational | Component | Not recorded |
Log Ingestion (Heroku) Log ingestion endpoint for all Heroku traffic. | Operational | Component | Not recorded |
Processors | Operational | Component | Not recorded |
Destinations | Operational | Component | Not recorded |
Log Ingestion (Syslog) Log ingestion endpoint for all syslog traffic. | Operational | Component | Not recorded |
Web App | Operational | Component | Not recorded |
Web App | Operational | Component | Not recorded |
Search | Operational | Component | Not recorded |
Alerting | Operational | Component | Not recorded |
Livetail | Operational | Component | Not recorded |
Archiving Archiving service for logs. | Operational | Component | Not recorded |
Follow outages, degraded components, and maintenance updates in your Uptimus workspace with email, push, and webhook alerts.
Official provider components
Incident and maintenance separation
Workspace alerts and webhooks
Related status pages based on category, adapter type, and operational history.
Mezmo is currently marked as Operational in Uptimus based on the latest official status page check.
Supported status page providers are checked continuously by our scraper scheduler. The public page is cached briefly for SEO and performance.
No. Uptimus stores incidents and maintenance windows separately when the upstream provider exposes enough detail.
Yes. Create an Uptimus workspace, follow this provider, and choose email, push, or webhook notifications.