Gemini App in Workspace and Gemini Side Panel (Gemini in Workspace) customers are experiencing “Something Went Wrong” errors.
# Incident Report ## Summary On Wednesday, June 10, 2026, at 03:30 US/Pacific, an incident occurred affecting the availability of Gemini services across all surfaces, including web, mobile (iOS and Android), and integrations within Google Chrome. The incident resulted in an elevated error rate for approximately 6 hours and 55 minutes, with full service restoration confirmed after a total duration of 14 hours and 49 minutes. We sincerely apologize for the disruption this incident caused to your business. We know how much you rely on Google Cloud, and we regret the impact on your productivity. We are working to address the root cause and prevent this from occurring in the future. ## Root Cause The disruption was caused by extreme read contention within the foundational database service that manages tool deployment metadata. The primary cause was an index design issue within the database where a column used for tracking deployment expirations contained a high volume of rows with similar values. This resulted in "hotspotting," where traffic was concentrated on a very small number of database shards, overwhelming them. Additionally, the service utilized an in-memory cache with a very short Time To Live (TTL) of only one minute, requiring frequent database refreshes. The failure sequence was as follows: 1. **Technical Trigger:** The trigger was an upward spike in the Queries Per Second (QPS) from the frontend. The underlying system was already running close to maximum usage and the incoming QPS crossed a tipping threshold beyond which the underlying spanner started getting overloaded. 2. **Systemic Effect:** This short window, combined with an indexing anomaly where specific metadata fields containing empty values were clustered on a single database partition, led to a "hot shard" condition. 3. **Contention and Latency:** As request volume increased, the internal tool management service experienced a more than 10x surge in database calls because the cache hit rate dropped significantly. 4. **Failure State:** The underlying storage layer reached its capacity for handling incoming queries, and database failure rates skyrocketed to 60%. This led to cache hit rate dropping down to 50%. Automated validation failed to prevent this because the indexing strategy did not sufficiently account for the concentration of data rows when specific identifiers were absent. ## Remediation and Prevention The issue was detected by internal real-time monitoring alerts indicating high processing error rates. Engineering teams performed several targeted actions to restore the service: * **Throttling:** Applied Rate Access Control Lists (RateACLs) to the most affected database shards to allow internal caches to recover. * **Index Redistribution:** Gradually redistributed common values in the database index to random values in a specific target range, effectively spreading the load across multiple shards. * **Cache Policy Update:** Deployed a configuration change to increase the in-memory cache Time To Live (TTL) from 1 minute to 20 minutes, significantly reducing database load. * **Deadline Adjustment:** Shortened the Remote Procedure Call (RPC) deadlines for database queries to prevent request pile-ups during periods of high latency. ## Prevention To prevent a recurrence, Google engineering will be implementing the following actions: * **Index Design Standards:** Restructure of database index to prevent "hotspotting". * **Adaptive Cache Tuning:** Implement more robust cache policies that prevent load amplification during periods of backend stress. * **Enhanced Hotspot Monitoring:** Deploy improved monitoring and alerting specifically designed to detect uneven database load distribution before it impacts user traffic. * **Service Resiliency:** Implement advanced load protection mechanisms, such as smart task resizing and query coalescing for preventing spanner overloads. ## Detailed Description of Impact The impact window occurred between 03:30 US/Pacific and 10:25 US/Pacific on June 10, 2026\. **Gemini App (Web and Mobile):** * High Failure Rates: Users experienced a 50% error rate when attempting to send prompts, receiving "Something went wrong" messages . * Service Unavailability: The issue affected both consumer accounts and enterprise Google Workspace customers . * Enterprise Google Workspace Users encountered disruptions when attempting to attach assets from Google Drive within the web application (gemini.google.com). Affected users observed the "Add from Drive" functionality to be visually disabled and unresponsive following interaction with the "+" prompt. **Integrations:** * **Google Chrome Integration:** The Gemini Side Panel and integrated conversational features encountered persistent timeouts.