Service Outage: We’re Working on It

Incident Report for LoanNEX

Postmortem

Azure Front Door Service Disruption – October 29-30, 2025

Service Disruption – October 29, 2025

Between 11:03 AM CST October 29 and 6:05 PM CST October 29, 2025, customers experienced intermittent latencies, timeouts, and errors when accessing our services. Our AFD instance was affected from 11:03 CST until 3 PM CST with intermittent connectivity issues until 4:30 PM CST.

Root Cause: An upstream infrastructure provider (Azure Front Door) experienced a configuration deployment failure that caused widespread node failures. This directly impacted our service availability as we rely on this infrastructure for global content delivery.

Resolution: Our engineering team worked with the infrastructure provider throughout the incident. The provider blocked further configuration changes at 11:30 AM CST and deployed a last known good configuration starting at 11:40 AM CST. Phased recovery was completed by 6:05 PM CST.

Impact: Customers experienced intermittent service disruptions including slow page loads, timeout errors, and connection failures during the incident window. Service availability has been fully restored to normal levels.

Next Steps: We are reviewing our dependencies and implementing additional monitoring and failover capabilities to reduce impact from similar upstream incidents. A detailed internal retrospective is completed an available upon request.

Posted Oct 29, 2025 - 19:55 CDT

Resolved

This incident has been resolved.
Posted Oct 29, 2025 - 19:47 CDT

Update

Update - 5:20 pm CST, 29 October 2025

Status: Monitoring
We continue to closely monitor platform performance. Services remain accessible with only minor intermittent issues as Azure completes their infrastructure recovery.

Microsoft's expected full resolution remains 23:20 UTC. We will provide another update within 2 hours or if conditions change.
Thank you for your continued patience.
Posted Oct 29, 2025 - 17:26 CDT

Monitoring

Here's a status update reflecting your current state:

Update - 3:00 pm CST, 29 October 2025

Status: Monitoring - Partial Recovery
Our platform is now responding following Azure's configuration deployment. However, we are continuing to monitor closely as the full recovery process is still underway.

Current State:
✓ Core services are accessible
⚠️ Intermittent issues may still occur as Azure completes node recovery
🔄 Performance stabilization in progress

What to Expect:
You should be able to access our platform, but may experience occasional slowness, timeouts, or temporary errors as Azure continues routing traffic through recovering infrastructure. Service quality will continue improving as Microsoft completes its mitigation efforts.

Azure's Timeline:
Microsoft expects full resolution by 23:20 UTC (approximately 4 hours from now). We will continue monitoring throughout this period and will confirm full stability once Azure completes their recovery.

Next Update:
We will provide another update within 1 hours, or immediately if we observe any significant changes in service stability.
Thank you for your patience as we work through this recovery period.
Posted Oct 29, 2025 - 15:06 CDT

Identified

Update - 2:28 CST, 29 October 2025

Status: Observing Recovery
Good news - Microsoft Azure has successfully deployed their stable configuration and recovery is now underway.

Current Progress:
✓ Last known good configuration deployed successfully
✓ Azure Portal access restored (direct routing active)
🔄 Node recovery and traffic rerouting in progress
🔄 Service improvements ongoing as healthy nodes come online

What You're Seeing:
We are still experiencing stability issues. These should gradually resolve as Azure continues recovering infrastructure nodes. We will keep you updated.

Expected Full Resolution:
Microsoft anticipates complete mitigation by 23:20 UTC (approximately 4 hours from now) as they finish recovering all affected nodes.

Known Limitations:
Configuration changes remain temporarily blocked on Azure infrastructure
Some Azure Portal extensions (e.g., Marketplace) may still experience intermittent loading issues

Next Update:
We will provide another update within 1 hour or sooner if significant progress is made.
We appreciate your patience as recovery continues. Service quality should continue improving steadily over the next several hours.
Posted Oct 29, 2025 - 14:29 CDT

Update

Here's the updated status message:

Update - 1:18 CST, 29 October 2025
Microsoft Azure has initiated deployment of their last known good configuration and expects initial signs of recovery within approximately 30 minutes.

Recovery Timeline:
Phase 1 (In Progress): Deploying stable configuration - Expected completion ~17:50 UTC
Phase 2 (Next): Node recovery and traffic rerouting to healthy infrastructure

Current Impact:
Our services remain affected during this recovery process. You may continue to experience access issues or degraded performance until Azure completes both recovery phases.

Restrictions:
Microsoft has temporarily blocked all customer configuration changes to its infrastructure while it completes the rollback. This restriction will remain in place until full mitigation is achieved.

What to Expect:
Service stability should begin improving within the next 30-45 minutes as Azure's deployment completes. We will monitor closely and provide another update once we confirm meaningful recovery.

Thank you for your continued patience.
Posted Oct 29, 2025 - 13:19 CDT

Update

Update - 12:33 CST, 29 October 2025
Microsoft Azure has identified the root cause as an inadvertent configuration change to Azure Front Door (AFD) services starting at approximately 16:00 UTC.

Current Status:
Microsoft is actively rolling back to the last known good configuration
Azure Portal access has been restored via failover routing
Our services remain impacted while the AFD rollback is in progress

What This Means:
You may still experience intermittent issues or degraded performance as Microsoft completes its infrastructure rollback. We are closely monitoring the situation and will restore full service as soon as Azure's remediation is complete.

Next Update:
Microsoft has committed to providing updates within 30 minutes. We will post our next update shortly after receiving additional information from Azure.

We appreciate your patience as this incident is resolved.
Posted Oct 29, 2025 - 12:34 CDT

Update

We are continuing to investigate this issue.
Posted Oct 29, 2025 - 11:34 CDT

Investigating

Service Disruption - Azure Infrastructure Issue
Status: Investigating
Last Updated: 11:18 AM, 29 October 2025
We are currently experiencing service disruptions due to an ongoing issue with Microsoft Azure Portal infrastructure. This is affecting our ability to deliver normal service levels.

What's Happening:
Microsoft Azure is experiencing portal access issues affecting multiple services, including LoanNEX. Our platform relies on Azure infrastructure, and we are directly affected by this outage.

Impact:
You may experience difficulties accessing our platform or encounter degraded performance during this time.

What We're Doing:
Actively monitoring the situation with Microsoft Azure
Our technical team is on standby to restore full service immediately once Azure resolves their infrastructure issues
We will provide updates as more information becomes available

We understand this disruption is frustrating and sincerely apologize for the inconvenience. This issue is beyond our direct control, but we remain committed to restoring service as quickly as possible.
For real-time updates on the Azure issue, you can visit: Azure Status Page
https://azure.status.microsoft/en-us/status

We will continue to post updates here as the situation develops.
Posted Oct 29, 2025 - 11:34 CDT
This incident affected: Integrations (LoanNEX API, Pricing Parser Bot, Partner Integrations), LoanNEX Qualifier (LoanNEX Qualifier (NexApp), LockIt (Lock services)), Notifications (Email Activity), and Web Applications (Quick Quote, Client Admin).