Errors
The Errors collection provides a centralized system for tracking, analyzing, and resolving errors across the platform. It helps developers and operators understand issues, prioritize fixes, and improve system reliability.
Overview
Errors are captured automatically throughout the platform, providing detailed information about exceptions, failures, and unexpected conditions. The Errors collection offers:
- Comprehensive error tracking
- Error grouping and classification
- Frequency and impact analysis
- Integration with debugging tools
- Resolution tracking
Error Structure
Each error record contains detailed information about what went wrong:
interface Error {
id: string
timestamp: Date
type: string
message: string
stack?: string
source: {
service: string
file?: string
function?: string
line?: number
column?: number
}
context: {
user?: string
request?: {
method?: string
url?: string
headers?: Record<string, string>
body?: any
}
environment: string
version?: string
}
metadata?: Record<string, any>
count: number
firstSeen: Date
lastSeen: Date
status: 'open' | 'acknowledged' | 'resolved' | 'ignored'
priority?: 'low' | 'medium' | 'high' | 'critical'
assignee?: string
}
Error Types
Errors are categorized into several types:
Runtime Errors
Errors that occur during application execution, including:
- Uncaught exceptions
- Promise rejections
- Timeouts
- Memory issues
API Errors
Errors related to API interactions:
- Invalid requests
- Authentication failures
- Rate limiting
- Integration failures
Validation Errors
Errors resulting from invalid data:
- Schema validation failures
- Constraint violations
- Type mismatches
System Errors
Errors at the infrastructure level:
- Resource exhaustion
- Configuration issues
- Dependency failures
Working with Errors
Querying Errors
Errors can be queried using various filters:
// Example: Find all critical errors in the production environment
const criticalErrors = await Errors.find({
priority: 'critical',
'context.environment': 'production',
status: 'open',
})
Error Grouping
Similar errors are automatically grouped to reduce noise and help focus on unique issues:
// Example: Get error groups with count > 10
const frequentErrorGroups = await Errors.aggregate([{ $group: { _id: '$type', count: { $sum: '$count' } } }, { $match: { count: { $gt: 10 } } }, { $sort: { count: -1 } }])
Error Resolution
Track and update the status of errors:
// Example: Mark an error as resolved
await Errors.updateOne(
{ id: 'error-123' },
{
$set: {
status: 'resolved',
resolvedAt: new Date(),
resolvedBy: 'user-456',
resolution: 'Fixed in PR #789',
},
},
)
Best Practices
- Contextual Information: Include sufficient context with each error to aid debugging
- Error Classification: Use consistent error types and categories
- Priority Assignment: Assign appropriate priority levels based on impact
- Resolution Tracking: Document how errors were resolved to prevent recurrence
- Error Budgets: Establish error budgets for services to balance reliability and development velocity
Integration with Other Observability Tools
Errors integrate with other observability components:
- Logs: Errors generate corresponding log entries with full context
- Traces: Errors are linked to distributed traces for request context
- Metrics: Error rates and patterns are tracked as metrics
- Alerts: Critical errors trigger alerts based on configurable thresholds
API Reference
List Errors
GET / api / errors
Query parameters:
type
: Filter by error typestatus
: Filter by statuspriority
: Filter by priorityfrom
: Start timestampto
: End timestamplimit
: Maximum number of errors to return
Get Error by ID
GET /api/errors/:id
Update Error Status
PATCH /api/errors/:id
Request body:
{
"status": "acknowledged",
"assignee": "user-123",
"comment": "Investigating this issue"
}