DBGorilla Test Inventory & Strategy

Backend integration + Frontend E2E tests categorized by feature area

872
Backend Integration Tests
169
Frontend E2E Tests
284
Quarantined (BE)
41
Quarantined (FE)
494
Create DBs via API
12
Overlap Areas
Backend Frontend E2E Overlap Quarantined Creates DB Skipped External Svc

Frontend/Backend Overlap

These features are tested by BOTH backend integration tests and frontend E2E tests. Consider whether both are necessary.

Overlap Analysis

1. Database Connection / CRUD
BE test_db_crud_routes_integration.py (16), test_db_cluster_detection_integration.py (3), test_db_creation_permission_integration.py (4), test_db_server_management_routes_integration.py (4)
FE connect-database.spec.ts (10), connect-database-demo.spec.ts (5), add-cluster.spec.ts (3), server-management.spec.ts (2)
E2E tests the full flow (form fill, submit, verify in sidebar) and exercises the API underneath. 27 BE tests are redundant with 20 E2E tests. Keep E2E as the source of truth. Reduce BE to permission edge cases (RBAC, creation restrictions) that the UI doesn't cover.
2. Authentication & Login
BE test_auth_routes_integration.py (20), test_auth_utils.py (1)
FE auth-test.spec.ts (4), authentication@cross-browser.spec.ts (16)
E2E covers the real user login flow. Backend tests cover token internals (generation, validation, refresh) that the UI doesn't exercise directly. Keep both -- low overlap.
3. LLM Server Management & Chat
BE test_llm_routes_integration.py (5), test_llm_server_routes_integration.py (2), test_llm_chat_integration.py (4), test_use_llm_chat_routes_integration.py (26), test_chat_agents_e2e.py (19)
FE llm-setup.spec.ts (1), smoke-chat.spec.ts (2)
E2E llm-setup.spec.ts tests the FULL sysop-to-user flow (add server, discover, enable, assign tiers, chat). That one test validates what 56 BE tests cover in isolation. Expand E2E coverage, reduce BE to edge cases (SSRF validation, error responses, RBAC).
4. Schema Inspection
BE test_postgres_schema_routes_integration.py (12), test_schema_refresh_background_dedup.py (4), test_schema_refresh_uat.py (5)
FE smoke-schema-explorer.spec.ts (4)
E2E schema-explorer tests prove the user can navigate, search, and export schemas. BE tests cover refresh mechanics and caching internals the UI doesn't touch. Keep both -- genuinely different layers.
5. Query Optimization
BE test_query_optimization_artifact_routes_integration.py (18), test_query_optimization_v0_2_routes_integration.py (11), test_query_optimization_microservice_integration.py (4)
FE smoke-query-optimizer.spec.ts (1)
1 E2E smoke test just checks UI renders. 33 BE tests cover the optimization engine. Expand E2E to test a real optimization flow end-to-end. 17 of the BE tests are quarantined anyway (Redis/microservice unavailable).
6. Feedback
BE test_feedback_routes_integration.py (17)
FE feedback-onprem.spec.ts (6) all quarantined
E2E tests the real user experience (widget, admin queue) but is quarantined because it switches FEEDBACK_TYPE at runtime. Fix: run E2E feedback tests in a dedicated onprem deployment where mode is already set. Reduce BE to API-only validation (error cases, auth checks). The E2E tests are more valuable when they work.
7. Clone Management
BE test_clone_deployment_routes_integration.py (21), test_clone_target_routes_integration.py (1), test_e2e_clone_workflow.py (1)
FE smoke-clone-management.spec.ts (1), snapshot-creation.spec.ts (7)
E2E snapshot-creation tests the user flow (form, create, verify). BE has 21 deployment tests covering internal lifecycle. Keep E2E as primary validation, reduce BE to edge cases and error paths the UI can't trigger.
8. PL/pgSQL Check
BE test_postgres_plpgsql_check_routes_integration.py (16), test_plpgsql_check_analyzer.py (1)
FE plpgsql-check.spec.ts (12)
29 tests for one feature. E2E tests validate the full user flow (create procedure, run check, see results). This is MORE valuable than BE tests that hit the API in isolation. Keep all 12 E2E tests, reduce BE to the 1 analyzer test + RBAC checks only.
9. Alerts
BE test_alerting_routes_integration.py (9) all quarantined, test_sysop_internal_alerts_routes_integration.py (19)
FE smoke-system-alerts.spec.ts (1)
BE alerting all quarantined (Grafana unavailable in CI). E2E is 1 smoke test. Expand E2E to cover alert creation flow when Grafana is available. BE internal alerts (19 tests) test sysop-only paths the UI doesn't cover -- keep those.
10. RAG / Knowledge Base
BE test_use_llm_rag_routes_integration.py (1), test_shared_knowledge_base_integration.py (1)
FE smoke-rag-knowledgebase.spec.ts (2)
Minimal coverage on both sides. E2E tests document upload + display which is the user flow. Expand E2E as RAG matures.
11. Session Management
BE (covered in auth_routes)
FE session-multi-tab.spec.ts (15) all quarantined, session-timeout.spec.ts (13) all quarantined
28 E2E tests, all quarantined (flaky BroadcastChannel + token injection). These test REAL browser behavior (multi-tab sync, session timeout) that BE can never cover. Worth fixing -- they validate user-facing session UX that matters. BE auth tests only cover the token API, not the browser experience.
12. User Management
BE test_user_management_routes_integration.py (18), test_user_management_rbac_integration.py (15)
FE (no direct E2E tests)
Backend only -- no E2E coverage. Add E2E tests for user invite, role change, deactivation flows. These are admin-facing features that should be validated through the UI.

All Categories

Testing Strategy

E2E tests are the source of truth. They prove features work from the user's perspective. Backend integration tests exist only when they cover something the UI can't reach.

The Testing Pyramid

Based on Google's testing philosophy: fast unit tests at the base, E2E at the top, minimal integration in between.

Onprem E2E Happy Paths ~50 tests | < 5 min API Integration ~20 tests | < 1 min | shrinking Unit Tests Edge Cases + Error Paths + Validation DI / Repository Pattern ~2000+ tests | < 2 min | milliseconds each Proves features work for the user Only what E2E and unit tests can't reach All edge cases, fast, no infrastructure CURRENT STATE 872 integration tests doing all 3 jobs

Test Layers

1 Unit Tests Edge Cases + Error Paths

What: All edge cases, error paths, RBAC permutations, input validation, SSRF checks, tenant isolation. Uses DI/repository pattern -- inject fakes, no network, no live backend.

Where: backend/tests/unit/, services/*/tests/unit/, packages/*/tests/

Runner: pytest + xdist (-n auto), in-memory SQLite

Speed target: < 2 minutes

Key insight: As the repository pattern expands, edge cases currently in 872 integration tests move here. Fast, comprehensive, no infrastructure needed.

2 E2E Tests Happy Paths

What: Full user flows through the browser against a live deployment. Happy paths only -- prove the feature works end-to-end. If a user can do it in the UI, test it through the UI.

Where: frontend/tests/e2e/

Runner: Playwright, sharded across K8s pods

Speed target: < 5 minutes

3 API Integration Tests Wiring Only

What: Minimal. Only tests things neither unit tests nor E2E can reach -- external webhooks, internal-only endpoints with no UI, API version compatibility. As DI expands, this layer shrinks toward zero.

Where: backend/tests/api/ NEW

Runner: pytest against live deployment, single pod

Speed target: < 1 minute

4 Onprem / Mode-Specific Tests Scheduled

What: Tests requiring specific deployment config (onprem mode, feedbackType=none). Run daily, not per-PR.

Where: backend/tests/onprem/, frontend/tests/e2e-onprem/ NEW

Runner: Daily workflow with dedicated onprem deployment

What Goes Where

E2E Tests (Layer 2) -- test these

API Tests (Layer 3) -- minimal, shrinking

DELETE from current integration tests

Onprem Tests (Layer 4) -- scheduled daily

Test Rules

1
No runtime state mutation. Tests must not switch runtime config (FEEDBACK_TYPE, feature flags) on a shared backend. If a test needs a specific mode, it runs in a dedicated deployment (Layer 4).
2
Database reuse. Tests must use persistent databases pre-registered by setup. Only tests that specifically test connect/disconnect create ephemeral databases.
3
Independence. Every test runs independently. No ordering dependencies. No shared mutable state. If a test creates data, it cleans up after itself.
4
Timeout budget. E2E: 30s per action, 5 min per spec. API: 10s per request, 3 min total. Unit: 5s per test.
5
Quarantine discipline. Flaky tests get quarantined with a reason. Reviewed weekly. Not fixed in 2 weeks = deleted or rewritten.

Directory Structure

backend/ tests/ unit/ # Layer 1 -- keep as-is api/ # Layer 3 -- NEW (replaces integration/) rbac/ # Role-based access control edge cases validation/ # Input validation and error responses security/ # SSRF, tenant isolation, rate limiting sysop/ # Sysop-only endpoint tests compatibility/ # API version compatibility integration/ # DEPRECATED -- migrate to api/ or delete onprem/ # Layer 4 -- onprem-specific frontend/ tests/ e2e/ # Layer 2 -- expand coverage flows/ # Full user workflows smoke/ # Quick render checks visual/ # Screenshot comparison pages/ # Page objects utils/ # Test utilities e2e-onprem/ # Layer 4 -- onprem feedback, mode switching

Speed Targets

E2E tests
~10 min (current)
< 5 min (target)
API integration
~11 min (current)
< 3 min
Total dev-deploy
~17 min (current)
< 8 min (target)

Coverage Gaps

Features and pages that exist in the codebase but have zero test coverage.

38
Frontend Pages Untested
29
Backend Route Files Untested
107
API Endpoints Untested
3
Middleware Untested
22%
Frontend Page Coverage

Frontend Pages -- No E2E Coverage

Tier 1 -- Critical Core user-facing features with complex interactions
E2E
DataAccessPage /database/:id/data-access
4 tabs: table browser, natural language queries, query builder, SQL executor. Most important user feature, zero tests.
E2E
DatabaseQueryPerformancePage /database/:id/query-performance
Query performance analytics and fingerprint drilldown. Core DBA workflow.
E2E
DatabaseAdvisorPage /database/:id/advisor
AI-powered database recommendations. Key differentiating feature.
E2E
AdminOrganizationManagementPage /admin/settings
User management, email settings, org analysis, API keys, integrations tabs. Critical admin functionality.
E2E
PlanOperationsPage /database/:id/plans
Plan creation and execution. Ties analyzers to databases.
Tier 2 -- High Important features and admin flows
E2E
Clone Management /clone/:deploymentId/*
6 pages: deployment, data mocking, query, schema explorer, performance, data access. Full clone lifecycle untested.
E2E
Observability Pages /observability/*
4 pages: notifications, activity, metrics dashboards, logs/tracing. Zero coverage.
E2E
Sysop Pages /sysop/*
4 pages: settings, exporters, usage analytics, AI feedback analytics. Admin-only, zero coverage.
E2E
UsersManagementPage /admin/users
User CRUD table. Role assignment, deactivation.
E2E
DatabaseIssueTrackingPage /database/:id/issue-tracking
Issue list and detail pages. Issue lifecycle.
E2E
RunsHistoryPage + RunDetailPage /database/:id/runs, /runs/:runId
Plan run history and detail views.
Tier 3 -- Medium Supplementary features
E2E
TopologyBrowserPage + TopologyComparisonPage /database/:id/topology-browser, /topology/compare
Schema topology visualization and epoch comparison.
E2E
ManageUserProfilePage /profile
User profile editing. Has unit test but no E2E.
E2E
ExtensionsPage /database/:id/postgres-extensions
PostgreSQL extension management.
E2E
ClusterDetailPage /cluster/:clusterId
Cluster detail view and node management.
E2E
IntegrationsPage /admin/integrations
Slack, Trello OAuth callback flows.
E2E
PerformanceOperationsPage /database/:id/performance-operations
Performance tuning operations.

Backend Routes -- No Test Coverage

Critical -- Security & Infrastructure
BE
keycloak_routes.py 7 endpoints
OIDC login, callback, token exchange, logout, device config. Core auth flow, zero tests post AUTH-001.
BE
health_routes.py 3 endpoints
K8s liveness/readiness probes. Complex directory writability + alert consumer state checks.
BE
config_routes.py 1 endpoint
Public (no auth) frontend config endpoint. Keycloak SSO discovery, WebSocket URLs. Frontend depends on this before login.
High -- Core Features
BE
chat_routes.py 17 endpoints
Message CRUD, voting, deletion, export, search, title updates. Core messaging feature.
BE
response_routes.py 3 endpoints
LLM response generation including streaming. Core AI feature.
BE
rag_document_routes.py 9 endpoints
Document upload (sync/async), deletion, listing, summary, chunks. RAG pipeline.
BE
topology_graph_routes.py 8 endpoints
Graph browsing, snapshots, comparisons, history.
BE
chat_file_routes.py 3 endpoints
File upload/download in chat. Security implications.
BE
postgres_query_optimization_routes.py 5 endpoints
Query optimization results CRUD.
Medium -- Feature Completeness
BE
llm_assignments_routes.py 5 endpointscluster_routing_routes.py 5 endpointspostgres_extension_routes.py 4 endpointschat_admin_routes.py 4 endpoints
Plus 11 more route files with 1-7 endpoints each. See full inventory for details.
Untested Middleware
BE
body_size_middleware.pylogging_middleware.pyopenlit_middleware.py
Request size enforcement, request/response logging, OpenLIT tracing. All 3 have zero unit or integration tests.

Testability Architecture

Parts of the backend are easier to test than others. The repository pattern with dependency injection enables fast unit tests that don't need a live API. Expanding this pattern reduces the need for slow integration tests.

Easy to Test (Repository Pattern)

Hard to Test (Direct Dependencies)

Current State

82%
Routes use DI (60/73)
56%
Services are direct (109/194)
43%
Services use DI (83/194)
2
Services mixed pattern

Routes are in good shape -- most use Depends(). The problem is the 109 service files that directly call async_session() or external clients. These force integration tests to run against a live backend.

DI / Repository Pattern (83 services)

Direct Dependencies (109 services)

Strategy: Make More Code Easy to Test

1
Expand the repository pattern. New features should use the dbg-repositories + dbg-services pattern. Services accept repositories as constructor args, not inline DB calls.
2
Inject external dependencies. Grafana, Redis, Weaviate, LiteLLM clients should be injected, not imported directly. This lets unit tests swap them for fakes.
3
Move logic out of routes. Route handlers should be thin -- validate input, call service, return response. All logic lives in services that can be unit tested independently.
4
Integration tests for wiring only. Once services are unit-testable with DI, integration tests only need to verify the wiring (routes call the right service with the right args). This is what E2E covers naturally.

Impact: Migrating the top 10 most-tested services to the repository pattern would let ~200 current integration tests become fast unit tests. Combined with E2E for user flows, integration tests drop from 872 to under 100.

Code to Remove

Dead code, debug-only features, and deprecated dependencies. Removing these narrows the test target.

8
Frontend Components to Delete
6
Pages to Delete
14
Grafana Service Files
13
Test Page Tabs to Remove
Safe to Delete -- No Dependencies
FE
Test Page Components (8 files)
components/features/testing/FaviconTester.tsx
components/features/testing/StreamingChatComponent.tsx
components/features/testing/RagAgentTest.tsx
components/features/testing/QueryPerformanceTest.tsx
components/features/testing/QueryOptimizationArtifactTest.tsx
components/features/testing/ClusterManagementTester.tsx
pages/AllPlaybooksPageTest.tsx
Only imported by TestPage.tsx. No external usage. Remove files and imports.
FE
Sysop Pages (3 files)
pages/sysop/MetricsExporterManagementPage.tsx
pages/sysop/UsageAnalyticsPage.tsx
pages/sysop/AIFeedbackAnalyticsPage.tsx
Only imported in App.tsx routes. Remove pages, routes, and related components:
components/sysop/ExporterStatsCard.tsx
components/sysop/HealthDistributionChart.tsx
components/sysop/ExportersByTenantTable.tsx
hooks/useSysopAIFeedback.ts
FE
Project Pages (2 files)
pages/CreateProjectPage.tsx
pages/ProjectDetailPage.tsx
Dev-only routes. Remove pages, routes, and all project-related components.
FE
13 Test Page Tabs
Remove tabs from TestPage.tsx: notifications, database-list, servers, server-databases, rag, logo, chat-tests, cluster-testing, auth-forms, query-performance, all-playbooks, query-optimization-artifacts, query-optimizer-v1. Keep 4 tabs: email-templates, schema-cache-refresh, plpgsql-check, topology-epochs.
Requires Refactor -- Has Dependents
BE
Grafana Service Module (14 files, ~4000 lines)
backend/app/services/grafana/*
Already optional via ALERTING_BACKEND="native" (default). Alerting system has dual modes -- Grafana is loaded only when backend is "grafana" or "both". To remove:
1. Verify ALERTING_BACKEND is "native" in all envs
2. Remove Grafana initialization from alerting_service.py
3. Delete services/grafana/ module (14 files)
4. Remove GRAFANA_URL, GRAFANA_API_KEY from config.py
5. Remove grafanaWebhookSecret from Helm values
6. Update scripts: initialize_internal_alerts.py, check_internal_alerts.py
BE
Guides Service & Routes
backend/app/api/v0_1/guide_routes.py
Guide components are used in production pages (DatabaseDetailsPage, ManageLLMsPage, ExtensionsPage, DocumentationPage). Need to remove guide service + routes from backend AND all guide references from frontend pages before deleting.
BE
Sysop Exporter & AI Feedback Endpoints
Remove from sysop_routes.py: GET /exporters, GET /exporters/stats. Remove from chat_admin_routes.py: GET /analytics/votes, GET /analytics/popular-messages, GET /stats/chat-sessions. Keep tenant and defaults endpoints.
Keep for Now
BE
OpenLIT Tracing (20+ files)
Decorators and context managers are non-invasive (functions work without them). Deeply integrated into agent/tool system. Leave in place -- can revisit if tracing backend changes.
FE
Debug Components
OnboardingStateTester.tsx -- used in DebugPanelContent.tsx
PubSubTester.tsx -- used in DebugPanelContent.tsx
Part of the debug panel which is useful for development. Keep.

Migration Plan

1

Organize E2E

Categorize existing E2E tests. Identify gaps (features with no E2E coverage). Fix flaky tests (session-timeout, accessibility, connect-database timeouts).

2

Create backend/tests/api/

New directory structure. Migrate RBAC, validation, security tests from integration/. Small, fast, independent -- no database creation.

3

Expand E2E Coverage

Add E2E tests for features currently only covered by BE integration: query execution, analyzer lifecycle, alert creation, user management. Each new E2E test replaces 5-15 BE integration tests.

4

Deprecate backend/tests/integration/

Remove BE integration tests now covered by E2E. Keep only tests migrated to api/. Target: 90% reduction in integration test count (872 -> ~80).

5

Onprem Testing

Create daily onprem deploy workflow. Move feedback-onprem and mode-switching tests there. Add onprem-specific E2E tests.