These features are tested by BOTH backend integration tests and frontend E2E tests. Consider whether both are necessary.
Overlap Analysis
1. Database Connection / CRUD
BE test_db_crud_routes_integration.py (16), test_db_cluster_detection_integration.py (3), test_db_creation_permission_integration.py (4), test_db_server_management_routes_integration.py (4) FE connect-database.spec.ts (10), connect-database-demo.spec.ts (5), add-cluster.spec.ts (3), server-management.spec.ts (2)
E2E tests the full flow (form fill, submit, verify in sidebar) and exercises the API underneath. 27 BE tests are redundant with 20 E2E tests. Keep E2E as the source of truth. Reduce BE to permission edge cases (RBAC, creation restrictions) that the UI doesn't cover.
2. Authentication & Login
BE test_auth_routes_integration.py (20), test_auth_utils.py (1) FE auth-test.spec.ts (4), authentication@cross-browser.spec.ts (16)
E2E covers the real user login flow. Backend tests cover token internals (generation, validation, refresh) that the UI doesn't exercise directly. Keep both -- low overlap.
3. LLM Server Management & Chat
BE test_llm_routes_integration.py (5), test_llm_server_routes_integration.py (2), test_llm_chat_integration.py (4), test_use_llm_chat_routes_integration.py (26), test_chat_agents_e2e.py (19) FE llm-setup.spec.ts (1), smoke-chat.spec.ts (2)
E2E llm-setup.spec.ts tests the FULL sysop-to-user flow (add server, discover, enable, assign tiers, chat). That one test validates what 56 BE tests cover in isolation. Expand E2E coverage, reduce BE to edge cases (SSRF validation, error responses, RBAC).
4. Schema Inspection
BE test_postgres_schema_routes_integration.py (12), test_schema_refresh_background_dedup.py (4), test_schema_refresh_uat.py (5) FE smoke-schema-explorer.spec.ts (4)
E2E schema-explorer tests prove the user can navigate, search, and export schemas. BE tests cover refresh mechanics and caching internals the UI doesn't touch. Keep both -- genuinely different layers.
5. Query Optimization
BE test_query_optimization_artifact_routes_integration.py (18), test_query_optimization_v0_2_routes_integration.py (11), test_query_optimization_microservice_integration.py (4) FE smoke-query-optimizer.spec.ts (1)
1 E2E smoke test just checks UI renders. 33 BE tests cover the optimization engine. Expand E2E to test a real optimization flow end-to-end. 17 of the BE tests are quarantined anyway (Redis/microservice unavailable).
6. Feedback
BE test_feedback_routes_integration.py (17) FE feedback-onprem.spec.ts (6) all quarantined
E2E tests the real user experience (widget, admin queue) but is quarantined because it switches FEEDBACK_TYPE at runtime. Fix: run E2E feedback tests in a dedicated onprem deployment where mode is already set. Reduce BE to API-only validation (error cases, auth checks). The E2E tests are more valuable when they work.
7. Clone Management
BE test_clone_deployment_routes_integration.py (21), test_clone_target_routes_integration.py (1), test_e2e_clone_workflow.py (1) FE smoke-clone-management.spec.ts (1), snapshot-creation.spec.ts (7)
E2E snapshot-creation tests the user flow (form, create, verify). BE has 21 deployment tests covering internal lifecycle. Keep E2E as primary validation, reduce BE to edge cases and error paths the UI can't trigger.
8. PL/pgSQL Check
BE test_postgres_plpgsql_check_routes_integration.py (16), test_plpgsql_check_analyzer.py (1) FE plpgsql-check.spec.ts (12)
29 tests for one feature. E2E tests validate the full user flow (create procedure, run check, see results). This is MORE valuable than BE tests that hit the API in isolation. Keep all 12 E2E tests, reduce BE to the 1 analyzer test + RBAC checks only.
9. Alerts
BE test_alerting_routes_integration.py (9) all quarantined, test_sysop_internal_alerts_routes_integration.py (19) FE smoke-system-alerts.spec.ts (1)
BE alerting all quarantined (Grafana unavailable in CI). E2E is 1 smoke test. Expand E2E to cover alert creation flow when Grafana is available. BE internal alerts (19 tests) test sysop-only paths the UI doesn't cover -- keep those.
10. RAG / Knowledge Base
BE test_use_llm_rag_routes_integration.py (1), test_shared_knowledge_base_integration.py (1) FE smoke-rag-knowledgebase.spec.ts (2)
Minimal coverage on both sides. E2E tests document upload + display which is the user flow. Expand E2E as RAG matures.
11. Session Management
BE (covered in auth_routes) FE session-multi-tab.spec.ts (15) all quarantined, session-timeout.spec.ts (13) all quarantined
28 E2E tests, all quarantined (flaky BroadcastChannel + token injection). These test REAL browser behavior (multi-tab sync, session timeout) that BE can never cover. Worth fixing -- they validate user-facing session UX that matters. BE auth tests only cover the token API, not the browser experience.
12. User Management
BE test_user_management_routes_integration.py (18), test_user_management_rbac_integration.py (15) FE (no direct E2E tests)
Backend only -- no E2E coverage. Add E2E tests for user invite, role change, deactivation flows. These are admin-facing features that should be validated through the UI.
All Categories
Testing Strategy
E2E tests are the source of truth. They prove features work from the user's perspective. Backend integration tests exist only when they cover something the UI can't reach.
The Testing Pyramid
Based on Google's testing philosophy: fast unit tests at the base, E2E at the top, minimal integration in between.
Test Layers
1 Unit Tests Edge Cases + Error Paths
What: All edge cases, error paths, RBAC permutations, input validation, SSRF checks, tenant isolation. Uses DI/repository pattern -- inject fakes, no network, no live backend.
Key insight: As the repository pattern expands, edge cases currently in 872 integration tests move here. Fast, comprehensive, no infrastructure needed.
2 E2E Tests Happy Paths
What: Full user flows through the browser against a live deployment. Happy paths only -- prove the feature works end-to-end. If a user can do it in the UI, test it through the UI.
Where:frontend/tests/e2e/
Runner: Playwright, sharded across K8s pods
Speed target: < 5 minutes
3 API Integration Tests Wiring Only
What: Minimal. Only tests things neither unit tests nor E2E can reach -- external webhooks, internal-only endpoints with no UI, API version compatibility. As DI expands, this layer shrinks toward zero.
Where:backend/tests/api/NEW
Runner: pytest against live deployment, single pod
Speed target: < 1 minute
4 Onprem / Mode-Specific Tests Scheduled
What: Tests requiring specific deployment config (onprem mode, feedbackType=none). Run daily, not per-PR.
Runner: Daily workflow with dedicated onprem deployment
What Goes Where
E2E Tests (Layer 2) -- test these
Database connect, discover, view in sidebar, disconnect
Schema exploration, search, export
Query execution, results, explain plans
LLM provider setup, chat, responses
Analyzers -- run, view results, act on recommendations
Plans & runs -- create, execute, view
Clone management -- create, verify, teardown
Alert rules -- create, trigger, notifications
User management -- invite, role change, access
Auth flows -- login, logout, session, multi-tab
Accessibility -- keyboard nav, ARIA, focus
Visual regression -- screenshots across deploys
Sysop admin -- tenant management, user verification, config overrides
API Tests (Layer 3) -- minimal, shrinking
Webhook/callback from external systems (no UI trigger)
Internal-only endpoints with no UI at all
API version compatibility (v0.1 vs v0.2)
RBAC edge cases -- move to unit tests with DI
Input validation -- move to unit tests with DI
SSRF protection -- move to unit tests with DI
Tenant isolation -- move to unit tests with DI
Rate limiting -- move to unit tests with DI
DELETE from current integration tests
Connect database and verify it appears
Create plan and run it
Submit feedback and list it
Configure LLM and chat
Create snapshot and verify
Add server and discover databases
Refresh schema and check cache
Any test that duplicates an E2E flow
Onprem Tests (Layer 4) -- scheduled daily
Feedback in onprem mode (pending queue)
FEEDBACK_TYPE switching
Features gated by runtime.type
Onprem-specific UI behavior
Test Rules
1
No runtime state mutation. Tests must not switch runtime config (FEEDBACK_TYPE, feature flags) on a shared backend. If a test needs a specific mode, it runs in a dedicated deployment (Layer 4).
2
Database reuse. Tests must use persistent databases pre-registered by setup. Only tests that specifically test connect/disconnect create ephemeral databases.
3
Independence. Every test runs independently. No ordering dependencies. No shared mutable state. If a test creates data, it cleans up after itself.
4
Timeout budget. E2E: 30s per action, 5 min per spec. API: 10s per request, 3 min total. Unit: 5s per test.
5
Quarantine discipline. Flaky tests get quarantined with a reason. Reviewed weekly. Not fixed in 2 weeks = deleted or rewritten.
Directory Structure
backend/tests/unit/ # Layer 1 -- keep as-isapi/ # Layer 3 -- NEW (replaces integration/)rbac/ # Role-based access control edge casesvalidation/ # Input validation and error responsessecurity/ # SSRF, tenant isolation, rate limitingsysop/ # Sysop-only endpoint testscompatibility/ # API version compatibilityintegration/ # DEPRECATED -- migrate to api/ or deleteonprem/ # Layer 4 -- onprem-specificfrontend/tests/e2e/ # Layer 2 -- expand coverageflows/ # Full user workflowssmoke/ # Quick render checksvisual/ # Screenshot comparisonpages/ # Page objectsutils/ # Test utilitiese2e-onprem/ # Layer 4 -- onprem feedback, mode switching
Speed Targets
E2E tests
~10 min (current)
< 5 min (target)
API integration
~11 min (current)
< 3 min
Total dev-deploy
~17 min (current)
< 8 min (target)
Coverage Gaps
Features and pages that exist in the codebase but have zero test coverage.
38
Frontend Pages Untested
29
Backend Route Files Untested
107
API Endpoints Untested
3
Middleware Untested
22%
Frontend Page Coverage
Frontend Pages -- No E2E Coverage
Tier 1 -- CriticalCore user-facing features with complex interactions
E2E
DataAccessPage/database/:id/data-access 4 tabs: table browser, natural language queries, query builder, SQL executor. Most important user feature, zero tests.
E2E
DatabaseQueryPerformancePage/database/:id/query-performance Query performance analytics and fingerprint drilldown. Core DBA workflow.
llm_assignments_routes.py5 endpoints •
cluster_routing_routes.py5 endpoints •
postgres_extension_routes.py4 endpoints •
chat_admin_routes.py4 endpoints Plus 11 more route files with 1-7 endpoints each. See full inventory for details.
Untested Middleware
BE
body_size_middleware.py • logging_middleware.py • openlit_middleware.py Request size enforcement, request/response logging, OpenLIT tracing. All 3 have zero unit or integration tests.
Testability Architecture
Parts of the backend are easier to test than others. The repository pattern with dependency injection enables fast unit tests that don't need a live API. Expanding this pattern reduces the need for slow integration tests.
Easy to Test (Repository Pattern)
+packages/dbg-repositories/ -- clean data access layer
+packages/dbg-services/ -- business logic with injected repos
+ Services using BaseRepository -- swap real DB for in-memory
+ Can test with fakes/mocks, no network, no live backend
+ Fast: milliseconds per test
Hard to Test (Direct Dependencies)
- Routes that call services directly without DI
- Services with hardcoded async_session() calls
- Code that talks to Grafana, Redis, Weaviate, LiteLLM inline
- Requires full live deployment to test
- Slow: seconds to minutes per test
Current State
82%
Routes use DI (60/73)
56%
Services are direct (109/194)
43%
Services use DI (83/194)
2
Services mixed pattern
Routes are in good shape -- most use Depends(). The problem is the 109 service files that directly call async_session() or external clients. These force integration tests to run against a live backend.
DI / Repository Pattern (83 services)
alerting/* (16 files) -- alert registry, templates, all alert types
grafana/* (8 files) -- alert rules, contact points, data sources
Expand the repository pattern. New features should use the dbg-repositories + dbg-services pattern. Services accept repositories as constructor args, not inline DB calls.
2
Inject external dependencies. Grafana, Redis, Weaviate, LiteLLM clients should be injected, not imported directly. This lets unit tests swap them for fakes.
3
Move logic out of routes. Route handlers should be thin -- validate input, call service, return response. All logic lives in services that can be unit tested independently.
4
Integration tests for wiring only. Once services are unit-testable with DI, integration tests only need to verify the wiring (routes call the right service with the right args). This is what E2E covers naturally.
Impact: Migrating the top 10 most-tested services to the repository pattern would let ~200 current integration tests become fast unit tests. Combined with E2E for user flows, integration tests drop from 872 to under 100.
Code to Remove
Dead code, debug-only features, and deprecated dependencies. Removing these narrows the test target.
8
Frontend Components to Delete
6
Pages to Delete
14
Grafana Service Files
13
Test Page Tabs to Remove
Safe to Delete -- No Dependencies
FE
Test Page Components (8 files) components/features/testing/FaviconTester.tsx components/features/testing/StreamingChatComponent.tsx components/features/testing/RagAgentTest.tsx components/features/testing/QueryPerformanceTest.tsx components/features/testing/QueryOptimizationArtifactTest.tsx components/features/testing/ClusterManagementTester.tsx pages/AllPlaybooksPageTest.tsx Only imported by TestPage.tsx. No external usage. Remove files and imports.
FE
Sysop Pages (3 files) pages/sysop/MetricsExporterManagementPage.tsx pages/sysop/UsageAnalyticsPage.tsx pages/sysop/AIFeedbackAnalyticsPage.tsx Only imported in App.tsx routes. Remove pages, routes, and related components: components/sysop/ExporterStatsCard.tsx components/sysop/HealthDistributionChart.tsx components/sysop/ExportersByTenantTable.tsx hooks/useSysopAIFeedback.ts
FE
Project Pages (2 files) pages/CreateProjectPage.tsx pages/ProjectDetailPage.tsx Dev-only routes. Remove pages, routes, and all project-related components.
Grafana Service Module (14 files, ~4000 lines) backend/app/services/grafana/* Already optional via ALERTING_BACKEND="native" (default). Alerting system has dual modes -- Grafana is loaded only when backend is "grafana" or "both". To remove: 1. Verify ALERTING_BACKEND is "native" in all envs 2. Remove Grafana initialization from alerting_service.py 3. Delete services/grafana/ module (14 files) 4. Remove GRAFANA_URL, GRAFANA_API_KEY from config.py 5. Remove grafanaWebhookSecret from Helm values 6. Update scripts: initialize_internal_alerts.py, check_internal_alerts.py
BE
Guides Service & Routes backend/app/api/v0_1/guide_routes.py Guide components are used in production pages (DatabaseDetailsPage, ManageLLMsPage, ExtensionsPage, DocumentationPage). Need to remove guide service + routes from backend AND all guide references from frontend pages before deleting.
BE
Sysop Exporter & AI Feedback Endpoints Remove from sysop_routes.py: GET /exporters, GET /exporters/stats. Remove from chat_admin_routes.py: GET /analytics/votes, GET /analytics/popular-messages, GET /stats/chat-sessions. Keep tenant and defaults endpoints.
Keep for Now
BE
OpenLIT Tracing (20+ files) Decorators and context managers are non-invasive (functions work without them). Deeply integrated into agent/tool system. Leave in place -- can revisit if tracing backend changes.
FE
Debug Components OnboardingStateTester.tsx -- used in DebugPanelContent.tsx PubSubTester.tsx -- used in DebugPanelContent.tsx Part of the debug panel which is useful for development. Keep.
Migration Plan
1
Organize E2E
Categorize existing E2E tests. Identify gaps (features with no E2E coverage). Fix flaky tests (session-timeout, accessibility, connect-database timeouts).
2
Create backend/tests/api/
New directory structure. Migrate RBAC, validation, security tests from integration/. Small, fast, independent -- no database creation.
3
Expand E2E Coverage
Add E2E tests for features currently only covered by BE integration: query execution, analyzer lifecycle, alert creation, user management. Each new E2E test replaces 5-15 BE integration tests.
4
Deprecate backend/tests/integration/
Remove BE integration tests now covered by E2E. Keep only tests migrated to api/. Target: 90% reduction in integration test count (872 -> ~80).