GBase 8c Performance Tuning: From Statistics to Execution Plan Optimization
Source: Dev.to
Introduction
As data volumes and concurrent access grow, database performance becomes critical to system stability. Drawing from official technical manuals, this article distills core performance‑optimization techniques for GBase 8c, the China‑domestically developed distributed database. Topics include statistics updates, execution‑plan intervention, SQL rewriting, and key parameter configuration.
GBase 8c is a cost‑based distributed database. Its optimizer acts like a precise navigation system, choosing what it considers the optimal execution path (execution plan) for every SQL statement. This decision relies heavily on “map data” — statistics.
The optimizer estimates row counts and costs based on statistics gathered by the ANALYZE command (stored in pg_class and pg_statistic system tables). Outdated statistics are like using an old map—they can mislead the optimizer into generating poor execution plans.
Core principle: Before attempting any complex tuning, first ensure your statistics are accurate and up to date.
Updating Statistics
Statistics are the source data the planner uses to generate plans.
-- Update statistics for a single table
ANALYZE tablename;
-- Update statistics for all tables in the current database
ANALYZE;
-- Collect multi‑column statistics for correlated columns
ANALYZE tablename((column_1, column_2));Automation
GBase 8c provides the autovacuum daemon to automatically reclaim space and update statistics. Control its behavior with the autovacuum and autovacuum_mode parameters (e.g., set to mix mode for both cleanup and analysis).
Detecting Performance Problems
Business signals – interface timeouts, slow application responses, or errors – are the most direct indicators of performance issues.
Proactive detection – conduct database inspections or query slow logs.
Enable slow logging:
track_stmt_stat_level = 'OFF,L0' # example setting
enable_stmt_track = onQuery slow logs:
SELECT *
FROM dbe_perf.get_global_slow_sql_by_timestamp(
'2024-05-07 04:00:00',
'2024-05-07 04:10:00'
);Guiding the Optimizer with Plan Hints
When the optimizer chooses an undesirable plan, GBase 8c offers Plan Hint functionality. Add hints using the /*+ ... */ comment syntax before the SQL statement. Separate multiple hints with spaces.
Hint Syntax Overview
| Hint type | Example |
|---|---|
| Join order | Leading((t1 t2 t3)) |
| Join method | NestLoop(t1 t2), HashJoin(t1 t2), MergeJoin(t1 t2) |
| Scan method | IndexScan(t1 index_name), SeqScan(t1), IndexOnlyScan(t1 index_name) |
| Estimated rows | Rows(t1 #10) (suggests table t1 has about 10 rows) |
Example: Forcing a NestLoop Join
-- Default may be HashJoin
EXPLAIN
SELECT *
FROM stu s
JOIN stu_info i ON s.id = i.stu_id;
-- Force NestLoop with a Hint
SELECT /*+ NestLoop(s i) */ *
FROM stu s
JOIN stu_info i ON s.id = i.stu_id;Caution: Hints bypass the optimizer’s cost calculations. Use them as a last resort and continually verify their effectiveness.
Adjusting Database Parameters
Tweaking configuration parameters can nudge execution plans toward better strategies.
| Parameter | Effect |
|---|---|
work_mem | Increases memory available for sorts and hash joins, reducing disk I/O. |
shared_buffers | Controls data caching; larger values benefit read‑heavy workloads. |
Optimizer switches (e.g., enable_hashjoin, enable_indexscan) | Disable specific strategies, forcing the optimizer to consider alternatives (similar to hints but applied at session or global level). |
Example: Setting work_mem
gs_guc set -Z coordinator -D /path/to/data_dir -c "work_mem = 16MB"SQL Rewriting Best Practices
| Recommendation | Reason |
|---|---|
Use UNION ALL instead of UNION when duplicates are impossible | Avoids expensive deduplication. |
Add IS NOT NULL filters on join columns if NULLs are common | Allows early data reduction, improving join efficiency. |
Use TRUNCATE instead of DELETE to clear tables | Faster and releases disk space immediately. |
Explicitly list column names in INSERT statements | Improves readability and stability. |
| Choose the appropriate storage model (row‑store vs. column‑store) based on workload (OLTP vs. OLAP) | Aligns physical storage with query patterns. |
Conclusion
Performance tuning for GBase 8c is a systematic process. Accurate statistics, judicious use of plan hints, careful parameter configuration, and thoughtful SQL rewriting are interlinked steps. Integrating these methods into daily operations will help you effectively tackle the growing challenge of slow queries in your GBase environment.