diff --git a/product/en/docs-mogdb/v3.0/reference-guide/guc-parameters/9-query-planning/4-other-optimizer-options.md b/product/en/docs-mogdb/v3.0/reference-guide/guc-parameters/9-query-planning/4-other-optimizer-options.md index 873c03417ac442cc3bd8c32df7e2e0980d692aaa..f45b36c541df3fcc2fcc52b7065be92e23a8a847 100644 --- a/product/en/docs-mogdb/v3.0/reference-guide/guc-parameters/9-query-planning/4-other-optimizer-options.md +++ b/product/en/docs-mogdb/v3.0/reference-guide/guc-parameters/9-query-planning/4-other-optimizer-options.md @@ -100,7 +100,7 @@ This parameter is a USERSET parameter. Set it based on instructions provided in - **intargetlist**: Uses the In Target List query rewriting rules (subquery optimization in the target column). - **predpushnormal**: Use the Predicate Push query rewriting rule (push the predicate condition to the subquery). - **predpushforce**: Uses the Predicate Push query rewriting rules. Push down predicate conditions to subqueries and use indexes as much as possible for acceleration. -- **predpush**: Selects the optimal plan based on the cost in **predpushnormal** and **predpushforce**. +- **predpush**: Selects the optimal plan based on the cost in **predpushnormal** and **predpushforce**. **Note**: The rewriting rule for the **predpush** can in rare scenarios result in failure to generate a legal plan, and full testing is recommended before enabling the parameter. **Default value**: **magicset** diff --git a/product/en/docs-mogdb/v3.1/reference-guide/guc-parameters/9-query-planning/4-other-optimizer-options.md b/product/en/docs-mogdb/v3.1/reference-guide/guc-parameters/9-query-planning/4-other-optimizer-options.md index 5df88a93ee7e7a352a6fda58a6014f16e1b11230..d470f7fe9b61c3faa565118f5fe212b5e06f8999 100644 --- a/product/en/docs-mogdb/v3.1/reference-guide/guc-parameters/9-query-planning/4-other-optimizer-options.md +++ b/product/en/docs-mogdb/v3.1/reference-guide/guc-parameters/9-query-planning/4-other-optimizer-options.md @@ -152,7 +152,7 @@ This parameter is a USERSET parameter. Set it based on instructions provided in - **intargetlist**: Uses the In Target List query rewriting rules (subquery optimization in the target column). - **predpushnormal**: Use the Predicate Push query rewriting rule (push the predicate condition to the subquery). - **predpushforce**: Uses the Predicate Push query rewriting rules. Push down predicate conditions to subqueries and use indexes as much as possible for acceleration. -- **predpush**: Selects the optimal plan based on the cost in **predpushnormal** and **predpushforce**. +- **predpush**: Selects the optimal plan based on the cost in **predpushnormal** and **predpushforce**. **Note**: The rewriting rule for the **predpush** can in rare scenarios result in failure to generate a legal plan, and full testing is recommended before enabling the parameter. **Default value**: **magicset** diff --git a/product/en/docs-mogdb/v5.0/about-mogdb/MogDB-compared-to-openGauss.md b/product/en/docs-mogdb/v5.0/about-mogdb/MogDB-compared-to-openGauss.md index 6a6e8e81dccc5e15f2b511773e6065b7ba42ae51..a09d0d0ddcd803a2b6610329a565e84398ffe4a6 100644 --- a/product/en/docs-mogdb/v5.0/about-mogdb/MogDB-compared-to-openGauss.md +++ b/product/en/docs-mogdb/v5.0/about-mogdb/MogDB-compared-to-openGauss.md @@ -77,10 +77,6 @@ For more details, please visit openGauss official website: - -![memory-optimized-storage-engine-within-opengauss](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-introduction-2.png) - -[Figure 1](#memoryoptimized) presents the Memory-Optimized Storage Engine component (in green) of MogDB database and is responsible for managing MOT and transactions. - -MOT tables are created side-by-side regular disk-based tables. MOT's effective design enables almost full SQL coverage and support for a full database feature-set, such as stored procedures and user-defined functions (excluding the features listed in **MOT SQL Coverage and Limitations** section). - -With data and indexes stored totally in-memory, a Non-Uniform Memory Access (NUMA)-aware design, algorithms that eliminate lock and latch contention and query native compilation, MOT provides faster data access and more efficient transaction execution. - -MOT's effective almost lock-free design and highly tuned implementation enable exceptional near-linear throughput scale-up on many-core servers - probably the best in the industry. - -Memory-Optimized Tables are fully ACID compliant, as follows: - -- **Atomicity -** An atomic transaction is an indivisible series of database operations that either all occur or none occur after a transaction has been completed (committed or aborted, respectively). -- **Consistency -** Every transaction leaves the database in a consistent (data integrity) state. -- **Isolation -** Transactions cannot interfere with each other. MOT supports repeatable-reads and read-committed isolation levels. In the next release, MOT will also support serializable isolation. See the **MOT Isolation Levels** section for more information. -- **Durability -** The effects of successfully completed (committed) transactions must persist despite crashes and failures. MOT is fully integrated with the WAL-based logging of MogDB. Both synchronous and asynchronous logging options are supported. MOT also uniquely supports synchronous + group commit with NUMA-awareness optimization. See the **MOT Durability Concepts** section for more information. - -The MOT Engine was published in the VLDB 2020 (an International Conference on ‘Very Large Data Bases" or VLDB): - -**Industrial-Strength OLTP Using Main Memory and Many Cores**, VLDB 2020 vol. 13 - [Paper](http://www.vldb.org/pvldb/vol13/p3099-avni.pdf), [Video on youtube](https://www.modb.pro/video/6676?slink), [Video on bilibili](https://www.bilibili.com/video/BV1MA411n7ef?p=97). diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/2-mot-features-and-benefits.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/2-mot-features-and-benefits.md deleted file mode 100644 index 1713c463ebe20e1f1b972747b3cee069dc6de4ec..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/2-mot-features-and-benefits.md +++ /dev/null @@ -1,22 +0,0 @@ ---- -title: MOT Features and Benefits -summary: MOT Features and Benefits -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT Features and Benefits - -MOT provide users with significant benefits in performance (query and transaction latency), scalability (throughput and concurrency) and in some cases cost (high resource utilization) - - -- **Low Latency -** Provides fast query and transaction response time -- **High Throughput -** Supports spikes and constantly high user concurrency -- **High Resource Utilization -** Utilizes hardware to its full extent - -Using MOT, applications are able to achieve more 2.5 to 4 times (2.5x - 4x) higher throughput. For example, in our TPC-C benchmarks (interactive transactions and synchronous logging) performed both on Huawei Taishan Kunpeng-based (ARM) servers and on Dell x86 Intel Xeon-based servers, MOT provides throughput gains that vary from 2.5x on a 2-socket server to 3.7x on a 4-socket server, reaching 4.8M (million) tpmC on an ARM 4-socket 256-cores server. - -The lower latency provided by MOT reduces transaction speed by 3x to 5.5x, as observed in TPC-C benchmarks. - -Additionally, MOT enables extremely high utilization of server resources when running under high load and contention, which is a well-known problem for all leading industry databases. Using MOT, utilization reaches 99% on 4-socket server, compared with much lower utilization observed when testing other industry leading databases. - -This abilities are especially evident and important on modern many-core servers. diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/3-mot-key-technologies.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/3-mot-key-technologies.md deleted file mode 100644 index 3bf4108c17a46edc1a760b6734314a8939a4dea6..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/3-mot-key-technologies.md +++ /dev/null @@ -1,24 +0,0 @@ ---- -title: MOT Key Technologies -summary: MOT Key Technologies -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT Key Technologies - -The following key MOT technologies enable its benefits: - -- **Memory Optimized Data Structures -** With the objective of achieving optimal high concurrent throughput and predictable low latency, all data and indexes are in memory, no intermediate page buffers are used and minimal, short-duration locks are used. Data structures and all algorithms have been specialized and optimized for in-memory design. -- **Lock-free Transaction Management -** The MOT storage engine applies an optimistic approach to achieving data integrity versus concurrency and high throughput. During a transaction, an MOT table does not place locks on any version of the data rows being updated, thus significantly reducing contention in some high-volume systems. Optimistic Concurrency Control (OCC) statements within a transaction are implemented without locks, and all data modifications are performed in a part of the memory that is dedicated to private transactions (also called *Private Transaction Memory*). This means that during a transaction, the relevant data is updated in the Private Transaction Memory, thus enabling lock-less reads and writes; and a very short duration lock is only placed at the Commit phase. For more details, see the **MOT Concurrency Control Mechanism** section. -- **Lock-free Index -** Because database data and indexes stored totally in-memory, having an efficient index data structure and algorithm is essential. The MOT Index is based on state-of-the-art Masstree a fast and scalable Key Value (KV) store for multi-core systems, implemented as a Trie of B+ trees. In this way, excellent performance is achieved on many-core servers and during high concurrent workloads. This index applies various advanced techniques in order to optimize performance, such as an optimistic lock approach, cache-line awareness and memory prefetching. -- **NUMA-aware Memory Management -** MOT memory access is designed with Non-Uniform Memory Access (NUMA) awareness. NUMA-aware algorithms enhance the performance of a data layout in memory so that threads access the memory that is physically attached to the core on which the thread is running. This is handled by the memory controller without requiring an extra hop by using an interconnect, such as Intel QPI. MOT's smart memory control module with pre-allocated memory pools for various memory objects improves performance, reduces locks and ensures stability. Allocation of a transaction's memory objects is always NUMA-local. Deallocated objects are returned to the pool. Minimal usage of OS malloc during transactions circumvents unnecessary locks. -- **Efficient Durability - Logging and Checkpoint -** Achieving disk persistence (also known as *durability*) is a crucial requirement for being ACID compliant (the **D** stands for Durability). All current disks (including the SSD and NVMe) are significantly slower than memory and thus are always the bottleneck of a memory-based database. As an in-memory storage engine with full durability support, MOT's durability design must implement a wide variety of algorithmic optimizations in order to ensure durability, while still achieving the speed and throughput objectives for which it was designed. These optimizations include - - - Parallel logging, which is also available in all MogDB disk tables - - Log buffering per transaction and lock-less transaction preparation - - Updating delta records, meaning only logging changes - - In addition to synchronous and asynchronous, innovative NUMA-aware group commit logging - - State-of-the-art database checkpoints (CALC) enable the lowest memory and computational overhead. -- **High SQL Coverage and Feature Set -** By extending and relying on the PostgreSQL Foreign Data Wrappers (FDW) + Index support, the entire range of SQL is covered, including stored procedures, user-defined functions and system function calls. You may refer to the **MOT SQL Coverage and Limitations** section for a list of the features that are not supported. -- **Queries Native Compilation using PREPARE Statements -** Queries and transaction statements can be executed in an interactive manner by using PREPARE client commands that have been precompiled into a native execution format (which are also known as *Code-Gen* or *Just-in-Time [JIT]* compilation). This achieves an average of 30% higher performance. Compilation and Lite Execution are applied when possible, and if not, applicable queries are processed using the standard execution path. A Cache Plan module (that has been optimized for OLTP) re-uses compilation results throughout an entire session (even using different bind settings), as well as across different sessions. -- **Seamless Integration of MOT and MogDB Database -** The MOT operates side by side the disk-based storage engine within an integrated envelope. MOT's main memory engine and disk-based storage engines co-exist side by side in order to support multiple application scenarios, while internally reusing database auxiliary services, such as a Write-Ahead Logging (WAL) Redo Log, Replication, Checkpointing, Recovery, High Availability and so on. Users benefit from the unified deployment, configuration and access of both disk-based tables and MOT tables. This provides a flexible and cost-efficient choice of which storage engine to use according to specific requirements. For example, to place highly performance-sensitive data that causes bottlenecks into memory. diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/4-mot-usage-scenarios.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/4-mot-usage-scenarios.md deleted file mode 100644 index ef56b9ec8f19f87d2a511506afaed5bbae603f48..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/4-mot-usage-scenarios.md +++ /dev/null @@ -1,22 +0,0 @@ ---- -title: MOT Usage Scenarios -summary: MOT Usage Scenarios -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT Usage Scenarios - -MOT can significantly speed up an application's overall performance, depending on the characteristics of the workload. MOT improves the performance of transaction processing by making data access and transaction execution more efficient and minimizing redirections by removing lock and latch contention between concurrently executing transactions. - -MOT's extreme speed stems from the fact that it is optimized around concurrent in-memory usage management (not just because it is in memory). Data storage, access and processing algorithms were designed from the ground up to take advantage of the latest state of the art enhancements in in-memory and high-concurrency computing. - -MogDB enables an application to use any combination of MOT tables and standard disk-based tables. MOT is especially beneficial for enabling your most active, high-contention and performance-sensitive application tables that have proven to be bottlenecks and for tables that require a predictable low-latency access and high throughput. - -MOT tables can be used for a variety of application use cases, which include: - -- **High-throughput Transactions Processing -** This is the primary scenario for using MOT, because it supports large transaction volume that requires consistently low latency for individual transactions. Examples of such applications are real-time decision systems, payment systems, financial instrument trading, sports betting, mobile gaming, ad delivery and so on. -- **Acceleration of Performance Bottlenecks -** High contention tables can significantly benefit from using MOT, even when other tables are on disk. The conversion of such tables (in addition to related tables and tables that are referenced together in queries and transactions) result in a significant performance boost as the result of lower latencies, less contention and locks, and increased server throughput ability. -- **Elimination of Mid-Tier Cache -** Cloud and Mobile applications tend to have periodic or spikes of massive workload. Additionally, many of these applications have 80% or above read-workload, with frequent repetitive queries. To sustain the workload spikes, as well to provide optimal user experience by low-latency response time, applications sometimes deploy a mid-tier caching layer. Such additional layers increase development complexity and time, and also increase operational costs. MOT provides a great alternative, simplifying the application architecture with a consistent and high performance data store, while shortening development cycles and reducing CAPEX and OPEX costs. -- **Large-scale Data Streaming and Data Ingestion -** MOT tables enables large-scale streamlined data processing in the Cloud (for Mobile, M2M and IoT), Transactional Processing (TP), Analytical Processing (AP) and Machine Learning (ML). MOT tables are especially good at consistently and quickly ingesting large volumes of data from many different sources at the same time. The data can be later processed, transformed and moved in slower disk-based tables. Alternatively, MOT enables the querying of consistent and up-date data that enable real-time conclusions. In IoT and cloud applications with many real-time data streams, it is common to have special data ingestion and processing triers. For instance, an Apache Kafka cluster can be used to ingest data of 100,000 events/sec with a 10msec latency. A periodic batch processing task enriches and converts the collected data into an alternative format to be placed into a relational database for further analysis. MOT can support such scenarios (while eliminating the separate data ingestion tier) by ingesting data streams directly into MOT relational tables, ready for analysis and decisions. This enables faster data collection and processing, MOT eliminates costly tiers and slow batch processing, increases consistency, increases freshness of analyzed data, as well as lowers Total Cost of Ownership (TCO). -- **Lower TCO -** Higher resource efficiency and mid-tier elimination can save 30% to 90%. diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/5-mot-performance-benchmarks.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/5-mot-performance-benchmarks.md deleted file mode 100644 index 8d68d86f967ec34d3ba94c62705ede8c64d5e236..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/5-mot-performance-benchmarks.md +++ /dev/null @@ -1,189 +0,0 @@ ---- -title: MOT Performance Benchmarks -summary: MOT Performance Benchmarks -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT Performance Benchmarks - -Our performance tests are based on the TPC-C Benchmark that is commonly used both by industry and academia. - -Ours tests used BenchmarkSQL (see **MOT Sample TPC-C Benchmark**) and generates the workload using interactive SQL commands, as opposed to stored procedures. - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** Using the stored procedures approach may produce even higher performance results because it involves significantly less networking roundtrips and database envelope SQL processing cycles. - -All tests that evaluated the performance of MogDB MOT vs DISK used synchronous logging and its optimized **group-commit=on** version in MOT. - -Finally, we performed an additional test in order to evaluate MOT's ability to quickly and ingest massive quantities of data and to serve as an alternative to a mid-tier data ingestion solutions. - -All tests were performed in June 2020. - -The following shows various types of MOT performance benchmarks. - -## MOT Hardware - -The tests were performed on servers with the following configuration and with 10Gbe networking - - -- ARM64/Kunpeng 920-based 2-socket servers, model Taishan 2280 v2 (total 128 Cores), 800GB RAM, 1TB NVMe disk. OS: openEuler - -- ARM64/Kunpeng 960-based 4-socket servers, model Taishan 2480 v2 (total 256 Cores), 512GB RAM, 1TB NVMe disk. OS: openEuler - -- x86-based Dell servers, with 2-sockets of Intel Xeon Gold 6154 CPU @ 3GHz with 18 Cores (72 Cores, with hyper-threading=on), 1TB RAM, 1TB SSD OS: CentOS 7.6 - -- x86-based SuperMicro server, with 8-sockets of Intel(R) Xeon(R) CPU E7-8890 v4 @ 2.20GHz 24 cores (total 384 Cores, with hyper-threading=on), 1TB RAM, 1.2TB SSD (Seagate 1200 SSD 200GB, SAS 12Gb/s). OS: Ubuntu 16.04.2 LTS - -- x86-based Huawei server, with 4-sockets of Intel(R) Xeon(R) CPU E7-8890 v4 2.2Ghz (total 96 Cores, with hyper-threading=on), 512GB RAM, SSD 2TB OS: CentOS 7.6 - -## MOT Results - Summary - -MOT provides higher performance than disk-tables by a factor of 2.5x to 4.1x and reaches 4.8 million tpmC on ARM/Kunpeng-based servers with 256 cores. The results clearly demonstrate MOT's exceptional ability to scale-up and utilize all hardware resources. Performance jumps as the quantity of CPU sockets and server cores increases. - -MOT delivers up to 30,000 tpmC/core on ARM/Kunpeng-based servers and up to 40,000 tpmC/core on x86-based servers. - -Due to a more efficient durability mechanism, in MOT the replication overhead of a Primary/Secondary High Availability scenario is 7% on ARM/Kunpeng and 2% on x86 servers, as opposed to the overhead in disk tables of 20% on ARM/Kunpeng and 15% on x86 servers. - -Finally, MOT delivers 2.5x lower latency, with TPC-C transaction response times of 2 to 7 times faster. - -## MOT High Throughput - -The following shows the results of various MOT table high throughput tests. - -### ARM/Kunpeng 2-Socket 128 Cores - -**Performance** - -The following figure shows the results of testing the TPC-C benchmark on a Huawei ARM/Kunpeng server that has two sockets and 128 cores. - -Four types of tests were performed - - -- Two tests were performed on MOT tables and another two tests were performed on MogDB disk-based tables. -- Two of the tests were performed on a Single node (without high availability), meaning that no replication was performed to a secondary node. The other two tests were performed on Primary/Secondary nodes (with high availability), meaning that data written to the primary node was replicated to a secondary node. - -MOT tables are represented in orange and disk-based tables are represented in blue. - -**Figure 1** ARM/Kunpeng 2-Socket 128 Cores - Performance Benchmarks - -![arm-kunpeng-2-socket-128-cores-performance-benchmarks](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-10.png) - -The results showed that: - -- As expected, the performance of MOT tables is significantly greater than of disk-based tables in all cases. -- For a Single Node - 3.8M tpmC for MOT tables versus 1.5M tpmC for disk-based tables -- For a Primary/Secondary Node - 3.5M tpmC for MOT tables versus 1.2M tpmC for disk-based tables -- For production grade (high-availability) servers (Primary/Secondary Node) that require replication, the benefit of using MOT tables is even more significant than for a Single Node (without high-availability, meaning no replication). -- The MOT replication overhead of a Primary/Secondary High Availability scenario is 7% on ARM/Kunpeng and 2% on x86 servers, as opposed to the overhead of disk tables of 20% on ARM/Kunpeng and 15% on x86 servers. - -**Performance per CPU core** - -The following figure shows the TPC-C benchmark performance/throughput results per core of the tests performed on a Huawei ARM/Kunpeng server that has two sockets and 128 cores. The same four types of tests were performed (as described above). - -**Figure 2** ARM/Kunpeng 2-Socket 128 Cores - Performance per Core Benchmarks - -![arm-kunpeng-2-socket-128-cores-performance-per-core-benchmarks](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-11.png) - -The results showed that as expected, the performance of MOT tables is significantly greater per core than of disk-based tables in all cases. It also shows that for production grade (high-availability) servers (Primary/Secondary Node) that require replication, the benefit of using MOT tables is even more significant than for a Single Node (without high-availability, meaning no replication). - -### ARM/Kunpeng 4-Socket 256 Cores - -The following demonstrates MOT's excellent concurrency control performance by showing the tpmC per quantity of connections. - -**Figure 3** ARM/Kunpeng 4-Socket 256 Cores - Performance Benchmarks - -![arm-kunpeng-4-socket-256-cores-performance-benchmarks](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-12.png) - -The results show that performance increases significantly even when there are many cores and that peak performance of 4.8M tpmC is achieved at 768 connections. - -### x86-based Servers - -- **8-Socket 384 Cores** - -The following demonstrates MOT’s excellent concurrency control performance by comparing the tpmC per quantity of connections between disk-based tables and MOT. This test was performed on an x86 server with eight sockets and 384 cores. The orange represents the results of the MOT table. - -**Figure 4** x86 8-Socket 384 Cores - Performance Benchmarks - -![x86-8-socket-384-cores-performance-benchmarks](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-13.png) - -The results show that MOT tables significantly outperform disk-based tables and have very highly efficient performance per core on a 386 core server, reaching over 3M tpmC / core. - -- **4-Socket 96 Cores** - -3.9 million tpmC was achieved by MOT on this 4-socket 96 cores server. The following figure shows a highly efficient MOT table performance per core reaching 40,000 tpmC / core. - -**Figure 5** 4-Socket 96 Cores - Performance Benchmarks - -![4-socket-96-cores-performance-benchmarks](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-14.png) - -## MOT Low Latency - -The following was measured on ARM/Kunpeng 2-socket server (128 cores). The numbers scale is milliseconds (ms). - -**Figure 1** Low Latency (90th%) - Performance Benchmarks - -![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-15.png) - -MOT's average transaction speed is 2.5x, with MOT latency of 10.5 ms, compared to 23-25ms for disk tables. - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** The average was calculated by taking into account all TPC-C 5 transaction percentage distributions. For more information, you may refer to the description of TPC-C transactions in the **MOT Sample TPC-C Benchmark** section. - -**Figure 2** Low Latency (90th%, Transaction Average) - Performance Benchmarks - -![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-16.png) - -## MOT RTO and Cold-Start Time - -### High Availability Recovery Time Objective (RTO) - -MOT is fully integrated into MogDB, including support for high-availability scenarios consisting of primary and secondary deployments. The WAL Redo Log's replication mechanism replicates changes into the secondary database node and uses it for replay. - -If a Failover event occurs, whether it is due to an unplanned primary node failure or due to a planned maintenance event, the secondary node quickly becomes active. The amount of time that it takes to recover and replay the WAL Redo Log and to enable connections is also referred to as the Recovery Time Objective (RTO). - -**The RTO of MogDB, including the MOT, is less than 10 seconds.** - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** The Recovery Time Objective (RTO) is the duration of time and a service level within which a business process must be restored after a disaster in order to avoid unacceptable consequences associated with a break in continuity. In other words, the RTO is the answer to the question: "How much time did it take to recover after notification of a business process disruption?" - -In addition, as shown in the **MOT High Throughput** section in MOT the replication overhead of a Primary/Secondary High Availability scenario is only 7% on ARM/Kunpeng servers and 2% on x86 servers, as opposed to the replication overhead of disk-tables, which is 20% on ARM/Kunpeng and 15% on x86 servers. - -### Cold-Start Recovery Time - -Cold-start Recovery time is the amount of time it takes for a system to become fully operational after a stopped mode. In memory databases, this includes the loading of all data and indexes into memory, thus it depends on data size, hardware bandwidth, and on software algorithms to process it efficiently. - -Our MOT tests using ARM servers with NVMe disks demonstrate the ability to load **100 GB of database checkpoint in 40 seconds (2.5 GB/sec)**. Because MOT does not persist indexes and therefore they are created at cold-start, the actual size of the loaded data + indexes is approximately 50% more. Therefore, can be converted to **MOT cold-start time of Data + Index capacity of 150GB in 40 seconds,** or **225 GB per minute (3.75 GB/sec)**. - -The following figure demonstrates cold-start process and how long it takes to load data into a MOT table from the disk after a cold start. - -**Figure 1** Cold-Start Time - Performance Benchmarks - -![cold-start-time-performance-benchmarks](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-17.png) - -- **Database Size -** The total amount of time to load the entire database (in GB) is represented by the blue line and the **TIME (sec)** Y axis on the left. -- **Throughput -** The quantity of database GB throughput per second is represented by the orange line and the **Throughput GB/sec** Y axis on the right. - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** The performance demonstrated during the test is very close to the bandwidth of the SSD hardware. Therefore, it is feasible that higher (or lower) performance may be achieved on a different platform. - -## MOT Resource Utilization - -The following figure shows the resource utilization of the test performed on a x86 server with four sockets, 96 cores and 512GB RAM server. It demonstrates that a MOT table is able to efficiently and consistently consume almost all available CPU resources. For example, it shows that almost 100% CPU percentage utilization is achieved for 192 cores and 3.9M tpmC. - -- **tmpC -** Number of TPC-C transactions completed per minute is represented by the orange bar and the **tpmC** Y axis on the left. -- **CPU % Utilization -** The amount of CPU utilization is represented by the blue line and the **CPU %** Y axis on the right. - -**Figure 1** Resource Utilization - Performance Benchmarks - -![resource-utilization-performance-benchmarks](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-18.png) - -## MOT Data Ingestion Speed - -This test simulates realtime data streams arriving from massive IoT, cloud or mobile devices that need to be quickly and continuously ingested into the database on a massive scale. - -- The test involved ingesting large quantities of data, as follows - - - - 10 million rows were sent by 500 threads, 2000 rounds, 10 records (rows) in each insert command, each record was 200 bytes. - - The client and database were on different machines. Database server - x86 2-socket, 72 cores. - -- Performance Results - - - **Throughput - 10,000** Records/Core or **2** MB/Core. - - **Latency - 2.8ms per a 10 records** bulk insert (includes client-server networking) - - > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-caution.gif) **CAUTION:** We are projecting that multiple additional, and even significant, performance improvements will be made by MOT for this scenario. Click **MOT Usage Scenarios** for more information about large-scale data streaming and data ingestion. diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/introducing-mot.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/introducing-mot.md deleted file mode 100644 index e7210704b0a890d0f60291dbc0dab5c01e367b89..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/introducing-mot.md +++ /dev/null @@ -1,16 +0,0 @@ ---- -title: Introducing MOT -summary: Introducing MOT -author: Guo Huan -date: 2023-05-22 ---- - -# Introducing MOT - -This chapter introduces MogDB Memory-Optimized Tables (MOT), describes its features and benefits, key technologies, applicable scenarios, performance benchmarks and its competitive advantages. - -+ **[MOT Introduction](1-mot-introduction.md)** -+ **[MOT Features and Benefits](2-mot-features-and-benefits.md)** -+ **[MOT Key Technologies](3-mot-key-technologies.md)** -+ **[MOT Usage Scenarios](4-mot-usage-scenarios.md)** -+ **[MOT Performance Benchmarks](5-mot-performance-benchmarks.md)** \ No newline at end of file diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/1-using-mot-overview.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/1-using-mot-overview.md deleted file mode 100644 index a54764d553c93fc863a8b68b0005e8abd5bab3ce..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/1-using-mot-overview.md +++ /dev/null @@ -1,18 +0,0 @@ ---- -title: Using MOT Overview -summary: Using MOT Overview -author: Zhang Cuiping -date: 2021-03-04 ---- - -# Using MOT Overview - -MOT is automatically deployed as part of openGauss. You may refer to the **MOT Preparation** section for a description of how to estimate and plan required memory and storage resources in order to sustain your workload. The **MOT Deployment** section describes all the configuration settings in MOT, as well as non-mandatory options for server optimization. - -Using MOT tables is quite simple. The syntax of all MOT commands is the same as for disk-based tables and includes support for most of standard PostgreSQL SQL, DDL and DML commands and features, such as Stored Procedures. Only the create and drop table statements in MOT differ from the statements for disk-based tables in openGauss. You may refer to the **MOT Usage** section for a description of these two simple commands, to learn how to convert a disk-based table into an MOT table, to get higher performance using Query Native Compilation and PREPARE statements and for a description of external tool support and the limitations of the MOT engine. - -The **MOT Administration** section describes how to perform database maintenance, monitoring and analysis of logs and reported errors. Lastly, the **MOT Sample TPC-C Benchmark** section describes how to perform a standard TPC-C benchmark. - -- Read the following topics to learn how to use MOT - - - ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/using-mot-overview-2.png) diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/2-mot-preparation.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/2-mot-preparation.md deleted file mode 100644 index 3cceb8b2a61ef33fce7dd877b24daca2f1e19efb..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/2-mot-preparation.md +++ /dev/null @@ -1,206 +0,0 @@ ---- -title: MOT Preparation -summary: MOT Preparation -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT Preparation - -The following describes the prerequisites and the memory and storage planning to perform in order to prepare to use MOT. - -## MOT Prerequisites - -The following specifies the hardware and software prerequisites for using MogDB MOT. - -### Supported Hardware - -MOT can utilize state-of-the-art hardware, as well as support existing hardware platforms. Both x86 architecture and ARM by Huawei Kunpeng architecture are supported. - -MOT is fully aligned with the hardware supported by the MogDB database. For more information, see the *MogDB Installation Guide*. - -### CPU - -MOT delivers exceptional performance on many-core servers (scale-up). MOT significantly outperforms the competition in these environments and provides near-linear scaling and extremely high resource utilization. - -Even so, users can already start realizing MOT's performance benefits on both low-end, mid-range and high-end servers, starting from one or two CPU sockets, as well as four and even eight CPU sockets. Very high performance and resource utilization are also expected on very high-end servers that have 16 or even 32 sockets (for such cases, we recommend contacting Enmo support). - -### Memory - -MOT supports standard RAM/DRAM for its data and transaction management. All MOT tables’ data and indexes reside in-memory; therefore, the memory capacity must support the data capacity and still have space for further growth. For detailed information about memory requirements and planning, see the **MOT Memory and Storage Planning** section. - -### Storage IO - -MOT is a durable database and uses persistent storage (disk/SSD/NVMe drive[s]) for transaction log operations and periodic checkpoints. - -We recommend using a storage device with low latency, such as SSD with a RAID-1 configuration, NVMe or any enterprise-grade storage system. When appropriate hardware is used, the database transaction processing and contention are the bottleneck, not the IO. - -For detailed memory requirements and planning, see the **MOT Memory and Storage Planning** section. - -Supported Operating Systems - -MOT is fully aligned with the operating systems supported by MogDB. - -MOT supports both bare-metal and virtualized environments that run the following operating systems on a bare-metal server or virtual machine - - -- **x86 -** CentOS 7.6 and EulerOS 2.0 -- **ARM -** openEuler and EulerOS - -### OS Optimization - -MOT does not require any special modifications or the installation of new software. However, several optional optimizations can enhance performance. You may refer to the **MOT Server Optimization - x86** and **MOT Server Optimization - ARM Huawei Taishan 2P/4P** sections for a description of the optimizations that enable maximal performance. - -## MOT Memory and Storage Planning - -This section describes the considerations and guidelines for evaluating, estimating and planning the quantity of memory and storage capacity to suit your specific application needs. This section also describes the various data aspects that affect the quantity of required memory, such as the size of data and indexes for the planned tables, memory to sustain transaction management and how fast the data is growing. - -### MOT Memory Planning - -MOT belongs to the in-memory database class (IMDB) in which all tables and indexes reside entirely in memory. - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** Memory storage is volatile, meaning that it requires power to maintain the stored information. Disk storage is persistent, meaning that it is written to disk, which is non-volatile storage. MOT uses both, having all data in memory, while persisting (by WAL logging) transactional changes to disk with strict consistency (in synchronous logging mode). - -Sufficient physical memory must exist on the server in order to maintain the tables in their initial state, as well as to accommodate the related workload and growth of data. All this is in addition to the memory that is required for the traditional disk-based engine, tables and sessions that support the workload of disk-based tables. Therefore, planning ahead for enough memory to contain them all is essential. - -Even so, you can get started with whatever amount of memory you have and perform basic tasks and evaluation tests. Later, when you are ready for production, the following issues should be addressed. - -- **Memory Configuration Settings** - - Similar to standard PG , the memory of the MogDB database process is controlled by the upper limit in its max_process_memory setting, which is defined in the postgres.conf file. The MOT engine and all its components and threads, reside within the MogDB process. Therefore, the memory allocated to MOT also operates within the upper boundary defined by max_process_memory for the entire MogDB database process. - - The amount of memory that MOT can reserve for itself is defined as a portion of max_process_memory. It is either a percentage of it or an absolute value that is less than it. This portion is defined in the mot.conf configuration file by the _mot__memory settings. - - The portion of max_process_memory that can be used by MOT must still leave at least 2 GB available for the PG (MogDB) envelope. Therefore, in order to ensure this, MOT verifies the following during database startup - - - ``` - (max_mot_global_memory + max_mot_local_memory) + 2GB < max_process_memory - ``` - - If this limit is breached, then MOT memory internal limits are adjusted in order to provide the maximum possible within the limitations described above. This adjustment is performed during startup and calculates the value of MOT max memory accordingly. - - > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** MOT max memory is a logically calculated value of either the configured settings or their adjusted values of (max_mot_global_memory + max_mot_local_memory). - - In this case, a warning is issued to the server log, as shown below - - - **Warning Examples** - - Two messages are reported - the problem and the solution. - - The following is an example of a warning message reporting the problem - - - ``` - [WARNING] MOT engine maximum memory definitions (global: 9830 MB, local: 1843 MB, session large store: 0 MB, total: 11673 MB) breach GaussDB maximum process memory restriction (12288 MB) and/or total system memory (64243 MB). MOT values shall be adjusted accordingly to preserve required gap (2048 MB). - ``` - - The following is an example of a warning message indicating that MOT is automatically adjusting the memory limits - - - ``` - [WARNING] Adjusting MOT memory limits: global = 8623 MB, local = 1617 MB, session large store = 0 MB, total = 10240 MB - ``` - - This is the only place that shows the new memory limits. - - Additionally, MOT does not allow the insertion of additional data when the total memory usage approaches the chosen memory limits. The threshold for determining when additional data insertions are no longer allowed, is defined as a percentage of MOT max memory (which is a calculated value, as described above). The default is 90, meaning 90%. Attempting to add additional data over this threshold returns an error to the user and is also registered in the database log file. - -- **Minimum and Maximum** - - In order to secure memory for future operations, MOT pre-allocates memory based on the minimum global and local settings. The database administrator should specify the minimum amount of memory required for the MOT tables and sessions to sustain their workload. This ensures that this minimal memory is allocated to MOT even if another excessive memory-consuming application runs on the same server as the database and competes with the database for memory resources. The maximum values are used to limit memory growth. - -- **Global and Local** - - The memory used by MOT is comprised of two parts - - - - **Global Memory -** Global memory is a long-term memory pool that contains the data and indexes of MOT tables. It is evenly distributed across NUMA-nodes and is shared by all CPU cores. - - **Local Memory -** Local memory is a memory pool used for short-term objects. Its primary consumers are sessions handling transactions. These sessions are storing data changes in the part of the memory dedicated to the relevant specific transaction (known as *transaction private memory*). Data changes are moved to the global memory at the commit phase. Memory object allocation is performed in NUMA-local manner in order to achieve the lowest possible latency. - - Deallocated objects are put back in the relevant memory pools. Minimal use of operating system memory allocation (malloc) functions during transactions circumvents unnecessary locks and latches. - - The allocation of these two memory parts is controlled by the dedicated **min/max_mot_global_memory** and **min/max_mot_local_memory** settings. If MOT global memory usage gets too close to this defined maximum, then MOT protects itself and does not accept new data. Attempts to allocate memory beyond this limit are denied and an error is reported to the user. - -- **Minimum Memory Requirements** - - To get started and perform a minimal evaluation of MOT performance, there are a few requirements. - - Make sure that the **max_process_memory** (as defined in **postgres.conf**) has sufficient capacity for MOT tables and sessions (configured by **mix/max_mot_global_memory** and **mix/max_mot_local_memory**), in addition to the disk tables buffer and extra memory. For simple tests, the default **mot.conf** settings can be used. - -- **Actual Memory Requirements During Production** - - In a typical OLTP workload, with 80:20 read:write ratio on average, MOT memory usage per table is 60% higher than in disk-based tables (this includes both the data and the indexes). This is due to the use of more optimal data structures and algorithms that enable faster access, with CPU-cache awareness and memory-prefetching. - - The actual memory requirement for a specific application depends on the quantity of data, the expected workload and especially on the data growth. - -- **Max Global Memory Planning - Data + Index Size** - - To plan for maximum global memory - - - 1. Determine the size of a specific disk table (including both its data and all its indexes). The following statistical query can be used to determine the data size of the **customer** table and the **customer_pkey** index size - - - **Data size -** select pg_relation_size(‘customer'); - - **Index -** select pg_relation_size('customer_pkey'); - 2. Add 60%, which is the common requirement in MOT relative to the current size of the disk-based data and index. - 3. Add an additional percentage for the expected growth of data. For example - - - 5% monthly growth = 80% yearly growth (1.05^12). Thus, in order to sustain a year's growth, allocate 80% more memory than is currently used by the tables. - - This completes the estimation and planning of the max_mot_global_memory value. The actual setting can be defined either as an absolute value or a percentage of the Postgres max_process_memory. The exact value is typically finetuned during deployment. - -- **Max Local Memory Planning - Concurrent Session Support** - - Local memory needs are primarily a function of the quantity of concurrent sessions. The typical OLTP workload of an average session uses up to 8 MB. This should be multiplied by the quantity of sessions and then a little bit extra should be added. - - A memory calculation can be performed in this manner and then finetuned, as follows - - - ``` - SESSION_COUNT * SESSION_SIZE (8 MB) + SOME_EXTRA (100MB should be enough) - ``` - - The default specifies 15% of Postgres's max_process_memory, which by default is 12 GB. This equals 1.8 GB, which is sufficient for 230 sessions, which is the requirement for the max_mot_local memory. The actual setting can be defined either in absolute values or as a percentage of the Postgres max_process_memory. The exact value is typically finetuned during deployment. - - **Unusually Large Transactions** - - Some transactions are unusually large because they apply changes to a large number of rows. This may increase a single session's local memory up to the maximum allowed limit, which is 1 GB. For example - - - ``` - delete from SOME_VERY_LARGE_TABLE; - ``` - - Take this scenario into consideration when configuring the max_mot_local_memory setting, as well as during application development. - - > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** You may refer to the **MEMORY (MOT)** section for more information about configuration settings. - -### Storage IO - -MOT is a memory-optimized, persistent database storage engine. A disk drive(s) is required for storing the Redo Log (WAL) and a periodic checkpoint. - -It is recommended to use a storage device with low latency, such as SSD with a RAID-1 configuration, NVMe or any enterprise-grade storage system. When appropriate hardware is used, the database transaction processing and contention are the bottleneck, not the IO. - -Since the persistent storage is much slower than RAM memory, the IO operations (logging and checkpoint) can create a bottleneck for both an in-memory and memory-optimized databases. However, MOT has a highly efficient durability design and implementation that is optimized for modern hardware (such as SSD and NVMe). In addition, MOT has minimized and optimized writing points (for example, by using parallel logging, a single log record per transaction and NUMA-aware transaction group writing) and has minimized the data written to disk (for example, only logging the delta or updated columns of the changed records and only logging a transaction at the commit phase). - -### Required Capacity - -The required capacity is determined by the requirements of checkpointing and logging, as described below - - -- **Checkpointing** - - A checkpoint saves a snapshot of all the data to disk. - - Twice the size of all data should be allocated for checkpointing. There is no need to allocate space for the indexes for checkpointing - - Checkpointing = 2x the MOT Data Size (rows only, index is not persistent). - - Twice the size is required because a snapshot is saved to disk of the entire size of the data, and in addition, the same amount of space should be allocated for the checkpoint that is in progress. When a checkpoint process finishes, the previous checkpoint files are deleted. - - > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** In the next MogDB release, MOT will have an incremental checkpoint feature, which will significantly reduce this storage capacity requirement. - -- **Logging** - - MOT table log records are written to the same database transaction log as the other records of disk-based tables. - - The size of the log depends on the transactional throughput, the size of the data changes and the time between checkpoints (at each time checkpoint the Redo Log is truncated and starts to expand again). - - MOT tables use less log bandwidth and have lower IO contention than disk-based tables. This is enabled by multiple mechanisms. - - For example, MOT does not log every operation before a transaction has been completed. It is only logged at the commit phase and only the updated delta record is logged (not full records like for disk-based tables). - - In order to ensure that the log IO device does not become a bottleneck, the log file must be placed on a drive that has low latency. - - > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** You may refer to the **STORAGE (MOT)** section for more information about configuration settings. diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/3-mot-deployment.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/3-mot-deployment.md deleted file mode 100644 index 1bf1320b9f6b91492427e40529ee6ffe2eb1c133..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/3-mot-deployment.md +++ /dev/null @@ -1,660 +0,0 @@ ---- -title: MOT Deployment -summary: MOT Deployment -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT Deployment - -The following sections describe various mandatory and optional settings for optimal deployment. - -## MOT Server Optimization - x86 - -Generally, databases are bounded by the following components - - -- **CPU -** A faster CPU speeds up any CPU-bound database. -- **Disk -** High-speed SSD/NVME speeds up any I/O-bound database. -- **Network -** A faster network speeds up any **SQL\\Net**-bound database. - -In addition to the above, the following general-purpose server settings are used by default and may significantly affect a database's performance. - -MOT performance tuning is a crucial step for ensuring fast application functionality and data retrieval. MOT can utilize state-of-the-art hardware, and therefore it is extremely important to tune each system in order to achieve maximum throughput. - -The following are optional settings for optimizing MOT database performance running on an Intel x86 server. These settings are optimal for high throughput workloads - - -### BIOS - -- Hyper Threading - ON - - Activation (HT=ON) is highly recommended. - - We recommend turning hyper threading ON while running OLTP workloads on MOT. When hyper-threading is used, some OLTP workloads demonstrate performance gains of up to40%. - -### OS Environment Settings - -- NUMA - - Disable NUMA balancing, as described below. MOT performs its own memory management with extremely efficient NUMA-awareness, much more than the default methods used by the operating system. - - ``` - echo 0 > /proc/sys/kernel/numa_balancing - ``` - -- Services - - Disable Services, as described below - - - ``` - service irqbalance stop # MANADATORY - service sysmonitor stop # OPTIONAL, performance - service rsyslog stop # OPTIONAL, performance - ``` - -- Tuned Service - - The following section is mandatory. - - The server must run the throughput-performance profile - - - ``` - [...]$ tuned-adm profile throughput-performance - ``` - - The **throughput-performance** profile is broadly applicable tuning that provides excellent performance across a variety of common server workloads. - - Other less suitable profiles for MogDB and MOT server that may affect MOT's overall performance are - balanced, desktop, latency-performance, network-latency, network-throughput and powersave. - -- Sysctl - - The following lists the recommended operating system settings for best performance. - - - Add the following settings to /etc/sysctl.conf and run sysctl -p - - ```bash - net.ipv4.ip_local_port_range = 9000 65535 - kernel.sysrq = 1 - kernel.panic_on_oops = 1 - kernel.panic = 5 - kernel.hung_task_timeout_secs = 3600 - kernel.hung_task_panic = 1 - vm.oom_dump_tasks = 1 - kernel.softlockup_panic = 1 - fs.file-max = 640000 - kernel.msgmnb = 7000000 - kernel.sched_min_granularity_ns = 10000000 - kernel.sched_wakeup_granularity_ns = 15000000 - kernel.numa_balancing=0 - vm.max_map_count = 1048576 - net.ipv4.tcp_max_tw_buckets = 10000 - net.ipv4.tcp_tw_reuse = 1 - net.ipv4.tcp_tw_recycle = 1 - net.ipv4.tcp_keepalive_time = 30 - net.ipv4.tcp_keepalive_probes = 9 - net.ipv4.tcp_keepalive_intvl = 30 - net.ipv4.tcp_retries2 = 80 - kernel.sem = 250 6400000 1000 25600 - net.core.wmem_max = 21299200 - net.core.rmem_max = 21299200 - net.core.wmem_default = 21299200 - net.core.rmem_default = 21299200 - #net.sctp.sctp_mem = 94500000 915000000 927000000 - #net.sctp.sctp_rmem = 8192 250000 16777216 - #net.sctp.sctp_wmem = 8192 250000 16777216 - net.ipv4.tcp_rmem = 8192 250000 16777216 - net.ipv4.tcp_wmem = 8192 250000 16777216 - net.core.somaxconn = 65535 - vm.min_free_kbytes = 26351629 - net.core.netdev_max_backlog = 65535 - net.ipv4.tcp_max_syn_backlog = 65535 - #net.sctp.addip_enable = 0 - net.ipv4.tcp_syncookies = 1 - vm.overcommit_memory = 0 - net.ipv4.tcp_retries1 = 5 - net.ipv4.tcp_syn_retries = 5 - ``` - - - Update the section of /etc/security/limits.conf to the following - - - ```bash - soft nofile 100000 - hard nofile 100000 - ``` - - The **soft** and a **hard** limit settings specify the quantity of files that a process may have opened at once. The soft limit may be changed by each process running these limits up to the hard limit value. - -- Disk/SSD - - The following describes how to ensure that disk R/W performance is suitable for database synchronous commit mode. - - To do so, test your disk bandwidth using the following - - ``` - [...]$ sync; dd if=/dev/zero of=testfile bs=1M count=1024; sync - 1024+0 records in - 1024+0 records out - 1073741824 bytes (1.1 GB) copied, 1.36034 s, 789 MB/s - ``` - - In case the disk bandwidth is significantly below the above number (789 MB/s), it may create a performance bottleneck for MogDB, and especially for MOT. - -### Network - -Use a 10Gbps network or higher. - -To verify, use iperf, as follows - - -``` -Server side: iperf -s -Client side: iperf -c -``` - -- rc.local - Network Card Tuning - - The following optional settings have a significant effect on performance - - - 1. Copy set_irq_affinity.sh from to /var/scripts/. - - 2. Put in /etc/rc.d/rc.local and run chmod in order to ensure that the following script is executed during boot - - - ```bash - chmod +x /etc/rc.d/rc.local - var/scripts/set_irq_affinity.sh -x all - ethtool -K gro off - ethtool -C adaptive-rx on adaptive-tx on - Replace with the network card, i.e. ens5f1 - ``` - -## MOT Server Optimization - ARM Huawei Taishan 2P/4P - -The following are optional settings for optimizing MOT database performance running on an ARM/Kunpeng-based Huawei Taishan 2280 v2 server powered by 2-sockets with a total of 256 Cores and Taishan 2480 v2 server powered by 4-sockets with a total of 256 Cores. - -Unless indicated otherwise, the following settings are for both client and server machines - - -### BIOS - -Modify related BIOS settings, as follows - - -1. Select **BIOS** - **Advanced** - **MISC Config**. Set **Support Smmu** to **Disabled**. - -2. Select **BIOS** - **Advanced** - **MISC Config**. Set **CPU Prefetching Configuration** to **Disabled**. - - ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-deployment-1.png) - -3. Select **BIOS** - **Advanced** - **Memory Config**. Set **Die Interleaving** to **Disabled**. - - ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-deployment-2.png) - -4. Select **BIOS** - **Advanced** - **Performance Config**. Set **Power Policy** to **Performance**. - - ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-deployment-3.png) - -### OS - Kernel and Boot - -- The following operating system kernel and boot parameters are usually configured by a sysadmin. - - Configure the kernel parameters, as follows - - - ```bash - net.ipv4.ip_local_port_range = 9000 65535 - kernel.sysrq = 1 - kernel.panic_on_oops = 1 - kernel.panic = 5 - kernel.hung_task_timeout_secs = 3600 - kernel.hung_task_panic = 1 - vm.oom_dump_tasks = 1 - kernel.softlockup_panic = 1 - fs.file-max = 640000 - kernel.msgmnb = 7000000 - kernel.sched_min_granularity_ns = 10000000 - kernel.sched_wakeup_granularity_ns = 15000000 - kernel.numa_balancing=0 - vm.max_map_count = 1048576 - net.ipv4.tcp_max_tw_buckets = 10000 - net.ipv4.tcp_tw_reuse = 1 - net.ipv4.tcp_tw_recycle = 1 - net.ipv4.tcp_keepalive_time = 30 - net.ipv4.tcp_keepalive_probes = 9 - net.ipv4.tcp_keepalive_intvl = 30 - net.ipv4.tcp_retries2 = 80 - kernel.sem = 32000 1024000000 500 32000 - kernel.shmall = 52805669 - kernel.shmmax = 18446744073692774399 - sys.fs.file-max = 6536438 - net.core.wmem_max = 21299200 - net.core.rmem_max = 21299200 - net.core.wmem_default = 21299200 - net.core.rmem_default = 21299200 - net.ipv4.tcp_rmem = 8192 250000 16777216 - net.ipv4.tcp_wmem = 8192 250000 16777216 - net.core.somaxconn = 65535 - vm.min_free_kbytes = 5270325 - net.core.netdev_max_backlog = 65535 - net.ipv4.tcp_max_syn_backlog = 65535 - net.ipv4.tcp_syncookies = 1 - vm.overcommit_memory = 0 - net.ipv4.tcp_retries1 = 5 - net.ipv4.tcp_syn_retries = 5 - ##NEW - kernel.sched_autogroup_enabled=0 - kernel.sched_min_granularity_ns=2000000 - kernel.sched_latency_ns=10000000 - kernel.sched_wakeup_granularity_ns=5000000 - kernel.sched_migration_cost_ns=500000 - vm.dirty_background_bytes=33554432 - kernel.shmmax=21474836480 - net.ipv4.tcp_timestamps = 0 - net.ipv6.conf.all.disable_ipv6=1 - net.ipv6.conf.default.disable_ipv6=1 - net.ipv4.tcp_keepalive_time=600 - net.ipv4.tcp_keepalive_probes=3 - kernel.core_uses_pid=1 - ``` - -- Tuned Service - - The following section is mandatory. - - The server must run a throughput-performance profile - - - ``` - [...]$ tuned-adm profile throughput-performance - ``` - - The **throughput-performance** profile is broadly applicable tuning that provides excellent performance across a variety of common server workloads. - - Other less suitable profiles for MogDB and MOT server that may affect MOT's overall performance are - balanced, desktop, latency-performance, network-latency, network-throughput and powersave. - -- Boot Tuning - - Add **iommu.passthrough=1** to the **kernel boot arguments**. - - When operating in **pass-through** mode, the adapter does require **DMA translation to the memory,** which improves performance. - -## MOT Configuration Settings - -MOT is provided preconfigured to creating working MOT Tables. For best results, it is recommended to customize the MOT configuration (defined in the file named mot.conf) according to your application's specific requirements and your preferences. - -This file is read-only upon server startup. If you edit this file while the system is running, then the server must be reloaded in order for the changes to take effect. - -The mot.conf file is located in the same folder as the postgres.conf configuration file. - -Read the **General Guidelines** section and then review and configure the following sections of the mot.conf file, as needed. - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** The topics listed above describe each of the setting sections in the mot.conf file. In addition to the above topics, for an overview of all the aspects of a specific MOT feature (such as Recovery), you may refer to the relevant topic of this user manual. For example, the mot.conf file has a Recovery section that contains settings that affect MOT recovery and this is described in the **MOT Recovery** section that is listed above. In addition, for a full description of all aspects of Recovery, you may refer to the **MOT Recovery** section of the Administration chapter of this user manual. Reference links are also provided in each relevant section of the descriptions below. - -The following topics describe each section in the mot.conf file and the settings that it contains, as well as the default value of each. - -### General Guidelines - -The following are general guidelines for editing the mot.conf file. - -- Each setting appears with its default value as follows - - - ``` - # name = value - ``` - -- Blank/white space is acceptable. - -- Comments are indicated by placing a number sign (#) anywhere on a line. - -- The default values of each setting appear as a comment throughout this file. - -- In case a parameter is uncommented and a new value is placed, the new setting is defined. - -- Changes to the mot.conf file are applied only at the start or reload of the database server. - -Memory Units are represented as follows - - -- KB - Kilobytes -- MB - Megabytes -- GB - Gigabytes -- TB - Terabytes - -If no memory units are specified, then bytes are assumed. - -Some memory units are represented as a percentage of the **max_process_memory** setting that is configured in **postgresql.conf**. For example - **20%**. - -Time units are represented as follows - - -- us - microseconds (or micros) -- ms - milliseconds (or millis) -- s - seconds (or secs) -- min - minutes (or mins) -- h - hours -- d - days - -If no time units are specified, then microseconds are assumed. - -### REDO LOG (MOT) - -- **enable_group_commit = false** - - Specifies whether to use group commit. - - This option is only relevant when MogDB is configured to use synchronous commit, meaning only when the synchronous_commit setting in postgresql.conf is configured to any value other than off. - -- **group_commit_size = 16** - -- **group_commit_timeout = 10 ms** - - This option is only relevant when the MOT engine has been configured to **Synchronous Group Commit** logging. This means that the synchronous_commit setting in postgresql.conf is configured to true and the enable_group_commit parameter in the mot.conf configuration file is configured to true. - - Defines which of the following determines when a group of transactions is recorded in the WAL Redo Log - - - **group_commit_size** - The quantity of committed transactions in a group. For example, **16** means that when 16 transactions in the same group have been committed by their client application, then an entry is written to disk in the WAL Redo Log for each of the 16 transactions. - - **group_commit_timeout** - A timeout period in ms. For example, **10** means that after 10 ms, an entry is written to disk in the WAL Redo Log for each of the transactions in the same group that have been committed by their client application in the lats 10 ms. - - A commit group is closed after either the configured number of transactions has arrived or after the configured timeout period since the group was opened. After the group is closed, all the transactions in the group wait for a group flush to complete execution and then notify the client that each transaction has ended. - - You may refer to **MOT Logging - WAL Redo Log** section for more information about the WAL Redo Log and synchronous group commit logging. - -### CHECKPOINT (MOT) - -- **checkpoint_dir =** - - Specifies the directory in which checkpoint data is to be stored. The default location is in the data folder of each data node. - -- **checkpoint_segsize = 16 MB** - - Specifies the segment size used during checkpoint. Checkpoint is performed in segments. When a segment is full, it is serialized to disk and a new segment is opened for the subsequent checkpoint data. - -- **checkpoint_workers = 3** - - Specifies the number of workers to use during checkpoint. - - Checkpoint is performed in parallel by several MOT engine workers. The quantity of workers may substantially affect the overall performance of the entire checkpoint operation, as well as the operation of other running transactions. To achieve a shorter checkpoint duration, a larger number of workers should be used, up to the optimal number (which varies based on the hardware and workload). However, be aware that if this number is too large, it may negatively impact the execution time of other running transactions. Keep this number as low as possible to minimize the effect on the runtime of other running transactions, but at the cost of longer checkpoint duration. - - > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** You may refer to the **MOT Checkpoints** section for more information about configuration settings. - -### RECOVERY (MOT) - -- **checkpoint_recovery_workers = 3** - - Specifies the number of workers (threads) to use during checkpoint data recovery. Each MOT engine worker runs on its own core and can process a different table in parallel by reading it into memory. For example, while the default is three-course, you might prefer to set this parameter to the number of cores that are available for processing. After recovery these threads are stopped and killed. - - > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** You may refer to the **MOT Recovery** section for more information about configuration settings. - -### STATISTICS (MOT) - -- **enable_stats = false** - - Configures periodic statistics for printing. - -- **print_stats_period = 10 minute** - - Configures the time period for printing a summary statistics report. - -- **print_full_stats_period = 1 hours** - - Configures the time period for printing a full statistics report. - - The following settings configure the various sections included in the periodic statistics report. If none of them are configured, then the statistics report is suppressed. - -- **enable_log_recovery_stats = false** - - Log recovery statistics contain various Redo Log recovery metrics. - -- **enable_db_session_stats = false** - - Database session statistics contain transaction events, such commits, rollbacks and so on. - -- **enable_network_stats = false** - - Network statistics contain connection/disconnection events. - -- **enable_log_stats = false** - - Log statistics contain details regarding the Redo Log. - -- **enable_memory_stats = false** - - Memory statistics contain memory-layer details. - -- **enable_process_stats = false** - - Process statistics contain total memory and CPU consumption for the current process. - -- **enable_system_stats = false** - - System statistics contain total memory and CPU consumption for the entire system. - -- **enable_jit_stats = false** - - JIT statistics contain information regarding JIT query compilation and execution. - -### ERROR LOG (MOT) - -- **log_level = INFO** - - Configures the log level of messages issued by the MOT engine and recorded in the Error log of the database server. Valid values are PANIC, ERROR, WARN, INFO, TRACE, DEBUG, DIAG1 and DIAG2. - -- **Log.COMPONENT.LOGGER.log_level=LOG_LEVEL** - - Configures specific loggers using the syntax described below. - - For example, to configure the TRACE log level for the ThreadIdPool logger in system component, use the following syntax - - - ``` - Log.System.ThreadIdPool.log_level=TRACE - ``` - - To configure the log level for all loggers under some component, use the following syntax - - - ``` - Log.COMPONENT.log_level=LOG_LEVEL - ``` - - For example - - - ``` - Log.System.log_level=DEBUG - ``` - -### MEMORY (MOT) - -- **enable_numa = true** - - Specifies whether to use NUMA-aware memory allocation. - - When disabled, all affinity configurations are disabled as well. - - MOT engine assumes that all the available NUMA nodes have memory. If the machine has some special configuration in which some of the NUMA nodes have no memory, then the MOT engine initialization and hence the database server startup will fail. In such machines, it is recommended that this configuration value be set to false, in order to prevent startup failures and let the MOT engine to function normally without using NUMA-aware memory allocation. - -- **affinity_mode = fill-physical-first** - - Configures the affinity mode of threads for the user session and internal MOT tasks. - - When a thread pool is used, this value is ignored for user sessions, as their affinity is governed by the thread pool. However, it is still used for internal MOT tasks. - - Valid values are **fill-socket-first**, **equal-per-socket**, **fill-physical-first** and **none** - - - - **Fill-socket-first** attaches threads to cores in the same socket until the socket is full and then moves to the next socket. - - **Equal-per-socket** spreads threads evenly among all sockets. - - **Fill-physical-first** attaches threads to physical cores in the same socket until all physical cores are employed and then moves to the next socket. When all physical cores are used, then the process begins again with hyper-threaded cores. - - **None** disables any affinity configuration and lets the system scheduler determine on which core each thread is scheduled to run. - -- **lazy_load_chunk_directory = true** - - Configures the chunk directory mode that is used for memory chunk lookup. - - **Lazy** mode configures the chunk directory to load parts of it on demand, thus reducing the initial memory footprint (from 1 GB to 1 MB approximately). However, this may result in minor performance penalties and errors in extreme conditions of memory distress. In contrast, using a **non-lazy** chunk directory allocates an additional 1 GB of initial memory, produces slightly higher performance and ensures that chunk directory errors are avoided during memory distress. - -- **reserve_memory_mode = virtual** - - Configures the memory reservation mode (either **physical** or **virtual**). - - Whenever memory is allocated from the kernel, this configuration value is consulted to determine whether the allocated memory is to be resident (**physical**) or not (**virtual**). This relates primarily to preallocation, but may also affect runtime allocations. For **physical** reservation mode, the entire allocated memory region is made resident by forcing page faults on all pages spanned by the memory region. Configuring **virtual** memory reservation may result in faster memory allocation (particularly during preallocation), but may result in page faults during the initial access (and thus may result in a slight performance hit) and more sever errors when physical memory is unavailable. In contrast, physical memory allocation is slower, but later access is both faster and guaranteed. - -- **store_memory_policy = compact** - - Configures the memory storage policy (**compact** or **expanding**). - - When **compact** policy is defined, unused memory is released back to the kernel, until the lower memory limit is reached (see **min_mot_memory** below). In **expanding** policy, unused memory is stored in the MOT engine for later reuse. A **compact** storage policy reduces the memory footprint of the MOT engine, but may occasionally result in minor performance degradation. In addition, it may result in unavailable memory during memory distress. In contrast, **expanding** mode uses more memory, but results in faster memory allocation and provides a greater guarantee that memory can be re-allocated after being de-allocated. - -- **chunk_alloc_policy = auto** - - Configures the chunk allocation policy for global memory. - - MOT memory is organized in chunks of 2 MB each. The source NUMA node and the memory layout of each chunk affect the spread of table data among NUMA nodes, and therefore can significantly affect the data access time. When allocating a chunk on a specific NUMA node, the allocation policy is consulted. - - Available values are **auto**, **local**, **page-interleaved**, **chunk-interleaved** and **native** - - - - **Auto** policy selects a chunk allocation policy based on the current hardware. - - **Local** policy allocates each chunk on its respective NUMA node. - - **Page-interleaved** policy allocates chunks that are composed of interleaved memory 4-kilobyte pages from all NUMA nodes. - - **Chunk-interleaved** policy allocates chunks in a round robin fashion from all NUMA nodes. - - **Native** policy allocates chunks by calling the native system memory allocator. - -- **chunk_prealloc_worker_count = 8** - - Configures the number of workers per NUMA node participating in memory preallocation. - -- **max_mot_global_memory = 80%** - - Configures the maximum memory limit for the global memory of the MOT engine. - - Specifying a percentage value relates to the total defined by **max_process_memory** configured in **postgresql.conf**. - - The MOT engine memory is divided into global (long-term) memory that is mainly used to store user data and local (short-term) memory that is mainly used by user sessions for local needs. - - Any attempt to allocate memory beyond this limit is denied and an error is reported to the user. Ensure that the sum of **max_mot_global_memory** and **max_mot_local_memory** do not exceed the **max_process_memory** configured in **postgresql.conf**. - -- **min_mot_global_memory = 0 MB** - - Configures the minimum memory limit for the global memory of the MOT engine. - - Specifying a percentage value relates to the total defined by the **max_process_memory** configured in **postgresql.conf**. - - This value is used for the preallocation of memory during startup, as well as to ensure that a minimum amount of memory is available for the MOT engine during its normal operation. When using **compact** storage policy (see **store_memory_policy** above), this value designates the lower limit under which memory is not released back to the kernel, but rather kept in the MOT engine for later reuse. - -- **max_mot_local_memory = 15%** - - Configures the maximum memory limit for the local memory of the MOT engine. - - Specifying a percentage value relates to the total defined by the **max_process_memory** configured in **postgresql.conf**. - - MOT engine memory is divided into global (long-term) memory that is mainly used to store user data and local (short-term) memory that is mainly used by user session for local needs. - - Any attempt to allocate memory beyond this limit is denied and an error is reported to the user. Ensure that the sum of **max_mot_global_memory** and **max_mot_local_memory** do not exceed the **max_process_memory** configured in **postgresql.conf**. - -- **min_mot_local_memory = 0 MB** - - Configures the minimum memory limit for the local memory of the MOT engine. - - Specifying a percentage value relates to the total defined by the **max_process_memory** configured in **postgresql.conf**. - - This value is used for preallocation of memory during startup, as well as to ensure that a minimum amount of memory is available for the MOT engine during its normal operation. When using compact storage policy (see **store_memory_policy** above), this value designates the lower limit under which memory is not released back to the kernel, but rather kept in the MOT engine for later reuse. - -- **max_mot_session_memory = 0 MB** - - Configures the maximum memory limit for a single session in the MOT engine. - - Typically, sessions in the MOT engine can allocate as much local memory as needed, so long as the local memory limit is not exceeded. To prevent a single session from taking too much memory, and thereby denying memory from other sessions, this configuration item is used to restrict small session-local memory allocations (up to 1,022 KB). - - Make sure that this configuration item does not affect large or huge session-local memory allocations. - - A value of zero denotes no restriction on any session-local small allocations per session, except for the restriction arising from the local memory allocation limit configured by **max_mot_local_memory**. - - Note: Percentage values cannot be set for this configuration item. - -- **min_mot_session_memory = 0 MB** - - Configures the minimum memory reservation for a single session in the MOT engine. - - This value is used to preallocate memory during session creation, as well as to ensure that a minimum amount of memory is available for the session to perform its normal operation. - - Note: Percentage values cannot be set for this configuration item. - -- **session_large_buffer_store_size = 0 MB** - - Configures the large buffer store for sessions. - - When a user session executes a query that requires a lot of memory (for example, when using many rows), the large buffer store is used to increase the certainty level that such memory is available and to serve this memory request more quickly. Any memory allocation for a session exceeding 1,022 KB is considered as a large memory allocation. If the large buffer store is not used or is depleted, such allocations are treated as huge allocations that are served directly from the kernel. - - Note: Percentage values cannot be set for this configuration item. - -- **session_large_buffer_store_max_object_size = 0 MB** - - Configures the maximum object size in the large allocation buffer store for sessions. - - Internally, the large buffer store is divided into objects of varying sizes. This value is used to set an upper limit on objects originating from the large buffer store, as well as to determine the internal division of the buffer store into objects of various size. - - This size cannot exceed 1⁄8 of the **session_large_buffer_store_size**. If it does, it is adjusted to the maximum possible. - - Note: Percentage values cannot be set for this configuration item. - -- **session_max_huge_object_size = 1 GB** - - Configures the maximum size of a single huge memory allocation made by a session. - - Huge allocations are served directly from the kernel and therefore are not guaranteed to succeed. - - This value also pertains to global (meaning not session-related) memory allocations. - - Note: Percentage values cannot be set for this configuration item. - -### GARBAGE COLLECTION (MOT) - -- **enable_gc = true** - - Specifies whether to use the Garbage Collector (GC). - -- **reclaim_threshold = 512 KB** - - Configures the memory threshold for the garbage collector. - - Each session manages its own list of to-be-reclaimed objects and performs its own garbage collection during transaction commitment. This value determines the total memory threshold of objects waiting to be reclaimed, above which garbage collection is triggered for a session. - - In general, the trade-off here is between un-reclaimed objects vs garbage collection frequency. Setting a low value keeps low levels of un-reclaimed memory, but causes frequent garbage collection that may affect performance. Setting a high value triggers garbage collection less frequently, but results in higher levels of un-reclaimed memory. This setting is dependent upon the overall workload. - -- **reclaim_batch_size = 8000** - - Configures the batch size for garbage collection. - - The garbage collector reclaims memory from objects in batches, in order to restrict the number of objects being reclaimed in a single garbage collection pass. The intent of this approach is to minimize the operation time of a single garbage collection pass. - -- **high_reclaim_threshold = 8 MB** - - Configures the high memory threshold for garbage collection. - - Because garbage collection works in batches, it is possible that a session may have many objects that can be reclaimed, but which were not. In such situations, in order to prevent garbage collection lists from becoming too bloated, this value is used to continue reclaiming objects within a single pass, even though that batch size limit has been reached, until the total size of the still-waiting-to-be-reclaimed objects is less than this threshold, or there are no more objects eligible for reclamation. - -### JIT (MOT) - -- **enable_mot_codegen = true** - - Specifies whether to use JIT query compilation and execution for planned queries. - - JIT query execution enables JIT-compiled code to be prepared for a prepared query during its planning phase. The resulting JIT-compiled function is executed whenever the prepared query is invoked. JIT compilation usually takes place in the form of LLVM. On platforms where LLVM is not natively supported, MOT provides a software-based fallback called Tiny Virtual Machine (TVM). - -- **force_mot_pseudo_codegen = false** - - Specifies whether to use TVM (pseudo-LLVM) even though LLVM is supported on the current platform. - - On platforms where LLVM is not natively supported, MOT automatically defaults to TVM. - - On platforms where LLVM is natively supported, LLVM is used by default. This configuration item enables the use of TVM for JIT compilation and execution on platforms on which LLVM is supported. - -- **enable_mot_codegen_print = false** - - Specifies whether to print emitted LLVM/TVM IR code for JIT-compiled queries. - -- **mot_codegen_limit = 100** - - Limits the number of JIT queries allowed per user session. - -### Default mot.conf - -The minimum settings and configuration specify to point the **postgresql.conf** file to the location of the **mot.conf** file - - -``` -postgresql.conf -mot_config_file = '/tmp/gauss/mot.conf' -``` - -Ensure that the value of the max_process_memory setting is sufficient to include the global (data and index) and local (sessions) memory of MOT tables. - -The default content of **mot.conf** is sufficient to get started. The settings can be optimized later. diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/4-mot-usage.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/4-mot-usage.md deleted file mode 100644 index 61762a9516d88539a3dd1a139e7a4d236bcc6d62..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/4-mot-usage.md +++ /dev/null @@ -1,506 +0,0 @@ ---- -title: MOT Usage -summary: MOT Usage -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT Usage - -Using MOT tables is quite simple and is described in the few short sections below. - -MogDB enables an application to use of MOT tables and standard disk-based tables. You can use MOT tables for your most active, high-contention and throughput-sensitive application tables or you can use MOT tables for all your application's tables. - -The following commands describe how to create MOT tables and how to convert existing disk-based tables into MOT tables in order to accelerate an application's database-related performance. MOT is especially beneficial when applied to tables that have proven to be bottlenecks. - -The following is a simple overview of the tasks related to working with MOT tables: - -- Granting User Permissions -- Creating/Dropping an MOT Table -- Creating an Index for an MOT Table -- Converting a Disk Table into an MOT Table -- Query Native Compilation -- Retrying an Aborted Transaction -- MOT External Support Tools -- MOT SQL Coverage and Limitations - -## Granting User Permissions - -The following describes how to assign a database user permission to access the MOT storage engine. This is performed only once per database user, and is usually done during the initial configuration phase. - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** The granting of user permissions is required because MOT is integrated into the MogDB database by using and extending the Foreign Data Wrapper (FDW) mechanism, which requires granting user access permissions. - -To enable a specific user to create and access MOT tables (DDL, DML, SELECT) - - -Run the following statement only once - - -```sql -GRANT USAGE ON FOREIGN SERVER mot_server TO ; -``` - -All keywords are not case sensitive. - -## Creating/Dropping an MOT Table - -Creating a Memory Optimized Table (MOT) is very simple. Only the create and drop table statements in MOT differ from the statements for disk-based tables in MogDB. The syntax of **all other** commands for SELECT, DML and DDL are the same for MOT tables as for MogDB disk-based tables. - -- To create an MOT table - - - ```sql - create FOREIGN table test(x int) [server mot_server]; - ``` - -- Always use the FOREIGN keyword to refer to MOT tables. - -- The [server mot_server] part is optional when creating an MOT table because MOT is an integrated engine, not a separate server. - -- The above is an extremely simple example creating a table named **test** with a single integer column named **x**. In the next section (**Creating an Index**) a more realistic example is provided. - -- MOT tables cannot be created if incremental checkpoint is enabled in postgresql.conf. So please set enable_incremental_checkpoint to off before creating the MOT. - -- To drop an MOT table named test - - - ```sql - drop FOREIGN table test; - ``` - -For a description of the limitations of supported features for MOT tables, such as data types, see the **MOT SQL Coverage and Limitations** section. - -## Creating an Index for an MOT Table - -Standard PostgreSQL create and drop index statements are supported. - -For example - - -```sql -create index text_index1 on test(x) ; -``` - -The following is a complete example of creating an index for the ORDER table in a TPC-C workload - - -```sql -create FOREIGN table bmsql_oorder ( - o_w_id integer not null, - o_d_id integer not null, - o_id integer not null, - o_c_id integer not null, - o_carrier_id integer, - o_ol_cnt integer, - o_all_local integer, - o_entry_d timestamp, - primarykey (o_w_id, o_d_id, o_id) -); - -create index bmsql_oorder_index1 on bmsql_oorder(o_w_id, o_d_id, o_c_id, o_id) ; -``` - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** There is no need to specify the **FOREIGN** keyword before the MOT table name, because it is only created for create and drop table commands. - -For MOT index limitations, see the Index subsection under the _SQL Coverage and Limitations_ section. - -## Converting a Disk Table into an MOT Table - -The direct conversion of disk tables into MOT tables is not yet possible, meaning that no ALTER TABLE statement yet exists that converts a disk-based table into an MOT table. - -The following describes how to manually perform a few steps in order to convert a disk-based table into an MOT table, as well as how the **gs_dump** tool is used to export data and the **gs_restore** tool is used to import data. - -### Prerequisite Check - -Check that the schema of the disk table to be converted into an MOT table contains all required columns. - -Check whether the schema contains any unsupported column data types, as described in the _Unsupported Data Types_ section. - -If a specific column is not supported, then it is recommended to first create a secondary disk table with an updated schema. This schema is the same as the original table, except that all the unsupported types have been converted into supported types. - -Afterwards, use the following script to export this secondary disk table and then import it into an MOT table. - -### Converting - -To covert a disk-based table into an MOT table, perform the following - - -1. Suspend application activity. -2. Use **gs_dump** tool to dump the table’s data into a physical file on disk. Make sure to use the **data only**. -3. Rename your original disk-based table. -4. Create an MOT table with the same table name and schema. Make sure to use the create FOREIGN keyword to specify that it will be an MOT table. -5. Use **gs_restore** to load/restore data from the disk file into the database table. -6. Visually/manually verify that all the original data was imported correctly into the new MOT table. An example is provided below. -7. Resume application activity. - -**IMPORTANT Note** **-** In this way, since the table name remains the same, application queries and relevant database stored-procedures will be able to access the new MOT table seamlessly without code changes. Please note that MOT does not currently support cross-engine multi-table queries (such as by using Join, Union and sub-query) and cross-engine multi-table transactions. Therefore, if an original table is accessed somewhere in a multi-table query, stored procedure or transaction, you must either convert all related disk-tables into MOT tables or alter the relevant code in the application or the database. - -### Conversion Example - -Let's say that you have a database name **benchmarksql** and a table named **customer** (which is a disk-based table) to be migrated it into an MOT table. - -To migrate the customer table into an MOT table, perform the following - - -1. Check your source table column types. Verify that all types are supported by MOT, refer to section *Unsupported Data Types*. - - ```sql - benchmarksql-# \d+ customer - Table "public.customer" - Column | Type | Modifiers | Storage | Stats target | Description - --------+---------+-----------+---------+--------------+------------- - x | integer | | plain | | - y | integer | | plain | | - Has OIDs: no - Options: orientation=row, compression=no - ``` - -2. Check your source table data. - - ```sql - benchmarksql=# select * from customer; - x | y - ---+--- - 1 | 2 - 3 | 4 - (2 rows) - ``` - -3. Dump table data only by using **gs_dump**. - - ```sql - $ gs_dump -Fc benchmarksql -a --table customer -f customer.dump - gs_dump[port='15500'][benchmarksql][2020-06-04 16:45:38]: dump database benchmarksql successfully - gs_dump[port='15500'][benchmarksql][2020-06-04 16:45:38]: total time: 332 ms - ``` - -4. Rename the source table name. - - ```sql - benchmarksql=# alter table customer rename to customer_bk; - ALTER TABLE - ``` - -5. Create the MOT table to be exactly the same as the source table. - - ```sql - benchmarksql=# create foreign table customer (x int, y int); - CREATE FOREIGN TABLE - benchmarksql=# select * from customer; - x | y - ---+--- - (0 rows) - ``` - -6. Import the source dump data into the new MOT table. - - ```sql - $ gs_restore -C -d benchmarksql customer.dump - restore operation successful - total time: 24 ms - Check that the data was imported successfully. - benchmarksql=# select * from customer; - x | y - ---+--- - 1 | 2 - 3 | 4 - (2 rows) - - benchmarksql=# \d - List of relations - Schema | Name | Type | Owner | Storage - --------+-------------+---------------+--------+---------------------------------- - public | customer | foreign table | aharon | - public | customer_bk | table | aharon | {orientation=row,compression=no} - (2 rows) - ``` - -## Query Native Compilation - -An additional feature of MOT is the ability to prepare and parse *pre-compiled full queries* in a native format (using a PREPARE statement) before they are needed for execution. - -This native format can later be executed (using an EXECUTE command) more efficiently. This type of execution is much quicker because the native format bypasses multiple database processing layers during execution and thus enables better performance. - -This division of labor avoids repetitive parse analysis operations. In this way, queries and transaction statements are executed in an interactive manner. This feature is sometimes called *Just-In-Time (JIT)* query compilation. - -### Query Compilation - PREPARE Statement - -To use MOT’s native query compilation, call the PREPARE client statement before the query is executed. This instructs MOT to pre-compile the query and/or to pre-load previously pre-compiled code from a cache. - -The following is an example of PREPARE syntax in SQL - - -```sql -PREPARE name [ ( data_type [, ...] ) ] AS statement -``` - -PREPARE creates a prepared statement in the database server, which is a server-side object that can be used to optimize performance. - -### Execute Command - -When an EXECUTE command is subsequently issued, the prepared statement is parsed, analyzed, rewritten and executed. This division of labor avoids repetitive parse analysis operations, while enabling the execution plan to depend on specific provided setting values. - -The following is an example of how to invoke a PREPARE and then an EXECUTE statement in a Java application. - -```sql -conn = DriverManager.getConnection(connectionUrl, connectionUser, connectionPassword); - -// Example 1: PREPARE without bind settings -String query = "SELECT * FROM getusers"; -PreparedStatement prepStmt1 = conn.prepareStatement(query); -ResultSet rs1 = pstatement.executeQuery()) -while (rs1.next()) {…} - -// Example 2: PREPARE with bind settings -String sqlStmt = "SELECT * FROM employees where first_name=? and last_name like ?"; -PreparedStatement prepStmt2 = conn.prepareStatement(sqlStmt); -prepStmt2.setString(1, "Mark"); // first name "Mark" -prepStmt2.setString(2, "%n%"); // last name contains a letter "n" -ResultSet rs2 = prepStmt2.executeQuery()) -while (rs2.next()) {…} -``` - -The following describes the supported and unsupported features of MOT compilation. - -### Supported Queries for Lite Execution - -The following query types are suitable for lite execution - - -- Simple point queries - - - SELECT (including SELECT for UPDATE) - - UPDATE - - DELETE -- INSERT query -- Range UPDATE queries that refer to a full prefix of the primary key -- Range SELECT queries that refer to a full prefix of the primary key -- JOIN queries where one or both parts collapse to a point query -- JOIN queries that refer to a full prefix of the primary key in each joined table - -### Unsupported Queries for Lite Execution - -Any special query attribute disqualifies it for Lite Execution. In particular, if any of the following conditions apply, then the query is declared as unsuitable for Lite Execution. You may refer to the Unsupported Queries for Native Compilation and Lite Execution section for more information. - -It is important to emphasize that in case a query statement does not fit - -native compilation and lite execution, no error is reported to the client and the query will still be executed in a normal and standard manner. - -For more information about MOT native compilation capabilities, see either the section about Query Native Compilation or a more detailed information in the Query Native Compilation (JIT) section. - -## Retrying an Aborted Transaction - -In Optimistic Concurrency Control (OCC) (such as the one used by MOT) during a transaction (using any isolation level) no locks are placed on a record until the COMMIT phase. This is a powerful advantage that significantly increases performance. Its drawback is that an update may fail if another session attempts to update the same record. This results in an entire transaction that must be aborted. These so called *Update Conflicts* are detected by MOT at the commit time by a version checking mechanism. - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** A similar abort happens on engines using pessimistic concurrency control, such as standard PG and the MogDB disk-based tables, when SERIALIZABLE or REPEATABLE-READ isolation level are used. - -Such update conflicts are quite rare in common OLTP scenarios and are especially rare in our experience with MOT. However, because there is still a chance that they may happen, developers should consider resolving this issue using transaction retry code. - -The following describes how to retry a table command after multiple sessions attempt to update the same table simultaneously. You may refer to the OCC vs 2PL Differences by Example section for more detailed information. The following example is taken from TPC-C payment transaction. - -```java -int commitAborts = 0; - -while (commitAborts < RETRY_LIMIT) { - - try { - stmt =db.stmtPaymentUpdateDistrict; - stmt.setDouble(1, 100); - stmt.setInt(2, 1); - stmt.setInt(3, 1); - stmt.executeUpdate(); - - db.commit(); - - break; - } - catch (SQLException se) { - if(se != null && se.getMessage().contains("could not serialize access due to concurrent update")) { - log.error("commmit abort = " + se.getMessage()); - commitAborts++; - continue; - }else { - db.rollback(); - } - - break; - } -} -``` - -## MOT External Support Tools - -The following external MogDB tools have been modified in order to support MOT. Make sure to use the most recent version of each. An overview describing MOT-related usage is provided below. For a full description of these tools and their usage, refer to the MogDB Tools Reference document. - -### gs_ctl (Full and Incremental) - -This tool is used to create a standby server from a primary server, as well as to synchronize a server with another copy of the same server after their timelines have diverged. - -At the end of the operation, the latest MOT checkpoint is fetched by the tool, taking into consideration the **checkpoint_dir** configuration setting value. - -The checkpoint is fetched from the source server's **checkpoint_dir** to the destination server's **checkpoint_dir**. - -Currently, MOT does not support an incremental checkpoint. Therefore, the gs_ctl incremental build does not work in an incremental manner for MOT, but rather in FULL mode. The Postgres (disk-tables) incremental build can still be done incrementally. - -### gs_basebackup - -gs_basebackup is used to prepare base backups of a running server, without affecting other database clients. - -The MOT checkpoint is fetched at the end of the operation as well. However, the checkpoint's location is taken from **checkpoint_dir** in the source server and is transferred to the data directory of the source in order to back it up correctly. - -### gs_dump - -gs_dump is used to export the database schema and data to a file. It also supports MOT tables. - -### gs_restore - -gs_restore is used to import the database schema and data from a file. It also supports MOT tables. - -## MOT SQL Coverage and Limitations - -MOT design enables almost complete coverage of SQL and future feature sets. For example, standard Postgres SQL is mostly supported, as well common database features, such as stored procedures and user defined functions. - -The following describes the various types of SQL coverages and limitations - - -### Unsupported Features - -The following features are not supported by MOT - - -- Engine Interop - No cross-engine (Disk+MOT) queries, views or transactions. Planned for 2021. -- MVCC, Isolation - No snapshot/serializable isolation. Planned for 2021. -- Native Compilation (JIT) - Limited SQL coverage. Also, JIT compilation of stored procedures is not supported. -- LOCAL memory is limited to 1 GB. A transaction can only change data of less than 1 GB. -- Capacity (Data+Index) is limited to available memory. Anti-caching + Data Tiering will be available in the future. -- No full-text search index. -- Do not support Logical copy. - -In addition, the following are detailed lists of various general limitations of MOT tables, MOT indexes, Query and DML syntax and the features and limitations of Query Native Compilation. - -### MOT Table Limitations - -The following lists the functionality limitations of MOT tables - - -- Partitioning -- AES encryption, row-level access control, dynamic data masking -- Stream operations -- User-defined types -- Sub-transactions -- DML triggers -- DDL triggers -- Collations other than "C" or "POSIX" - -### Unsupported Table DDLs - -- Alter table -- Create table, like including -- Create table as select -- Partition by range -- Create table with no-logging clause -- DEFERRABLE primary key -- Reindex -- Tablespace -- Create schema with subcommands - -### Unsupported Data Types - -- UUID -- User-Defined Type (UDF) -- Array data type -- NVARCHAR2(n) -- Clob -- Name -- Blob -- Raw -- Path -- Circle -- Reltime -- Bit varying(10) -- Tsvector -- Tsquery -- JSON -- Box -- Text -- Line -- Point -- LSEG -- POLYGON -- INET -- CIDR -- MACADDR -- Smalldatetime -- BYTEA -- Bit -- Varbit -- OID -- Money -- Any unlimited varchar/character varying -- HSTORE -- XML -- Int16 -- Abstime -- Tsrange -- Tstzrange -- Int8range -- Int4range -- Numrange -- Daterange -- HLL - -### UnsupportedIndex DDLs and Index - -- Create index on decimal/numeric - -- Create index on nullable columns - -- Create index, index per table > 9 - -- Create index on key size > 256 - - The key size includes the column size in bytes + a column additional size, which is an overhead required to maintain the index. The below table lists the column additional size for different column types. - - Additionally, in case of non-unique indexes an extra 8 bytes is required. - - Thus, the following pseudo code calculates the **key size**: - - ```java - keySize =0; - - for each (column in index){ - keySize += (columnSize + columnAddSize); - } - if (index is non_unique) { - keySize += 8; - } - ``` - - | Column Type | Column Size | Column Additional Size | - | :---------- | :---------- | :--------------------- | - | varchar | N | 4 | - | tinyint | 1 | 1 | - | smallint | 2 | 1 | - | int | 4 | 1 | - | bigint | 8 | 1 | - | float | 4 | 2 | - | float8 | 8 | 3 | - - Types that are not specified in above table, the column additional size is zero (for instance timestamp). - -### Unsupported DMLs - -- Merge into -- Select into -- Lock table -- Copy from table -- Upsert - -### Unsupported Queries for Native Compilation and Lite Execution - -- The query refers to more than two tables -- The query has any one of the following attributes - - - Aggregation on non-primitive types - - Window functions - - Sub-query sub-links - - Distinct-ON modifier (distinct clause is from DISTINCT ON) - - Recursive (WITH RECURSIVE was specified) - - Modifying CTE (has INSERT/UPDATE/DELETE in WITH) - -In addition, the following clauses disqualify a query from lite execution - - -- Returning list -- Group By clause -- Grouping sets -- Having clause -- Windows clause -- Distinct clause -- Sort clause that does not conform to native index order -- Set operations -- Constraint dependencies diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/5-mot-administration.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/5-mot-administration.md deleted file mode 100644 index 5b2171fe07ada57ec189d6b63e0a420bb9c4c604..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/5-mot-administration.md +++ /dev/null @@ -1,419 +0,0 @@ ---- -title: MOT Administration -summary: MOT Administration -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT Administration - -The following describes various MOT administration topics. - -## MOT Durability - -Durability refers to long-term data protection (also known as *disk persistence*). Durability means that stored data does not suffer from any kind of degradation or corruption, so that data is never lost or compromised. Durability ensures that data and the MOT engine are restored to a consistent state after a planned shutdown (for example, for maintenance) or an unplanned crash (for example, a power failure). - -Memory storage is volatile, meaning that it requires power to maintain the stored information. Disk storage, on the other hand, is non-volatile, meaning that it does not require power to maintain stored information, thus, it can survive a power shutdown. MOT uses both types of storage - it has all data in memory, while persisting transactional changes to disk **MOT Durability** and by maintaining frequent periodic **MOT Checkpoints** in order to ensure data recovery in case of shutdown. - -The user must ensure sufficient disk space for the logging and Checkpointing operations. A separated drive can be used for the Checkpoint to improve performance by reducing disk I/O load. - -You may refer to the **MOT Key Technologies** section for an overview of how durability is implemented in the MOT engine. - -**To configure durability -** - -To ensure strict consistency, configure the synchronous_commit parameter to **On** in the postgres.conf configuration file. - -**MOTs WAL Redo Log and Checkpoints enable durability, as described below -** - -### MOT Logging - WAL Redo Log - -To ensure Durability, MOT is fully integrated with the MogDB's Write-Ahead Logging (WAL) mechanism, so that MOT persists data in WAL records using MogDB's XLOG interface. This means that every addition, update, and deletion to an MOT table’s record is recorded as an entry in the WAL. This ensures that the most current data state can be regenerated and recovered from this non-volatile log. For example, if three new rows were added to a table, two were deleted and one was updated, then six entries would be recorded in the log. - -MOT log records are written to the same WAL as the other records of MogDB disk-based tables. - -MOT only logs an operation at the transaction commit phase. - -MOT only logs the updated delta record in order to minimize the amount of data written to disk. - -During recovery, data is loaded from the last known or a specific Checkpoint; and then the WAL Redo log is used to complete the data changes that occur from that point forward. - -The WAL (Redo Log) retains all the table row modifications until a Checkpoint is performed (as described above). The log can then be truncated in order to reduce recovery time and to save disk space. - -**Note** - In order to ensure that the log IO device does not become a bottleneck, the log file must be placed on a drive that has low latency. - -### MOT Logging Types - -Two synchronous transaction logging options and one asynchronous transaction logging option are supported (these are also supported by the standard MogDB disk engine). MOT also supports synchronous Group Commit logging with NUMA-awareness optimization, as described below. - -According to your configuration, one of the following types of logging is implemented - - -- **Synchronous Redo Logging** - - The **Synchronous Redo Logging** option is the simplest and most strict redo logger. When a transaction is committed by a client application, the transaction redo entries are recorded in the WAL (Redo Log), as follows - - - 1. While a transaction is in progress, it is stored in the MOT's memory. - 2. After a transaction finishes and the client application sends a Commit command, the transaction is locked and then written to the WAL Redo Log on the disk. This means that while the transaction log entries are being written to the log, the client application is still waiting for a response. - 3. As soon as the transaction's entire buffer is written to the log, the changes to the data in memory take place and then the transaction is committed. After the transaction has been committed, the client application is notified that the transaction is complete. - - **Summary** - - The **Synchronous Redo Logging** option is the safest and most strict because it ensures total synchronization of the client application and the WAL Redo log entries for each transaction as it is committed; thus ensuring total durability and consistency with absolutely no data loss. This logging option prevents the situation where a client application might mark a transaction as successful, when it has not yet been persisted to disk. - - The downside of the **Synchronous Redo Logging** option is that it is the slowest logging mechanism of the three options. This is because a client application must wait until all data is written to disk and because of the frequent disk writes (which typically slow down the database). - -- **Group Synchronous Redo Logging** - - The **Group Synchronous Redo Logging** option is very similar to the **Synchronous Redo Logging** option, because it also ensures total durability with absolutely no data loss and total synchronization of the client application and the WAL (Redo Log) entries. The difference is that the **Group Synchronous Redo Logging** option writes _groups of transaction_r edo entries to the WAL Redo Log on the disk at the same time, instead of writing each and every transaction as it is committed. Using Group Synchronous Redo Logging reduces the amount of disk I/Os and thus improves performance, especially when running a heavy workload. - - The MOT engine performs synchronous Group Commit logging with Non-Uniform Memory Access (NUMA)-awareness optimization by automatically grouping transactions according to the NUMA socket of the core on which the transaction is running. - - You may refer to the **NUMA Awareness Allocation and Affinity** section for more information about NUMA-aware memory access. - - When a transaction commits, a group of entries are recorded in the WAL Redo Log, as follows - - - 1. While a transaction is in progress, it is stored in the memory. The MOT engine groups transactions in buckets according to the NUMA socket of the core on which the transaction is running. This means that all the transactions running on the same socket are grouped together and that multiple groups will be filling in parallel according to the core on which the transaction is running. - - Writing transactions to the WAL is more efficient in this manner because all the buffers from the same socket are written to disk together. - - **Note** - Each thread runs on a single core/CPU which belongs to a single socket and each thread only writes to the socket of the core on which it is running. - - 2. After a transaction finishes and the client application sends a Commit command, the transaction redo log entries are serialized together with other transactions that belong to the same group. - - 3. After the configured criteria are fulfilled for a specific group of transactions (quantity of committed transactions or timeout period as describes in the **REDO LOG (MOT)** section), the transactions in this group are written to the WAL on the disk. This means that while these log entries are being written to the log, the client applications that issued the commit are waiting for a response. - - 4. As soon as all the transaction buffers in the NUMA-aware group have been written to the log, all the transactions in the group are performing the necessary changes to the memory store and the clients are notified that these transactions are complete. - - **Summary** - - The **Group Synchronous Redo Logging** option is a an extremely safe and strict logging option because it ensures total synchronization of the client application and the WAL Redo log entries; thus ensuring total durability and consistency with absolutely no data loss. This logging option prevents the situation where a client application might mark a transaction as successful, when it has not yet been persisted to disk. - - On one hand this option has fewer disk writes than the **Synchronous Redo Logging** option, which may mean that it is faster. The downside is that transactions are locked for longer, meaning that they are locked until after all the transactions in the same NUMA memory have been written to the WAL Redo Log on the disk. - - The benefits of using this option depend on the type of transactional workload. For example, this option benefits systems that have many transactions (and less so for systems that have few transactions, because there are few disk writes anyway). - -- **Asynchronous Redo Logging** - - The **Asynchronous Redo Logging** option is the fastest logging method, However, it does not ensure no data loss, meaning that some data that is still in the buffer and was not yet written to disk may get lost upon a power failure or database crash. When a transaction is committed by a client application, the transaction redo entries are recorded in internal buffers and written to disk at preconfigured intervals. The client application does not wait for the data being written to disk. It continues to the next transaction. This is what makes asynchronous redo logging the fastest logging method. - - When a transaction is committed by a client application, the transaction redo entries are recorded in the WAL Redo Log, as follows - - - 1. While a transaction is in progress, it is stored in the MOT's memory. - 2. After a transaction finishes and the client application sends a Commit command, the transaction redo entries are written to internal buffers, but are not yet written to disk. Then changes to the MOT data memory take place and the client application is notified that the transaction is committed. - 3. At a preconfigured interval, a redo log thread running in the background collects all the buffered redo log entries and writes them to disk. - - **Summary** - - The Asynchronous Redo Logging option is the fastest logging option because it does not require the client application to wait for data being written to disk. In addition, it groups many transactions redo entries and writes them together, thus reducing the amount of disk I/Os that slow down the MOT engine. - - The downside of the Asynchronous Redo Logging option is that it does not ensure that data will not get lost upon a crash or failure. Data that was committed, but was not yet written to disk, is not durable on commit and thus cannot be recovered in case of a failure. The Asynchronous Redo Logging option is most relevant for applications that are willing to sacrifice data recovery (consistency) over performance. - -### Configuring Logging - -Two synchronous transaction logging options and one asynchronous transaction logging option are supported by the standard MogDB disk engine. - -To configure logging - - -1. The determination of whether synchronous or asynchronous transaction logging is performed is configured in the synchronous_commit **(On = Synchronous)** parameters in the postgres.conf configuration file. - -If a synchronous mode of transaction logging has been selected (synchronous_commit = **On**, as described above), then the enable_group_commit parameter in the mot.conf configuration file determines whether the **Group Synchronous Redo Logging** option or the **Synchronous Redo Logging** option is used. For **Group Synchronous Redo Logging**, you must also define in the mot.conf file which of the following thresholds determine when a group of transactions is recorded in the WAL - -- group_commit_size **-** The quantity of committed transactions in a group. For example, **16** means that when 16 transactions in the same group have been committed by a client application, then an entry is written to disk in the WAL Redo Log for all 16 transactions. - -- group_commit_timeout - A timeout period in ms. For example, **10** means that after 10 ms, an entry is written to disk in the WAL Redo Log for each of the transactions in the same group that have been committed by their client application in the last 10 ms. - - > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** You may refer to the **REDO LOG (MOT)** for more information about configuration settings. - -### MOT Checkpoints - -A Checkpoint is the point in time at which all the data of a table's rows is saved in files on persistent storage in order to create a full durable database image. It is a snapshot of the data at a specific point in time. - -A Checkpoint is required in order to reduce a database's recovery time by shortening the quantity of WAL (Redo Log) entries that must be replayed in order to ensure durability. Checkpoint's also reduce the storage space required to keep all the log entries. - -If there were no Checkpoints, then in order to recover a database, all the WAL redo entries would have to be replayed from the beginning of time, which could take days/weeks depending on the quantity of records in the database. Checkpoints record the current state of the database and enable old redo entries to be discarded. - -Checkpoints are essential during recovery scenarios (especially for a cold start). First, the data is loaded from the last known or a specific Checkpoint; and then the WAL is used to complete the data changes that occurred since then. - -For example - If the same table row is modified 100 times, then 100 entries are recorded in the log. When Checkpoints are used, then even if a specific table row was modified 100 times, it is recorded in the Checkpoint a single time. After the recording of a Checkpoint, recovery can be performed on the basis of that Checkpoint and only the WAL Redo Log entries that occurred since the Checkpoint need be played. - -## MOT Recovery - -The main objective of MOT Recovery is to restore the data and the MOT engine to a consistent state after a planned shutdown (for example, for maintenance) or an unplanned crash (for example, after a power failure). - -MOT recovery is performed automatically with the recovery of the rest of the MogDB database and is fully integrated into MogDB recovery process (also called a *Cold Start*). - -MOT recovery consists of two stages - - -**Checkpoint Recovery -** First, data must be recovered from the latest Checkpoint file on disk by loading it into memory rows and creating indexes. - -**WAL Redo Log Recovery -** Afterwards, the recent data (which was not captured in the Checkpoint) must be recovered from the WAL Redo Log by replaying records that were added to the log since the Checkpoint that was used in the Checkpoint Recovery (described above). - -The WAL Redo Log recovery is managed and triggered by MogDB. - -- To configure recovery. - -- While WAL recovery is performed in a serial manner, the Checkpoint recovery can be configured to run in a multi-threaded manner (meaning in parallel by multiple workers). - -- Configure the **Checkpoint_recovery_workers** parameter in the **mot.conf** file, which is described in the **RECOVERY (MOT)** section. - -## MOT Replication and High Availability - -Since MOT is integrated into MogDB and uses/supports its replication and high availability, both synchronous and asynchronous replication are supported out of the box. - -The MogDB gs_ctl tool is used for availability control and to operate the cluster. This includes gs_ctl switchover, gs_ctl failover, gs_ctl build and so on. - -You may refer to the MogDB Tools Reference document for more information. - -- To configure replication and high availability. -- Refer to the relevant MogDB documentation. - -## MOT Memory Management - -For planning and finetuning, see the **MOT Memory and Storage Planning** and **MOT Configuration Settings** sections. - -## MOT Vacuum - -Use VACUUM for garbage collection and optionally to analyze a database, , as follows - - -- [PG] - - In Postgress (PG), the VACUUM reclaims storage occupied by dead tuples. In normal PG operation, tuples that are deleted or that are made obsolete by an update are not physically removed from their table. They remain present until a VACUUM is done. Therefore, it is necessary to perform a VACUUM periodically, especially on frequently updated tables. - -- [MOT Extension] - - MOT tables do not need a periodic VACUUM operation, since dead/empty tuples are re-used by new ones. MOT tables require VACUUM operations only when their size is significantly reduced and they do not expect to grow to their original size in the near future. - - For example, an application that periodically (for example, once in a week) performs a large deletion of a table/tables data while inserting new data takes days and does not necessarily require the same quantity of rows. In such cases, it makes sense to activate the VACUUM. - - The VACUUM operation on MOT tables is always transformed into a VACUUM FULL with an exclusive table lock. - -- Supported Syntax and Limitations - - Activation of the VACUUM operation is performed in a standard manner. - - ```sql - VACUUM [FULL | ANALYZE] [ table ]; - ``` - - Only the FULL and ANALYZE VACUUM options are supported. The VACUUM operation can only be performed on an entire MOT table. - - The following PG vacuum options are not supported: - - - FREEZE - - VERBOSE - - Column specification - - LAZY mode (partial table scan) - - Additionally, the following functionality is not supported - - - AUTOVACUUM - -## MOT Statistics - -Statistics are intended for performance analysis or debugging. It is uncommon to turn them ON in a production environment (by default, they are OFF). Statistics are primarily used by database developers and to a lesser degree by database users. - -There is some impact on performance, particularly on the server. Impact on the user is negligible. - -The statistics are saved in the database server log. The log is located in the data folder and named **postgresql-DATE-TIME.log**. - -Refer to **STATISTICS (MOT)** for detailed configuration options. - -## MOT Monitoring - -All syntax for monitoring of PG-based FDW tables is supported. This includes Table or Index sizes (as described below). In addition, special functions exist for monitoring MOT memory consumption, including MOT Global Memory, MOT Local Memory and a single client session. - -### Table and Index Sizes - -The size of tables and indexes can be monitored by querying pg_relation_size. - -For example - -**Data Size** - -```sql -select pg_relation_size('customer'); -``` - -**Index** - -```sql -select pg_relation_size('customer_pkey'); -``` - -### MOT GLOBAL Memory Details - -Check the size of MOT global memory, which includes primarily the data and indexes. - -```sql -select * from mot_global_memory_detail(); -``` - -Result - - -```sql -numa_node | reserved_size | used_size -----------------+----------------+------------- --1 | 194716368896 | 25908215808 -0 | 446693376 | 446693376 -1 | 452984832 | 452984832 -2 | 452984832 | 452984832 -3 | 452984832 | 452984832 -4 | 452984832 | 452984832 -5 | 364904448 | 364904448 -6 | 301989888 | 301989888 -7 | 301989888 | 301989888 -``` - -Where - - -- -1 is the total memory. -- 0..7 are NUMA memory nodes. - -### MOT LOCAL Memory Details - -Check the size of MOT local memory, which includes session memory. - -```sql -select * from mot_local_memory_detail(); -``` - -Result - - -```sql -numa_node | reserved_size | used_size -----------------+----------------+------------- --1 | 144703488 | 144703488 -0 | 25165824 | 25165824 -1 | 25165824 | 25165824 -2 | 18874368 | 18874368 -3 | 18874368 | 18874368 -4 | 18874368 | 18874368 -5 | 12582912 | 12582912 -6 | 12582912 | 12582912 -7 | 12582912 | 12582912 -``` - -Where - - -- -1 is the total memory. -- 0..7 are NUMA memory nodes. - -### Session Memory - -Memory for session management is taken from the MOT local memory. - -Memory usage by all active sessions (connections) is possible using the following query - - -```sql -select * from mot_session_memory_detail(); -``` - -Result - - -```sql -sessid | total_size | free_size | used_size -----------------------------------------+-----------+----------+---------- -1591175063.139755603855104 | 6291456 | 1800704 | 4490752 - -``` - -Legend - - -- **total_size -** is allocated for the session -- **free_size -** not in use -- **used_size -** In actual use - -The following query enables a DBA to determine the state of local memory used by the current session - - -```sql -select * from mot_session_memory_detail() - where sessid = pg_current_sessionid(); -``` - -Result - - -![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-administration-1.png) - -## MOT Error Messages - -Errors may be caused by a variety of scenarios. All errors are logged in the database server log file. In addition, user-related errors are returned to the user as part of the response to the query, transaction or stored procedure execution or to database administration action. - -- Errors reported in the Server log include - Function, Entity, Context, Error message, Error description and Severity. -- Errors reported to users are translated into standard PostgreSQL error codes and may consist of an MOT-specific message and description. - -The following lists the error messages, error descriptions and error codes. The error code is actually an internal code and not logged or returned to users. - -### Errors Written the Log File - -All errors are logged in the database server log file. The following lists the errors that are written to the database server log file and are **not** returned to the user. The log is located in the data folder and named **postgresql-DATE-TIME.log**. - -**Table 1** Errors Written Only to the Log File - -| Message in the Log | Error Internal Code | -| :---------------------------------- | :------------------------------- | -| Error code denoting success | MOT_NO_ERROR 0 | -| Out of memory | MOT_ERROR_OOM 1 | -| Invalid configuration | MOT_ERROR_INVALID_CFG 2 | -| Invalid argument passed to function | MOT_ERROR_INVALID_ARG 3 | -| System call failed | MOT_ERROR_SYSTEM_FAILURE 4 | -| Resource limit reached | MOT_ERROR_RESOURCE_LIMIT 5 | -| Internal logic error | MOT_ERROR_INTERNAL 6 | -| Resource unavailable | MOT_ERROR_RESOURCE_UNAVAILABLE 7 | -| Unique violation | MOT_ERROR_UNIQUE_VIOLATION 8 | -| Invalid memory allocation size | MOT_ERROR_INVALID_MEMORY_SIZE 9 | -| Index out of range | MOT_ERROR_INDEX_OUT_OF_RANGE 10 | -| Error code unknown | MOT_ERROR_INVALID_STATE 11 | - -### Errors Returned to the User - -The following lists the errors that are written to the database server log file and are returned to the user. - -MOT returns PG standard error codes to the envelope using a Return Code (RC). Some RCs cause the generation of an error message to the user who is interacting with the database. - -The PG code (described below) is returned internally by MOT to the database envelope, which reacts to it according to standard PG behavior. - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** %s, %u and %lu in the message are replaced by relevant error information, such as query, table name or another information. - %s - String - %u - Number - %lu - Number - -**Table 2** Errors Returned to the User and Logged to the Log File - -| Short and Long Description Returned to the User | PG Code | Internal Error Code | -| :---------------------------------------------------- | :------------------------------ | :------------------------------ | -| Success.Denotes success | ERRCODE_SUCCESSFUL_COMPLETIONCOMPLETION | RC_OK = 0 | -| FailureUnknown error has occurred. | ERRCODE_FDW_ERROR | RC_ERROR = 1 | -| Unknown error has occurred.Denotes aborted operation. | ERRCODE_FDW_ERROR | RC_ABORT | -| Column definition of %s is not supported.Column type %s is not supported yet. | ERRCODE_INVALID_COLUMN_DEFINITION | RC_UNSUPPORTED_COL_TYPE | -| Column definition of %s is not supported.Column type Array of %s is not supported yet. | ERRCODE_INVALID_COLUMN_DEFINITION | RC_UNSUPPORTED_COL_TYPE_ARR | -| Column size %d exceeds max tuple size %u.Column definition of %s is not supported. | ERRCODE_FEATURE_NOT_SUPPORTED | RC_EXCEEDS_MAX_ROW_SIZE | -| Column name %s exceeds max name size %u.Column definition of %s is not supported. | ERRCODE_INVALID_COLUMN_DEFINITION | RC_COL_NAME_EXCEEDS_MAX_SIZE | -| Column size %d exceeds max size %u.Column definition of %s is not supported. | ERRCODE_INVALID_COLUMN_DEFINITION | RC_COL_SIZE_INVLALID | -| Cannot create table.Cannot add column %s; as the number of declared columns exceeds the maximum declared columns. | ERRCODE_FEATURE_NOT_SUPPORTED | RC_TABLE_EXCEEDS_MAX_DECLARED_COLS | -| Cannot create index.Total column size is greater than maximum index size %u. | ERRCODE_FDW_KEY_SIZE_EXCEEDS_MAX_ALLOWED | RC_INDEX_EXCEEDS_MAX_SIZE | -| Cannot create index.Total number of indexes for table %s is greater than the maximum number of indexes allowed %u. | ERRCODE_FDW_TOO_MANY_INDEXES | RC_TABLE_EXCEEDS_MAX_INDEXES | -| Cannot execute statement.Maximum number of DDLs per transaction reached the maximum %u. | ERRCODE_FDW_TOO_MANY_DDL_CHANGES_IN_TRANSACTION_NOT_ALLOWED | RC_TXN_EXCEEDS_MAX_DDLS | -| Unique constraint violationDuplicate key value violates unique constraint \"%s\"".Key %s already exists. | ERRCODE_UNIQUE_VIOLATION | RC_UNIQUE_VIOLATION | -| Table \"%s\" does not exist. | ERRCODE_UNDEFINED_TABLE | RC_TABLE_NOT_FOUND | -| Index \"%s\" does not exist. | ERRCODE_UNDEFINED_TABLE | RC_INDEX_NOT_FOUND | -| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_LOCAL_ROW_FOUND | -| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_LOCAL_ROW_NOT_FOUND | -| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_LOCAL_ROW_DELETED | -| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_INSERT_ON_EXIST | -| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_INDEX_RETRY_INSERT | -| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_INDEX_DELETE | -| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_LOCAL_ROW_NOT_VISIBLE | -| Memory is temporarily unavailable. | ERRCODE_OUT_OF_LOGICAL_MEMORY | RC_MEMORY_ALLOCATION_ERROR | -| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_ILLEGAL_ROW_STATE | -| Null constraint violated.NULL value cannot be inserted into non-null column %s at table %s. | ERRCODE_FDW_ERROR | RC_NULL_VIOLATION | -| Critical error.Critical error: %s. | ERRCODE_FDW_ERROR | RC_PANIC | -| A checkpoint is in progress - cannot truncate table. | ERRCODE_FDW_OPERATION_NOT_SUPPORTED | RC_NA | -| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_MAX_VALUE | -| <recovery message> | - | ERRCODE_CONFIG_FILE_ERROR | -| <recovery message> | - | ERRCODE_INVALID_TABLE_DEFINITION | -| Memory engine - Failed to perform commit prepared. | - | ERRCODE_INVALID_TRANSACTION_STATE | -| Invalid option <option name> | - | ERRCODE_FDW_INVALID_OPTION_NAME | -| Invalid memory allocation request size. | - | ERRCODE_INVALID_PARAMETER_VALUE | -| Memory is temporarily unavailable. | - | ERRCODE_OUT_OF_LOGICAL_MEMORY | -| Could not serialize access due to concurrent update. | - | ERRCODE_T_R_SERIALIZATION_FAILURE | -| Alter table operation is not supported for memory table.Cannot create MOT tables while incremental checkpoint is enabled.Re-index is not supported for memory tables. | - | ERRCODE_FDW_OPERATION_NOT_SUPPORTED | -| Allocation of table metadata failed. | - | ERRCODE_OUT_OF_MEMORY | -| Database with OID %u does not exist. | - | ERRCODE_UNDEFINED_DATABASE | -| Value exceeds maximum precision: %d. | - | ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE | -| You have reached a maximum logical capacity %lu of allowed %lu. | - | ERRCODE_OUT_OF_LOGICAL_MEMORY | diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/6-mot-sample-tpcc-benchmark.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/6-mot-sample-tpcc-benchmark.md deleted file mode 100644 index 6c0c4dd5d8d813459ce5f2357d9f96ffa825f80e..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/6-mot-sample-tpcc-benchmark.md +++ /dev/null @@ -1,116 +0,0 @@ ---- -title: MOT Sample TPC-C Benchmark -summary: MOT Sample TPC-C Benchmark -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT Sample TPC-C Benchmark - -## TPC-C Introduction - -The TPC-C Benchmark is an industry standard benchmark for measuring the performance of Online Transaction Processing (OLTP) systems. It is based on a complex database and a number of different transaction types that are executed on it. TPC-C is both a hardware-independent and a software-independent benchmark and can thus be run on every test platform. An official overview of the benchmark model can be found at the tpc.org website here - . - -The database consists of nine tables of various structures and thus also nine types of data records. The size and quantity of the data records varies per table. A mix of five concurrent transactions of varying types and complexities is executed on the database, which are largely online or in part queued for deferred batch processing. Because these tables compete for limited system resources, many system components are stressed and data changes are executed in a variety of ways. - -**Table 1** TPC-C Database Structure - -| Table | Number of Entries | -| :--------- | :--------------------------------------- | -| Warehouse | n | -| Item | 100,000 | -| Stock | n x 100,000 | -| District | n x 10 | -| Customer | 3,000 per district, 30,000 per warehouse | -| Order | Number of customers (initial value) | -| New order | 30% of the orders (initial value) | -| Order line | ~ 10 per order | -| History | Number of customers (initial value) | - -The transaction mix represents the complete business processing of an order - from its entry through to its delivery. More specifically, the provided mix is designed to produce an equal number of new-order transactions and payment transactions and to produce a single delivery transaction, a single order-status transaction and a single stock-level transaction for every ten new-order transactions. - -**Table 2** TPC-C Transactions Ratio - -| Transaction Level ≥ 4% | Share of All Transactions | -| :--------------------- | :------------------------ | -| TPC-C New order | ≤ 45% | -| Payment | ≥ 43% | -| Order status | ≥ 4% | -| Delivery | ≥ 4% (batch) | -| Stock level | ≥ 4% | - -There are two ways to execute the transactions - **as stored procedures** (which allow higher throughput) and in **standard interactive SQL mode**. - -**Performance Metric - tpm-C** - -The tpm-C metric is the number of new-order transactions executed per minute. Given the required mix and a wide range of complexity and types among the transactions, this metric most closely simulates a comprehensive business activity, not just one or two transactions or computer operations. For this reason, the tpm-C metric is considered to be a measure of business throughput. - -The tpm-C unit of measure is expressed as transactions-per-minute-C, whereas "C" stands for TPC-C specific benchmark. - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** The official TPC-C Benchmark specification can be accessed at - . Some of the rules of this specification are generally not fulfilled in the industry, because they are too strict for industry reality. For example, Scaling rules - (a) tpm-C / Warehouse must be >9 and <12.86 (implying that a very high warehouses rate is required in order to achieve a high tpm-C rate, which also means that an extremely large database and memory capacity are required); and (b) 10x terminals x Warehouses (implying a huge quantity of simulated clients). - -## System-Level Optimization - -Follow the instructions in the **MOT Server Optimization - x86** section. The following section describes the key system-level optimizations for deploying the MogDB database on a Huawei Taishan server and on a Euler 2.8 operating system for ultimate performance. - -## BenchmarkSQL - An Open-Source TPC-C Tool - -For example, to test TPCC, the **BenchmarkSQL** can be used, as follows - - -- Download **benchmarksql** from the following link - -- The schema creation scripts in the **benchmarksql** tool need to be adjusted to MOT syntax and unsupported DDLs need to be avoided. The adjusted scripts can be directly downloaded from the following link - . The contents of this tar file includes sql.common.mogdb.mot folder and jTPCCTData.java file as well as a sample configuration file postgresql.conf and a TPCC properties file props.mot for reference. -- Place the sql.common.mogdb.mot folder in the same level as sql.common under run folder and replace the file src/client/jTPCCTData.java with the downloaded java file. -- Edit the file runDatabaseBuild.sh under run folder to remove **extraHistID** from **AFTER_LOAD** list to avoid unsupported alter table DDL. -- Replace the JDBC driver under lib/postgres folder with the MogDB JDBC driver available from the following link - . - -The only change done in the downloaded java file (compared to the original one) was to comment the error log printing for serialization and duplicate key errors. These errors are normal in case of MOT, since it uses Optimistic Concurrency Control (OCC) mechanism. - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** The benchmark test is executed using a standard interactive SQL mode without stored procedures. - -## Running the Benchmark - -Anyone can run the benchmark by starting up the server and running the **benchmarksql** scripts. - -To run the benchmark - - -1. Go to the **benchmarksql** run folder and rename sql.common to sql.common.orig. -2. Create a link sql.common to sql.common.mogdb.mot in order to test MOT. -3. Start up the database server. -4. Configure the props.pg file in the client. -5. Run the benchmark. - -## Results Report - -- Results in CLI - - BenchmarkSQL results should appear as follows - - - ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-sample-tpcc-benchmark-1.jpg) - - Over time, the benchmark measures and averages the committed transactions. The example above benchmarks for two minutes. - - The score is **2.71M tmp-C** (new-orders per-minute), which is 45% of the total committed transactions, meaning the **tpmTOTAL**. - -- Detailed Result Report - - The following is an example of a detailed result report - - - **Figure 1** Detailed Result Report - - ![detailed-result-report](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-sample-tpcc-benchmark-2.png) - - ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-sample-tpcc-benchmark-3.png) - - BenchmarkSQL collects detailed performance statistics and operating system performance data (if configured). - - This information can show the latency of the queries, and thus expose bottlenecks related to storage/network/CPU. - -- Results of TPC-C of MOT on Huawei Taishan 2480 - - Our TPC-C benchmark dated 01-May-2020 with an MogDB database installed on Taishan 2480 server (a 4-socket ARM/Kunpeng server), achieved a throughput of 4.79M tpm-C. - - A near linear scalability was demonstrated, as shown below - - - **Figure 2** Results of TPC-C of MOT on Huawei Taishan 2480 - - ![results-of-tpc-c-of-mot-on-huawei-taishan-2480](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-sample-tpcc-benchmark-4.png) diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/using-mot.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/using-mot.md deleted file mode 100644 index 1de811f76f8cf0a04ce4e7a2f1595f11565d477d..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/using-mot.md +++ /dev/null @@ -1,17 +0,0 @@ ---- -title: Using MOT -summary: Using MOT -author: Guo Huan -date: 2023-05-22 ---- - -# Using MOT - -This chapter describes how to deploy, use and manage MogDB MOT. Using MOT tables is quite simple. The syntax of all MOT commands is the same as for MogDB disk‑based tables. Only the create and drop table statements in MOT differ from the statements for disk-based tables in MogDB. You may refer to this chapter in order to learn how to get started, how to convert a disk‑based table into an MOT table, how to use advanced MOT features, such as Native Compilation (JIT) for Queries and Stored Procedures, execution of Cross-engine Transactions, as well as MOT's limitations and coverage. MOT administration options are also described here. This chapter also describes how to perform a TPC-C benchmark. - -+ **[Using MOT Overview](1-using-mot-overview.md)** -+ **[MOT Preparation](2-mot-preparation.md)** -+ **[MOT Deployment](3-mot-deployment.md)** -+ **[MOT Usage](4-mot-usage.md)** -+ **[MOT Administration](5-mot-administration.md)** -+ **[MOT Sample TPC-C Benchmark](6-mot-sample-tpcc-benchmark.md)** \ No newline at end of file diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-1.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-1.md deleted file mode 100644 index 3b0a39a439eda8d332f6527826490c6da449b2f3..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-1.md +++ /dev/null @@ -1,90 +0,0 @@ ---- -title: MOT Scale-up Architecture -summary: MOT Scale-up Architecture -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT Scale-up Architecture - -To **scale up** means to add additional cores to the *same machine* in order to add computing power. To scale up refers to the most common traditional form of adding computing power in a machine that has a single pair of controllers and multiple cores. Scale-up architecture is limited by the scalability limits of a machine’s controller. - -## Technical Requirements - -MOT has been designed to achieve the following - - -- **Linear Scale-up -** MOT delivers a transactional storage engine that utilizes all the cores of a single NUMA architecture server in order to provide near-linear scale-up performance. This means that MOT is targeted to achieve a direct, near-linear relationship between the quantity of cores in a machine and the multiples of performance increase. - - > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** The near-linear scale-up results achieved by MOT significantly outperform all other existing solutions, and come as close as possible to achieving optimal results, which are limited by the physical restrictions and limitations of hardware, such as wires. - -- **No Maximum Number of Cores Limitation -** MOT does not place any limits on the maximum quantity of cores. This means that MOT is scalable from a single core up to 1,000s of cores, with minimal degradation per additional core, even when crossing NUMA socket boundaries. - -- **Extremely High Transactional Throughout -** MOT delivers a transactional storage engine that can achieve extremely high transactional throughout compared with any other OLTP vendor on the market. - -- **Extremely Low Transactional Latency -** MOT delivers a transactional storage engine that can reach extremely low transactional latency compared with any other OLTP vendor on the market. - -- **Seamless Integration and Leveraging with/of MogDB -** MOT integrates its transactional engine in a standard and seamless manner with the MogDB product. In this way, MOT reuses maximum functionality from the MogDB layers that are situated on top of its transactional storage engine. - -## Design Principles - -To achieve the requirements described above (especially in an environment with many-cores), our storage engine's architecture implements the following techniques and strategies - - -- **Data and indexes only reside in memory**. -- **Data and indexes are not laid out with physical partitions** (because these might achieve lower performance for certain types of applications). -- Transaction concurrency control is based on **Optimistic Concurrency Control (OCC)** without any centralized contention points. See the **MOT Concurrency Control Mechanism** section for more information about OCC. -- **Parallel Redo Logs (ultimately per core)** are used to efficiently avoid a central locking point. -- **Indexes are lock-free**. See the **MOT Indexes** section for more information about lock-free indexes. -- **NUMA-awareness memory allocation** is used to avoid cross-socket access, especially for session lifecycle objects. See the **NUMA Awareness Allocation and Affinity** section for more information about NUMA-awareness. -- **A customized MOT memory management allocator** with pre-cached object pools is used to avoid expensive runtime allocation and extra points of contention. This dedicated MOT memory allocator makes memory allocation more efficient by pre-accessing relatively large chunks of memory from the operation system as needed and then divvying it out to the MOT as needed. - -## Integration using Foreign Data Wrappers (FDW) - -MOT complies with and leverages MogDB's standard extensibility mechanism - Foreign Data Wrapper (FDW), as shown in the following diagram. - -The PostgreSQL Foreign Data Wrapper (FDW) feature enables the creation of foreign tables in an MOT database that are proxies for some other data source, such as Oracle, MySQL, PostgreSQL and so on. When a query is made on a foreign table, the FDW queries the external data source and returns the results, as if they were coming from a table in your database. - -MogDB relies on the PostgreSQL Foreign Data Wrappers (FDW) and Index support so that SQL is entirely covered, including stored procedures, user defined functions, system functions calls. - -**Figure 1** MOT Architecture - -![mot-architecture](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-scale-up-architecture-2.png) - -In the diagram above, the MOT engine is represented in green, while the existing MogDB (based on Postgres) components are represented in the top part of this diagram in blue. As you can see, the Foreign Data Wrapper (FDW) mediates between the MOT engine and the MogDB components. - -**MOT-Related FDW Customizations** - -Integrating MOT through FDW enables the reuse of the most upper layer MogDB functionality and therefore significantly shortened MOT's time-to-market without compromising SQL coverage. - -However, the original FDW mechanism in MogDB was not designed for storage engine extensions, and therefore lacks the following essential functionalities - - -- Index awareness of foreign tables to be calculated in the query planning phase -- Complete DDL interfaces -- Complete transaction lifecycle interfaces -- Checkpoint interfaces -- Redo Log interface -- Recovery interfaces -- Vacuum interfaces - -In order to support all the missing functionalities, the SQL layer and FDW interface layer were extended to provide the necessary infrastructure in order to enable the plugging in of the MOT transactional storage engine. - -## Result - Linear Scale-up - -The following shows the results achieved by the MOT design principles and implementation described above. - -To the best of our knowledge, MOT outperforms all existing industry-grade OLTP databases in transactional throughput of ACID-compliant workloads. - -MogDB and MOT have been tested on the following many-core systems with excellent performance scalability results. The tests were performed both on x86 Intel-based and ARM/Kunpeng-based many-core servers. You may refer to the **MOT Performance Benchmarks** section for more detailed performance review. - -Our TPC-C benchmark dated June 2020 tested an MogDB MOT database on a Taishan 2480 server. A 4-socket ARM/Kunpeng server, achieved throughput of 4.8 M tpmC. The following graph shows the near-linear nature of the results, meaning that it shows a significant increase in performance correlating to the increase of the quantity of cores - - -**Figure 2** TPC-C on ARM (256 Cores) - -![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-12.png) - -The following is an additional example that shows a test on an x86-based server also showing CPU utilization. - -**Figure 3** tpmC vs CPU Usage - -![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-18.png) - -The chart shows that MOT demonstrates a significant performance increase correlation with an increase of the quantity of cores. MOT consumes more and more of the CPU correlating to the increase of the quantity of cores. Other industry solutions do not increase and sometimes show slightly degraded performance, which is a well-known problem in the database industry that affects customers’ CAPEX and OPEX expenses and operational efficiency. diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-2.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-2.md deleted file mode 100644 index aab31c2a03baeef5b1ebf3c8cc0b8fbbe8e77365..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-2.md +++ /dev/null @@ -1,179 +0,0 @@ ---- -title: MOT Concurrency Control Mechanism -summary: MOT Concurrency Control Mechanism -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT Concurrency Control Mechanism - -After investing extensive research to find the best concurrency control mechanism, we concluded that SILO based on OCC is the best ACID-compliant OCC algorithm for MOT. SILO provides the best foundation for MOT's challenging requirements. - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** MOT is fully Atomicity, Consistency, Isolation, Durability (ACID)-compliant, as described in the **MOT Introduction** section. - -The following topics describe MOT's concurrency control mechanism - - -## MOT Local and Global Memory - -SILO manages both a local memory and a global memory, as shown in Figure 1. - -- **Global** memory is long-term shared memory is shared by all cores and is used primarily to store all the table data and indexes -- **Local** memory is short-term memory that is used primarily by sessions for handling transactions and store data changes in a primate to transaction memory until the commit phase. - -When a transaction change is required, SILO handles the copying of all that transaction's data from the global memory into the local memory. Minimal locks are placed on the global memory according to the OCC approach, so that the contention time in the global shared memory is extremely minimal. After the transaction’ change has been completed, this data is pushed back from the local memory to the global memory. - -The basic interactive transactional flow with our SILO-enhanced concurrency control is shown in the figure below - - -**Figure 1** Private (Local) Memory (for each transaction) and a Global Memory (for all the transactions of all the cores) - -![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-concurrency-control-mechanism-2.png) - -For more details, refer to the Industrial-Strength OLTP Using Main Memory and Many-cores document [**Comparison - Disk vs. MOT**]. - -## MOT SILO Enhancements - -SILO in its basic algorithm flow outperformed many other ACID-compliant OCCs that we tested in our research experiments. However, in order to make it a product-grade mechanism, we had to enhance it with many essential functionalities that were missing in the original design, such as - - -- Added support for interactive mode transactions, where transactions are running SQL by SQL from the client side and not as a single step on the server side -- Added optimistic inserts -- Added support for non-unique indexes -- Added support for read-after-write in transactions so that users can see their own changes before they are committed -- Added support for lockless cooperative garbage collection -- Added support for lockless checkpoints -- Added support for fast recovery -- Added support for two-phase commit in a distributed deployment - -Adding these enhancements without breaking the scalable characteristic of the original SILO was very challenging. - -## MOT Isolation Levels - -Even though MOT is fully ACID-compliant (as described in the section), not all isolation levels are supported in MogDB 2.1. The following table describes all isolation levels, as well as what is and what is not supported by MOT. - -**Table 1** Isolation Levels - -| Isolation Level | Description | -| :--------------- | :----------------------------------------------------------- | -| READ UNCOMMITTED | **Not supported by MOT.** | -| READ COMMITTED | **Supported by MOT.**
The READ COMMITTED isolation level that guarantees that any data that is read was already committed when it was read. It simply restricts the reader from seeing any intermediate, uncommitted or dirty reads. Data is free to be changed after it has been read so that READ COMMITTED does not guarantee that if the transaction re-issues the read, that the same data will be found. | -| SNAPSHOT | **Not supported by MOT.**
The SNAPSHOT isolation level makes the same guarantees as SERIALIZABLE, except that concurrent transactions can modify the data. Instead, it forces every reader to see its own version of the world (its own snapshot). This makes it very easy to program, plus it is very scalable, because it does not block concurrent updates. However, in many implementations this isolation level requires higher server resources. | -| REPEATABLE READ | **Supported by MOT.**
REPEATABLE READ is a higher isolation level that (in addition to the guarantees of the READ COMMITTED isolation level) guarantees that any data that is read cannot change. If a transaction reads the same data again, it will find the same previously read data in place, unchanged and available to be read.
Because of the optimistic model, concurrent transactions are not prevented from updating rows read by this transaction. Instead, at commit time this transaction validates that the REPEATABLE READ isolation level has not been violated. If it has, this transaction is rolled back and must be retried. | -| SERIALIZABLE | **Not supported by MOT**.
Serializable isolation makes an even stronger guarantee. In addition to everything that the REPEATABLE READ isolation level guarantees, it also guarantees that no new data can be seen by a subsequent read.
It is named SERIALIZABLE because the isolation is so strict that it is almost a bit like having the transactions run in series rather than concurrently. | - -The following table shows the concurrency side effects enabled by the different isolation levels. - -**Table 2** Concurrency Side Effects Enabled by Isolation Levels - -| Isolation Level | Description | Non-repeatable Read | Phantom | -| :--------------- | :---------- | :------------------ | :------ | -| READ UNCOMMITTED | Yes | Yes | Yes | -| READ COMMITTED | No | Yes | Yes | -| REPEATABLE READ | No | No | Yes | -| SNAPSHOT | No | No | No | -| SERIALIZABLE | No | No | No | - -In the near future release, MogDB MOT will also support both SNAPSHOT and SERIALIZABLE isolation levels. - -## MOT Optimistic Concurrency Control - -The Concurrency Control Module (CC Module for short) provides all the transactional requirements for the Main Memory Engine. The primary objective of the CC Module is to provide the Main Memory Engine with support for various isolation levels. - -### Optimistic OCC vs. Pessimistic 2PL - -The functional differences of Pessimistic 2PL (2-Phase Locking) vs. Optimistic Concurrency Control (OCC) involve pessimistic versus optimistic approaches to transaction integrity. - -Disk-based tables use a pessimistic approach, which is the most commonly used database method. The MOT Engine use an optimistic approach. - -The primary functional difference between the pessimistic approach and the optimistic approach is that if a conflict occurs - - -- The pessimistic approach causes the client to wait. -- The optimistic approach causes one of the transactions to fail, so that the failed transaction must be retried by the client. - -**Optimistic Concurrency Control Approach (Used by MOT)** - -The **Optimistic Concurrency Control (OCC)** approach detects conflicts as they occur, and performs validation checks at commit time. - -The optimistic approach has less overhead and is usually more efficient, partly because transaction conflicts are uncommon in most applications. - -The functional differences between optimistic and pessimistic approaches is larger when the REPEATABLE READ isolation level is enforced and is largest for the SERIALIZABLE isolation level. - -**Pessimistic Approaches (Not used by MOT)** - -The **Pessimistic Concurrency Control** (2PL or 2-Phase Locking) approach uses locks to block potential conflicts before they occur. A lock is applied when a statement is executed and released when the transaction is committed. Disk-based row-stores use this approach (with the addition of Multi-version Concurrency Control [MVCC]). - -In 2PL algorithms, while a transaction is writing a row, no other transaction can access it; and while a row is being read, no other transaction can overwrite it. Each row is locked at access time for both reading and writing; and the lock is released at commit time. These algorithms require a scheme for handling and avoiding deadlock. Deadlock can be detected by calculating cycles in a wait-for graph. Deadlock can be avoided by keeping time ordering using TSO or by some kind of back-off scheme. - -**Encounter Time Locking (ETL)** - -Another approach is Encounter Time Locking (ETL), where reads are handled in an optimistic manner, but writes lock the data that they access. As a result, writes from different ETL transactions are aware of each other and can decide to abort. It has been empirically verified that ETL improves the performance of OCC in two ways - - -- First, ETL detects conflicts early on and often increases transaction throughput. This is because transactions do not perform useless operations, because conflicts discovered at commit time (in general) cannot be solved without aborting at least one transaction. -- Second, encounter-time locking Reads-After-Writes (RAW) are handled efficiently without requiring expensive or complex mechanisms. - -**Conclusion** - -OCC is the fastest option for most workloads. This finding has also been observed in our preliminary research phase. - -One of the reasons is that when every core executes multiple threads, a lock is likely to be held by a swapped thread, especially in interactive mode. Another reason is that pessimistic algorithms involve deadlock detection (which introduces overhead) and usually uses read-write locks (which are less efficient than standard spin-locks). - -We have chosen Silo because it was simpler than other existing options, such as TicToc, while maintaining the same performance for most workloads. ETL is sometimes faster than OCC, but it introduces spurious aborts which may confuse a user, in contrast to OCC which aborts only at commit. - -### OCC vs 2PL Differences by Example - -The following shows the differences between two user experiences - Pessimistic (for disk-based tables) and Optimistic (MOT tables) when sessions update the same table simultaneously. - -In this example, the following table test command is run - - -``` -table "TEST" - create table test (x int, y int, z int, primary key(x)); -``` - -This example describes two aspects of the same test - user experience (operations in the example) and retry requirements. - -**Example Pessimistic Approach - Used in Disk-based Tables** - -The following is an example of the Pessimistic approach (which is not Mot). Any Isolation Level may apply. - -The following two sessions perform a transaction that attempts to update a single table. - -A WAIT LOCK action occurs and the client experience is that session #2 is *stuck* until Session #1 has completed a COMMIT. Only afterwards, is Session #2 able to progress. - -However, when this approach is used, both sessions succeed and no abort occurs (unless SERIALIZABLE or REPEATABLE-READ isolation level is applied), which results in the entire transaction needing to be retried. - -**Table 1** Pessimistic Approach Code Example - -| | Session 1 | Session 2 | -| :--- | :------------------------------- | :----------------------------------------------------------- | -| t0 | Begin | Begin | -| t1 | update test set y=200 where x=1; | | -| t2 | y=200 | Update test set y=300 where x=1; - Wait on lock | -| t4 | Commit | | -| | | Unlock | -| | | Commit(in READ-COMMITTED this will succeed, in SERIALIZABLE it will fail) | -| | | y = 300 | - -**Example Optimistic Approach - Used in MOT** - -The following is an example of the Optimistic approach. - -It describes the situation of creating an MOT table and then having two concurrent sessions updating that same MOT table simultaneously - - -``` -create foreign table test (x int, y int, z int, primary key(x)); -``` - -- The advantage of OCC is that there are no locks until COMMIT. -- The disadvantage of using OCC is that the update may fail if another session updates the same record. If the update fails (in all supported isolation levels), an entire SESSION #2 transaction must be retried. -- Update conflicts are detected by the kernel at commit time by using a version checking mechanism. -- SESSION #2 will not wait in its update operation and will be aborted because of conflict detection at commit phase. - -**Table 2** Optimistic Approach Code Example - Used in MOT - -| | Session 1 | Session 2 | -| :--- | :------------------------------- | :------------------------------- | -| t0 | Begin | Begin | -| t1 | update test set y=200 where x=1; | | -| t2 | y=200 | Update test set y=300 where x=1; | -| t4 | Commit | y = 300 | -| | | Commit | -| | | ABORT | -| | | y = 200 | diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-3.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-3.md deleted file mode 100644 index 3af831038f8bfda7238c36d37d0f6be5afea3f97..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-3.md +++ /dev/null @@ -1,59 +0,0 @@ ---- -title: Extended FDW and Other MogDB Features -summary: Extended FDW and Other MogDB Features -author: Zhang Cuiping -date: 2021-03-04 ---- - -# Extended FDW and Other MogDB Features - -MogDB is based on PostgreSQL, which does not have a built-in storage engine adapter, such as MySQL handlerton. To enable the integration of the MOT storage engine into MogDB, we have leveraged and extended the existing Foreign Data Wrapper (FDW) mechanism. With the introduction of FDW into PostgreSQL 9.1, externally managed databases can now be accessed in a way that presents these foreign tables and data sources as united, locally accessible relations. - -In contrast, the MOT storage engine is embedded inside MogDB and its tables are managed by it. Access to tables is controlled by the MogDB planner and executor. MOT gets logging and checkpointing services from MogDB and participates in the MogDB recovery process in addition to other processes. - -We refer to all the components that are in use or are accessing the MOT storage engine as the *Envelope*. - -The following figure shows how the MOT storage engine is embedded inside MogDB and its bi-directional access to database functionality. - -**Figure 1** MOT Storage Engine Embedded inside MogDB - FDW Access to External Databases - -![mot-architecture](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-scale-up-architecture-2.png) - -We have extended the capabilities of FDW by extending and modifying the FdwRoutine structure in order to introduce features and calls that were not required before the introduction of MOT. For example, support for The following new features was added - Add Index, Drop Index/Table, Truncate, Vacuum and Table/Index Memory Statistics. A significant emphasis was put on integration with MogDB logging, replication and checkpointing mechanisms in order to provide consistency for cross-table transactions through failures. In this case, the MOT itself sometimes initiates calls to MogDB functionality through the FDW layer. - -## Creating Tables and Indexes - -In order to support the creation of MOT tables, standard FDW syntax was reused. - -For example, create FOREIGN table. - -The MOT FDW mechanism passes the instruction to the MOT storage engine for actual table creation. Similarly, we support index creation (create index …). This feature was not previously available in FDW, because it was not needed since its tables are managed externally. - -To support both in MOT FDW, the **ValidateTableDef** function actually creates the specified table. It also handles the index creation of that relation, as well as DROP TABLE and DROP INDEX, in addition to VACUUM and ALTER TABLE, which were not previously supported in FDW. - -## Index Usage for Planning and Execution - -A query has two phases - **Planning** and **Execution**. During the Planning phase (which may take place once per multiple executions), the best index for the scan is chosen. This choice is made based on the matching query's WHERE clauses, JOIN clauses and ORDER BY conditions. During execution, a query iterates over the relevant table rows and performs various tasks, such as update or delete, per iteration. An insert is a special case where the table adds the row to all indexes and no scanning is required. - -- **Planner -** In standard FDW, a query is passed for execution to a foreign data source. This means that index filtering and the actual planning (such as the choice of indexes) is not performed locally in the database, rather it is performed in the external data source. Internally, the FDW returns a general plan to the database planner. MOT tables are handled in a similar manner as disk tables. This means that relevant MOT indexes are filtered and matched, and the indexes that minimize the set of traversed rows are selected and are added to the plan. -- **Executor -** The Query Executor uses the chosen MOT index in order to iterate over the relevant rows of the table. Each row is inspected by the MogDB envelope, and according to the query conditions, an update or delete is called to handle the relevant row. - -## Durability, Replication and High Availability - -A storage engine is responsible for storing, reading, updating and deleting data in the underlying memory and storage systems. The logging, checkpointing and recovery are not handled by the storage engine, especially because some transactions encompass multiple tables with different storage engines. Therefore, in order to persist and replicate data, the high-availability facilities from the MogDB envelope are used as follows - - -- **Durability -** In order to ensure Durability, the MOT engine persists data by Write-Ahead Logging (WAL) records using the MogDB's XLOG interface. This also provides the benefits of MogDB's replication capabilities that use the same APIs. You may refer to the **MOT Durability Concepts** for more information. -- **Checkpointing -** A MOT Checkpoint is enabled by registering a callback to the MogDB Checkpointer. Whenever a general database Checkpoint is performed, the MOT Checkpoint process is called as well. MOT keeps the Checkpoint's Log Sequence Number (LSN) in order to be aligned with MogDB recovery. The MOT Checkpointing algorithm is highly optimized and asynchronous and does not stop concurrent transactions. You may refer to the **MOT Checkpoint Concepts** for more information. -- **Recovery -** Upon startup, MogDB first calls an MOT callback that recovers the MOT Checkpoint by loading into memory rows and creating indexes, followed by the execution of the WAL recovery by replaying records according to the Checkpoint's LSN. The MOT Checkpoint is recovered in parallel using multiple threads - each thread reads a different data segment. This makes MOT Checkpoint recovery quite fast on many-core hardware, though it is still potentially slower compared to disk-based tables where only WAL records are replayed. You may refer to the **MOT Recovery Concepts** for more information. - -## VACUUM and DROP - -In order to maximize MOT functionality, we added support for VACUUM, DROP TABLE and DROP INDEX. All three execute with an exclusive table lock, meaning without allowing concurrent transactions on the table. The system VACUUM calls a new FDW function to perform the MOT vacuuming, while DROP was added to the ValidateTableDef() function. - -## Deleting Memory Pools - -Each index and table tracks all the memory pools that it uses. A DROP INDEX command is used to remove metadata. Memory pools are deleted as a single consecutive block. The MOT VACUUM only compacts used memory, because memory reclamation is performed continuously in the background by the epoch-based Garbage Collector (GC). In order to perform the compaction, we switch the index or the table to new memory pools, traverse all the live data, delete each row and insert it using the new pools and finally delete the pools as is done for a drop. - -## Query Native Compilation (JIT) - -The FDW adapter to MOT engine also contains a lite execution path that employs Just-In-Time (JIT) compiled query execution using the LLVM compiler. More information about MOT Query Native Compilation can be found in the **Query Native Compilation (JIT)** section. diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-4.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-4.md deleted file mode 100644 index 0194a6b5c0ca52d4ddac1cb6ecc1fff8bd75f2b5..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-4.md +++ /dev/null @@ -1,22 +0,0 @@ ---- -title: NUMA Awareness Allocation and Affinity -summary: NUMA Awareness Allocation and Affinity -author: Zhang Cuiping -date: 2021-03-04 ---- - -# NUMA Awareness Allocation and Affinity - -Non-Uniform Memory Access (NUMA) is a computer memory design used in multiprocessing, where the memory access time depends on the memory location relative to the processor. Under NUMA, a processor can take advantage of NUMA by preferring to access its own local memory (which is faster), rather than accessing non-local memory (meaning that it will prefer **not** to access the local memory of another processor or memory shared between processors). - -MOT memory access has been designed with NUMA awareness. This means that MOT is aware that memory is not uniform and achieves best performance by accessing the quickest and most local memory. - -The benefits of NUMA are limited to certain types of workloads, particularly on servers where the data is often strongly associated with certain tasks or users. - -In-memory database systems running on NUMA platforms face several issues, such as the increased latency and the decreased bandwidth when accessing remote main memory. To cope with these NUMA-related issues, NUMA awareness must be considered as a major design principle for the fundamental architecture of a database system. - -To facilitate quick operation and make efficient use of NUMA nodes, MOT allocates a designated memory pool for rows per table and for nodes per index. Each memory pool is composed from 2 MB chunks. A designated API allocates these chunks from a local NUMA node, from pages coming from all nodes or in a round-robin fashion, where each chunk is allocated on the next node. By default, pools of shared data are allocated in a round robin fashion in order to balance access, while not splitting rows between different NUMA nodes. However, thread private memory is allocated from a local node. It must also be verified that a thread always operates in the same NUMA node. - -**Summary** - -MOT has a smart memory control module that has preallocated memory pools intended for various types of memory objects. This smart memory control improves performance, reduces locks and ensures stability. The allocation of the memory objects of a transaction is always NUMA-local, ensuring optimal performance for CPU memory access and resulting in low latency and reduced contention. Deallocated objects go back to the memory pool. Minimized use of OS malloc functions during transactions circumvents unnecessary locks. diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-5.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-5.md deleted file mode 100644 index 016a23dd0c18a64b8c7ced456d9edc851899fa95..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-5.md +++ /dev/null @@ -1,41 +0,0 @@ ---- -title: MOT Indexes -summary: MOT Indexes -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT Indexes - -MOT Index is a lock-free index based on state-of-the-art Masstree, which is a fast and scalable Key Value (KV) store for multicore systems, implemented as tries of B+ trees. It achieves excellent performance on many-core servers and high concurrent workloads. It uses various advanced techniques, such as an optimistic lock approach, cache-awareness and memory prefetching. - -After comparing various state-of-the-art solutions, we chose Masstree for the index because it demonstrated the best overall performance for point queries, iterations and modifications. Masstree is a combination of tries and a B+ tree that is implemented to carefully exploit caching, prefetching, optimistic navigation and fine-grained locking. It is optimized for high contention and adds various optimizations to its predecessors, such as OLFIT. However, the downside of a Masstree index is its higher memory consumption. While row data consumes the same memory size, the memory per row per each index (primary or secondary) is higher on average by 16 bytes - 29 bytes in the lock-based B-Tree used in disk-based tables vs. 45 bytes in MOT's Masstree. - -Our empirical experiments showed that the combination of the mature lock-free Masstree implementation and our robust improvements to Silo have provided exactly what we needed in that regard. - -Another challenge was making an optimistic insertion into a table with multiple indexes. - -The Masstree index is at the core of MOT memory layout for data and index management. Our team enhanced and significantly improved Masstree and submitted some of the key contributions to the Masstree open source. These improvements include - - -- Dedicated memory pools per index - Efficient allocation and fast index drop -- Global GC for Masstree - Fast, on-demand memory reclamation -- Masstree iterator implementation with access to an insertion key -- ARM architecture support - -We contributed our Masstree index improvements to the Masstree open-source implementation, which can be found here - . - -MOT's main innovation was to enhance the original Masstree data structure and algorithm, which did not support Non-Unique Indexes (as a Secondary index). You may refer to the **Non-unique Indexes** section for the design details. - -MOT supports both Primary, Secondary and Keyless indexes (subject to the limitations specified in the **Unsupported Index DDLs and Index**section). - -## Non-unique Indexes - -A non-unique index may contain multiple rows with the same key. Non-unique indexes are used solely to improve query performance by maintaining a sorted order of data values that are used frequently. For example, a database may use a non-unique index to group all people from the same family. However, the Masstree data structure implementation does not allow the mapping of multiple objects to the same key. Our solution for enabling the creation of non-unique indexes (as shown in the figure below) is to add a symmetry-breaking suffix to the key, which maps the row. This added suffix is the pointer to the row itself, which has a constant size of 8 bytes and a value that is unique to the row. When inserting into a non-unique index, the insertion of the sentinel always succeeds, which enables the row allocated by the executing transaction to be used. This approach also enable MOT to have a fast, reliable, order-based iterator for a non-unique index. - -**Figure 1** Non-unique Indexes - -![non-unique-indexes](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-indexes-2.png) - -The structure of an MOT table T that has three rows and two indexes is depicted in the figure above. The rectangles represent data rows, and the indexes point to sentinels (the elliptic shapes) which point to the rows. The sentinels are inserted into unique indexes with a key and into non-unique indexes with a key + a suffix. The sentinels facilitate maintenance operations so that the rows can be replaced without touching the index data structure. In addition, there are various flags and a reference count embedded in the sentinel in order to facilitate optimistic inserts. - -When searching a non-unique secondary index, the required key (for example, the family name) is used. The fully concatenated key is only used for insert and delete operations. Insert and delete operations always get a row as a parameter, thereby making it possible to create the entire key and to use it in the execution of the deletion or the insertion of the specific row for the index. diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-6.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-6.md deleted file mode 100644 index 11005768cad07acd696a0939dec94cf1650b5d34..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-6.md +++ /dev/null @@ -1,204 +0,0 @@ ---- -title: MOT Durability Concepts -summary: MOT Durability Concepts -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT Durability Concepts - -Durability refers to long-term data protection (also known as *disk persistence*). Durability means that stored data does not suffer from any kind of degradation or corruption, so that data is never lost or compromised. Durability ensures that data and the MOT engine are restored to a consistent state after a planned shutdown (for example, for maintenance) or an unplanned crash (for example, a power failure). - -Memory storage is volatile, meaning that it requires power to maintain the stored information. Disk storage, on the other hand, is non-volatile, meaning that it does not require power to maintain stored information, thus, it can survive a power shutdown. MOT uses both types of storage - it has all data in memory, while persisting transactional changes to disk **MOT Durability** and by maintaining frequent periodic **MOT Checkpoints** in order to ensure data recovery in case of shutdown. - -The user must ensure sufficient disk space for the logging and Checkpointing operations. A separated drive can be used for the Checkpoint to improve performance by reducing disk I/O load. - -You may refer to **MOT Key Technologies** section__for an overview of how durability is implemented in the MOT engine. - -MOTs WAL Redo Log and checkpoints enabled durability, as described below - - -- **MOT Logging - WAL Redo Log Concepts** -- **MOT Checkpoint Concepts** - -## MOT Logging - WAL Redo Log Concepts - -### Overview - -Write-Ahead Logging (WAL) is a standard method for ensuring data durability. The main concept of WAL is that changes to data files (where tables and indexes reside) are only written after those changes have been logged, meaning only after the log records that describe the changes have been flushed to permanent storage. - -The MOT is fully integrated with the MogDB envelope logging facilities. In addition to durability, another benefit of this method is the ability to use the WAL for replication purposes. - -Three logging methods are supported, two standard Synchronous and Asynchronous, which are also supported by the standard MogDB disk-engine. In addition, in the MOT a Group-Commit option is provided with special NUMA-Awareness optimization. The Group-Commit provides the top performance while maintaining ACID properties. - -To ensure Durability, MOT is fully integrated with the MogDB's Write-Ahead Logging (WAL) mechanism, so that MOT persists data in WAL records using MogDB's XLOG interface. This means that every addition, update, and deletion to an MOT table's record is recorded as an entry in the WAL. This ensures that the most current data state can be regenerated and recovered from this non-volatile log. For example, if three new rows were added to a table, two were deleted and one was updated, then six entries would be recorded in the log. - -- MOT log records are written to the same WAL as the other records of MogDB disk-based tables. - -- MOT only logs an operation at the transaction commit phase. - -- MOT only logs the updated delta record in order to minimize the amount of data written to disk. - -- During recovery, data is loaded from the last known or a specific Checkpoint; and then the WAL Redo log is used to complete the data changes that occur from that point forward. - -- The WAL (Redo Log) retains all the table row modifications until a Checkpoint is performed (as described above). The log can then be truncated in order to reduce recovery time and to save disk space. - - > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** In order to ensure that the log IO device does not become a bottleneck, the log file must be placed on a drive that has low latency. - -### Logging Types - -Two synchronous transaction logging options and one asynchronous transaction logging option are supported (these are also supported by the standard MogDB disk engine). MOT also supports synchronous Group Commit logging with NUMA-awareness optimization, as described below. - -According to your configuration, one of the following types of logging is implemented: - -- **Synchronous Redo Logging** - - The **Synchronous Redo Logging** option is the simplest and most strict redo logger. When a transaction is committed by a client application, the transaction redo entries are recorded in the WAL (Redo Log), as follows - - - 1. While a transaction is in progress, it is stored in the MOT’s memory. - 2. After a transaction finishes and the client application sends a **Commit** command, the transaction is locked and then written to the WAL Redo Log on the disk. This means that while the transaction log entries are being written to the log, the client application is still waiting for a response. - 3. As soon as the transaction's entire buffer is written to the log, the changes to the data in memory take place and then the transaction is committed. After the transaction has been committed, the client application is notified that the transaction is complete. - -- **Technical Description** - - When a transaction ends, the SynchronousRedoLogHandler serializes its transaction buffer and write it to the XLOG iLogger implementation. - - **Figure 1** Synchronous Logging - - ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-durability-concepts-6.png) - - **Summary** - - The **Synchronous Redo Logging** option is the safest and most strict because it ensures total synchronization of the client application and the WAL Redo log entries for each transaction as it is committed; thus ensuring total durability and consistency with absolutely no data loss. This logging option prevents the situation where a client application might mark a transaction as successful, when it has not yet been persisted to disk. - - The downside of the **Synchronous Redo Logging** option is that it is the slowest logging mechanism of the three options. This is because a client application must wait until all data is written to disk and because of the frequent disk writes (which typically slow down the database). - -- **Group Synchronous Redo Logging** - - The **Group Synchronous Redo Logging** option is very similar to the **Synchronous Redo Logging** option, because it also ensures total durability with absolutely no data loss and total synchronization of the client application and the WAL (Redo Log) entries. The difference is that the **Group Synchronous Redo Logging** option writes _groups of transaction_redo entries to the WAL Redo Log on the disk at the same time, instead of writing each and every transaction as it is committed. Using Group Synchronous Redo Logging reduces the amount of disk I/Os and thus improves performance, especially when running a heavy workload. - - The MOT engine performs synchronous Group Commit logging with Non-Uniform Memory Access (NUMA)-awareness optimization by automatically grouping transactions according to the NUMA socket of the core on which the transaction is running. - - You may refer to the **NUMA Awareness Allocation and Affinity** section for more information about NUMA-aware memory access. - - When a transaction commits, a group of entries are recorded in the WAL Redo Log, as follows - - - 1. While a transaction is in progress, it is stored in the memory. The MOT engine groups transactions in buckets according to the NUMA socket of the core on which the transaction is running. This means that all the transactions running on the same socket are grouped together and that multiple groups will be filling in parallel according to the core on which the transaction is running. - - > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** - > - > Each thread runs on a single core/CPU which belongs to a single socket and each thread only writes to the socket of the core on which it is running. - - 2. After a transaction finishes and the client application sends a Commit command, the transaction redo log entries are serialized together with other transactions that belong to the same group. - - 3. After the configured criteria are fulfilled for a specific group of transactions (quantity of committed transactions or timeout period as describes in the **REDO LOG (MOT)** section), the transactions in this group are written to the WAL on the disk. This means that while these log entries are being written to the log, the client applications that issued the commit are waiting for a response. - - 4. As soon as all the transaction buffers in the NUMA-aware group have been written to the log, all the transactions in the group are performing the necessary changes to the memory store and the clients are notified that these transactions are complete. - - Writing transactions to the WAL is more efficient in this manner because all the buffers from the same socket are written to disk together. - - **Technical Description** - - The four colors represent 4 NUMA nodes. Thus each NUMA node has its own memory log enabling a group commit of multiple connections. - - **Figure 2** Group Commit - with NUMA-awareness - - ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-durability-concepts-7.png) - - **Summary** - - The **Group Synchronous Redo Logging** option is a an extremely safe and strict logging option because it ensures total synchronization of the client application and the WAL Redo log entries; thus ensuring total durability and consistency with absolutely no data loss. This logging option prevents the situation where a client application might mark a transaction as successful, when it has not yet been persisted to disk. - - On one hand this option has fewer disk writes than the **Synchronous Redo Logging** option, which may mean that it is faster. The downside is that transactions are locked for longer, meaning that they are locked until after all the transactions in the same NUMA memory have been written to the WAL Redo Log on the disk. - - The benefits of using this option depend on the type of transactional workload. For example, this option benefits systems that have many transactions (and less so for systems that have few transactions, because there are few disk writes anyway). - -- **Asynchronous Redo Logging** - - The **Asynchronous Redo Logging** option is the fastest logging method, However, it does not ensure no data loss, meaning that some data that is still in the buffer and was not yet written to disk may get lost upon a power failure or database crash. When a transaction is committed by a client application, the transaction redo entries are recorded in internal buffers and written to disk at preconfigured intervals. The client application does not wait for the data being written to disk. It continues to the next transaction. This is what makes asynchronous redo logging the fastest logging method. - - When a transaction is committed by a client application, the transaction redo entries are recorded in the WAL Redo Log, as follows - - - 1. While a transaction is in progress, it is stored in the MOT's memory. - 2. After a transaction finishes and the client application sends a Commit command, the transaction redo entries are written to internal buffers, but are not yet written to disk. Then changes to the MOT data memory take place and the client application is notified that the transaction is committed. - 3. At a preconfigured interval, a redo log thread running in the background collects all the buffered redo log entries and writes them to disk. - - **Technical Description** - - Upon transaction commit, the transaction buffer is moved (pointer assignment - not a data copy) to a centralized buffer and a new transaction buffer is allocated for the transaction. The transaction is released as soon as its buffer is moved to the centralized buffer and the transaction thread is not blocked. The actual write to the log uses the Postgres walwriter thread. When the walwriter timer elapses, it first calls the AsynchronousRedoLogHandler (via registered callback) to write its buffers and then continues with its logic and flushes the data to the XLOG. - - **Figure 3** Asynchronous Logging - - ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-durability-concepts-8.png) - - **Summary** - - The Asynchronous Redo Logging option is the fastest logging option because it does not require the client application to wait for data being written to disk. In addition, it groups many transactions redo entries and writes them together, thus reducing the amount of disk I/Os that slow down the MOT engine. - - The downside of the Asynchronous Redo Logging option is that it does not ensure that data will not get lost upon a crash or failure. Data that was committed, but was not yet written to disk, is not durable on commit and thus cannot be recovered in case of a failure. The Asynchronous Redo Logging option is most relevant for applications that are willing to sacrifice data recovery (consistency) over performance. - - Logging Design Details - - The following describes the design details of each persistence-related component in the In-Memory Engine Module. - - **Figure 4** Three Logging Options - - ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-durability-concepts-9.png) - - The RedoLog component is used by both by backend threads that use the In-Memory Engine and by the WAL writer in order to persist their data. Checkpoints are performed using the Checkpoint Manager, which is triggered by the Postgres checkpointer. - -- **Logging Design Overview** - - Write-Ahead Logging (WAL) is a standard method for ensuring data durability. WAL's central concept is that changes to data files (where tables and indexes reside) are only written after those changes have been logged, meaning after the log records that describe these changes have been flushed to permanent storage. - - The MOT Engine uses the existing MogDB logging facilities, enabling it also to participate in the replication process. - -- **Per-transaction Logging** - - In the In-Memory Engine, the transaction log records are stored in a transaction buffer which is part of the transaction object (TXN). The transaction buffer is logged during the calls to addToLog() - if the buffer exceeds a threshold it is then flushed and reused. When a transaction commits and passes the validation phase (OCC SILO**[Comparison - Disk vs. MOT] validation)** or aborts for some reason, the appropriate message is saved in the log as well in order to make it possible to determine the transaction's state during a recovery. - - **Figure 5** Per-transaction Logging - - ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-durability-concepts-10.png) - - Parallel Logging is performed both by MOT and disk engines. However, the MOT engine enhances this design with a log-buffer per transaction, lockless preparation and a single log record. - -- **Exception Handling** - - The persistence module handles exceptions by using the Postgres error reporting infrastructure (ereport). An error message is recorded in the system log for each error condition. In addition, the error is reported to the envelope using Postgres’s built-in error reporting infrastructure. - - The following exceptions are reported by this module - - - **Table 1** Exception Handling - - | Exception Condition | Exception Code | Scenario | Resulting Outcome | - | :----------------------------------- | :----------------------------- | :----------------------------------------------------------- | :--------------------- | - | WAL write failure | ERRCODE_FDW_ERROR | On any case the WAL write fails | Transaction terminates | - | File IO error: write, open and so on | ERRCODE_IO_ERROR | Checkpoint - Called on any file access error | FATAL - process exists | - | Out of Memory | ERRCODE_INSUFFICIENT_RESOURCES | Checkpoint - Local memory allocation failures | FATAL - process exists | - | Logic, DB errors | ERRCODE_INTERNAL_ERROR | Checkpoint: algorithm fails or failure to retrieve table data or indexes. | FATAL - process exists | - -## MOT Checkpoint Concepts - -In MogDB, a Checkpoints is a snapshot of a point in the sequence of transactions at which it is guaranteed that the heap and index data files have been updated with all information written before the checkpoint. - -At the time of a Checkpoint, all dirty data pages are flushed to disk and a special checkpoint record is written to the log file. - -The data is stored directly in memory. The MOT does not store its data it the same way as MogDB so that the concept of dirty pages does not exist. - -For this reason, we have researched and implemented the CALC algorithm, which is described in the paper named Low-Overhead Asynchronous Checkpointing in Main-Memory Database Systems, SIGMOD 2016 from Yale University. - -Low-overhead asynchronous checkpointing in main-memory database systems. - -### CALC Checkpoint Algorithm - Low Overhead in Memory and Compute - -The checkpoint algorithm provides the following benefits - - -- **Reduced Memory Usage -** At most two copies of each record are stored at any time. Memory usage is minimized by only storing a single physical copy of a record while it is live and stable versions are equal or when no checkpoint is actively being recorded. -- **Low Overhead -** CALC's overhead is smaller than other asynchronous checkpointing algorithms. -- **Uses Virtual Points of Consistency -** CALC does not require quiescing of the database in order to achieve a physical point of consistency. - -### Checkpoint Activation - -MOT checkpoints are integrated into MogDB's envelope's Checkpoint mechanism. The Checkpoint process can be triggered manually by executing the **CHECKPOINT;** command or automatically according to the envelope's Checkpoint triggering settings (time/size). - -Checkpoint configuration is performed in the mot.conf file - see the **CHECKPOINT (MOT)** section. diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-7.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-7.md deleted file mode 100644 index 1cb8bcb94e9f27a28c384fe15562565c8595b614..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-7.md +++ /dev/null @@ -1,24 +0,0 @@ ---- -title: MOT Recovery Concepts -summary: MOT Recovery Concepts -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT Recovery Concepts - -The MOT Recovery Module provides all the required functionality for recovering the MOT tables data. The main objective of the Recovery module is to restore the data and the MOT engine to a consistent state after a planned (maintenance for example) shut down or an unplanned (power failure for example) crash. - -MogDB database recovery, which is also sometimes called a *Cold Start*, includes MOT tables and is performed automatically with the recovery of the rest of the database. The MOT Recovery Module is seamlessly and fully integrated into the MogDB recovery process. - -MOT recovery has two main stages - Checkpoint Recovery and WAL Recovery (Redo Log). - -MOT checkpoint recovery is performed before the envelope's recovery takes place. This is done only at cold-start events (start of a PG process). It recovers the metadata first (schema) and then inserts all the rows from the current valid checkpoint, which is done in parallel by checkpoint_recovery_workers, each working on a different table. The indexes are created during the insert process. - -When checkpointing a table, it is divided into 16MB chunks, so that multiple recovery workers can recover the table in parallel. This is done in order to speed-up the checkpoint recovery, it is implemented as a multi-threaded procedure where each thread is responsible for recovering a different segment. There are no dependencies between different segments therefore there is no contention between the threads and there is no need to use locks when updating table or inserting new rows. - -WAL records are recovered as part of the envelope's WAL recovery. MogDB envelope iterates through the XLOG and performs the necessary operation based on the xlog record type. In case of entry with record type MOT, the envelope forwards it to MOT RecoveryManager for handling. The xlog entry will be ignored by MOT recovery, if it is 'too old' - its LSN is older than the checkpoint's LSN (Log Sequence Number). - -In an active-standby deployment, the standby server is always in a Recovery state for an automatic WAL recovery process. - -The MOT recovery parameters are set in the mot.conf file explained in the **[MOT Recovery](../../../administrator-guide/mot-engine/2-using-mot/5-mot-administration.md#mot-recovery)** section. diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-8.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-8.md deleted file mode 100644 index 661cac1f9337f39acb9c6093215b9d02c18de05e..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-8.md +++ /dev/null @@ -1,73 +0,0 @@ ---- -title: MOT Query Native Compilation (JIT) -summary: MOT Query Native Compilation (JIT) -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT Query Native Compilation (JIT) - -MOT enables you to prepare and parse *pre-compiled full queries* in a native format (using a **PREPARE** statement) before they are needed for execution. - -This native format can later be executed (using an **EXECUTE** command) more efficiently. This type of execution is much more efficient because during execution the native format bypasses multiple database processing layers. This division of labor avoids repetitive parse analysis operations. The Lite Executor module is responsible for executing **prepared** queries and has a much faster execution path than the regular generic plan performed by the envelope. This is achieved using Just-In-Time (JIT) compilation via LLVM. In addition, a similar solution that has potentially similar performance is provided in the form of pseudo-LLVM. - -The following is an example of a **PREPARE** syntax in SQL: - -``` -PREPARE name [ ( data_type [, ...] ) ] AS statement -``` - -The following is an example of how to invoke a PREPARE and then an EXECUTE statement in a Java application - - -``` -conn = DriverManager.getConnection(connectionUrl, connectionUser, connectionPassword); - -// Example 1: PREPARE without bind settings -String query = "SELECT * FROM getusers"; -PreparedStatement prepStmt1 = conn.prepareStatement(query); -ResultSet rs1 = pstatement.executeQuery()) -while (rs1.next()) {…} - -// Example 2: PREPARE with bind settings -String sqlStmt = "SELECT * FROM employees where first_name=? and last_name like ?"; -PreparedStatement prepStmt2 = conn.prepareStatement(sqlStmt); -prepStmt2.setString(1, "Mark"); // first name "Mark" -prepStmt2.setString(2, "%n%"); // last name contains a letter "n" -ResultSet rs2 = prepStmt2.executeQuery()) -while (rs2.next()) {…} -``` - -## Prepare - -**PREPARE** creates a prepared statement. A prepared statement is a server-side object that can be used to optimize performance. When the **PREPARE** statement is executed, the specified statement is parsed, analyzed and rewritten. - -If the tables mentioned in the query statement are MOT tables, the MOT compilation takes charge of the object preparation and performs a special optimization by compiling the query into IR byte code based on LLVM. - -Whenever a new query compilation is required, the query is analyzed and a proper tailored IR byte code is generated for the query using the utility GsCodeGen object and standard LLVM JIT API (IRBuilder). After byte-code generation is completed, the code is JIT-compiled into a separate LLVM module. The compiled code results in a C function pointer that can later be invoked for direct execution. Note that this C function can be invoked concurrently by many threads, as long as each thread provides a distinct execution context (details are provided below). Each such execution context is referred to as *JIT Context*. - -To improve performance further, MOT JIT applies a caching policy for its LLVM code results, enabling them to be reused for the same queries across different sessions. - -## Execute - -When an EXECUTE command is issued, the prepared statement (described above) is planned and executed. This division of labor avoids repetitive parse analysis work, while enabling the execution plan to depend on the specific setting values supplied. - -When the resulting execute query command reaches the database, it uses the corresponding IR byte code which is executed directly and more efficiently within the MOT engine. This is referred to as *Lite Execution*. - -In addition, for availability, the Lite Executor maintains a preallocated pool of JIT sources. Each session preallocates its own session-local pool of JIT context objects (used for repeated executions of precompiled queries). - -For more details you may refer to the Supported Queries for Lite Execution and Unsupported Queries for Lite Execution sections. - -## JIT Compilation Comparison - MogDB Disk-based vs. MOT Tables - -Currently, MogDB contains two main forms of JIT / CodeGen query optimizations for its disk-based tables - - -- Accelerating expression evaluation, such as in WHERE clauses, target lists, aggregates and projections. -- Inlining small function invocations. - -These optimizations are partial (in the sense they do not optimize the entire interpreted operator tree or replace it altogether) and are targeted mostly at CPU-bound complex queries, typically seen in OLAP use cases. The execution of queries is performed in a pull-model (Volcano-style processing) using an interpreted operator tree. When activated, the compilation is performed at each query execution. At the moment, caching of the generated LLVM code and its reuse across sessions and queries is not yet provided. - -In contrast, MOT JIT optimization provides LLVM code for entire queries that qualify for JIT optimization by MOT. The resulting code is used for direct execution over MOT tables, while the interpreted operator model is abandoned completely. The result is *practically* handwritten LLVM code that has been generated for an entire specific query execution. - -Another significant conceptual difference is that MOT LLVM code is only generated for prepared queries during the PREPARE phase of the query, rather than at query execution. This is especially important for OLTP scenarios due to the rather short runtime of OLTP queries, which cannot allow for code generation and relatively long query compilation time to be performed during each query execution. - -Finally, in PostgreSQL the activation of a PREPARE implies the reuse of the resulting plan across executions with different parameters in the same session. Similarly, the MOT JIT applies a caching policy for its LLVM code results, and extends it for reuse across different sessions. Thus, a single query may be compiled just once and its LLVM code may be reused across many sessions, which again is beneficial for OLTP scenarios. diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-9.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-9.md deleted file mode 100644 index c8cfa4760e20d748b411b87bb05548a6b1e7552f..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-9.md +++ /dev/null @@ -1,33 +0,0 @@ ---- -title: Comparison - Disk vs. MOT -summary: Comparison - Disk vs. MOT -author: Zhang Cuiping -date: 2021-03-04 ---- - -# Comparison - Disk vs. MOT - -The following table briefly compares the various features of the MogDB disk-based storage engine and the MOT storage engine. - -**Table 1** Comparison - Disk-based vs. MOT - -| Feature | MogDB Disk Store | MogDB MOT Engine | -| :--------------------------- | :---------------------------------- | :---------------------------------------- | -| Intel x86 + Kunpeng ARM | Yes | Yes | -| SQL and Feature-set Coverage | 100% | 98% | -| Scale-up (Many-cores, NUMA) | Low Efficiency | High Efficiency | -| Throughput | High | Extremely High | -| Latency | Low | Extremely Low | -| Distributed (Cluster Mode) | Yes | Yes | -| Isolation Levels | - RC+SI
- RR
- Serializable | - RC
- RR
- RC+SI (in V2 release) | -| Concurrency Control | Pessimistic | Optimistic | -| Data Capacity (Data + Index) | Unlimited | Limited to DRAM | -| Native Compilation | No | Yes | -| Replication, Recovery | Yes | Yes | -| Replication Options | 2 (sync, async) | 3 (sync, async, group-commit) | - -**Legend -** - -- RR = Repeatable Reads -- RC = Read Committed -- SI = Snapshot Isolation diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/concepts-of-mot.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/concepts-of-mot.md deleted file mode 100644 index 6960df5305689753939bdb594e0e9a378c2aa184..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/concepts-of-mot.md +++ /dev/null @@ -1,20 +0,0 @@ ---- -title: Concepts of MOT -summary: Concepts of MOT -author: Guo Huan -date: 2023-05-22 ---- - -# Concepts of MOT - -This chapter describes how MogDB MOT is designed and how it works. It also sheds light on its advanced features and capabilities and how to use them. This chapter serves to educate the reader about various technical details of how MOT operates, details of important MOT features and innovative differentiators. The content of this chapter may be useful for decision-making regarding MOT's suitability to specific application requirements and for using and managing it most efficiently. - -+ **[MOT Scale-up Architecture](3-1.md)** -+ **[MOT Concurrency Control Mechanism](3-2.md)** -+ **[Extended FDW and Other openGauss Features](3-3.md)** -+ **[NUMA Awareness Allocation and Affinity](3-4.md)** -+ **[MOT Indexes](3-5.md)** -+ **[MOT Durability Concepts](3-6.md)** -+ **[MOT Recovery Concepts](3-7.md)** -+ **[MOT Query Native Compilation (JIT)](3-8.md)** -+ **[Comparison – Disk vs. MOT](3-9.md)** \ No newline at end of file diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/4-appendix/1-references.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/4-appendix/1-references.md deleted file mode 100644 index 5354e91374b4ee65554416c0dc04c990d40eccb5..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/4-appendix/1-references.md +++ /dev/null @@ -1,36 +0,0 @@ ---- -title: References -summary: References -author: Zhang Cuiping -date: 2021-05-18 ---- - -# References - -[1] Y. Mao, E. Kohler, and R. T. Morris. Cache craftiness for fast multicore key-value storage. In Proc. 7th ACM European Conference on Computer Systems (EuroSys), Apr. 2012. - -[2] K. Ren, T. Diamond, D. J. Abadi, and A. Thomson. Low-overhead asynchronous checkpointing in main-memory database systems. In Proceedings of the 2016 ACM SIGMOD International Conference on Management of Data, 2016. - -[5] Tu, S., Zheng, W., Kohler, E., Liskov, B., and Madden, S. Speedy transactions in multicore in-memory databases. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (New York, NY, USA, 2013), SOSP ’13, ACM, pp. 18-32. - -[6] H. Avni at al. Industrial-Strength OLTP Using Main Memory and Many-cores, VLDB 2020. - -[7] Bernstein, P. A., and Goodman, N. Concurrency control in distributed database systems. ACM Comput. Surv. 13, 2 (1981), 185-221. - -[8] Felber, P., Fetzer, C., and Riegel, T. Dynamic performance tuning of word-based software transactional memory. In Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP 2008, Salt Lake City, UT, USA, February 20-23, 2008 (2008), - -pp. 237-246. - -[9] Appuswamy, R., Anadiotis, A., Porobic, D., Iman, M., and Ailamaki, A. Analyzing the impact of system architecture on the scalability of OLTP engines for high-contention workloads. PVLDB 11, 2 (2017), - -121-134. - -[10] R. Sherkat, C. Florendo, M. Andrei, R. Blanco, A. Dragusanu, A. Pathak, P. Khadilkar, N. Kulkarni, C. Lemke, S. Seifert, S. Iyer, S. Gottapu, R. Schulze, C. Gottipati, N. Basak, Y. Wang, V. Kandiyanallur, S. Pendap, D. Gala, R. Almeida, and P. Ghosh. Native store extension for SAP HANA. PVLDB, 12(12): - -2047-2058, 2019. - -[11] X. Yu, A. Pavlo, D. Sanchez, and S. Devadas. Tictoc: Time traveling optimistic concurrency control. In Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26 - July 01, 2016, pages 1629-1642, 2016. - -[12] V. Leis, A. Kemper, and T. Neumann. The adaptive radix tree: Artful indexing for main-memory databases. In C. S. Jensen, C. M. Jermaine, and X. Zhou, editors, 29th IEEE International Conference on Data Engineering, ICDE 2013, Brisbane, Australia, April 8-12, 2013, pages 38-49. IEEE Computer Society, 2013. - -[13] S. K. Cha, S. Hwang, K. Kim, and K. Kwon. Cache-conscious concurrency control of main-memory indexes on shared-memory multiprocessor systems. In P. M. G. Apers, P. Atzeni, S. Ceri, S. Paraboschi, K. Ramamohanarao, and R. T. Snodgrass, editors, VLDB 2001, Proceedings of 27th International Conference on Very Large Data Bases, September 11-14, 2001, Roma, Italy, pages 181-190. Morga Kaufmann, 2001. diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/4-appendix/2-glossary.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/4-appendix/2-glossary.md deleted file mode 100644 index f2d1576038d266439c9d9238fcf119133a52ce34..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/4-appendix/2-glossary.md +++ /dev/null @@ -1,59 +0,0 @@ ---- -title: Glossary -summary: Glossary -author: Zhang Cuiping -date: 2021-05-18 ---- - -# Glossary - -| Acronym | Definition/Description | -| :------ | :----------------------------------------------------------- | -| 2PL | 2-Phase Locking | -| ACID | Atomicity, Consistency, Isolation, Durability | -| AP | Analytical Processing | -| ARM | Advanced RISC Machine, a hardware architecture alternative to x86 | -| CC | Concurrency Control | -| CPU | Central Processing Unit | -| DB | Database | -| DBA | Database Administrator | -| DBMS | Database Management System | -| DDL | Data Definition Language. Database Schema management language | -| DML | Data Modification Language | -| ETL | Extract, Transform, Load or Encounter Time Locking | -| FDW | Foreign Data Wrapper | -| GC | Garbage Collector | -| HA | High Availability | -| HTAP | Hybrid Transactional-Analytical Processing | -| IoT | Internet of Things | -| IM | In-Memory | -| IMDB | In-Memory Database | -| IR | Intermediate Representation of a source code, used in compilation and optimization | -| JIT | Just In Time | -| JSON | JavaScript Object Notation | -| KV | Key Value | -| LLVM | Low-Level Virtual Machine, refers to a compilation code or queries to IR | -| M2M | Machine-to-Machine | -| ML | Machine Learning | -| MM | Main Memory | -| MO | Memory Optimized | -| MOT | Memory Optimized Tables storage engine (SE), pronounced as /em/ /oh/ /tee/ | -| MVCC | Multi-Version Concurrency Control | -| NUMA | Non-Uniform Memory Access | -| OCC | Optimistic Concurrency Control | -| OLTP | Online Transaction Processing | -| PG | PostgreSQL | -| RAW | Reads-After-Writes | -| RC | Return Code | -| RTO | Recovery Time Objective | -| SE | Storage Engine | -| SQL | Structured Query Language | -| TCO | Total Cost of Ownership | -| TP | Transactional Processing | -| TPC-C | An On-Line Transaction Processing Benchmark | -| Tpm-C | Transactions-per-minute-C. A performance metric for TPC-C benchmark that counts new-order transactions. | -| TVM | Tiny Virtual Machine | -| TSO | Time Sharing Option | -| UDT | User-Defined Type | -| WAL | Write Ahead Log | -| XLOG | A PostgreSQL implementation of transaction logging (WAL - described above) | diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/4-appendix/mot-appendix.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/4-appendix/mot-appendix.md deleted file mode 100644 index b12952408d583c29861663e09e6db99a1d921a93..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/4-appendix/mot-appendix.md +++ /dev/null @@ -1,11 +0,0 @@ ---- -title: Appendix -summary: Appendix -author: Guo Huan -date: 2023-05-22 ---- - -# Appendix - -+ **[References](1-references.md)** -+ **[Glossary](2-glossary.md)** \ No newline at end of file diff --git a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/mot-engine.md b/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/mot-engine.md deleted file mode 100644 index 7f56d056406ba671e0385602b52444d41162d798..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/administrator-guide/mot-engine/mot-engine.md +++ /dev/null @@ -1,13 +0,0 @@ ---- -title: MOT -summary: MOT -author: Guo Huan -date: 2023-05-22 ---- - -# MOT - -- **[Introducing MOT](1-introducing-mot/introducing-mot.md)** -- **[Using MOT](2-using-mot/using-mot.md)** -- **[Concepts of MOT](3-concepts-of-mot/concepts-of-mot.md)** -- **[Appendix](4-appendix/mot-appendix.md)** \ No newline at end of file diff --git a/product/en/docs-mogdb/v5.0/reference-guide/guc-parameters/guc-parameter-list.md b/product/en/docs-mogdb/v5.0/reference-guide/guc-parameters/guc-parameter-list.md index 760442c5562b297fd1f148a6c58b327ffb9d9a1d..a3fb1fd85b7671afb8198f829dc41643a5c276e4 100644 --- a/product/en/docs-mogdb/v5.0/reference-guide/guc-parameters/guc-parameter-list.md +++ b/product/en/docs-mogdb/v5.0/reference-guide/guc-parameters/guc-parameter-list.md @@ -104,7 +104,6 @@ date: 2022-05-26 | [client_min_messages](../../reference-guide/guc-parameters/error-reporting-and-logging/logging-time.md#client_min_messages) | | [cn_send_buffer_size](fault-tolerance.md#cn_send_buffer_size) | | [codegen_cost_threshold](../../reference-guide/guc-parameters/query-planning/other-optimizer-options.md#codegen_cost_threshold) | -| [codegen_mot_limit](mot.md#codegen_mot_limit) | | [codegen_strategy](../../reference-guide/guc-parameters/query-planning/other-optimizer-options.md#codegen_strategy) | | [comm_proxy_attr](../../reference-guide/guc-parameters/connection-and-authentication/communication-library-parameters.md#comm_proxy_attr) | | [commit_delay](../../reference-guide/guc-parameters/write-ahead-log/settings.md#commit_delay) | @@ -212,8 +211,6 @@ date: 2022-05-26 | [enable_cbm_tracking](backup-and-restoration-parameter.md#enable_cbm_tracking) | | [enable_change_hjcost](../../reference-guide/guc-parameters/query-planning/optimizer-method-configuration.md#enable_change_hjcost) | | [enable_codegen](../../reference-guide/guc-parameters/query-planning/other-optimizer-options.md#enable_codegen) | -| [enable_codegen_mot](mot.md#enable_codegen_mot) | -| [enable_codegen_mot_print](mot.md#enable_codegen_mot_print) | | [enable_codegen_print](../../reference-guide/guc-parameters/query-planning/other-optimizer-options.md#enable_codegen_print) | | [enable_compress_spill](developer-options.md#enable_compress_spill) | | [enable_consider_usecount](../../reference-guide/guc-parameters/resource-consumption/background-writer.md#enable_consider_usecount) | @@ -308,7 +305,6 @@ date: 2022-05-26 | [fault_mon_timeout](lock-management.md#fault_mon_timeout) | | [FencedUDFMemoryLimit](guc-user-defined-functions.md#fencedudfmemorylimit) | | [force_bitmapand](../../reference-guide/guc-parameters/query-planning/optimizer-method-configuration.md#force_bitmapand) | -| [force_pseudo_codegen_mot](mot.md#force_pseudo_codegen_mot) | | [force_promote](../../reference-guide/guc-parameters/write-ahead-log/settings.md#force_promote) | | [from_collapse_limit](../../reference-guide/guc-parameters/query-planning/other-optimizer-options.md#from_collapse_limit) | | [fsync](../../reference-guide/guc-parameters/write-ahead-log/settings.md#fsync) | @@ -441,8 +437,6 @@ date: 2022-05-26 | [minimum_pool_size](connection-pool-parameters.md#minimum_pool_size) | | [modify_initial_password](../../reference-guide/guc-parameters/connection-and-authentication/security-and-authentication.md#modify_initial_password) | | [most_available_sync](../../reference-guide/guc-parameters/ha-replication/primary-server.md#most_available_sync) | -| [mot_allow_index_on_nullable_column](mot.md#mot_allow_index_on_nullable_column) | -| [mot_config_file](mot.md#mot_config_file) | | [ngram_gram_size](../../reference-guide/guc-parameters/query-planning/other-optimizer-options.md#ngram_gram_size) | | [ngram_grapsymbol_ignore](../../reference-guide/guc-parameters/query-planning/other-optimizer-options.md#ngram_grapsymbol_ignore) | | [ngram_punctuation_ignore](../../reference-guide/guc-parameters/query-planning/other-optimizer-options.md#ngram_punctuation_ignore) | diff --git a/product/en/docs-mogdb/v5.0/reference-guide/guc-parameters/mot.md b/product/en/docs-mogdb/v5.0/reference-guide/guc-parameters/mot.md deleted file mode 100644 index 234c505c0ddf7890f1e63611980ff183743af77d..0000000000000000000000000000000000000000 --- a/product/en/docs-mogdb/v5.0/reference-guide/guc-parameters/mot.md +++ /dev/null @@ -1,73 +0,0 @@ ---- -title: Memory Table -summary: Memory Table -author: Zhang Cuiping -date: 2021-04-20 ---- - -# Memory Table - -This section describes the parameters in the memory table. - -## enable_codegen_mot - -**Parameter description**: Specifies whether to enable the native LLVM Lite to perform simple queries. If native LLVM is not supported on the current platform, pseudo LLVM will be used. - -This parameter is a POSTMASTER parameter. Set it based on instructions provided in Table 1 [GUC parameters](appendix.md). - -**Value range**: Boolean - -**Default value**: **true** - -## force_pseudo_codegen_mot - -**Parameter description**: Specifies whether to force pseudo LLVM Lite to perform simple queries, even if the current platform supports native LLVM. - -This parameter is a POSTMASTER parameter. Set it based on instructions provided in Table 1 [GUC parameters](appendix.md). - -**Value range**: Boolean - -**Default value**: **true** - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** -> Even if **force_pseudo_codegen_mot** is set to **true**, the current platform does not support the native LLVM. In this case, the pseudo LLVM is still used. - -## enable_codegen_mot_print - -**Parameter description**: Specifies whether to print the IR byte code of the generation function. (If pseudo LLVM is used, the pseudo IR byte code is printed.) - -This parameter is a POSTMASTER parameter. Set it based on instructions provided in Table 1 [GUC parameters](appendix.md). - -**Value range**: Boolean - -**Default value**: **true** - -## codegen_mot_limit - -**Parameter description**: Specifies the maximum number of global cache plan sources and the clone plan of each session. - -This parameter is a POSTMASTER parameter. Set it based on instructions provided in Table 1 [GUC parameters](appendix.md). - -**Value range:** uint32 - -**Default value**: **100** - -## mot_allow_index_on_nullable_column - -**Parameter description**: Specifies whether indexes can be created on the **nullable** column in the memory table. - -This parameter is a POSTMASTER parameter. Set it based on instructions provided in Table 1 [GUC parameters](appendix.md). - -**Value range**: Boolean - -**Default value**: **true** - -## mot_config_file - -**Parameter description**: Specifies the main configuration file of the MOT. - -This parameter is a POSTMASTER parameter. Set it based on instructions provided in Table 1 [GUC parameters](appendix.md). - -**Value range**: a string - -**Default value**: **NULL** diff --git a/product/en/docs-mogdb/v5.0/reference-guide/guc-parameters/query-planning/other-optimizer-options.md b/product/en/docs-mogdb/v5.0/reference-guide/guc-parameters/query-planning/other-optimizer-options.md index ab1421ee35b60018a66e9e6b20d83270c4994c47..e204455812c8c36e6249f6c59500b5455ab8fe5e 100644 --- a/product/en/docs-mogdb/v5.0/reference-guide/guc-parameters/query-planning/other-optimizer-options.md +++ b/product/en/docs-mogdb/v5.0/reference-guide/guc-parameters/query-planning/other-optimizer-options.md @@ -152,7 +152,7 @@ This parameter is a USERSET parameter. Set it based on instructions provided in - **intargetlist**: Uses the In Target List query rewriting rules (subquery optimization in the target column). - **predpushnormal**: Use the Predicate Push query rewriting rule (push the predicate condition to the subquery). - **predpushforce**: Uses the Predicate Push query rewriting rules. Push down predicate conditions to subqueries and use indexes as much as possible for acceleration. -- **predpush**: Selects the optimal plan based on the cost in **predpushnormal** and **predpushforce**. +- **predpush**: Selects the optimal plan based on the cost in **predpushnormal** and **predpushforce**. **Note**: The rewriting rule for the **predpush** can in rare scenarios result in failure to generate a legal plan, and full testing is recommended before enabling the parameter. - **reduce_orderby**:Uses the reduce orderby query rewriting rule (remove unnecessary sorting in subquery). **Default value**: **magicset, reduce_orderby** @@ -255,7 +255,7 @@ The restrictions on simple query are as follows: - **on**: enabled. - **off**: disabled. -**Default value**: **on** +**Default value**: **on** (when using PTK to install MogDB, PTK will optimize this parameter and the default value is **off** after optimization) ## enable_partition_opfusion diff --git a/product/en/docs-mogdb/v5.0/reference-guide/guc-parameters/reference-guide-guc-parameters.md b/product/en/docs-mogdb/v5.0/reference-guide/guc-parameters/reference-guide-guc-parameters.md index 4d0720996e70adc5db3486918ff314ba82a56a79..562fffe14141b0e0bd1b1533b44c1c44ffaab518 100644 --- a/product/en/docs-mogdb/v5.0/reference-guide/guc-parameters/reference-guide-guc-parameters.md +++ b/product/en/docs-mogdb/v5.0/reference-guide/guc-parameters/reference-guide-guc-parameters.md @@ -14,7 +14,6 @@ date: 2023-04-07 - **[Resource Consumption](./resource-consumption/resource-consumption.md)** - **[Write Ahead Log](./write-ahead-log/write-ahead-log.md)** - **[HA Replication](./ha-replication/ha-replication.md)** -- **[Memory Table](mot.md)** - **[Query Planning](./query-planning/query-planning.md)** - **[Error Reporting and Logging](./error-reporting-and-logging/error-reporting-and-logging.md)** - **[Alarm Detection](alarm-detection.md)** diff --git a/product/en/docs-mogdb/v5.0/reference-guide/guc-parameters/write-ahead-log/log-replay.md b/product/en/docs-mogdb/v5.0/reference-guide/guc-parameters/write-ahead-log/log-replay.md index a12acaa6cd7d1c5f1ad1495a62161a47e86e0b04..3e31ff20771204ac39fc515827c6e834a14f650d 100644 --- a/product/en/docs-mogdb/v5.0/reference-guide/guc-parameters/write-ahead-log/log-replay.md +++ b/product/en/docs-mogdb/v5.0/reference-guide/guc-parameters/write-ahead-log/log-replay.md @@ -21,7 +21,7 @@ This parameter is a SIGHUP parameter. Set it based on instructions provided in T **Parameter description**: Specifies whether to count information required by **redo_time_detail()**. -This parameter is a SIGHUP parameter. Set it based on instructions provided in Table 1 [GUC parameters](../../../reference-guide/guc-parameters/appendix.md). +This parameter is a POSTMASTER parameter. Set it based on instructions provided in Table 1 [GUC parameters](../../../reference-guide/guc-parameters/appendix.md). **Value range**: boolean, true/false diff --git a/product/en/docs-mogdb/v5.0/toc.md b/product/en/docs-mogdb/v5.0/toc.md index aebb2b871055ec4fe841aca4c10fdcf1c167d7fb..dead7552ab0e3d2360a2cae1ade7cc04aa7de1db 100644 --- a/product/en/docs-mogdb/v5.0/toc.md +++ b/product/en/docs-mogdb/v5.0/toc.md @@ -229,33 +229,6 @@ + [Slow SQL Diagnosis](/administrator-guide/routine-maintenance/slow-sql-diagnosis.md) + [Log Reference](/administrator-guide/routine-maintenance/11-log-reference.md) + [Primary and Standby Management](/administrator-guide/primary-and-standby-management.md) - + [MOT Engine](/administrator-guide/mot-engine/mot-engine.md) - + [Introducing MOT](/administrator-guide/mot-engine/1-introducing-mot/introducing-mot.md) - + [MOT Introduction](/administrator-guide/mot-engine/1-introducing-mot/1-mot-introduction.md) - + [MOT Features and Benefits](/administrator-guide/mot-engine/1-introducing-mot/2-mot-features-and-benefits.md) - + [MOT Key Technologies](/administrator-guide/mot-engine/1-introducing-mot/3-mot-key-technologies.md) - + [MOT Usage Scenarios](/administrator-guide/mot-engine/1-introducing-mot/4-mot-usage-scenarios.md) - + [MOT Performance Benchmarks](/administrator-guide/mot-engine/1-introducing-mot/5-mot-performance-benchmarks.md) - + [Using MOT](/administrator-guide/mot-engine/2-using-mot/using-mot.md) - + [Using MOT Overview](/administrator-guide/mot-engine/2-using-mot/1-using-mot-overview.md) - + [MOT Preparation](/administrator-guide/mot-engine/2-using-mot/2-mot-preparation.md) - + [MOT Deployment](/administrator-guide/mot-engine/2-using-mot/3-mot-deployment.md) - + [MOT Usage](/administrator-guide/mot-engine/2-using-mot/4-mot-usage.md) - + [MOT Administration](/administrator-guide/mot-engine/2-using-mot/5-mot-administration.md) - + [MOT Sample TPC-C Benchmark](/administrator-guide/mot-engine/2-using-mot/6-mot-sample-tpcc-benchmark.md) - + [Concepts of MOT](/administrator-guide/mot-engine/3-concepts-of-mot/concepts-of-mot.md) - + [MOT Scale-up Architecture](/administrator-guide/mot-engine/3-concepts-of-mot/3-1.md) - + [MOT Concurrency Control Mechanism](/administrator-guide/mot-engine/3-concepts-of-mot/3-2.md) - + [Extended FDW and Other MogDB Features](/administrator-guide/mot-engine/3-concepts-of-mot/3-3.md) - + [NUMA Awareness Allocation and Affinity](/administrator-guide/mot-engine/3-concepts-of-mot/3-4.md) - + [MOT Indexes](/administrator-guide/mot-engine/3-concepts-of-mot/3-5.md) - + [MOT Durability Concepts](/administrator-guide/mot-engine/3-concepts-of-mot/3-6.md) - + [MOT Recovery Concepts](/administrator-guide/mot-engine/3-concepts-of-mot/3-7.md) - + [MOT Query Native Compilation (JIT)](/administrator-guide/mot-engine/3-concepts-of-mot/3-8.md) - + [Comparison - Disk vs. MOT](/administrator-guide/mot-engine/3-concepts-of-mot/3-9.md) - + [Appendix](/administrator-guide/mot-engine/4-appendix/mot-appendix.md) - + [References](/administrator-guide/mot-engine/4-appendix/1-references.md) - + [Glossary](/administrator-guide/mot-engine/4-appendix/2-glossary.md) + [Column-store Tables Management](/administrator-guide/column-store-tables-management.md) + [Backup and Restoration](/administrator-guide/backup-and-restoration/backup-and-restoration.md) + [Overview](/administrator-guide/backup-and-restoration/backup-and-restoration-overview.md) @@ -1384,7 +1357,6 @@ + [Sending Server](./reference-guide/guc-parameters/ha-replication/sending-server.md) + [Primary Server](./reference-guide/guc-parameters/ha-replication/primary-server.md) + [Standby Server](./reference-guide/guc-parameters/ha-replication/standby-server.md) - + [Memory Table](./reference-guide/guc-parameters/mot.md) + [Query Planning](./reference-guide/guc-parameters/query-planning/query-planning.md) + [Optimizer Method Configuration](./reference-guide/guc-parameters/query-planning/optimizer-method-configuration.md) + [Optimizer Cost Constants](./reference-guide/guc-parameters/query-planning/optimizer-cost-constants.md) diff --git a/product/en/docs-mogdb/v5.0/toc_manage.md b/product/en/docs-mogdb/v5.0/toc_manage.md index 45b3e6bd12495d20e701212c99b317f5c6ae24c7..2319e36aa32c030d1bf5305a99687f1727bc10f8 100644 --- a/product/en/docs-mogdb/v5.0/toc_manage.md +++ b/product/en/docs-mogdb/v5.0/toc_manage.md @@ -25,33 +25,6 @@ + [Slow SQL Diagnosis](/administrator-guide/routine-maintenance/slow-sql-diagnosis.md) + [Log Reference](/administrator-guide/routine-maintenance/11-log-reference.md) + [Primary and Standby Management](/administrator-guide/primary-and-standby-management.md) -+ [MOT Engine](/administrator-guide/mot-engine/mot-engine.md) - + [Introducing MOT](/administrator-guide/mot-engine/1-introducing-mot/introducing-mot.md) - + [MOT Introduction](/administrator-guide/mot-engine/1-introducing-mot/1-mot-introduction.md) - + [MOT Features and Benefits](/administrator-guide/mot-engine/1-introducing-mot/2-mot-features-and-benefits.md) - + [MOT Key Technologies](/administrator-guide/mot-engine/1-introducing-mot/3-mot-key-technologies.md) - + [MOT Usage Scenarios](/administrator-guide/mot-engine/1-introducing-mot/4-mot-usage-scenarios.md) - + [MOT Performance Benchmarks](/administrator-guide/mot-engine/1-introducing-mot/5-mot-performance-benchmarks.md) - + [Using MOT](/administrator-guide/mot-engine/2-using-mot/using-mot.md) - + [Using MOT Overview](/administrator-guide/mot-engine/2-using-mot/1-using-mot-overview.md) - + [MOT Preparation](/administrator-guide/mot-engine/2-using-mot/2-mot-preparation.md) - + [MOT Deployment](/administrator-guide/mot-engine/2-using-mot/3-mot-deployment.md) - + [MOT Usage](/administrator-guide/mot-engine/2-using-mot/4-mot-usage.md) - + [MOT Administration](/administrator-guide/mot-engine/2-using-mot/5-mot-administration.md) - + [MOT Sample TPC-C Benchmark](/administrator-guide/mot-engine/2-using-mot/6-mot-sample-tpcc-benchmark.md) - + [Concepts of MOT](/administrator-guide/mot-engine/3-concepts-of-mot/concepts-of-mot.md) - + [MOT Scale-up Architecture](/administrator-guide/mot-engine/3-concepts-of-mot/3-1.md) - + [MOT Concurrency Control Mechanism](/administrator-guide/mot-engine/3-concepts-of-mot/3-2.md) - + [Extended FDW and Other MogDB Features](/administrator-guide/mot-engine/3-concepts-of-mot/3-3.md) - + [NUMA Awareness Allocation and Affinity](/administrator-guide/mot-engine/3-concepts-of-mot/3-4.md) - + [MOT Indexes](/administrator-guide/mot-engine/3-concepts-of-mot/3-5.md) - + [MOT Durability Concepts](/administrator-guide/mot-engine/3-concepts-of-mot/3-6.md) - + [MOT Recovery Concepts](/administrator-guide/mot-engine/3-concepts-of-mot/3-7.md) - + [MOT Query Native Compilation (JIT)](/administrator-guide/mot-engine/3-concepts-of-mot/3-8.md) - + [Comparison - Disk vs. MOT](/administrator-guide/mot-engine/3-concepts-of-mot/3-9.md) - + [Appendix](/administrator-guide/mot-engine/4-appendix/mot-appendix.md) - + [References](/administrator-guide/mot-engine/4-appendix/1-references.md) - + [Glossary](/administrator-guide/mot-engine/4-appendix/2-glossary.md) + [Column-store Tables Management](/administrator-guide/column-store-tables-management.md) + [Backup and Restoration](/administrator-guide/backup-and-restoration/backup-and-restoration.md) + [Overview](/administrator-guide/backup-and-restoration/backup-and-restoration-overview.md) diff --git a/product/en/docs-mogdb/v5.0/toc_parameters-and-tools.md b/product/en/docs-mogdb/v5.0/toc_parameters-and-tools.md index 4506d9c41f8133dcdb8b41d9eeb7547f83ef08a6..e3af3e6bbfedb3b824de12cd35ebdd0e6ae3f698 100644 --- a/product/en/docs-mogdb/v5.0/toc_parameters-and-tools.md +++ b/product/en/docs-mogdb/v5.0/toc_parameters-and-tools.md @@ -29,7 +29,6 @@ + [Sending Server](./reference-guide/guc-parameters/ha-replication/sending-server.md) + [Primary Server](./reference-guide/guc-parameters/ha-replication/primary-server.md) + [Standby Server](./reference-guide/guc-parameters/ha-replication/standby-server.md) - + [Memory Table](./reference-guide/guc-parameters/mot.md) + [Query Planning](./reference-guide/guc-parameters/query-planning/query-planning.md) + [Optimizer Method Configuration](./reference-guide/guc-parameters/query-planning/optimizer-method-configuration.md) + [Optimizer Cost Constants](./reference-guide/guc-parameters/query-planning/optimizer-cost-constants.md) diff --git a/product/en/docs-mogdb/v5.2/reference-guide/guc-parameters/query-planning/other-optimizer-options.md b/product/en/docs-mogdb/v5.2/reference-guide/guc-parameters/query-planning/other-optimizer-options.md index ab1421ee35b60018a66e9e6b20d83270c4994c47..acd61b4da4a3b62c3068c36b5f144410700e329b 100644 --- a/product/en/docs-mogdb/v5.2/reference-guide/guc-parameters/query-planning/other-optimizer-options.md +++ b/product/en/docs-mogdb/v5.2/reference-guide/guc-parameters/query-planning/other-optimizer-options.md @@ -255,7 +255,7 @@ The restrictions on simple query are as follows: - **on**: enabled. - **off**: disabled. -**Default value**: **on** +**Default value**: **on** (when using PTK to install MogDB, PTK will optimize this parameter and the default value is **off** after optimization) ## enable_partition_opfusion diff --git a/product/en/docs-mogdb/v5.2/reference-guide/guc-parameters/write-ahead-log/log-replay.md b/product/en/docs-mogdb/v5.2/reference-guide/guc-parameters/write-ahead-log/log-replay.md index a12acaa6cd7d1c5f1ad1495a62161a47e86e0b04..3e31ff20771204ac39fc515827c6e834a14f650d 100644 --- a/product/en/docs-mogdb/v5.2/reference-guide/guc-parameters/write-ahead-log/log-replay.md +++ b/product/en/docs-mogdb/v5.2/reference-guide/guc-parameters/write-ahead-log/log-replay.md @@ -21,7 +21,7 @@ This parameter is a SIGHUP parameter. Set it based on instructions provided in T **Parameter description**: Specifies whether to count information required by **redo_time_detail()**. -This parameter is a SIGHUP parameter. Set it based on instructions provided in Table 1 [GUC parameters](../../../reference-guide/guc-parameters/appendix.md). +This parameter is a POSTMASTER parameter. Set it based on instructions provided in Table 1 [GUC parameters](../../../reference-guide/guc-parameters/appendix.md). **Value range**: boolean, true/false diff --git a/product/en/docs-mogdb/v6.0/reference-guide/guc-parameters/query-planning/other-optimizer-options.md b/product/en/docs-mogdb/v6.0/reference-guide/guc-parameters/query-planning/other-optimizer-options.md index ab1421ee35b60018a66e9e6b20d83270c4994c47..e204455812c8c36e6249f6c59500b5455ab8fe5e 100644 --- a/product/en/docs-mogdb/v6.0/reference-guide/guc-parameters/query-planning/other-optimizer-options.md +++ b/product/en/docs-mogdb/v6.0/reference-guide/guc-parameters/query-planning/other-optimizer-options.md @@ -152,7 +152,7 @@ This parameter is a USERSET parameter. Set it based on instructions provided in - **intargetlist**: Uses the In Target List query rewriting rules (subquery optimization in the target column). - **predpushnormal**: Use the Predicate Push query rewriting rule (push the predicate condition to the subquery). - **predpushforce**: Uses the Predicate Push query rewriting rules. Push down predicate conditions to subqueries and use indexes as much as possible for acceleration. -- **predpush**: Selects the optimal plan based on the cost in **predpushnormal** and **predpushforce**. +- **predpush**: Selects the optimal plan based on the cost in **predpushnormal** and **predpushforce**. **Note**: The rewriting rule for the **predpush** can in rare scenarios result in failure to generate a legal plan, and full testing is recommended before enabling the parameter. - **reduce_orderby**:Uses the reduce orderby query rewriting rule (remove unnecessary sorting in subquery). **Default value**: **magicset, reduce_orderby** @@ -255,7 +255,7 @@ The restrictions on simple query are as follows: - **on**: enabled. - **off**: disabled. -**Default value**: **on** +**Default value**: **on** (when using PTK to install MogDB, PTK will optimize this parameter and the default value is **off** after optimization) ## enable_partition_opfusion diff --git a/product/en/docs-mogdb/v6.0/reference-guide/guc-parameters/write-ahead-log/log-replay.md b/product/en/docs-mogdb/v6.0/reference-guide/guc-parameters/write-ahead-log/log-replay.md index a12acaa6cd7d1c5f1ad1495a62161a47e86e0b04..3e31ff20771204ac39fc515827c6e834a14f650d 100644 --- a/product/en/docs-mogdb/v6.0/reference-guide/guc-parameters/write-ahead-log/log-replay.md +++ b/product/en/docs-mogdb/v6.0/reference-guide/guc-parameters/write-ahead-log/log-replay.md @@ -21,7 +21,7 @@ This parameter is a SIGHUP parameter. Set it based on instructions provided in T **Parameter description**: Specifies whether to count information required by **redo_time_detail()**. -This parameter is a SIGHUP parameter. Set it based on instructions provided in Table 1 [GUC parameters](../../../reference-guide/guc-parameters/appendix.md). +This parameter is a POSTMASTER parameter. Set it based on instructions provided in Table 1 [GUC parameters](../../../reference-guide/guc-parameters/appendix.md). **Value range**: boolean, true/false diff --git a/product/zh/docs-mogdb/v3.0/reference-guide/guc-parameters/9-query-planning/4-other-optimizer-options.md b/product/zh/docs-mogdb/v3.0/reference-guide/guc-parameters/9-query-planning/4-other-optimizer-options.md index 8b95eb36f88b386f7e60d936f65844c115d569a0..c103ffa1eaaa737583f65187377aa8561f77fdc4 100644 --- a/product/zh/docs-mogdb/v3.0/reference-guide/guc-parameters/9-query-planning/4-other-optimizer-options.md +++ b/product/zh/docs-mogdb/v3.0/reference-guide/guc-parameters/9-query-planning/4-other-optimizer-options.md @@ -104,7 +104,7 @@ set rewrite_rule=none; --关闭所有可选查询重写规则 - intargetlist:使用In Target List查询重写规则(提升目标列中的子查询)。 - predpushnormal:使用Predicate Push查询重写规则(下推谓词条件到子查询中)。 - predpushforce:使用Predicate Push查询重写规则(下推谓词条件到子查询中,尽可能的利用索引加速)。 -- predpush:在predpushnormal和predpushforce中根据代价选择最优计划。 +- predpush:在predpushnormal和predpushforce中根据代价选择最优计划。**注**:predpush类的重写规则在极个别场景会导致无法生成合法的计划,在启用参数前建议进行充分测试。 **默认值**: magicset diff --git a/product/zh/docs-mogdb/v3.1/reference-guide/guc-parameters/9-query-planning/4-other-optimizer-options.md b/product/zh/docs-mogdb/v3.1/reference-guide/guc-parameters/9-query-planning/4-other-optimizer-options.md index 6565477f58ccc27b95af8d827bf297effa029520..77244479d7418dcb042c275d14f140268d58324a 100644 --- a/product/zh/docs-mogdb/v3.1/reference-guide/guc-parameters/9-query-planning/4-other-optimizer-options.md +++ b/product/zh/docs-mogdb/v3.1/reference-guide/guc-parameters/9-query-planning/4-other-optimizer-options.md @@ -156,7 +156,7 @@ set rewrite_rule=none; --关闭所有可选查询重写规则 - intargetlist:使用In Target List查询重写规则(提升目标列中的子查询)。 - predpushnormal:使用Predicate Push查询重写规则(下推谓词条件到子查询中)。 - predpushforce:使用Predicate Push查询重写规则(下推谓词条件到子查询中,尽可能的利用索引加速)。 -- predpush:在predpushnormal和predpushforce中根据代价选择最优计划。 +- predpush:在predpushnormal和predpushforce中根据代价选择最优计划。**注**:predpush类的重写规则在极个别场景会导致无法生成合法的计划,在启用参数前建议进行充分测试。 **默认值**: magicset diff --git a/product/zh/docs-mogdb/v5.0/about-mogdb/MogDB-compared-to-openGauss.md b/product/zh/docs-mogdb/v5.0/about-mogdb/MogDB-compared-to-openGauss.md index 8f84f3105c9c455ca7ebd50599e8af5516e344ea..2f3428c858533be962add66b40b170c81aa238f3 100644 --- a/product/zh/docs-mogdb/v5.0/about-mogdb/MogDB-compared-to-openGauss.md +++ b/product/zh/docs-mogdb/v5.0/about-mogdb/MogDB-compared-to-openGauss.md @@ -85,10 +85,6 @@ openGauss是一个单机数据库,具备关系型数据库的基本功能, - 通过将密钥掌握在用户自己手上,实现公有云、消费者云以及开发用户的用户信任问题; - 让云数据库借助全密态能力更好的遵守个人隐私保护方面的法律法规。 -- 内存表 - - 内存表把数据全部缓存在内存中,所有数据访问实现免锁并发,实现数据处理的极致性能,满足实时性严苛要求场景。 - - 主备双机 主备双机支持同步和异步复制,应用可以根据业务场景选择合适的部署方式。同步复制保证数据的高可靠,一般需要一主两备部署,同时对性能有一定影响。异步复制一主一备部署即可,对性能影响小,但异常时可能存在数据丢失。openGauss支持页面损坏的自动修复,在主机页面发生损坏时,能够自动从备机修复损坏页面。openGauss支持备机并行日志恢复,尽量降低主机故障时业务不可用的时间。 diff --git a/product/zh/docs-mogdb/v5.0/about-mogdb/mogdb-new-feature/5.0.6.md b/product/zh/docs-mogdb/v5.0/about-mogdb/mogdb-new-feature/5.0.6.md index 80c595f7dfa2c3512db88557bb5bd3cc9e82ab2c..6650d4a9c83191304dc5818d7728739ebd30d19d 100644 --- a/product/zh/docs-mogdb/v5.0/about-mogdb/mogdb-new-feature/5.0.6.md +++ b/product/zh/docs-mogdb/v5.0/about-mogdb/mogdb-new-feature/5.0.6.md @@ -21,7 +21,7 @@ Append-Update模式的存储引擎Astore在大规模高并发更新场景下, 支持Inplace-Update的新存储引擎Ustore正式发布,在具备与Astore较一致的读写性能的同时,在空间管理、高热点更新等方面均具备优异性能表现,较好的解决了客户痛点。 -**相关页面**:[配置Ustore](../../performance-tuning/system-tuning/configuring-ustore.md) +**相关页面**:[In-place Update存储引擎Ustore](../../performance-tuning/system-tuning/configuring-ustore.md) ### 2.2 Select自动提交 diff --git a/product/zh/docs-mogdb/v5.0/about-mogdb/mogdb-new-feature/5.0.8.md b/product/zh/docs-mogdb/v5.0/about-mogdb/mogdb-new-feature/5.0.8.md new file mode 100644 index 0000000000000000000000000000000000000000..ed119f4100e230f4bfea56d6b6e9eaa97c132529 --- /dev/null +++ b/product/zh/docs-mogdb/v5.0/about-mogdb/mogdb-new-feature/5.0.8.md @@ -0,0 +1,142 @@ +--- +title: MogDB 5.0.8 +summary: MogDB 5.0.8 +author: Guo Huan +date: 2024-07-03 +--- + +# MogDB 5.0.8 + +## 1. 版本说明 + +MogDB 5.0.8是MogDB 5.0.0的补丁版本,于2024-07-31发布,其在MogDB 5.0.7的基础上新增部分特性并修复了部分缺陷,内容如下: + +
+ +## 2. 新增特性 + +### 2.1 顺序扫描预读 + +MogDB针对大数据量的顺序扫描场景进行优化,实现了扫描过程中CPU计算和I/O的并行化,充分发挥CPU执行效率,带来了更高的吞吐量。顺序扫描预读支持AStore和UStore两种存储引擎,开启扫描预读后,TPCH场景下 ,SeqScan算子有20% - 80%的性能提升,端到端有10% - 30%的性能提升。 + +**相关页面**:[顺序扫描预读](../../characteristic-description/high-performance/seqscan-prefetch.md) + +### 2.2 UStore支持SMP并行执行 + +SMP并行技术是一种利用计算机多核CPU架构来实现多线程并行计算,以充分利用CPU资源来提高查询性能的技术。本特性新增了对于Ustore存储引擎的并行能力支持。其中全表顺序扫描场景下,在存储带宽内随着并行度增加,查询性能达到了近乎线性倍数提升的效果;在索引查询下,SMP也获得了显著性能提升。 + +**相关页面**:[Ustore SMP并行扫描](../../characteristic-description/high-performance/ustore-smp.md)、[In-place Update存储引擎Ustore](../../performance-tuning/system-tuning/configuring-ustore.md) + +### 2.3 导入导出能力增强 + +1. gs_dump支持分区表的并行导入/导出,获得了存储带宽内随并行度的增加而几近线性倍数提升的效果。 + + **相关页面**:[逻辑备份恢复效率增强](../../characteristic-description/high-availability/enhanced-efficiency-of-logical-backup-and-restore.md) + +2. gs_dump支持在备机上进行数据的备份导出。 + + **相关页面**:[gs_dump](../../reference-guide/tool-reference/server-tools/gs_dump.md) + +3. gs_dump和gs_restore支持指定导入/导出function、trigger、type、package、procedure五种基本对象。 + + **相关页面**:[支持指定导入导出五类基本对象](../../characteristic-description/enterprise-level-features/import-export-specific-objects.md) + +### 2.4 兼容性增强 + +1. 在B兼容性模式(MySQL兼容性)下,支持除法中,除数为0的时候,错误信息"division by zero"不以error级别报错,而是以warning级别报错,避免SQL执行中遇到这种情况时候执行中断,在这种情况下,计算结果返回NULL,行为与MySQL保持一致。 + +2. 在使用如JDBC等PBE连接方式的情况下,支持匿名块中的左值与OUT/INOUT类型的数据在存储过程中返回值到对应驱动端。该功能的使用需要设置behavior_compat_options参数中proc_outparam_override选项。 + + **相关页面**:[PBE模式支持存储过程out出参](../../characteristic-description/compatibility/stored-procedure-out-parameters-in-pbe-mode.md) + +3. 对原生ECPG(嵌入式SQL预处理器)进行改造以适配ORACLE PRO\*C的用法与功能,便于用户平滑使用ECPG替换PRO\*C实现业务逻辑。 + + **相关页面**:[支持嵌入式SQL预处理器(ECPG)](../../characteristic-description/application-development-interfaces/ECPG.md) + +### 2.5 逻辑解码增强 + +逻辑解码支持生成常用表操作的DDL日志,以及DDL日志的逻辑解码,方便数据迁移同步软件捕获MogDB数据字典变更的行为,增强MogDB和异构数据库的双轨并行能力。 + +**相关页面**:[逻辑解码支持DDL操作](../../developer-guide/logical-replication/logical-decoding/logical-decoding-support-for-DDL.md) + +### 2.6 checkpoint能力增强 + +在几乎不影响数据库性能的前提下,优化了MogDB的脏页刷盘能力,减小系统内脏页数量,这将大幅减小数据库switchover操作的完成时间。另外在发生数据意外掉电重启的场景下,也会降低重启所需回放的wal数量,从而减小数据库启动时间。 + +**相关页面**:[极致刷脏](../../characteristic-description/high-performance/enhancement-of-dirty-pages-flushing-performance.md) + +
+ +## 3. 修复缺陷 + +1. 【4061】修复了由于多线程并发写入wal日志导致的低概率数据库宕机问题。 + +2. 【4571】修复了MogDB 5.0.4版本升级到高版本时,tidrangescan插件没有自动创建的问题。 + +3. 【5157】修复了gs_probackup restore还原全量备份时-i指定不存在id,会导致数据库宕机的问题。 + +4. 【5234】修复了gs_dump导出procedure,gs_restore导入procedure后,procedure不存在的问题。 + +5. 【5291】修复了gs_dump、gs_restore导出导入function,function包含两个参数,提示function不存在的问题。 + +6. 【3961】修复了group by rollup语法查询结果和oracle不一致的问题。 + +7. 【4543】修复了低版本已经使用的auto_increment无法重置为1的问题。 + +8. 【4550】修复了B模式下uuid_short数值重复的问题。 + +9. 【4570】修复了cm自动挂载vip失败的问题。 + +10. 【4610】修复了窗口函数包含partition子句时执行报错的问题。 + +11. 【4861】修复了select decode行为异常的问题。 + +12. 【4890】修复了MogDB 5.0.6版本开始,执行SQL报错bitmapset has multiple members的问题。 + +13. 【4894】修复了drop触发器的function时出现报错ERROR: could not find tuple for trigger 437851的问题。 + +14. 【4907】修复了单网段场景下,CM支持多个VIP功能异常的问题。 + +15. 【4918】修复了gs_dump在设置用户密码时开启并行导出会出错的问题。 + +16. 【4928】修复了gs_probackup基于增量备份的恢复操作失败的问题。 + +17. 【4952】修复了在开启增量排序的情况下,带有distinct的SQL会选择较差的计划的问题。 + +18. 【4989】修复了reload protect_standby为on ,protect_standby值存在异常的问题。 + +19. 【4992】修复了表实际行数为0,analyze后估算行数为非零值且误差较大的问题。 + +20. 【5138】修复了whale插件执行SELECT to_timestamp(0) FROM dual; 输出结果与预期不符的问题。 + +21. 【5143】修复了whale插件dbms_random.normal()函数返回的值每次一样的问题。 + +22. 【5144】修复了gs_dumpall导出导入时create view语句结尾缺少部分内容的问题。 + +23. 【5146】修复了树形查询过滤条件错误下推的问题。 + +24. 【5242】修复了存储过程游标参数返回为null,但实际有数据的问题。 + +25. 【5244】B模式下UNION ALL之后 '' 返回为0的问题。 + +26. 【5293】修复了执行存储过程插入表数据报错的问题。 + +27. 【5300】修复了gs_dumpall导出导入问题。 + +28. 【5400】修复了update分区表导致内存泄漏的问题。 + +29. 【5642】修复了B模式下order by结果集为空时,使用聚合函数报错的问题。 + +30. 【5680】修复了普通用户访问远程dblink的表会报ERROR: permission denied for relation (null)的问题。 + +31. 【5698】修复了更新dblink远程表使用别名识别为字段的问题。 + +32. 【5825】修复了rename用户和schema之后,gs_dump导出带有自增主键的表出现报错的问题。 + +33. 【1073】支持了外部分区表的访问更新。 + +34. 【6147】修复了在开启select - O自动事务提交功能下,fetchsize功能报错的问题。 + +35. 【6141】修复了select - O自动事务提交功能出现select for update非预期的自动提交的问题。 + +36. 【4470】修复了拥有外键约束的ustore表在删除表时出现tuple already updated by self的报错问题。 \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/about-mogdb/mogdb-new-feature/release-note.md b/product/zh/docs-mogdb/v5.0/about-mogdb/mogdb-new-feature/release-note.md index b355d249901538e4d20062e4b31d3069a224313e..b848dc2d3a3444892436a2693b3ef9ef7fd9c8ab 100644 --- a/product/zh/docs-mogdb/v5.0/about-mogdb/mogdb-new-feature/release-note.md +++ b/product/zh/docs-mogdb/v5.0/about-mogdb/mogdb-new-feature/release-note.md @@ -9,6 +9,7 @@ date: 2022-09-27 | 版本 | 发布日期 | 概述 | | ------------------- | ---------- | ------------------------------------------------------------ | +| [5.0.8](./5.0.8.md) | 2024/07/31 | MogDB 5.0.8版本在MogDB 5.0.7的基础上修复了部分缺陷,新增顺序扫描预读、UStore SMP并行执行等特性,同时对兼容性、性能、易用性均做了提升。 | | [5.0.7](./5.0.7.md) | 2024/05/30 | MogDB 5.0.7在MogDB 5.0.6的基础上修复了部分缺陷。 | | [5.0.6](./5.0.6.md) | 2024/03/30 | MogDB 5.0.6版本在MogDB 5.0.5的基础上修复了部分缺陷,新增Ustore存储引擎商用、Select自动提交、导入导出性能增强,同时对兼容性、性能、易用性均做了提升。 | | [5.0.5](./5.0.5.md) | 2023/12/30 | MogDB 5.0.5版本在MogDB 5.0.4的基础上新增B模式下允许使用数字开头作为表名、列名,having子句中允许别名和列名重复,IFNULL函数参数为时间相关类型等特性,并修复了部分缺陷。 | diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/administrator-guide.md b/product/zh/docs-mogdb/v5.0/administrator-guide/administrator-guide.md index 9f1d5bc0b4dc6b7690ad045a2252948a58599f4f..94993ff6303985032d138c465d483c829ad9f9d7 100644 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/administrator-guide.md +++ b/product/zh/docs-mogdb/v5.0/administrator-guide/administrator-guide.md @@ -10,7 +10,6 @@ date: 2023-05-22 - **[本地化](localization/localization.md)** - **[日常运维](routine-maintenance/routine-maintenance.md)** - **[主备管理](primary-and-standby-management.md)** -- **[MOT内存表管理](mot-engine/mot-engine.md)** - **[列存表管理](column-store-tables-management.md)** - **[备份与恢复](backup-and-restoration/backup-and-restoration.md)** - **[数据库部署方案](database-deployment-scenario/database-deployment-scenario.md)** diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/1-mot-introduction.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/1-mot-introduction.md deleted file mode 100644 index 7258ac470f1b947af0ba7575e30ec1a61ecd7535..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/1-mot-introduction.md +++ /dev/null @@ -1,29 +0,0 @@ ---- -title: MOT简介 -summary: MOT简介 -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT简介 - -MogDB引入了MOT存储引擎,它是一种事务性行存储,针对多核和大内存服务器进行了优化。MOT是MogDB数据库最先进的生产级特性(Beta版本),它为事务性工作负载提供更高的性能。MOT完全支持ACID特性,并包括严格的持久性和高可用性支持。企业可以在关键任务、性能敏感的在线事务处理(OLTP)中使用MOT,以实现高性能、高吞吐、可预测低延迟以及多核服务器的高利用率。MOT尤其适合在多路和多核处理器的现代服务器上运行,例如基于Arm/鲲鹏处理器的华为TaiShan服务器,以及基于x86的戴尔或类似服务器。 - -**图 1** MogDB内存优化存储引擎 - -![mogdb内存优化存储引擎](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-introduction-1.png) - -如[图1](#neicun)所示,MogDB数据库内存优化存储引擎组件(绿色部分)负责管理MOT和事务。 - -MOT与基于磁盘的普通(堆)表并排创建。MOT的有效设计实现了几乎完全的SQL覆盖,并且支持完整的数据库功能集,如存储过程和自定义函数(限制参见[MOT SQL覆盖和限制](../../../administrator-guide/mot-engine/2-using-mot/4-mot-usage.md#mot-sql覆盖和限制))。自MogDB 3.1.1版本以来,在单查询(如JOIN)的MOT表和堆表(基于磁盘的普通表)以及多步骤和多表事务中,MOT还支持MVCC和跨引擎事务(Cross-TX)。 - -通过完全存储在内存中的数据和索引、非统一内存访问感知(NUMA-aware)设计、消除锁和锁存争用的算法以及查询原生编译,MOT可提供更快的数据访问和更高效的事务执行。 - -MOT有效的几乎无锁的设计和高度调优的实现,使其在多核服务器上实现了卓越的近线性吞吐量扩展,这可能是业界最好的。 - -MOT完全支持ACID特性: - -- 原子性(Atomicity):原子事务是一系列不可分割的数据库操作。在事务完成(分别提交或中止)之后,这些操作要么全部发生,要么全部不发生。 -- 一致性(Consistency):事务结束后,数据库处于一致状态,保留数据完整性。 -- 隔离性(Isolation):事务之间不能相互干扰。MOT支持可重复读、读已提交和快照隔离级别。更多信息,请参见[MOT隔离级别](../../../administrator-guide/mot-engine/3-concepts-of-mot/3-2.md#mot隔离级别)。 -- 持久性(Durability):即使发生崩溃和失败,成功完成(提交)的事务效果持久保存。MOT完全集成了MogDB的基于WAL的日志记录。同时支持同步和异步日志记录选项。MOT还支持同步+面向NUMA优化的组提交。更多信息,请参见[MOT持久性概念](../../../administrator-guide/mot-engine/3-concepts-of-mot/3-6.md)。 \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/2-mot-features-and-benefits.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/2-mot-features-and-benefits.md deleted file mode 100644 index f46cfdd2fa4bd716ac0f08ecbf120f56eaffb5af..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/2-mot-features-and-benefits.md +++ /dev/null @@ -1,22 +0,0 @@ ---- -title: MOT特性及价值 -summary: MOT特性及价值 -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT特性及价值 - -MOT在高性能(查询和事务延迟)、高可扩展性(吞吐量和并发量)以及高资源利用率(某些程度上节约成本)方面拥有显著优势。 - -- 低延迟(Low Latency):提供快速的查询和事务响应时间。 -- 高吞吐量(High Throughput):支持峰值和持续高用户并发。 -- 高资源利用率(High Resource Utilization):充分利用硬件。 - -使用了MOT的应用程序可以达到普通表2.5到4倍的吞吐量。例如,在基于Arm/鲲鹏的华为TaiShan服务器和基于英特尔至强的戴尔x86服务器上,执行TPC-C基准测试(交互事务和同步日志),MOT提供的吞吐率增益在2路服务器上达到2.5倍,4路服务器上达到3.7倍,在4路256核Arm服务器上达到480万tpmC。 - -在TPC-C基准测试中可观察到,MOT提供更低的延迟,将事务处理速度提升了3至5.5倍。 - -此外,高负载和高争用的场景是所有领先的行业数据库都会遇到的公认问题,而MOT能够在这种情况下极高地利用服务器资源。使用MOT后,4路服务器的资源利用率达到99%,远远领先其他行业数据库。 - -这种能力在现代的多核服务器上尤为明显和重要。 diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/3-mot-key-technologies.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/3-mot-key-technologies.md deleted file mode 100644 index b851fd1b157b12479f99c53dce033558c9a3efa6..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/3-mot-key-technologies.md +++ /dev/null @@ -1,25 +0,0 @@ ---- -title: MOT关键技术 -summary: MOT关键技术 -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT关键技术 - -MOT的关键技术如下: - -- 内存优化数据结构:以实现高并发吞吐量和可预测的低延迟为目标,所有数据和索引都在内存中,不使用中间页缓冲区,并使用持续时间最短的锁。数据结构和所有算法都是专门为内存设计而优化的。 -- 免锁事务管理:MOT在保证严格一致性和数据完整性的前提下,采用乐观的策略实现高并发和高吞吐。在事务过程中,MOT不会对正在更新的数据行的任何版本加锁,从而大大降低了一些大内存系统中的争用。事务中的乐观并发控制(Optimistic Concurrency Control,OCC)语句是在没有锁的情况下实现的,所有的数据修改都是在内存中专门用于私有事务的部分(也称为私有事务内存)中进行的。这就意味着在事务过程中,相关数据在私有事务内存中更新,从而实现了无锁读写;而且只有在提交阶段才会短时间加锁。更多详细信息,请参见[MOT并发控制机制](../../../administrator-guide/mot-engine/3-concepts-of-mot/3-2.md)。 -- 免锁索引:由于内存表的数据和索引完全存储在内存中,因此拥有一个高效的索引数据结构和算法非常重要。MOT索引机制基于最先进的Masstree,这是一种用于多核系统的快速和可扩展的键值(Key Value,KV)存储索引,以B+树的Trie实现。通过这种方式,高并发工作负载在多核服务器上可以获得卓越的性能。同时MOT应用了各种先进的技术以优化性能,如优化锁方法、高速缓存感知和内存预取。 -- NUMA-aware的内存管理:MOT内存访问的设计支持非统一内存访问(NUMA)感知。NUMA-aware算法增强了内存中数据布局的性能,使线程访问物理上连接到线程运行的核心的内存。这是由内存控制器处理的,不需要通过使用互连(如英特尔QPI)进行额外的跳转。MOT的智能内存控制模块,为各种内存对象预先分配了内存池,提高了性能,减少了锁,保证了稳定性。事务的内存对象的分配始终是NUMA本地的。本地处理的对象会返回到池中。同时在事务中尽量减少系统内存分配(OS malloc)的使用,避免不必要的锁。 -- 高效持久性:日志和检查点是实现磁盘持久化的关键能力,也是ACID的关键要求之一(D代表持久性)。目前所有的磁盘(包括SSD和NVMe)都明显慢于内存,因此持久化是基于内存数据库引擎的瓶颈。作为一个基于内存的存储引擎,MOT的持久化设计必须实现各种各样的算法优化,以确保持久化的同时还能达到设计时的速度和吞吐量目标。这些优化包括: - - 并行日志,所有MogDB磁盘表都支持。 - - 每个事务的日志缓冲和无锁事务准备。 - - 增量更新记录,即只记录变化。 - - 除了同步和异步之外,创新的NUMA感知组提交日志记录。 - - 最先进的数据库检查点(CALC)使内存和计算开销降到最低。 -- 高SQL覆盖率和功能集:MOT通过扩展的PostgreSQL外部数据封装(FDW)以及索引,几乎支持完整的SQL范围,包括存储过程、用户定义函数和系统函数调用。有关不支持的功能的列表,请参阅[MOT SQL覆盖和限制](../../../administrator-guide/mot-engine/2-using-mot/4-mot-usage.md#mot-sql覆盖和限制)。 -- 使用PREPARE语句的查询原生编译:通过使用PREPARE客户端命令,可以以交互方式执行查询和事务语句。这些命令已被预编译成原生执行格式,也称为Code-Gen或即时(Just-in-Time,JIT)编译。这样可以实现平均30%的性能提升。在可能的情况下,应用编译和轻量级执行;否则,使用标准执行路径处理适用的查询。Cache Plan模块已针对OLTP进行了优化,在整个会话中甚至使用不同的绑定设置以及在不同的会话中重用编译结果。 -- JIT存储过程(JIT SP):加快性能。JIT SP是指通过LLVM运行时代码生成和编译库来生成代码、编译和执行存储过程。JIT SP仅对访问MOT表的存储过程可用,对用户完全透明。加速级别取决于存储过程逻辑。例如,一个真实的客户应用程序为不同的存储过程实现了20%、44%、300%和500%的加速,减少了存储过程延迟。 -- 无缝集成MVCC和跨引擎事务:MOT在集成封装中并排运行基于磁盘的(堆)存储引擎。在MogDB 5.0版本中,MOT支持MVCC和跨引擎事务,以及堆和MOT表之间的连接。 diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/4-mot-usage-scenarios.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/4-mot-usage-scenarios.md deleted file mode 100644 index 2e890b323526cb4d1c36926b7787d47737fcc925..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/4-mot-usage-scenarios.md +++ /dev/null @@ -1,22 +0,0 @@ ---- -title: MOT应用场景 -summary: MOT应用场景 -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT应用场景 - -MOT可以根据负载的特点,显著加快应用程序的整体性能。MOT通过提高数据访问和事务执行的效率,并通过消除并发执行事务之间的锁和锁存争用,最大程度地减少重定向,从而提高了事务处理的性能。 - -MOT的极速不仅因为它在内存中,还因为它围绕并发内存使用管理进行了优化。数据存储、访问和处理算法从头开始设计,以利用内存和高并发计算的最新先进技术。 - -MogDB允许应用程序随意组合MOT和基于标准磁盘的表。对于启用已证明是瓶颈的最活跃、高争用和对性能敏感的应用程序表,以及需要可预测的低延迟访问和高吞吐量的表来说,MOT特别有用。 - -MOT可用于各种应用,例如: - -- 高吞吐事务处理:这是使用MOT的主要场景,因为它支持海量事务,同时要求单个事务的延迟较低。这类应用的例子有实时决策系统、支付系统、金融工具交易、体育博彩、移动游戏、广告投放等。 -- 性能瓶颈加速:存在高争用现象的表可以通过使用MOT受益,即使该表是磁盘表。由于延迟更低、竞争和锁更少以及服务器吞吐量能力增加,此类表(除了相关表和在查询和事务中一起引用的表之外)的转换使得性能显著提升。 -- 消除中间层缓存:云计算和移动应用往往会有周期性或峰值的高工作负载。此外,许多应用都有80%以上负载是读负载,并伴有频繁的重复查询。为了满足峰值负载单独要求,以及降低响应延迟提供最佳的用户体验,应用程序通常会部署中间缓存层。这样的附加层增加了开发的复杂性和时间,也增加了运营成本。 MOT提供了一个很好的替代方案,通过一致的高性能数据存储来简化应用架构,缩短开发周期,降低CAPEX和OPEX成本。 -- 大规模流数据提取:MOT可以满足云端(针对移动、M2M和物联网)、事务处理(Transactional Processing,TP)、分析处理(Analytical Processing,AP)和机器学习(Machine Learning,ML)的大规模流数据的提取要求。MOT尤其擅长持续快速地同时提取来自许多不同来源的大量数据。这些数据可以在以后进行处理、转换,并在速度较慢的基于磁盘的表中进行移动。另外,MOT还可以查询到一致的、最新的数据,从而得出实时结果。在有许多实时数据流的物联网和云计算应用中,通常会有专门的数据摄取和处理。例如,一个Apache Kafka集群可以用来提取10万个事件/秒的数据,延迟为10ms。一个周期性的批处理任务会将收集到的数据收集起来,并将转换格式,放入关系型数据库中进行进一步分析。MOT可以通过将数据流直接存储在MOT关系表中,为分析和决策做好准备,从而支持这样的场景(同时消除单独的数据处理层)。这样可以更快地收集和处理数据,MOT避免了代价高昂的分层和缓慢的批处理,提高了一致性,增加了分析数据的实时性,同时降低了总拥有成本(Total Cost of Ownership,TCO)。 -- 降低TCO:提高资源利用率和消除中间层可以节省30%到90%的TCO。 diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/5-mot-performance-benchmarks.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/5-mot-performance-benchmarks.md deleted file mode 100644 index 4ea480d2d0f3d59f9448372eb34c4c5e754d8500..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/5-mot-performance-benchmarks.md +++ /dev/null @@ -1,202 +0,0 @@ ---- -title: MOT性能基准 -summary: MOT性能基准 -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT性能基准 - -我们的性能测试是基于业界和学术界通用的TPC-C基准。 - -测试使用了BenchmarkSQL(请参见[MOT样例TPC-C基准](../../../administrator-guide/mot-engine/2-using-mot/6-mot-sample-tpcc-benchmark.md)),并且使用交互式SQL命令而不是存储过程来生成工作负载。 - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: 使用存储过程方法可能会产生更高的性能结果,因为它需要大大减少网络往返和数据库封装SQL处理周期。 - -评估MogDB MOT性能和磁盘性能的所有测试都使用了同步日志记录和在MOT中优化的group-commit=on版本。 - -最后我们进行了额外测试,评估MOT快速采集大量数据的能力,并将其作为中间层数据采集解决方案的替代方案。 - -2020年6月完成全部测试。 - -下面是各种类型的MOT性能基准。 - -
- -## MOT硬件 - -本次测试使用的服务器满足10GbE组网和以下配置: - -- 基于Arm64/鲲鹏920的2路服务器,型号为TaiShan 2280 v2(128核),800GB RAM,1TB NVMe盘。操作系统为openEuler。 -- 基于Arm64/鲲鹏960的4路服务器,型号为TaiShan 2480 v2(256核),512GB RAM,1TB NVMe盘。操作系统为openEuler。 -- 戴尔x86服务器,2路英特尔至强金牌6154 CPU @ 3Ghz,18核(超线程开启时共72核),1TB RAM,1TB SSD。操作系统为CentOS 7.6。 -- x86超微服务器,8路英特尔(R)至强(R) CPU E7-8890 v4 @ 2.20GHz,24核(超线程开启共384核),1TB RAM,1.2 TB SSD(希捷1200 SSD 200GB,SAS 12Gb/s)。操作系统为Ubuntu 16.04.2 LTS。 -- 华为x86服务器,4路英特尔(R)至强(R) CPU E7-8890 v4 @ 2.2Ghz(超线程开启共96核),512GB RAM,SSD 2TB。操作系统为CentOS 7.6。 - -
- -## MOT测试总结 - -MOT比磁盘表性能提升2.5至4.1倍,在Arm/鲲鹏256核服务器上达到480万tpmC。测试结果清楚表明MOT在扩展和利用所有硬件资源方面的卓越能力。随着CPU槽位和服务器核数增加,性能会随之跃升。 - -MOT在Arm/鲲鹏架构下最高可达3万tpmC/核,在x86架构下最高可达4万tpmC/核。 - -由于持久性机制更高效,MOT中的复制开销在Arm/鲲鹏主备高可用场景下为7%,在x86服务器中为2%。而磁盘表的开销在Arm/鲲鹏中为20%,在x86中为15%。 - -最终,MOT延迟降低2.5倍,TPC-C事务响应速度提升2至7倍。 - -
- -## MOT高吞吐量 - -MOT高吞吐量测试结果如下。 - -
- -### Arm/鲲鹏2路128核 - -- **性能** - - 下图是华为Arm/鲲鹏2路128核服务器TPC-C基准测试的结果。 - - 一共进行了四类测试: - - - MOT和MogDB基于磁盘的表各进行了2次测试。 - - 其中两项测试是在单节点(无高可用性)上执行,这意味着没有向备节点执行复制。其余两个测试在主备节点(有高可用性)上执行,即写入主节点的数据被复制到备节点。 - - MOT用橙色表示,基于磁盘的表用蓝色表示。 - - **图 1** Arm/鲲鹏2路128核性能基准 - - ![Arm-鲲鹏2路128核性能基准](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-1.png) - - 结果表明: - - - 正如预期的那样,在所有情况下,MOT的性能明显高于基于磁盘的表。 - - 单节点:MOT性能为380万tpmC,而基于磁盘的表为150万tpmC。 - - 主备节点:MOT性能为350万tpmC,而基于磁盘的表为120万tpmC。 - - 相比单节点(无高可用性,无复制),在有复制需求的生产级(高可用性)服务器(主备节点)上,使用MOT的好处更显著。 - - 同在主备高可用场景下,MOT复制开销:Arm/鲲鹏为7%,x86为2%;而基于磁盘的表复制开销:Arm/鲲鹏为20%;x86为15%。 - -- **单CPU核性能** - - 下图是华为Arm/鲲鹏服务器2路128核的单核TPC-C基准性能/吞吐量测试结果。同样地,一共进行了四类测试: - - **图 2** Arm/鲲鹏2路128核的单核性能标杆 - - ![Arm-鲲鹏2路128核的单核性能标杆](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-2.png) - - 结果表明,正如预期的那样,在所有情况下,MOT的单核性能明显高于基于磁盘的表。相比单节点(无高可用性,无复制),在有复制需求的生产级(高可用性)服务器(主备节点)上,使用MOT的好处更显著。 - -
- -### Arm/鲲鹏4路256核 - -下面通过单连接数的tpmC来展示MOT出色的并发控制性能。 - -**图 3** Arm/鲲鹏4路256核性能基准 - -![Arm-鲲鹏4路256核性能基准](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-3.png) - -结果表明,随着核数增多,性能也显著提高,在768核时性能达到480万tpmC的峰值。 - -
- -### x86服务器 - -- **8路384核** - -下面通过比较基于磁盘的表和MOT之间单连接数的tpmC,来展示MOT出色的并发控制性能。本次测试以8路384核x86服务器为例。橙色表示MOT的结果。 - -**图 4** 8路384核x86服务器性能基准 - -![8路384核x86服务器性能基准](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-4.png) - -结果表明,在386核服务器上,MOT的性能明显优于基于磁盘的表,并且单核性能非常高,达到300万tpmC/核。 - -- **4路96核** - -在4路96核服务器上,MOT实现了390万tpmC。下图展示了高效MOT的单核性能达到4万tpmC/核。 - -**图 5** 4路96核服务器性能基准 - -![4路96核服务器性能基准](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-5.png) - -
- -## MOT低延迟 - -以下是在Arm/鲲鹏两路服务器(128核)上进行测试的结果。单位为毫秒(ms)。 - -**图 1** 低延迟(90th%)性能基准 - -![低延迟(90th-)性能基准](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-6.png) - -MOT的平均事务速度为2.5倍,MOT延迟为10.5ms,而基于磁盘的表延迟为23至25ms。 - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: 计算平均数时,已考虑TPC-C的5个事务分布占比。有关更多信息,请参阅[MOT样例TPC-C基准](../../../administrator-guide/mot-engine/2-using-mot/6-mot-sample-tpcc-benchmark.md)中关于TPC-C事务的说明。 - -**图 2** 低延迟(90th%,事务平均)性能基准 - -![低延迟(90th-事务平均)性能基准](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-7.png) - -
- -## MOT恢复时间目标(RTO)和冷启动时间 - -### 高可用RTO - -MOT完全集成到MogDB中,包括支持主备部署的高可用场景。WAL重做日志的复制机制将把复制更改到数据库备节点并使用备节点进行重放。 - -如果故障转移事件发生,无论是由于计划外的主节点故障还是由于计划内的维护事件,备节点都会迅速活跃。恢复和重放WAL重做日志以及启用连接所需的时间也称为恢复时间目标(RTO)。 - -**MogDB(包括MOT)的RTO小于10秒,这要归功于其并行恢复机制。** - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: 灾难发生后必须恢复业务流程,避免导致连续性中断相关的不可接受的后果,而RTO表示的就是这段流程的持续时间和业务级别。换句话说,RTO就是在回答这个问题:在通知业务流程中断后,需要多长时间才能恢复? - -
- -### 冷启动恢复时间 - -冷启动恢复时间是指系统从停止模式到能够完全运行所需的时间。在内存数据库中,这包括将所有数据和索引加载到内存中的时间,因此它取决于数据大小、硬件带宽和软件算法能否高效地处理这些数据。 - -MOT测试使用40 GB/s的ARM磁盘测试,可以在100 GB/s的时间内加载数据库。MOT的索引非持久化,因此它们是在冷启动时创建的。实际加载的数据加索引大小约多50%。因此,可以转换为MOT冷启动时间的数据和索引容量为40秒内150GB,或225 GB/分钟(3.75 GB/秒)。 - -冷启动过程和从磁盘加载数据到MOT所需时间如下图所示。 - -**图 1** 冷启动时间性能基准 - -![冷启动时间性能基准](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-8.png) - -- 数据库大小:加载整个数据库(每数据库GB)的总时间由蓝色线条和左侧的Y轴“时间(秒)”表示。 -- 吞吐量:数据库每秒GB吞吐量由橙色线和右侧的Y轴“吞吐量GB/秒”表示。 - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: 测试过程中表现的性能与SSD硬件的带宽非常接近。因此,可以在不同的平台上实现更高(或更低)的性能。 - -
- -## MOT资源利用率 - -在4路96核512GB RAM的x86服务器上测试的资源利用率如下所示。MOT能够高效持续消耗几乎所有可用的CPU资源。例如,192核390万tpmC的CPU利用率几乎达到100%。 - -- tmpC:每分钟完成的TPC-C事务数以橙色条柱和左侧的Y轴“tpmC”表示。 -- CPU利用率(%):CPU利用率由蓝色线条和右侧的Y轴“CPU%”表示。 - -**图 1** 资源利用率性能基准 - -![资源利用率性能基准](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-9.png) - -
- -## MOT数据采集速度 - -该测试模拟海量物联网、云端或移动端接入的实时数据流,快速持续地把海量数据注入到数据库。 - -- 本次测试涉及大量数据采集,具体如下: - - 1000万行数据由500个线程发送,2000轮,每个insert命令有10条记录(行),每条记录占200字节。 - - 客户端和数据库位于不同的机器上。 数据库服务器为2路72核x86服务器。 -- 性能结果 - - 吞吐量:10000个记录/核,或2MB/核。 - - 延迟:2.8ms每10条记录批量插入(包括客户端-服务器组网)。 - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-caution.gif) **注意:** 预计MOT将针对这一场景进行多项额外的甚至重大的性能改进。更多关于大规模数据流和数据采集的信息,请参阅[MOT应用场景](4-mot-usage-scenarios.md)。 diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/introducing-mot.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/introducing-mot.md deleted file mode 100644 index d46565991caafb9b3f5a8743b2fe3d5e8af89a6f..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/1-introducing-mot/introducing-mot.md +++ /dev/null @@ -1,16 +0,0 @@ ---- -title: MOT介绍 -summary: MOT介绍 -author: Guo Huan -date: 2023-05-22 ---- - -# MOT介绍 - -本章介绍了MogDB内存优化表(Memory-Optimized Table,MOT)的特性及价值、关键技术、应用场景、性能基准和竞争优势。 - -+ **[MOT简介](1-mot-introduction.md)** -+ **[MOT特性及价值](2-mot-features-and-benefits.md)** -+ **[MOT关键技术](3-mot-key-technologies.md)** -+ **[MOT应用场景](4-mot-usage-scenarios.md)** -+ **[MOT性能基准](5-mot-performance-benchmarks.md)** \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/1-using-mot-overview.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/1-using-mot-overview.md deleted file mode 100644 index fdfdef5042d4b36532b94d9d27e362eec41cd292..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/1-using-mot-overview.md +++ /dev/null @@ -1,18 +0,0 @@ ---- -title: MOT使用概述 -summary: MOT使用概述 -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT使用概述 - -MOT作为MogDB的一部分自动部署。有关如何计算和规划所需的内存和存储资源以维持工作负载的说明,请参阅[MOT准备](2-mot-preparation.md)。参考[MOT部署](3-mot-deployment.md)了解MOT中所有的配置,以及服务器优化的非必须选项。 - -使用MOT的方法非常简单。MOT命令的语法与基于磁盘的表的语法相同,并支持大多数标准,如PostgreSQL SQL、DDL和DML命令和功能,如存储过程。只有MOT中的创建和删除表语句与MogDB中基于磁盘的表的语句不同。您可以参考[MOT使用](4-mot-usage.md)了解这两个简单命令的说明,如何将基于磁盘的表转换为MOT,如何使用查询原生编译和PREPARE语句获得更高的性能,以及了解外部工具支持和MOT引擎的限制。 - -[MOT管理](5-mot-administration.md)介绍了如何进行数据库维护,以及监控和分析日志和错误报告。最后,[MOT样例TPC-C基准](6-mot-sample-tpcc-benchmark.md)介绍了如何执行标准TPC-C基准测试。 - -- 阅读以下内容了解如何使用MOT: - - ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/using-mot-overview-1.png) diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/2-mot-preparation.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/2-mot-preparation.md deleted file mode 100644 index 30eb93b65977b8ec95556f4368cc93ffda0bffde..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/2-mot-preparation.md +++ /dev/null @@ -1,226 +0,0 @@ ---- -title: MOT准备 -summary: MOT准备 -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT准备 - -下文介绍了使用MOT的前提条件以及内存和存储规划。 - -
- -## 前提条件 - -以下是使用MogDB MOT的软硬件前提条件。 - -
- -### 硬件支持 - -MOT支持最新硬件和现有硬件平台,支持x86架构和华为鲲鹏Arm架构。 - -MOT与MogDB数据库支持的硬件完全对齐。 - -
- -### CPU - -MOT在多核服务器(扩容)上提供卓越的性能。在这些环境中,MOT的性能明显优于友商,并提供近线性扩展和极高的资源利用率。 - -用户也可以开始在低端、中端和高端服务器上实现MOT的性能优势,无论CPU槽位是1或2个,还是4个,甚至是8个也没问题。在16路甚至32路的高端服务器上,性能和资源利用率也非常高(建议与云和恩墨技术支持联系)。 - -
- -### 内存 - -MOT支持标准RAM/DRAM用于其数据和事务管理。所有MOT数据和索引都驻留在内存中,因此内存容量必须能够支撑数据容量,并且还有进一步增长的空间。内存需求和规划请参见[MOT内存和存储规划](#mot内存和存储规划)。 - -
- -### 存储IO - -MOT是一个持久的数据库,使用永久性存储设备(磁盘/SSD/NVMe驱动器)进行事务日志操作和存储定期检查点。 - -推荐采用低延迟的存储设备,如配置RAID-1的SSD、NVMe或者任何企业级存储系统。当使用适当的硬件时,数据库事务处理和竞争将成为瓶颈,而非IO。 - -详细的内存要求和规划请参见[MOT内存和存储规划](#mot内存和存储规划)。 - -操作系统支持 - -MOT与MogDB支持的操作系统完全对齐。 - -MOT支持裸机和虚拟化环境,可以在裸机或虚拟机上运行以下操作系统: - -- x86:CentOS 7.6和EulerOS 2.0 -- Arm:openEuler和EulerOS - -
- -### 操作系统优化 - -MOT不需要任何特殊修改或安装新软件。但是,一些优化可以提高性能。有关实现最大性能的优化说明,请参阅[MOT服务器优化:x86](3-mot-deployment.md#mot服务器优化x86)和[MOT服务器优化:基于Arm的华为TaiShan2P/4P服务器](3-mot-deployment.md#mot服务器优化基于arm的华为taishan2p4p服务器)。 - -
- -## MOT内存和存储规划 - -本节描述了为满足特定应用程序需求,在评估、估计和规划内存和存储容量数量时,需要注意的事项和准则,以及影响所需内存数量的各种数据,例如计划表的数据和索引大小、维持事务管理的内存以及数据增长的速度。 - -
- -### MOT内存规划 - -MOT是一种内存数据库存储引擎(IMDB),其中所有表和索引完全驻留在内存中。 - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: 内存存储是易失的,需要电力来维护所存储的信息。磁盘存储是持久的,写入磁盘是非易失性存储。MOT使用两种存储,既把所有数据保存在内存中,也把事务性更改同步(通过WAL日志记录)到磁盘上以保持严格一致性(使用同步日志记录模式)。 - -服务器上必须有足够的物理内存以维持内存表的状态,并满足工作负载和数据的增长。所有这些都是在传统的基于磁盘的引擎、表和会话所需的内存之外的要求。因此,提前规划好足够的内存来容纳这些内容是非常有必要的。 - -开始可以使用任何数量的内存并执行基本任务和评估测试。但当准备好生产时,应解决以下问题: - -- **内存配置** - - MogDB数据库和标准Postgres类似,其内存上限是由max_process_memory设置的,该上限在postgres.conf文件中定义。MOT及其所有组件和线程,都驻留在MogDB进程中。因此,分配给MOT的内存也是在整个MogDB数据库进程的max_process_memory定义的上限内分配。 - - MOT为自己保留的内存是max_process_memory的一部分。可以通过百分比或通过小于max_process_memory的绝对值定义。这个部分在mot.conf配置文件中由_mot__memory配置项定义。 - - max_process_memory中可以除了被MOT使用的部分之外,必须为Postgres(MogDB)封装留下至少2GB的可用空间。为了确保这一点,MOT在数据库启动过程中会进行如下校验: - - ``` - (max_mot_global_memory + max_mot_local_memory) + 2GB < max_process_memory - ``` - - 如果违反此限制,则调整MOT内存内部限制,最大可能地满足上述限制范围。该调整在启动时进行,并据此计算MOT最大内存值。 - - > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: MOT最大内存值是配置或调整值(max_mot_global_memory + max_mot_local_memory)的逻辑计算值。 - - 此时,会向服务器日志发出警告,如下所示: - - 以下是报告问题的警告消息示例: - - ``` - [WARNING] MOT engine maximum memory definitions (global: 9830 MB, local: 1843 MB, session large store: 0 MB, total: 11673 MB) breach GaussDB maximum process memory restriction (12288 MB) and/or total system memory (64243 MB). MOT values shall be adjusted accordingly to preserve required gap (2048 MB). - ``` - - 以下警告消息示例提示MOT正在自动调整内存限制: - - ``` - [WARNING] Adjusting MOT memory limits: global = 8623 MB, local = 1617 MB, session large store = 0 MB, total = 10240 MB - ``` - - 新内存限制仅在此处显示。 - - 此外,当总内存使用量接近所选内存限制时,MOT不再允许插入额外数据。不再允许额外数据插入的阈值即是MOT最大内存百分比(如上所述,这是一个计算值)。MOT最大内存百分比默认值为90,即90%。尝试添加超过此阈值的额外数据时,会向用户返回错误,并且也会注册到数据库日志文件中。 - -- **最小值和最大值** - - 为了确保内存安全,MOT根据最小的全局和本地设置预先分配内存。数据库管理员应指定MOT和会话维持工作负载所需的最小内存量。这样可以确保即使另一个消耗内存的应用程序与数据库在同一台服务器上运行,并且与数据库竞争内存资源,也能够将这个最小的内存分配给MOT。最大值用于限制内存增长。 - -- **全局和本地** - - MOT使用的内存由两部分组成: - - - 全局内存:全局内存是一个长期内存池,包含MOT的数据和索引。它平均分布在NUMA节点,由所有CPU核共享。 - - - 本地内存:本地内存是用于短期对象的内存池。它的主要使用者是处理事务的会话。这些会话将数据更改存储在专门用于相关特定事务的内存部分(称为事务专用内存)。在提交阶段,数据更改将被移动到全局内存中。内存对象分配以NUMA-local方式执行,以实现尽可能低的延迟。 - - 被释放的对象被放回相关的内存池中。在事务期间尽量少使用操作系统内存分配(malloc)函数,避免不必要的锁和锁存。 - - 这两个内存的分配由专用的min/max_mot_global_memory和min/max_mot_local_memory设置控制。如果MOT全局内存使用量太接近最大值,则MOT会保护自身,不接受新数据。超出此限制的内存分配尝试将被拒绝,并向用户报告错误。 - -- **最低内存要求** - - 在开始执行对MOT性能的最小评估前,请确保: - - 除了磁盘表缓冲区和额外的内存,max_process_memory(在postgres.conf中定义)还有足够的容量用于MOT和会话(由mix/max_mot_global_memory和mix/max_mot_local_memory配置)。对于简单的测试,可以使用mot.conf的默认设置。 - -- **生产过程中实际内存需求** - - 在典型的OLTP工作负载中,平均读写比例为80:20,每个表的MOT内存使用率比基于磁盘的表高60%(包括数据和索引)。这是因为使用了更优化的数据结构和算法,使得访问速度更快,并具有CPU缓存感知和内存预取功能。 - - 特定应用程序的实际内存需求取决于数据量、预期工作负载,特别是数据增长。 - -- **最大全局内存规划:数据和索引大小** - - 要规划最大全局内存,需满足: - - 1. 确定特定磁盘表(包括其数据和所有索引)的大小。如下统计查询可以确定customer表的数据大小和customer_pkey索引大小: - - - 数据大小:选择pg_relation_size('customer'); - - 索引:选择pg_relation_size('customer_pkey'); - - 2. 额外增加60%的内存,相对于基于磁盘的数据和索引的当前大小,这是MOT中的常见要求。 - - 3. 额外增加数据预期增长百分比。例如: - - 5%月增长率 = 80%年增长率(1.05^12)。因此,为了维持年增长,需分配比表当前使用的还多80%的内存。 - - 至此,max_mot_global_memory值的估计和规划就完成了。实际设置可以用绝对值或Postgres max_process_memory的百分比来定义。具体的值通常在部署期间进行微调。 - -- **最大本地内存规划:并发会话支持** - - 本地内存需求主要是并发会话数量的函数。平均会话的典型OLTP工作负载最大占用8MB。此值乘以会话的数量,再加一点额外的值。 - - 可以通过这种方式进行内存计算,然后进行微调: - - ```bash - SESSION_COUNT * SESSION_SIZE (8 MB) + SOME_EXTRA (100MB should be enough) - ``` - - 默认指定Postgres最大进程内存(默认为12GB)的15%。相当于1.8GB可满足230个会话,即max_mot_local内存需求。实际设置可以用绝对值或Postgres max_process_memory的百分比来定义。具体的值通常在部署期间进行微调。 - -- **异常大事务** - - 某些事务非常大,因为它们将更改应用于大量行。这可能导致单个会话的本地内存增加到允许的最大限制,即1GB。例如: - - ```sql - delete from SOME_VERY_LARGE_TABLE; - ``` - - 在配置max_mot_local_memory设置和应用程序开发时,请考虑此场景。 - - > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: 有关配置的更多信息,请参阅[内存(MOT)](3-mot-deployment.md#内存mot)部分。 - -
- -### 存储IO - -MOT是一个内存优化的持久化数据库存储引擎。需要磁盘驱动器来存储WAL重做日志和定期检查点。 - -推荐采用低延迟的存储设备,如配置RAID-1的SSD、NVMe或者任何企业级存储系统。当使用适当的硬件时,数据库事务处理和竞争将成为瓶颈,而非IO。 - -由于持久性存储比RAM内存慢得多,因此IO操作(日志和检查点)可能成为内存中数据库和内存优化数据库的瓶颈。但是,MOT具有针对现代硬件(如SSD、NVMe)进行优化的高效持久性设计和实现。此外,MOT最小化和优化了写入点(例如,使用并行日志记录、每个事务的单日志记录和NUMA-aware事务组写入),并且最小化了写入磁盘的数据(例如,只把更改记录的增量或更新列记录到日志,并且只记录提交阶段的事务)。 - -
- -### 容量需求 - -所需容量取决于检查点和记录的要求,如下所述: - -- **检查点** - - 检查点将所有数据的快照保存到磁盘。 - - 需要给检查点分配两倍数据大小的容量。不需要为检查点索引分配空间。 - - 检查点 = 2 x MOT数据大小(仅表示行,索引非持久)。 - - 检查点之所以需要两倍大小,是因为快照会保存数据的全部大小到磁盘上,此外还应该为正在进行的检查点分配同样数量的空间。当检查点进程结束时,以前的检查点文件将被删除。 - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: 在下一个MogDB版本中,MOT将有一个增量检查点特性,这将大大降低存储容量需求。 - -- **日志记录** - - MOT日志记录与基于磁盘的表的其它记录写入同一个数据库事务日志。 - - 日志的大小取决于事务吞吐量、数据更改的大小和检查点之间的时间(每次检查点,重做日志被截断并重新开始扩展)。 - - 与基于磁盘的表相比,MOT使用较少的日志带宽和较低的IO争用。这由多种机制实现。 - - 例如,MOT不会在事务完成之前记录每个操作。它只在提交阶段记录,并且只记录更新的增量记录(不像基于磁盘的表那样的完整记录)。 - - 为了确保日志IO设备不会成为瓶颈,日志文件必须放在具有低延迟的驱动器上。 - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: 有关配置的更多信息,请参阅[存储(MOT)](3-mot-deployment.md#存储mot)。 diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/3-mot-deployment.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/3-mot-deployment.md deleted file mode 100644 index fd7d9b044a7905ca4bf0695b2ecb035eba96e09d..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/3-mot-deployment.md +++ /dev/null @@ -1,714 +0,0 @@ ---- -title: MOT部署 -summary: MOT部署 -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT部署 - -以下各小节介绍了各种必需和可选的设置,以达到最佳部署效果。 - -
- -## MOT服务器优化:x86 - -通常情况下,数据库由以下组件绑定: - -- CPU:更快的CPU可以加速任何CPU绑定的数据库。 -- 磁盘:高速SSD/NVME可加速任何I/O绑定数据库。 -- 网络:更快的网络可以加速任何SQL*Net绑定数据库。 - -除以上内容外,以下通用服务器设置默认使用,可能会明显影响数据库的性能。 - -MOT性能调优是确保快速的应用程序功能和数据检索的关键步骤。MOT支持最新的硬件,因此调整每个系统以达到最大吞吐量是极为重要的。 - -以下是用于优化在英特尔x86服务器上运行MOT时的建议配置。这些设置是高吞吐量工作负载的最佳选择。 - -
- -### BIOS - -- Hyper Threading设置为ON。 - - 强烈建议打开超线程(HT=ON)。 - - 建议在MOT上运行OLTP工作负载时打开超线程。当使用超线程时,某些OLTP工作负载显示高达40%的性能增益。 - -
- -### 操作系统环境设置 - -- NUMA - - 禁用NUMA平衡,如下所示。MOT以极其高效的NUMA-aware方式进行内存管理,远远超过操作系统使用的默认方法。 - - ``` - echo 0 > /proc/sys/kernel/numa_balancing - ``` - -- 服务 - - 禁用如下服务: - - ``` - service irqbalance stop # MANADATORY - service sysmonitor stop # OPTIONAL, performance - service rsyslog stop # OPTIONAL, performance - ``` - -- 调优服务 - - 以下为必填项。 - - 服务器必须运行throughput-performance配置文件。 - - ``` - [...]$ tuned-adm profile throughput-performance - ``` - - throughput-performance配置文件是广泛适用的调优,它为各种常见服务器工作负载提供卓越的性能。 - - 其他不太适合MogDB和MOT服务器的配置可能会影响MOT的整体性能,包括:平衡配置、桌面配置、延迟性能配置、网络延迟配置、网络吞吐量配置和节能配置。 - -- 系统命令 - - 推荐使用下列操作系统设置以获得最佳性能。 - - - 在/etc/sysctl.conf文件中添加如下配置,然后执行sysctl -p命令: - - ``` - net.ipv4.ip_local_port_range = 9000 65535 - kernel.sysrq = 1 - kernel.panic_on_oops = 1 - kernel.panic = 5 - kernel.hung_task_timeout_secs = 3600 - kernel.hung_task_panic = 1 - vm.oom_dump_tasks = 1 - kernel.softlockup_panic = 1 - fs.file-max = 640000 - kernel.msgmnb = 7000000 - kernel.sched_min_granularity_ns = 10000000 - kernel.sched_wakeup_granularity_ns = 15000000 - kernel.numa_balancing=0 - vm.max_map_count = 1048576 - net.ipv4.tcp_max_tw_buckets = 10000 - net.ipv4.tcp_tw_reuse = 1 - net.ipv4.tcp_tw_recycle = 1 - net.ipv4.tcp_keepalive_time = 30 - net.ipv4.tcp_keepalive_probes = 9 - net.ipv4.tcp_keepalive_intvl = 30 - net.ipv4.tcp_retries2 = 80 - kernel.sem = 250 6400000 1000 25600 - net.core.wmem_max = 21299200 - net.core.rmem_max = 21299200 - net.core.wmem_default = 21299200 - net.core.rmem_default = 21299200 - #net.sctp.sctp_mem = 94500000 915000000 927000000 - #net.sctp.sctp_rmem = 8192 250000 16777216 - #net.sctp.sctp_wmem = 8192 250000 16777216 - net.ipv4.tcp_rmem = 8192 250000 16777216 - net.ipv4.tcp_wmem = 8192 250000 16777216 - net.core.somaxconn = 65535 - vm.min_free_kbytes = 26351629 - net.core.netdev_max_backlog = 65535 - net.ipv4.tcp_max_syn_backlog = 65535 - #net.sctp.addip_enable = 0 - net.ipv4.tcp_syncookies = 1 - vm.overcommit_memory = 0 - net.ipv4.tcp_retries1 = 5 - net.ipv4.tcp_syn_retries = 5 - ``` - - - 按如下方式修改/etc/security/limits.conf对应部分: - - ``` - soft nofile 100000 - hard nofile 100000 - ``` - - 软限制和硬限制设置可指定一个进程同时打开的文件数量。软限制可由各自运行这些限制的进程进行更改,直至达到硬限制值。 - -- 磁盘/SSD - - 下面以数据库同步提交模式为例,介绍如何保证磁盘读写性能适合数据库同步提交模式。 - - 按如下方式运行磁盘/SSD性能测试: - - ``` - [...]$ sync; dd if=/dev/zero of=testfile bs=1M count=1024; sync - 1024+0 records in - 1024+0 records out - 1073741824 bytes (1.1 GB) copied, 1.36034 s, 789 MB/s - ``` - - 当磁盘带宽明显低于789MB/s时,可能会造成MogDB性能瓶颈,尤其是造成MOT性能瓶颈。 - -
- -### 网络 - -需要使用10Gbps以上网络。 - -运行iperf命令进行验证: - -``` -Server side: iperf -s -Client side: iperf -c -``` - -rc.local:网卡调优 - -以下可选设置对性能有显著影响: - -1. 将 下的set_irq_privacy.sh文件拷贝到/var/scripts/目录下。 - -2. 进入/etc/rc.d/rc.local,执行chmod命令,确保在boot时执行以下脚本: - - ``` - 'chmod +x /etc/rc.d/rc.local' - var/scripts/set_irq_affinity.sh -x all - ethtool -K gro off - ethtool -C adaptive-rx on adaptive-tx on - Replace with the network card, i.e. ens5f1 - ``` - -
- -## MOT服务器优化:基于Arm的华为TaiShan2P/4P服务器 - -以下是基于Arm/鲲鹏架构的华为TaiShan 2280 v2服务器(2路256核)和TaiShan 2480 v2服务器(4路256核)上运行MOT时的建议配置。 - -除非另有说明,以下设置适用于客户端和服务器的机器。 - -
- -### BIOS - -修改BIOS相关设置: - -1. 选择**BIOS**> **Advanced** > **MISC Config**。设置**Support Smmu**为**Disabled**。 - -2. 选择**BIOS**> **Advanced** > **MISC Config**。设置**CPU Prefetching Configuration**为**Disabled**。 - - ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-deployment-1.png) - -3. 选择**BIOS**> **Advanced** > **Memory Config**。设置**Die Interleaving**为**Disabled**。 - - ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-deployment-2.png) - -4. 选择**BIOS**> **Advanced** > **Performance Config**。设置**Power Policy**为**Performance**。 - - ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-deployment-3.png) - -
- -### 操作系统:内核和启动 - -- 以下操作系统内核和启动参数通常由sysadmin配置。 - - 配置内核参数,如下所示。 - - ```bash - net.ipv4.ip_local_port_range = 9000 65535 - kernel.sysrq = 1 - kernel.panic_on_oops = 1 - kernel.panic = 5 - kernel.hung_task_timeout_secs = 3600 - kernel.hung_task_panic = 1 - vm.oom_dump_tasks = 1 - kernel.softlockup_panic = 1 - fs.file-max = 640000 - kernel.msgmnb = 7000000 - kernel.sched_min_granularity_ns = 10000000 - kernel.sched_wakeup_granularity_ns = 15000000 - kernel.numa_balancing=0 - vm.max_map_count = 1048576 - net.ipv4.tcp_max_tw_buckets = 10000 - net.ipv4.tcp_tw_reuse = 1 - net.ipv4.tcp_tw_recycle = 1 - net.ipv4.tcp_keepalive_time = 30 - net.ipv4.tcp_keepalive_probes = 9 - net.ipv4.tcp_keepalive_intvl = 30 - net.ipv4.tcp_retries2 = 80 - kernel.sem = 32000 1024000000 500 32000 - kernel.shmall = 52805669 - kernel.shmmax = 18446744073692774399 - sys.fs.file-max = 6536438 - net.core.wmem_max = 21299200 - net.core.rmem_max = 21299200 - net.core.wmem_default = 21299200 - net.core.rmem_default = 21299200 - net.ipv4.tcp_rmem = 8192 250000 16777216 - net.ipv4.tcp_wmem = 8192 250000 16777216 - net.core.somaxconn = 65535 - vm.min_free_kbytes = 5270325 - net.core.netdev_max_backlog = 65535 - net.ipv4.tcp_max_syn_backlog = 65535 - net.ipv4.tcp_syncookies = 1 - vm.overcommit_memory = 0 - net.ipv4.tcp_retries1 = 5 - net.ipv4.tcp_syn_retries = 5 - ##NEW - kernel.sched_autogroup_enabled=0 - kernel.sched_min_granularity_ns=2000000 - kernel.sched_latency_ns=10000000 - kernel.sched_wakeup_granularity_ns=5000000 - kernel.sched_migration_cost_ns=500000 - vm.dirty_background_bytes=33554432 - kernel.shmmax=21474836480 - net.ipv4.tcp_timestamps = 0 - net.ipv6.conf.all.disable_ipv6=1 - net.ipv6.conf.default.disable_ipv6=1 - net.ipv4.tcp_keepalive_time=600 - net.ipv4.tcp_keepalive_probes=3 - kernel.core_uses_pid=1 - ``` - -- 调优服务 - - 以下为必填项。 - - 服务器必须运行throughput-performance配置文件: - - ```bash - [...]$ tuned-adm profile throughput-performance - ``` - - throughput-performance配置文件是广泛适用的调优,它为各种常见服务器工作负载提供卓越的性能。 - - 其他不太适合MogDB和MOT服务器的配置可能会影响MOT的整体性能,包括:平衡配置、桌面配置、延迟性能配置、网络延迟配置、网络吞吐量配置和节能配置。 - -- 启动调优 - - 在内核启动参数中添加iommu.passthrough=1。 - - 在pass-through模式下运行时,适配器需要DMA转换到内存,从而提高性能。 - -
- -## MOT配置 - -预置MOT用于创建工作MOT。为了获得最佳效果,建议根据应用程序的特定要求和偏好自定义MOT配置(在mot.conf文件中定义)。 - -该文件在服务器启动时只读。如果在系统运行中编辑此文件,则必须重新加载服务器才能使修改内容生效。 - -mot.conf文件与postgres.conf配置文件在同一文件夹下。 - -阅读[总体原则](#总体原则),根据需要查看和配置mot.conf文件。 - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: 以上描述了mot.conf文件中的各个设置。除上述内容外,要了解特定MOT功能(如恢复),可参考本用户手册的相关章节。例如,[MOT恢复](5-mot-administration.md#mot恢复)说明了mot.conf文件的恢复,包含影响MOT恢复的设置。此外,有关恢复的完整说明,请参阅“MOT管理”章节的[MOT恢复](5-mot-administration.md#mot恢复)。下文各相关章节中还提供了参考链接。 - -以下介绍了mot.conf文件中的各个部分,其包含的设置以及默认值。 - -
- -### 总体原则 - -以下是编辑mot.conf文件的总体原则。 - -- 每个设置项都带有默认值,如下所示: - - ``` - # name = value - ``` - -- 可以接受空格或留空。 - -- 在各行添加#号可进行注释。 - -- 每个设置项的默认值将作为注释显示在整个文件中。 - -- 如果参数没有注释并且置入了新值,则定义新设置。 - -- 对mot.conf文件的更改仅在数据库服务器启动或重装时生效。 - -内存单元的表示如下: - -- KB:千字节 -- MB:兆字节 -- GB:吉字节 -- TB:太字节 - -某些内存单位为postgresql.conf中的max_process_memory的百分比值。例如,20%。 - -时间单位表示如下: - -- us:微秒 -- ms:毫秒 -- s:秒 -- min:分钟 -- h:小时 -- d:天 - -
- -### 重做日志(MOT) - -- **enable_redo_log = true** - - 指定是否使用重做日志以获得持久性。有关重做日志的详细信息,请参阅[MOT日志记录:WAL重做日志](5-mot-administration.md#mot日志记录wal重做日志)。 - -- **enable_group_commit = false** - - 是否使用组提交。 - - 该选项仅在MogDB配置为使用同步提交时相关,即仅当postgresql.conf中的synchronization_commit设置为除off以外的任何值时相关。 - - 有关WAL重做日志的详细信息,请参阅[MOT日志记录:WAL重做日志](5-mot-administration.md#mot日志记录wal重做日志)。 - -- **group_commit_size = 16** - -- **group_commit_timeout = 10 ms** - - 只有当MOT引擎配置为同步组提交日志记录时,此选项才相关。即postgresql.conf中的synchronization_commit配置为true,mot.conf配置文件中的enable_group_commit配置为true。 - - 当一组事务记录在WAL重做日志中时,需确定以下设置项取值: - - group_commit_size:一组已提交的事务数。例如,16表示当同一组中的16个事务已由它们的客户端应用程序提交时,则针对16个事务中的每个事务,在磁盘的WAL重做日志中写入一个条目。 - - group_commit_timeout:超时时间,单位为毫秒。例如,10表示在10毫秒之后,为同一组由客户端应用程序在最近10毫秒内提交的每个事务,在磁盘的WAL重做日志中写入一个条目。 - - 提交组在到达配置的事务数后或者在超时后关闭。组关闭后,组中的所有事务等待一个组落盘完成执行,然后通知客户端每个事务都已经结束。 - - >![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif)**说明**:有关同步组提交日志记录的详细信息,请参阅[MOT日志类型](5-mot-administration.md#mot日志类型)。 - -
- -### 检查点(MOT) - -- **enable_checkpoint = true** - - 是否使用周期检查点。 - -- **checkpoint_dir =** - - 指定检查点数据存放目录。默认位置在每个数据节点的data文件夹中。 - -- **checkpoint_segsize = 16 MB** - - 指定检查点时使用的段大小。分段执行检查点。当一个段已满时,它将被序列化到磁盘,并为后续的检查点数据打开一个新的段。 - -- **checkpoint_workers = 3** - - 指定在检查点期间要使用的工作线程数。 - - 检查点由多个MOT引擎工作线程并行执行。工作线程的数量可能会大大影响整个检查点操作的整体性能,以及其它正在运行的事务的操作。为了实现较短的检查点持续时间,应使用更多线程,直至达到最佳数量(根据硬件和工作负载的不同而不同)。但请注意,如果这个数目太大,可能会对其他正在运行的事务的执行时间产生负面影响。尽可能低这个数字,以最小化对其他运行事务的运行时的影响。当此数目过高时,检查点持续时间会较长。 - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: 有关配置的更多信息,请参阅[MOT检查点](5-mot-administration.md#mot检查点)。 - -
- -### 恢复(MOT) - -- **checkpoint_recovery_workers = 3** - - 指定在检查点数据恢复期间要使用的工作线程数。每个MOT引擎工作线程在自己的核上运行,通过将不同的表读入内存,可以并行处理不同的表。缺省值为3,可将此参数设置为可处理的核数。恢复后,将停止并杀死这些线程。 - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: 有关配置的详细信息,请参阅[MOT恢复](5-mot-administration.md#mot恢复)。 - -
- -- **parallel_recovery_workers = 5** - - 指定在重做恢复/回放期间使用的工作线程数。 - -- **parallel_recovery_workers = 5** - - 指定恢复期间用于保存重做日志段的队列大小。此参数还限制并行恢复期间处于活动状态(进行中)的最大事务数。如果达到此限制,重做回放将等待某些事务提交,然后再处理新事务的重做日志。 - -### 统计(MOT) - -- **enable_stats = false** - - 设置周期性统计打印信息。 - -- **print_stats_period = 10 minute** - - 设置汇总统计报表打印的时间范围。 - -- **print_full_stats_period = 1 hours** - - 设置全量统计报表打印的时间范围。 - - 以下设置为周期性统计报表中的各个部分。如果没有配置,则抑制统计报表。 - -- **enable_log_recovery_stats = false** - - 日志恢复统计信息包含各种重做日志的恢复指标。 - -- **enable_db_session_stats = false** - - 数据库会话统计信息包含事务事件,如提交、回滚等。 - -- **enable_network_stats = false** - - 网络统计信息包括连接/断连事件。 - -- **enable_log_stats = false** - - 日志统计信息包含重做日志详情。 - -- **enable_memory_stats = false** - - 内存统计信息包含内存层详情。 - -- **enable_process_stats = false** - - 进程统计信息包含当前进程的内存和CPU消耗总量。 - -- **enable_system_stats = false** - - 系统统计信息包含整个系统的内存和CPU消耗总量。 - -- **enable_jit_stats = false** - - JIT统计信息包含有关JIT查询编译和执行的信息。 - -
- -### 错误日志(MOT) - -- **log_level = INFO** - - 设置MOT引擎下发的消息在数据库服务器的错误日志中记录的日志级别。有效值为PANIC、ERROR、WARN、INFO、TRACE、DEBUG、DIAG1、DIAG2。 - -- **Log/COMPONENT/LOGGER=LOG_LEVEL** - - 使用以下语法设置特定的日志记录器。 - - 例如,要为系统组件中的ThreadIdPool日志记录器配置TRACE日志级别,请使用以下语法: - - ``` - Log/System/ThreadIdPool=TRACE - ``` - - 要为某个组件下的所有记录器配置日志级别,请使用以下语法: - - ``` - Log/COMPONENT=LOG_LEVEL - ``` - - 例如: - - ``` - Log/System=DEBUG - ``` - -
- -### 内存(MOT) - -- **enable_numa = true** - - 指定是否使用可识别NUMA的内存。 禁用时,所有亲和性配置也将被禁用。 MOT引擎假定所有可用的NUMA节点都有内存。 如果计算机具有某些特殊配置,其中某些NUMA节点没有内存,则MOT引擎初始化将因此失败,因此数据库服务器启动将失败。 在此类计算机中,建议将此配置值设置为false,以防止启动失败并让MOT引擎在不使用可识别NUMA的内存分配的情况下正常运行。 - -- **affinity_mode = equal-per-socket** - - 设置用户会话和内部MOT任务的线程亲和模式。 - - 使用线程池时,用户会话将忽略此值,因为它们的亲和性由线程池控制。但内部MOT任务仍然使用。 - - 有效值为fill-socket-first、equal-per-socket、fill-physical-first、none。 - - - Fill-socket-first将线程连接到同一个槽位的核上,直到槽位已满,然后移动到下一个槽位。 - - Equal-per-socket使线程均匀分布在所有槽位中。 - - Fill-physical-first将线程连接到同一个槽位中的物理核,直到用尽所有物理核,然后移动到下一个槽位。当所有物理核用尽时,该过程再次从超线程核开始。 - - None禁用任何亲和配置,并让系统调度程序确定每个线程调度在哪个核上运行。 - -- **lazy_load_chunk_directory = true** - - 设置块目录模式,用于内存块查找。 - - Lazy模式将块目录设置为按需加载部分目录,从而减少初始内存占用(大约从1GB减少到1MB)。然而,这可能会导致轻微的性能损失和极端情况下的内存损坏。相反,使用non-lazy块目录会额外分配1GB的初始内存,产生略高的性能,并确保在内存损坏期间避免块目录错误。 - -- **reserve_memory_mode = virtual** - - 设置内存预留模式(取值为physical或virtual)。 - - 每当从内核分配内存时,都会参考此配置值来确定所分配的内存是常驻(physical)还是非常驻(virtual)。这主要与预分配有关,但也可能影响运行时分配。对于physical保留模式,通过强制内存区域所跨越的所有页出现页错误,使整个分配的内存区域常驻。配置virtual内存预留可加速内存分配(特别是在预分配期间),但可能在初始访问期间出现页错误(因此导致轻微的性能影响),并在物理内存不可用时出现更多服务器错误。相反,物理内存分配速度较慢,但后续访问速度更快且有保障。 - -- **store_memory_policy = compact** - - 设置内存存储策略(取值为compact或expanding)。 - - 当定义了compact策略时,未使用的内存会释放回内核,直到达到内存下限(请参见下面的min_mot_memory)。在expanding策略中,未使用的内存存储在MOT引擎中,以便后续再使用。compact存储策略可以减少MOT引擎的内存占用,但偶尔会导致性能轻微下降。此外,在内存损坏时,它还可能导致内存不可用。相反,expanding模式会占用更多的内存,但是会更快地分配内存,并且能够更好地保证在解分配后能够重新分配内存。 - -- **chunk_alloc_policy = auto** - - 设置全局内存的块分配策略。 - - MOT内存以2MB的块为单位组织。源NUMA节点和每个块的内存布局会影响表数据在NUMA节点间的分布,因此对数据访问时间有很大影响。在特定NUMA节点上分配块时,会参考分配策略。 - - 可用值包括auto、local、page-interleaved、chunk-interleaved、native。 - - - Auto策略根据当前硬件情况选择块分配策略。 - - Local策略在各自的NUMA节点上分配每个数据块。 - - Page-interleaved策略从所有NUMA节点分配由交插内存4千字节页组成的数据块。 - - Chunk-interleaved策略以轮循调度方式从所有NUMA节点分配数据块。 - - Native策略通过调用原生系统内存分配器来分配块。 - -- **chunk_prealloc_worker_count = 8** - - 设置每个NUMA节点参与内存预分配的工作线程数。 - -- **max_mot_global_memory = 80%** - - 设置MOT引擎全局内存的最大限制。 - - 指定百分比值与postgresql.conf中max_process_memory定义的总量有关。 - - MOT引擎内存分为全局(长期)内存,主要用于存储用户数据,以及本地(短期)内存,主要用于用户会话,以满足本地需求。 - - 任何试图分配超出此限制的内存的尝试将被拒绝,并向用户报告错误。请确保max_mot_global_memory与max_mot_local_memory之和不超过postgresql.conf中配置的max_process_memory。 - -- **min_mot_global_memory = 0 MB** - - 设置MOT引擎全局内存的最小限制。 - - 指定百分比值与postgresql.conf中max_process_memory定义的总量有关。 - - 此值用于启动期间的内存预分配,以及确保MOT引擎在正常运行期间有最小的内存可用量。当使用compact存储策略时(参阅上文store_memory_policy),该值指定了下限,超过下限的内存不会释放回内核,而是保留在MOT引擎中以便后续重用。 - -- **max_mot_local_memory = 15%** - - 设置MOT引擎本地内存的最大限制。 - - 指定百分比值与postgresql.conf中max_process_memory定义的总量有关。 - - MOT引擎内存分为全局(长期)内存,主要用于存储用户数据,以及本地(短期)内存,主要用于用户会话,以满足本地需求。 - - 任何试图分配超出此限制的内存的尝试将被拒绝,并向用户报告错误。请确保max_mot_global_memory与max_mot_local_memory之和不超过postgresql.conf中配置的max_process_memory。 - -- **min_mot_local_memory = 0 MB** - - 设置MOT引擎本地内存的最小限制。 - - 指定百分比值与postgresql.conf中max_process_memory定义的总量有关。 - - 此值用于在启动期间预分配内存,以及确保MOT引擎在正常运行期间有最小的可用内存。当使用compact存储策略时(参阅上文store_memory_policy),该值指定了下限,超过下限的内存不会释放回内核,而是保留在MOT引擎中以便后续重用。 - -- **max_mot_session_memory = 0 MB** - - 设置MOT引擎中单个会话的最大内存限制。 - - 指定百分比值与postgresql.conf中max_process_memory定义的总量有关。 - - 通常,MOT引擎中的会话可以根据需要分配尽可能多的本地内存,只要没有超出本地内存限制即可。为了避免单个会话占用过多的内存,从而拒绝其他会话的内存,通过该配置项限制小会话的本地内存分配(最大1022KB)。 - - 请确保该配置项不影响大会话的本地内存分配。 - - 0表示不会限制每个小会话的本地分配,除非是由max_mot_local_memory配置的本地内存分配限制引起的。 - -- **min_mot_session_memory = 0 MB** - - 设置MOT引擎中单个会话的最小内存预留。 - - 指定百分比值与postgresql.conf中max_process_memory定义的总量有关。 - - 此值用于在会话创建期间预分配内存,以及确保会话有最小的可用内存量来执行其正常操作。 - -- **high_red_mark_percent = 90** - - 设置内存分配的高红标记。 - - 这是按照由max_mot_memory设置的MOT引擎的最大值百分比计算的。默认值为90,即90%。当MOT占用内存总量达到此值时,只允许进行破坏性操作。其它操作都向用户报告错误。 - -- **session_large_buffer_store_size = 0 MB** - - 设置会话的大缓冲区存储。 - - 当用户会话执行需要大量内存的查询时(例如,使用许多行),大缓冲区存储用于增加此类内存可用的确定级别,并更快地为这个内存请求提供服务。对于超过1022KB的会话,任何内存分配都是大内存分配。如果未使用或耗尽了大缓冲区存储,则这些分配将被视为直接从内核提供的巨大分配。 - -- **session_large_buffer_store_max_object_size = 0 MB** - - 设置会话的大分配缓冲区存储中的最大对象大小。 - - 大缓冲区存储内部被划分为不同大小的对象。此值用于对源自大缓冲区存储的对象设置上限,以及确定缓冲区存储内部划分为不同大小的对象。 - - 此大小不能超过session_large_buffer_store_size的1/8。如果超过,则将其调整到最大可能。 - -- **session_max_huge_object_size = 1 GB** - - 设置会话单个大内存分配的最大尺寸。 - - 巨大分配直接从内核中提供,因此不能保证成功。 - - 此值也适用于全局(非会话相关)内存分配。 - -
- -### 垃圾收集(MOT) - -- **reclaim_threshold = 512 KB** - - 设置垃圾收集器的内存阈值。 - - 每个会话管理自己的待回收对象列表,并在事务提交时执行自己的垃圾回收。此值决定了等待回收的对象的总内存阈值,超过该阈值,会话将触发垃圾回收。 - - 一般来说,这里是在权衡未回收对象与垃圾收集频率。设置低值会使未回收的内存保持在较少的水平,但会导致频繁的垃圾回收,从而影响性能。设置高值可以减少触发垃圾回收的频率,但会导致未回收的内存过多。此设置取决于整体工作负载。 - -- **reclaim_batch_size = 8000** - - 设置垃圾回收的批次大小。 - - 垃圾收集器从对象中批量回收内存,以便限制在一次垃圾收集传递中回收的对象数量。此目的是最小化单个垃圾收集传递的操作时间。 - -- **high_reclaim_threshold = 8 MB** - - 设置垃圾回收的高内存阈值。 - - 由于垃圾收集是批量工作的,因此会话可能有许多可以回收的对象,但这些对象不能回收。在这种情况下,为了防止垃圾收集列表变得过于膨胀,尽管已经达到批处理大小限制,此值继续单独回收对象,直到待回收对象小于该阈值,或者没有更多符合回收条件的对象。 - -
- -### JIT(MOT) - -- **enable\_mot\_codegen = false** - - 指定是否对计划查询使用JIT查询编译和执行。 - - JIT查询执行为在计划阶段准备好的查询准备了JIT编译的代码。每当调用准备好的查询时,都会执行生成的JIT编译函数。JIT编译通常以LLVM的形式进行。 - -- **enable\_mot\_codegen\_print = false** - - 是否为JIT编译的查询打印发出的LLVM代码。 - -- **mot\_codegen\_limit = 50000** - - 限制每个用户会话允许的JIT查询数量。 - -- **enable_mot_query_codegen = true** - - 计划查询是否使用JIT查询编译和执行。JIT查询执行允许在规划阶段为预处理查询提供即时编译代码。每当调用预处理查询时,就会执行生成的JIT编译函数。JIT编译以LLVM的形式进行。 - -- **enable_mot_sp_codegen = true** - - 存储过程是否使用JIT查询编译和执行。JIT查询执行允许在编译阶段为存储过程提供即时编译代码。每当调用存储过程时,就会执行生成的JIT编译函数。 - -- **enable_mot_codegen_profile = true** - - 是否使用JIT分析。使用此选项时,mot_jit_profile()函数可用于获取JIT存储过程和查询的运行时配置数据。 - -
- -### 存储(MOT) - -**allow_index_on_nullable_column = true** - -指定是否允许在可空列上定义索引。 - -
- -### 默认MOT.conf文件 - -最小设置和配置指定将Postgresql.conf文件指向MOT.conf文件的位置: - -``` -Postgresql.conf -mot_config_file = '/tmp/gauss/ MOT.conf' -``` - -确保max_process_memory设置的值足够包含MOT的全局(数据和索引)和本地(会话)内存。 - -MOT.conf的默认内容满足开始使用的需求。设置内容后续可以优化。 diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/4-mot-usage.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/4-mot-usage.md deleted file mode 100644 index 6c383faa58b9950cbf8eef80a80b1d9a7528b11d..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/4-mot-usage.md +++ /dev/null @@ -1,748 +0,0 @@ ---- -title: MOT使用 -summary: MOT使用 -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT使用 - -使用MOT非常简单,以下几个小节将会进行描述。 - -MogDB允许应用程序使用MOT和基于标准磁盘的表。MOT适用于最活跃、高竞争和对吞吐量敏感的应用程序表,也可用于所有应用程序的表。 - -以下命令介绍如何创建MOT,以及如何将现有的基于磁盘的表转换为MOT,以加速应用程序的数据库相关性能。MOT尤其有利于已证明是瓶颈的表。 - -以下是与使用MOT相关的任务的简单概述: - -- 授予用户权限 -- 创建/删除MOT -- 为MOT创建索引 -- 将磁盘表转换为MOT -- 查询原生编译 -- 重试中止事务 -- MOT外部支持工具 -- MOT SQL覆盖和限制 - -
- -## 授予用户权限 - -以授予数据库用户对MOT存储引擎的访问权限为例。每个数据库用户仅执行一次,通常在初始配置阶段完成。 - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: MOT通过外部数据封装器(Foreign Data Wrapper,FDW)机制与MogDB数据库集成,所以需要授权用户权限。 - -要使特定用户能够创建和访问MOT(DDL、DML、SELECT),以下语句只执行一次: - -```sql -GRANT USAGE ON FOREIGN SERVER mot_server TO ; -``` - -所有关键字不区分大小写。 - -
- -## 创建/删除MOT - -创建MOT非常简单。只有MOT中的创建和删除表语句与MogDB中基于磁盘的表的语句不同。SELECT、DML和DDL的所有其他命令的语法对于MOT表和MogDB基于磁盘的表是一样的。 - -- 创建MOT: - - ```sql - create FOREIGN table test(x int) [server mot_server]; - ``` - -- 以上语句中: - - - 始终使用FOREIGN关键字引用MOT。 - - 在创建MOT表时,[server mot_server]部分是可选的,因为MOT是一个集成的引擎,而不是一个独立的服务器。 - - 上文以创建一个名为test的内存表(表中有一个名为x的整数列)为例。在下一节(创建索引)中将提供一个更现实的例子。 - - 如果postgresql.conf中开启了增量检查点,则无法创建MOT。因此请在创建MOT前将enable_incremental_checkpoint设置为off。 - -- 删除名为test的MOT: - - ```sql - drop FOREIGN table test; - ``` - -- ALTER TABLE - - 支持添加列、删除列和重命名列。 - -有关MOT的功能限制(如数据类型),请参见[MOT SQL覆盖和限制](#mot-sql覆盖和限制)。 - -
- -## 为MOT创建索引 - -支持标准的PostgreSQL创建和删除索引语句。 - -例如: - -```sql -create index text_index1 on test(x) ; -``` - -创建一个用于TPC-C的ORDER表,并创建索引: - -```sql -create FOREIGN table bmsql_oorder ( - o_w_id integer not null, - o_d_id integer not null, - o_id integer not null, - o_c_id integer not null, - o_carrier_id integer, - o_ol_cnt integer, - o_all_local integer, - o_entry_d timestamp, - primarykey (o_w_id, o_d_id, o_id) -); -create index bmsql_oorder_index1 on bmsql_oorder(o_w_id, o_d_id, o_c_id, o_id) ; -``` - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: 在MOT名字之前不需要指定FOREIGN关键字,因为它仅用于创建和删除表的命令。 - -有关MOT索引限制,请参见[MOT SQL覆盖和限制](#mot-sql覆盖和限制)的索引部分内容。 - -
- -## 将磁盘表转换为MOT - -磁盘表直接转换为MOT尚不能实现,这意味着尚不存在将基于磁盘的表转换为MOT的ALTER TABLE语句。 - -下面介绍如何手动将基于磁盘的表转换为MOT,如何使用gs_dump工具导出数据,以及如何使用gs_restore工具导入数据。 - -
- -### 前置条件检查 - -检查待转换为MOT的磁盘表的模式是否包含所有需要的列。 - -检查架构是否包含任何不支持的列数据类型,具体参见[不支持的数据类型](#不支持的数据类型)章节。 - -如果不支持特定列,则建议首先创建一个更新了模式的备磁盘表。此模式与原始表相同,只是所有不支持的类型都已转换为支持的类型。 - -使用以下脚本导出该备磁盘表,然后导入到MOT中。 - -
- -### 转换 - -要将基于磁盘的表转换为MOT,请执行以下步骤: - -1. 暂停应用程序活动。 -2. 使用gs_dump工具将表数据转储到磁盘的物理文件中。请确保使用data only。 -3. 重命名原始基于磁盘的表。 -4. 创建同名同模式的MOT。请确保使用创建FOREIGN关键字指定该表为MOT。 -5. 使用gs_restore将磁盘文件的数据加载/恢复到数据库表中。 -6. 浏览或手动验证所有原始数据是否正确导入到新的MOT中。下面将举例说明。 -7. 恢复应用程序活动。 - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-notice.gif) **须知**: 由于表名称保持不变,应用程序查询和相关数据库存储过程将能够无缝访问新的MOT,而无需更改代码。另一种方法是通过INSERT INTO SELECT语句将数据从普通(堆)表复制到新的MOT表。 -> -> ```sql -> INSERT INTO [MOT_table] SELECT * FROM [PG_table] WHERE condition; -> ``` -> -> 此方法受MOT事务大小限制,小于1GB。 - -
- -### 转换示例 - -假设要将数据库benchmarksql中一个基于磁盘的表customer迁移到MOT中。 - -将customer表迁移到MOT,操作步骤如下: - -1. 检查源表列类型。验证MOT支持所有类型,详情请参阅[不支持的数据类型](#不支持的数据类型)章节。 - - ```sql - benchmarksql-# \d+ customer - Table "public.customer" - Column | Type | Modifiers | Storage | Stats target | AboutopenGauss - --------+---------+-----------+---------+--------------+------------- - x | integer | | plain | | - y | integer | | plain | | - Has OIDs: no - Options: orientation=row, compression=no - ``` - -2. 请检查源表数据。 - - ```sql - benchmarksql=# select * from customer; - x | y - ---+--- - 1 | 2 - 3 | 4 - (2 rows) - ``` - -3. 只能使用gs_dump转储表数据。 - - ```sql - $ gs_dump -Fc benchmarksql -a --table customer -f customer.dump - gs_dump[port='15500'][benchmarksql][2020-06-04 16:45:38]: dump database benchmarksql successfully - gs_dump[port='15500'][benchmarksql][2020-06-04 16:45:38]: total time: 332 ms - ``` - -4. 重命名源表。 - - ```sql - benchmarksql=# alter table customer rename to customer_bk; - ALTER TABLE - ``` - -5. 创建与源表完全相同的MOT。 - - ```sql - benchmarksql=# create foreign table customer (x int, y int); - CREATE FOREIGN TABLE - benchmarksql=# select * from customer; - x | y - ---+--- - (0 rows) - ``` - -6. 将源转储数据导入到新MOT中。 - - ```sql - $ gs_restore -C -d benchmarksql customer.dump - restore operation successful - total time: 24 ms - Check that the data was imported successfully. - benchmarksql=# select * from customer; - x | y - ---+--- - 1 | 2 - 3 | 4 - (2 rows) - - benchmarksql=# \d - List of relations - Schema | Name | Type | Owner | Storage - --------+-------------+---------------+--------+---------------------------------- - public | customer | foreign table | aharon | - public | customer_bk | table | aharon | {orientation=row,compression=no} - (2 rows) - ``` - -
- -## 查询原生编译 - -MOT的另一个特性是,在预编译的完整查询需要执行之前,能够以原生格式(使用PREPARE语句)准备并解析这些查询。 - -这种原生格式方便后续更有效地执行(使用EXECUTE命令)。这种执行类型速度要快得多,因为原生格式在执行期间绕过多个数据库处理层,从而获得更好的性能。 - -这种分工避免了重复的解析分析操作。查询和事务语句可以交互执行。此功能有时称为即时(Just-In-Time,JIT)查询编译。 - -
- -### 查询编译:PREPARE语句 - -若要使用MOT的原生查询编译,请在执行查询之前调用PREPARE客户端语句。MOT将预编译查询和(或)从缓存预加载先前预编译的代码。 - -下面是SQL中PREPARE语法的示例: - -```sql -PREPARE name [ ( data_type [, ...] ) ] AS statement -``` - -PREPARE在数据库服务器中创建一个预处理语句,该语句是一个可用于优化性能的服务器端对象。 - -
- -### 运行命令 - -发出EXECUTE命令时,将解析、分析、重写和执行预处理语句。这种分工避免了重复的解析分析操作,同时使执行计划依赖于特定的设置值。 - -下面是在Java应用程序中调用PREPARE和EXECUTE语句的示例。 - -```sql -conn = DriverManager.getConnection(connectionUrl, connectionUser, connectionPassword); - -// Example 1: PREPARE without bind settings -String query = "SELECT * FROM getusers"; -PreparedStatement prepStmt1 = conn.prepareStatement(query); -ResultSet rs1 = pstatement.executeQuery()) -while (rs1.next()) {…} - -// Example 2: PREPARE with bind settings -String sqlStmt = "SELECT * FROM employees where first_name=? and last_name like ?"; -PreparedStatement prepStmt2 = conn.prepareStatement(sqlStmt); -prepStmt2.setString(1, "Mark"); // first name “Mark” -prepStmt2.setString(2, "%n%"); // last name contains a letter “n” -ResultSet rs2 = prepStmt2.executeQuery()) -while (rs2.next()) {…} -``` - -MOT编译支持的特性和不支持的特性见下文。 - -
- -### 轻量执行支持的查询 - -以下查询类型适合轻量执行: - -- 简单点查询 - - SELECT (including SELECT for UPDATE) - - UPDATE - - DELETE -- INSERT查询 -- 引用主键的完整前缀的范围UPDATE查询 -- 引用主键的完整前缀的范围SELECT查询 -- JOIN查询,其中一部分或两部分重叠为点查询 -- 引用每个连接表中主键的完整前缀的JOIN查询 - -
- -### 轻量执行不支持的查询 - -任何特殊的查询属性都不适用于轻量执行。特别是如果以下条件中的任何一项适用,则该查询不适合轻量执行。有关更多信息,请参阅[原生编译和轻量执行不支持的查询](#原生编译和轻量执行不支持的查询)。 - -需要强调一点,如果查询语句不适用原生编译和轻量执行,不向客户端报告错误,查询仍以正常和规范的方式执行。 - -有关MOT原生编译功能的详细信息,请参阅[查询原生编译](#查询原生编译)的有关内容。 - -
- -### JIT存储过程 - -JIT存储过程(JIT SP)由MogDB MOT引擎(从5.0版本开始)支持,其目标是提供更高的性能和更低的延迟。 - -JIT SP是指通过LLVM运行时代码生成和执行库来生成代码、编译和执行存储过程。JIT SP仅对访问MOT表的存储过程可用,对用户完全透明。跨引擎事务的存储过程将由标准的PL/pgSQL执行。加速级别取决于存储过程逻辑复杂度。例如,一个真实的客户应用程序为不同的存储过程实现了20%、44%、300%和500%的加速,将存储过程延迟减少到数十毫秒。 - -在调用存储过程的查询PREPARE阶段或第一次执行存储过程时,JIT模块尝试将存储过程SQL转换为基于C的函数,并在运行时(使用LLVM)编译。如果成功,连续存储过程调用,MOT将执行编译函数,从而获得性能增益。如果无法生成编译函数,存储过程将由标准的PL/pgSQL执行。这两种情况对用户完全透明。 - -您可以参考[MOT JIT诊断](#MOT-JIT诊断)了解有用的诊断信息。 - -
- -### MOT JIT诊断 - -#### mot_jit_detail - -该内置函数用于查询JIT编译(代码生成)的详细信息。 - -##### 使用示例 - -```sql -select * from mot_jit_detail(); - -select proc_oid, substr(query, 0, 50), namespace, jittable_status, valid_status, last_updated, plan_type, codegen_time from mot_jit_detail(); -``` - -##### 输出说明 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

字段

-

说明

-

proc_oid

-

过程OID(数据库中过程的真实对象ID)。0表示查询。

-

query

-

查询字符串或存储过程名称。

-

namespace

-

查询或过程所属的命名空间。对于过程和顶级查询,值为GLOBAL。对于所有调用查询、子查询,此字段将显示父信息。

-

jittable_status

-
    是否为JIT查询或过程:
  • jittable:JIT查询或过程
  • unjittable:不是JIT查询或过程
  • invalid:无效状态(DDL或JIT编译进行中导致失效后的临时状态)
-

valid_status

-

查询或过程是否有效:

  • valid:查询或过程有效
  • unavailable:JIT编译进行中
  • error:错误状态
  • dropped:过程已删除
  • replaced:过程已替换

-

last_updated

-

上次更新状态时的时间戳。

-

plan_type

-

表示存储过程或查询类型。

-

codegen_time

-

代码生成(JIT编译)所需的总时间,单位为微秒。

-

verify_time

-

LLVM验证时间(内部),单位为微秒。

-

finalize_time

-

LLVM完成时间(内部),单位为微秒。

-

compile_time

-

LLVM编译时间(内部),单位为微秒。

-
- -#### mot\_jit\_profile - -此内置函数用于查找查询或存储过程执行的分析数据(性能数据)。 - -##### 使用示例 - -```sql -select * from mot_jit_profile(); - -select proc_oid, id, parent_id, substr(query, 0, 50), namespace, weight, total, self, child_gross, child_net from mot_jit_profile(); -``` - -##### 输出说明 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

字段

-

说明

-

proc_oid

-

过程OID(数据库中过程的真实对象ID)。0表示查询。

-

id

-

用于操控输出的内部ID。

-

parent_id

-

父ID(内部ID)。仅适用于子查询和子过程。-1用于顶级查询和过程。

-

query

-

查询字符串或存储过程名称。

-

namespace

-

查询或过程所属的命名空间。对于过程和顶级查询,值为GLOBAL。对于所有调用查询、子查询,此字段将显示父信息。

-

weight

-

执行子查询或子过程的平均次数(每执行一次父存储过程),单位为微秒。

-

total

-

执行查询或过程所需的总时间,单位为微秒。

-

self

-

查询或过程所花费的时间,不包括子查询和子过程所花费的时间,单位为微秒。

-

child_gross

-

执行所有子查询和子过程所花费的总时间(child_net+准备执行所有子查询和子过程所花费的时间),单位为微秒。

-

child_net

-

所有子查询和子过程所花费的总时间,即,∑(child总数*weight),单位为微秒。

-

def_vars

-

定义变量(内部)所需的时间,单位为微秒。

-

init_vars

-

初始化变量(内部)所需的时间,单位为微秒。

-
- -#### 其他 - -另外,[PG_PROC](../../../reference-guide/system-catalogs-and-system-views/system-catalogs/PG_PROC.md)系统表也可用于获取存储过程和函数的有关信息。 - -例如,存储过程内容的查询如下: - -```sql -select proname,prosrc from pg_proc where proname='sp_call_filter_rules_100_1'; -``` - -
- -## 重试中止事务 - -在乐观并发控制(OCC)中,在COMMIT阶段前的事务期间(使用任何隔离级别)不会对记录进行锁定。这是一个能显著提高性能的强大优势。它的缺点是,如果另一个会话尝试更新相同的记录,则更新可能会失败。所以必须中止整个事务。这些所谓的更新冲突是由MOT在提交时通过版本检查机制检测到的。 - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: 使用悲观并发控制的引擎,如标准Postgres和MogDB基于磁盘的表,当使用SERIALIZABLE或REPEATABLE-READ隔离级别时,也会发生类似的异常中止。 - -这种更新冲突在常见的OLTP场景中非常少见,在使用MOT时尤其少见。但是,由于仍有可能发生这种情况,开发人员应该考虑使用事务重试代码来解决此问题。 - -下面以多个会话同时尝试更新同一个表为例,说明如何重试表命令。有关更多详细信息,请参阅[OCC与2PL的区别举例](../../../administrator-guide/mot-engine/3-concepts-of-mot/3-2.md#occ与2pl的区别举例)部分。下面以TPC-C支付事务为例。 - -```sql -int commitAborts = 0; - -while (commitAborts < RETRY_LIMIT) { - - try { - stmt =db.stmtPaymentUpdateDistrict; - stmt.setDouble(1, 100); - stmt.setInt(2, 1); - stmt.setInt(3, 1); - stmt.executeUpdate(); - - db.commit(); - - break; - } - catch (SQLException se) { - if(se != null && se.getMessage().contains("could not serialize access due to concurrent update")) { - log.error("commmit abort = " + se.getMessage()); - commitAborts++; - continue; - }else { - db.rollback(); - } - - break; - } -} -``` - -
- -## MOT外部支持工具 - -为了支持MOT,修改了以下外部MogDB工具。请确保使用的工具是最新版本。下面将介绍与MOT相关的用法。有关这些工具及其使用方法的完整说明,请参阅《参考指南》中的“[工具参考](../../../reference-guide/tool-reference/tool-overview.md)”章节。 - -
- -### gs_ctl(全量和增量) - -此工具用于从主服务器创建备服务器,以及当服务器的时间线偏离后,将服务器与其副本进行同步。 - -在操作结束时,工具将获取最新的MOT检查点,同时考虑checkpoint_dir配置值。 - -检查点从源服务器的checkpoint_dir读取到目标服务器的checkpoint_dir。 - -目前MOT不支持增量检查点。因此,gs_ctl增量构建对于MOT来说不是以增量方式工作,而是以全量方式工作。Postgres磁盘表仍然可以增量构建。 - -
- -### gs_basebackup - -gs_basebackup用于准备运行中服务器的基础备份,不影响其他数据库客户端。 - -MOT检查点也会在操作结束时获取。但是,检查点的位置是从源服务器中的checkpoint_dir获取的,并传输到源数据目录中,以便正确备份。 - -
- -### gs_dump - -gs_dump用于将数据库模式和数据导出到文件中。支持MOT。 - -
- -### gs_restore - -gs_restore用于从文件中导入数据库模式和数据。支持MOT。 - -
- -## MOT SQL覆盖和限制 - -MOT设计几乎能够覆盖SQL和未来特性集。例如,大多数支持标准的Postgres SQL,也支持常见的数据库特性,如存储过程、自定义函数等。 - -下面介绍各种SQL覆盖和限制。 - -
- -### 不支持的特性 - -MOT不支持以下特性: - -- 隔离性:不支持SERIALIZABLE隔离。 -- 查询原生编译(JIT):SQL覆盖范围有限。 -- 本地内存限制为1GB。一个事务只能更改小于1GB的数据。 -- 容量(数据+索引)受限于可用内存。 -- 不支持全文检索索引。 -- 不支持逻辑复制特性。 -- 不支持保存点。 - -此外,下面详细列出了MOT、MOT索引、查询和DML语法的各种通用限制,以及查询原生编译的特点和限制。 - -
- -### MOT限制 - -MOT功能限制: - -- 分区 -- AES加密、数据动态脱敏、行级访问控制 -- 流操作 -- 自定义类型 -- 子事务:仅支持存储过程的语句块上下文,且有以下限制:MOT恢复支持仅包含SELECT操作的子事务,且仅允许只读回滚。在这种情况下,父事务将中止。 -- DML触发器 -- DDL触发器 -- “C”或“POSIX”以外的排序规则 - -
- -### 不支持的DDL操作 - -- CREATE FORIGN table LIKE:有限支持,LIKE可以用于任何表(MOT和堆表),但不带任何选项、数据或索引。 -- 创建as select表 -- 按范围分区 -- 创建无日志记录子句(no-logging clause)的表 -- 创建可延迟约束主键(DEFERRABLE) -- 重新索引 -- 表空间 -- 使用子命令创建架构 - -
- -### 不支持的数据类型 - -- UUID -- User-Defined Type (UDF) -- Array data type -- NVARCHAR2(n) -- Clob -- Name -- Blob -- Raw -- Path -- Circle -- Reltime -- Bit varying(10) -- Tsvector -- Tsquery -- JSON -- Box -- Text -- Line -- Point -- LSEG -- POLYGON -- INET -- CIDR -- MACADDR -- Smalldatetime -- BYTEA -- Bit -- Varbit -- OID -- Money -- Any unlimited varchar/character varying -- HSTORE -- XML -- Int16 -- Abstime -- Tsrange -- Tstzrange -- Int8range -- Int4range -- Numrange -- Daterange -- HLL - -
- -### 不支持的索引DDL和索引 - -- 在小数和数值类型上创建索引 - -- 在可空列上创建索引 - -- 单表创建索引总数>9 - -- 在键大小>256的表上创建索引 - - 键大小包括以字节为单位的列大小+列附加大小,这是维护索引所需的开销。下表列出了不同列类型的列附加大小。 - - 此外,如果索引不是唯一的,额外需要8字节。 - - 下面是伪代码计算键大小: - - ```sql - keySize =0; - - for each (column in index){ - keySize += (columnSize + columnAddSize); - } - if (index is non_unique) { - keySize += 8; - } - ``` - - | 列类型 | 列大小 | 列附加大小 | - | :------- | :----- | :--------- | - | varchar | N | 4 | - | tinyint | 1 | 1 | - | smallint | 2 | 1 | - | int | 4 | 1 | - | longint | 8 | 1 | - | float | 4 | 2 | - | double | 8 | 3 | - -上表中未指定的类型,列附加大小为零(例如时间戳)。 - -
- -### 不支持的DML - -- Merge into -- Lock table -- Copy from table -- Upsert - -
- -### 不支持的JIT功能(原生编译和执行) - -- 存储过程编译:仅访问MOT表的存储过程可用。 -- 查询涉及两个以上的表 -- 查询有以下任何一个情况: - - 非原生类型的聚合 - - 窗口功能 - - 子查询子链接 - - Distinct-ON修饰语(distinct子句来自DISTINCT ON) - - 递归(已指定WITH RECURSIVE) - - 修改CTE(WITH中有INSERT/UPDATE/DELETE) - -以下子句不支持轻量执行: - -- Returning list -- Group By clause -- Grouping sets -- Having clause -- Windows clause -- Distinct clause -- Sort clause that does not conform to native index order:支持,但所有排序列都必须存在于SELECT中。 -- Set operations -- Constraint dependencies \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/5-mot-administration.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/5-mot-administration.md deleted file mode 100644 index 008636010ab619de728b49e9aefeabd8a36ace14..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/5-mot-administration.md +++ /dev/null @@ -1,460 +0,0 @@ ---- -title: MOT管理 -summary: MOT管理 -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT管理 - -下面介绍MOT管理。 - -
- -## MOT持久性 - -持久性是指长期的数据保护(也称为磁盘持久化)。持久性意味着存储的数据不会遭受任何形式的退化或损坏,因此数据不会丢失或损坏。持久性可确保在有计划停机(例如维护)或计划外崩溃(例如电源故障)后数据和MOT引擎恢复到一致状态。 - -内存存储是易失的,需要电力来维护所存储的信息。另一方面,磁盘存储是非易失性的,这意味着它不需要电源来维护存储的信息,因此不用担心停电。MOT同时使用这两种类型的存储。内存中存储了所有数据,同时将事务性更改持久化到磁盘,并保持频繁的定期[MOT检查点](#mot检查点),以确保在关机时恢复数据。 - -用户必须保证有足够的磁盘空间用于日志记录和检查点操作。检查点使用单独的驱动器,通过减少磁盘I/O负载来提高性能。 - -参考[MOT关键技术](../../../administrator-guide/mot-engine/1-introducing-mot/3-mot-key-technologies.md)了解如何在MOT引擎中实现持久化。 - -若要设置持久性: - -为保证严格一致性,请在postgres.conf配置文件中将参数sync_commit配置为On。 - -MOT的WAL重做日志和检查点开启持久性,下面将详细介绍。 - -
- -### MOT日志记录:WAL重做日志 - -为保证持久性,MOT全面集成MogDB的WAL机制,通过MogDB的XLOG接口持久化WAL记录。这意味着,每次MOT记录的添加、更新和删除都记录在WAL中。确保了可以从这个非易失性日志重新生成和恢复最新的数据状态。例如,如果向表中添加了3行,删除了2行,更新了1行,那么日志中将记录6个条目。 - -MOT日志记录和MogDB磁盘表的其他记录写入同一个WAL中。 - -MOT只记录事务提交阶段的操作。 - -MOT只记录更新的增量记录,以便最小化写入磁盘的数据量。 - -在恢复期间,从最后一个已知或特定检查点加载数据;然后使用WAL重做日志完成从该点开始发生的数据更改。 - -WAL重做日志将保留所有表行修改,直到执行检查点(如上所述)。然后可以截断日志,以减少恢复时间和节省磁盘空间。 - -注意:为确保日志IO设备不会成为瓶颈,日志文件必须放在具有低延迟的驱动器上。 - -
- -### MOT日志类型 - -支持两个同步事务日志选项和一个异步事务日志选项(标准MogDB磁盘引擎也支持这些选项)。MOT还支持同步的组提交日志记录与NUMA-aware优化,如下所述。 - -根据您的配置,实现以下类型的日志记录: - -- **同步重做日志记录** - - 同步重做日志记录选项是最简单、最严格的重做日志记录器。当客户端应用程序提交事务时,事务重做条目记录在WAL重做日志中,如下所示: - - 1. 当事务正在进行时,它存储在MOT内存中。 - - 2. 事务完成后,客户端应用程序发送Commit命令,该事务被锁定,然后写入磁盘上的WAL重做日志。当事务日志条目写入日志时,客户端应用程序仍在等待响应。 - - 3. 一旦事务的整个缓冲区被写入日志,就更改内存中的数据,然后提交事务。事务提交后,通知客户端应用程序事务完成。 - - **总结** - - 同步重做日志记录选项是最安全、最严格的,因为它确保了客户端应用程序和每个事务提交时的WAL重做日志条目的完全同步,从而确保了总的持久性和一致性,并且绝对不会丢失数据。此日志记录选项可防止客户端应用程序在事务尚未持久化到磁盘时将事务标记为成功的情况。 - - 同步重做日志记录选项的缺点是,它是三个选项中最慢的日志机制。因为客户端应用程序必须等待所有数据都写入磁盘,并且磁盘写入过于频繁导致数据库变慢。 - -- **组同步重做日志记录** - - 组同步重做日志记录选项与同步重做日志记录选项非常相似,它确保完全持久性,绝对不会丢失数据,并保证客户端应用程序和WAL重做日志条目的完全同步。不同的是,组同步重做日志记录选项将事务重做条目组同时写入磁盘上的WAL重做日志,而不是在提交时写入每个事务。使用组同步重做日志记录可以减少磁盘I/O数量,从而提高性能,特别是在运行繁重的工作负载时。 - - MOT引擎通过根据运行事务的核的NUMA槽位自动对事务进行分组,使用NUMA感知优化来执行同步的组提交记录。 - - 有关NUMA-aware内存访问的更多信息,请参阅[NUMA-aware分配和亲和性](../../../administrator-guide/mot-engine/3-concepts-of-mot/3-4.md)。 - - 当一个事务提交时,一组条目记录在WAL重做日志中,如下所示: - - 1. 当事务正在进行时,它存储在内存中。MOT引擎根据运行事务的核的NUMA槽位对桶中的事务进行分组。在同一槽位上运行的所有事务都被分组在一起,并且多个组将根据事务运行的核并行填充。 - - 这样,将事务写入WAL更为有效,因为来自同一个槽位的所有缓冲区都一起写入磁盘。 - - 注意:每个线程在属于单槽位的单核/CPU上运行,每个线程只写在各自运行的核的槽位上。 - - 2. 在事务完成并且客户端应用程序发送Commit命令之后,事务重做日志条目将与同组的其他事务一起序列化。 - - 3. 当特定一组事务满足配置条件后,如[重做日志(MOT)](3-mot-deployment.md#重做日志mot)小节中描述的已提交的事务数或超时时间,该组中的事务将被写入磁盘的WAL中。当这些日志条目被写入日志时,发出提交请求的客户端应用程序正在等待响应。 - - 4. 一旦NUMA-aware组中的所有事务缓冲区都写入日志,该组中的所有事务都将对内存存储执行必要的更改,并且通知客户端这些事务已完成。 - - **总结** - - 组同步重做日志记录选项是一个极其安全和严格的日志记录选项,因为它确保了客户端应用程序和WAL重做日志条目的完全同步,从而确保了总的持久性和一致性,而且绝对不会丢失数据。此日志记录选项可防止客户端应用程序在事务尚未持久化到磁盘时将事务标记为成功的情况。 - - 该选项的磁盘写入次数比同步重做日志记录选项少,这可能意味着它更快。缺点是事务被锁定的时间更长。在同一NUMA内存中的所有事务都写入磁盘的WAL重做日志之前,它们一直处于锁定状态。 - - 是否使用此选项取决于事务工作负载的类型。例如,此选项有利于事务较多的系统。而对于事务少的系统而言,磁盘写入量也很少,因此不建议使用。 - -- **异步重做日志记录** - - 异步重做日志记录选项是最快的日志记录方法,但是它不能确保数据不会丢失,某些仍位于缓冲区中且尚未写入磁盘的数据在电源故障或数据库崩溃时可能会丢失。当客户端应用程序提交事务时,事务重做条目将记录在内部缓冲区中,并按预先配置的时间间隔写入磁盘。客户端应用程序不会等待数据写入磁盘。它将继续进行下一个事务。这就是异步重做日志记录最快的原因。 - - 当客户端应用程序提交事务时,事务重做条目记录在WAL重做日志中,如下所示: - - 1. 当事务正在进行时,它存储在MOT内存中。 - - 2. 在事务完成并且客户端应用程序发送Commit命令后,事务重做条目将被写入内部缓冲区,但尚未写入磁盘。然后更改MOT数据内存,并通知客户端应用程序事务已提交。 - - 3. 后台运行的重做日志线程按预先配置的时间间隔收集所有缓存的重做日志条目,并将它们写入磁盘。 - - **总结** - - 异步重做日志记录选项是最快的日志记录选项,因为它不需要客户端应用程序等待数据写入磁盘。此外,它将许多事务重做条目分组并把它们写入一起,从而减少降低MOT引擎速度的磁盘I/O数量。 - - 异步重做日志记录选项的缺点是它不能确保在崩溃或失败时数据不会丢失。已提交但尚未写入磁盘的数据在提交时是不持久的,因此在出现故障时无法恢复。异步重做日志记录选项对于愿意牺牲数据恢复(一致性)而不是性能的应用程序来说最为适用。 - -
- -### 配置日志 - -标准MogDB磁盘引擎支持两个同步事务日志选项和一个异步事务日志选项。 - -配置日志记录 - -1. 在postgres.conf配置文件中的sync_commit (On = Synchronous)参数中指定是否执行同步或异步事务日志记录。 -2. 在重做日志章节中的mot.conf配置文件里,将enable_redo_log参数设置为True。 - -如果已选择事务日志记录的同步模式(如上文所述,synchronous_commit = On),则在mot.conf配置文件中的enable_group_commit参数中指定Group Synchronous Redo Logging选项或Synchronous Redo Logging选项。如果选择Group Synchronous Redo Logging,必须在mot.conf文件中定义以下阈值,决定何时将一组事务记录在WAL中。 - -- group_commit_size:一组已提交的事务数。例如,16表示当同一组中的16个事务已由它们的客户端应用程序提交时,则针对16个事务中的每个事务,在磁盘的WAL重做日志中写入一个条目。 -- group_commit_timeout:超时时间,单位为毫秒。例如,10表示在10毫秒之后,为同一组由客户端应用程序在最近10毫秒内提交的每个事务,在磁盘的WAL重做日志中写入一个条目。 - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: 有关配置的详细信息,请参阅[重做日志(MOT)](3-mot-deployment.md#重做日志mot)。 - -
- -### MOT检查点 - -检查点是一个时间点。在这个时间点,表行的所有数据都保存在持久存储上的文件中,以便创建完整持久的数据库镜像。这是一个数据在某个时间点的快照。 - -检查点减少了为确保持久性而必须重放的WAL重做日志条目的数量,以此缩短数据库的恢复时间。检查点还减少了保存所有日志条目所需的存储空间。 - -如果没有检查点,那么为了恢复数据库,所有WAL重做条目必须从开始时间进行重放,可能需要几天或几周的时间,这取决于数据库中的记录数量。检查点记录数据库的当前状态,并允许丢弃旧的重做条目。 - -检查点在恢复方案(特别是冷启动)中是必不可少的。首先,从最后一个已知或特定检查点加载数据;然后使用WAL完成此后发生的数据更改。 - -例如,如果同一表行被修改100次,则日志中将记录100个条目。当使用检查点后,即使表行被修改了100次,检查点也可以一次性记录。在记录检查点之后,可以基于该检查点执行恢复,并且只需要播放自该检查点之后发生的WAL重做日志条目。 - -
- -## MOT恢复 - -MOT恢复的主要目标是在有计划停机(例如维护)或计划外崩溃(例如电源故障后)后,将数据和MOT引擎恢复到一致状态。 - -MOT恢复是随着MogDB数据库其余部分的恢复而自动执行的,并且完全集成到MogDB恢复过程(也称为冷启动)。 - -MOT恢复包括两个阶段: - -检查点恢复:必须通过将数据加载到内存行并创建索引,从磁盘上的最新检查点文件恢复数据。 - -WAL重做日志恢复:从检查点恢复中使用检查点后,必须通过重放之后添加到日志中的记录,从WAL重做日志中恢复最近的数据(在检查点中未捕获)。 - -MogDB管理和触发WAL重做日志恢复。 - -为了缩短RTO(故障切换事件的备节点恢复)并加快冷启动,MOT引擎支持重做日志恢复和检查点恢复的进程并行。 - -- 配置parallel_recovery_workers和parallel_recovery_queue_size以更改重做日志恢复(日志回放)并行工作方式。此设置将影响RTO。 - -- 配置checkpoint_recovery_workers以更改检查点恢复并行工作方式。此设置主要影响冷启动时间。 - -更多说明和默认值,参见[恢复(MOT)](./3-mot-deployment.md#恢复mot)中的描述。 - -
- -## MOT复制和高可用 - -由于MOT集成到MogDB中,并且使用或支持其复制和高可用,因此,MOT原厂功能即支持同步复制和异步复制。 - -MogDB gs_ctl工具用于可用性控制和数据库操作。这包括gs_ctl切换、gs_ctl故障切换、gs_ctl构建等等。 - -有关更多信息,请参见参考指南 -> [gs_ctl](../../../reference-guide/tool-reference/tools-used-in-the-internal-system/gs_ctl.md)章节。 - -- 配置复制和高可用性。 -- 请参考MogDB相关文档。 - -
- -## MOT内存管理 - -规划和微调请参见[MOT内存和存储规划](2-mot-preparation.md#mot内存和存储规划)和[MOT配置](3-mot-deployment.md#mot配置)。 - -
- -## MOT VACUUM清理 - -使用VACUUM进行垃圾收集,并有选择地分析数据库,如下所示。 - -- 【Postgres】 - - 在Postgres中,VACUUM用于回收死元组占用的存储空间。在正常的Postgres操作中,删除的元组或因更新而作废的元组不会从表中物理删除。只能由VACUUM清理。因此,需要定期执行VACUUM,特别是在频繁更新的表上。 - -- 【MOT扩展】 - - MOT不需要周期性的VACUUM操作,因为新元组会重用失效元组和空元组。只有当MOT的大小急剧减少,并且不计划恢复到原来大小时,才需要VACUUM操作。 - - 例如,应用程序定期(如每周一次)大量删除表数据的同时插入新数据,这需要几天时间,并且不一定是相同数量的行。在这种情况下,可以使用VACUUM。 - - 对MOT的VACUUM操作总是被转换为带有排他表锁的VACUUM FULL。 - -- 支持的语法和限制 - - 按规范激活VACUUM操作。 - - ```sql - VACUUM [FULL | ANALYZE] [ table ]; - ``` - - 只支持FULL和ANALYZE VACUUM两种类型。VACUUM操作只能对整个MOT进行。 - - 不支持以下Postgres VACUUM选项: - - - FREEZE - - VERBOSE - - Column specification - - LAZY模式(部分表扫描) - - 此外,不支持以下功能: - - - AUTOVACUUM - -
- -## MOT统计 - -统计信息主要用于性能分析或调试。在生产环境中,通常不打开它们(默认是关闭的)。统计信息主要由数据库开发人员使用,数据库用户较少使用。 - -对性能有一定影响,特别是对服务器。对用户的影响可以忽略不计。 - -统计信息保存在数据库服务器日志中。该日志位于data文件夹中,命名为postgresql-DATE-TIME.log。 - -有关详细的配置选项,请参阅[统计(MOT)](3-mot-deployment.md#统计mot)。 - -
- -## MOT监控 - -监控的所有语法支持基于Postgres的FDW表,包括下面的表或索引大小。此外,还存在用于监控MOT内存消耗的特殊函数,包括MOT全局内存、MOT本地内存和单个客户端会话。 - -
- -### 表和索引大小 - -可以通过查询pg_relation_size来监控表和索引的大小。 - -例如: - -**数据大小** - -```sql -select pg_relation_size('customer'); -``` - -**索引** - -```sql -select pg_relation_size('customer_pkey'); -``` - -
- -### MOT全局内存详情 - -检查MOT全局内存大小,主要是数据和索引。 - -```sql -select * from mot_global_memory_detail(); -``` - -结果如下。 - -```sql -numa_node | reserved_size | used_size -----------------+----------------+------------- --1 | 194716368896 | 25908215808 -0 | 446693376 | 446693376 -1 | 452984832 | 452984832 -2 | 452984832 | 452984832 -3 | 452984832 | 452984832 -4 | 452984832 | 452984832 -5 | 364904448 | 364904448 -6 | 301989888 | 301989888 -7 | 301989888 | 301989888 -``` - -其中, - -- -1为总内存。 -- 0-7为NUMA内存节点。 - -
- -### MOT本地内存详情 - -检查MOT本地内存大小,包括会话内存。 - -```sql -select * from mot_local_memory_detail(); -``` - -结果如下。 - -```sql -numa_node | reserved_size | used_size -----------------+----------------+------------- --1 | 144703488 | 144703488 -0 | 25165824 | 25165824 -1 | 25165824 | 25165824 -2 | 18874368 | 18874368 -3 | 18874368 | 18874368 -4 | 18874368 | 18874368 -5 | 12582912 | 12582912 -6 | 12582912 | 12582912 -7 | 12582912 | 12582912 -``` - -其中, - -- -1为总内存。 -- 0-7为NUMA内存节点。 - -
- -### 会话内存 - -会话管理的内存从MOT本地内存中获取。 - -所有活动会话(连接)的内存使用量可以通过以下查询。 - -```sql -select * from mot_session_memory_detail(); -``` - -结果如下。 - -```sql -sessid | total_size | free_size | used_size -----------------------------------------+-----------+----------+---------- -1591175063.139755603855104 | 6291456 | 1800704 | 4490752 -``` - -其中, - -- total_size:分配给会话的内存。 -- free_size:未使用的内存。 -- used_size:使用中的内存。 - -DBA可以通过以下查询确定当前会话使用的本地内存状态。 - -```sql -select * from mot_session_memory_detail() - where sessid = pg_current_sessionid(); -``` - -结果如下。 - -![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-administration-1.png) - -
- -## MOT错误消息 - -错误可能由多种场景引起。所有错误都记录在数据库服务器日志文件中。此外,与用户相关的错误作为对查询、事务或存储过程执行或数据库管理操作的响应的一部分返回给用户。 - -- 服务器日志中报告的错误包括函数、实体、上下文、错误消息、错误描述和严重性。 -- 向用户报告的错误被翻译成标准PostgreSQL错误码,可能由MOT特定的消息和描述组成。 - -错误提示、错误描述和错误码见下文。该错误码实际上是内部代码,不记录也不返回给用户。 - -
- -### 写入日志文件的错误 - -所有错误都记录在数据库服务器日志文件中。以下列出了写入数据库服务器日志文件但未返回给用户的错误。该日志位于data文件夹中,命名为postgresql-DATE-TIME.log。 - -**表 1** 只写入日志文件的错误 - -| 日志消息 | 内部错误代码 | -| :---------------------------------- | :------------------------------- | -| Error code denoting success | MOT_NO_ERROR 0 | -| Out of memory | MOT_ERROR_OOM 1 | -| Invalid configuration | MOT_ERROR_INVALID_CFG 2 | -| Invalid argument passed to function | MOT_ERROR_INVALID_ARG 3 | -| System call failed | MOT_ERROR_SYSTEM_FAILURE 4 | -| Resource limit reached | MOT_ERROR_RESOURCE_LIMIT 5 | -| Internal logic error | MOT_ERROR_INTERNAL 6 | -| Resource unavailable | MOT_ERROR_RESOURCE_UNAVAILABLE 7 | -| Unique violation | MOT_ERROR_UNIQUE_VIOLATION 8 | -| Invalid memory allocation size | MOT_ERROR_INVALID_MEMORY_SIZE 9 | -| Index out of range | MOT_ERROR_INDEX_OUT_OF_RANGE 10 | -| Error code unknown | MOT_ERROR_INVALID_STATE 11 | - -
- -### 返回给用户的错误 - -下面列出了写入数据库服务器日志文件并返回给用户的错误。 - -MOT使用返回码(Return Code,RC)返回Postgres标准错误代码至封装。某些RC会导致向正在与数据库交互的用户生成错误消息。 - -MOT从内部返回Postgres代码(见下文)到数据库包,数据库封装根据标准的Postgres行为对其做出反应。 - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: 提示信息中的%s、%u、%lu指代相应的错误信息(如查询、表名或其他信息)。 - %s:字符串 - %u:数字 - %lu:数字 - -**表 2** 返回给用户并记录到日志文件的错误 - -| 返回给用户的短/长描述 | Postgres代码 | 内部错误码 | -| :---------------------------------- | :------------------------------ | :------------------------------ | -| Success.Denotes success | ERRCODE*SUCCESSFUL*COMPLETION | RC_OK = 0 | -| FailureUnknown error has occurred. | ERRCODE_FDW_ERROR | RC_ERROR = 1 | -| Unknown error has occurred.Denotes aborted operation. | ERRCODE_FDW_ERROR | RC_ABORT | -| Column definition of %s is not supported.Column type %s is not supported yet. | ERRCODE_INVALID_COLUMN_DEFINITION | RC_UNSUPPORTED_COL_TYPE | -| Column definition of %s is not supported.Column type Array of %s is not supported yet. | ERRCODE_INVALID_COLUMN_DEFINITION | RC_UNSUPPORTED_COL_TYPE_ARR | -| Column size %d exceeds max tuple size %u.Column definition of %s is not supported. | ERRCODE_FEATURE_NOT_SUPPORTED | RC_EXCEEDS_MAX_ROW_SIZE | -| Column name %s exceeds max name size %u.Column definition of %s is not supported. | ERRCODE_INVALID_COLUMN_DEFINITION | RC_COL_NAME_EXCEEDS_MAX_SIZE | -| Column size %d exceeds max size %u.Column definition of %s is not supported. | ERRCODE_INVALID_COLUMN_DEFINITION | RC_COL_SIZE_INVLALID | -| Cannot create table.Cannot add column %s; as the number of declared columns exceeds the maximum declared columns. | ERRCODE_FEATURE*NOT*SUPPORTED | RC_TABLE_EXCEEDS*MAX*DECLARED_COLS | -| Cannot create index.Total column size is greater than maximum index size %u. | ERRCODE_FDW_KEY*SIZE*EXCEEDS_MAX_ALLOWED | RC_INDEX_EXCEEDS_MAX_SIZE | -| Cannot create index.Total number of indexes for table %s is greater than the maximum number of indexes allowed %u. | ERRCODE_FDW_TOO*MANY*INDEXES | RC_TABLE_EXCEEDS_MAX_INDEXES | -| Cannot execute statement.Maximum number of DDLs per transaction reached the maximum %u. | ERRCODE_FDW_TOO*MANY*DDL_CHANGES*IN*TRANSACTION*NOT*ALLOWED | RC_TXN_EXCEEDS_MAX_DDLS | -| Unique constraint violationDuplicate key value violates unique constraint \“%s\“”.Key %s already exists. | ERRCODE*UNIQUE*VIOLATION | RC_UNIQUE_VIOLATION | -| Table \“%s\” does not exist. | ERRCODE_UNDEFINED_TABLE | RC_TABLE_NOT_FOUND | -| Index \“%s\” does not exist. | ERRCODE_UNDEFINED_TABLE | RC_INDEX_NOT_FOUND | -| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_LOCAL_ROW_FOUND | -| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_LOCAL_ROW_NOT_FOUND | -| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_LOCAL_ROW_DELETED | -| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_INSERT_ON_EXIST | -| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_INDEX_RETRY_INSERT | -| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_INDEX_DELETE | -| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_LOCAL_ROW_NOT_VISIBLE | -| Memory is temporarily unavailable. | ERRCODE_OUT_OF_LOGICAL_MEMORY | RC_MEMORY_ALLOCATION_ERROR | -| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_ILLEGAL_ROW_STATE | -| Null constraint violated.NULL value cannot be inserted into non-null column %s at table %s. | ERRCODE_FDW_ERROR | RC_NULL_VIOLATION | -| Critical error.Critical error: %s. | ERRCODE_FDW_ERROR | RC_PANIC | -| A checkpoint is in progress - cannot truncate table. | ERRCODE_FDW_OPERATION_NOT_SUPPORTED | RC_NA | -| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_MAX_VALUE | -| <recovery message> | | ERRCODE_CONFIG_FILE_ERROR | -| <recovery message> | | ERRCODE_INVALID*TABLE*DEFINITION | -| Memory engine - Failed to perform commit prepared. | | ERRCODE_INVALID*TRANSACTION*STATE | -| Invalid option <option name> | | ERRCODE_FDW_INVALID*OPTION*NAME | -| Invalid memory allocation request size. | | ERRCODE_INVALID*PARAMETER*VALUE | -| Memory is temporarily unavailable. | | ERRCODE_OUT_OF*LOGICAL*MEMORY | -| Could not serialize access due to concurrent update. | | ERRCODE_T_R*SERIALIZATION*FAILURE | -| Alter table operation is not supported for memory table.Cannot create MOT tables while incremental checkpoint is enabled.Re-index is not supported for memory tables. | | ERRCODE_FDW_OPERATION*NOT*SUPPORTED | -| Allocation of table metadata failed. | | ERRCODE_OUT_OF_MEMORY | -| Database with OID %u does not exist. | | ERRCODE_UNDEFINED_DATABASE | -| Value exceeds maximum precision: %d. | | ERRCODE_NUMERIC_VALUE*OUT*OF_RANGE | -| You have reached a maximum logical capacity %lu of allowed %lu. | | ERRCODE_OUT_OF*LOGICAL*MEMORY | diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/6-mot-sample-tpcc-benchmark.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/6-mot-sample-tpcc-benchmark.md deleted file mode 100644 index 5be939a923725d4c30811bd28fbf75c672122018..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/6-mot-sample-tpcc-benchmark.md +++ /dev/null @@ -1,116 +0,0 @@ ---- -title: MOT样例TPC-C基准 -summary: MOT样例TPC-C基准 -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT样例TPC-C基准 - -## TPC-C简介 - -TPC-C基准是衡量联机事务处理(OLTP)系统性能的行业标准基准。它基于一个复杂的数据库和许多不同的事务类型。这些事务类型在此基准上执行。TPC-C基准测试既不依赖硬件,也不依赖软件,因此可以在每个测试平台上运行。基准模型的官方概述,见[tpc.org网站](http://www.tpc.org/default5.asp)。 - -该数据库由9个不同结构的表组成,因此也包括9种类型的数据。每个表的数据大小和数量不同。在数据库上混合执行五种不同类型和复杂性的并发事务。这些大部分是在线事务或者部分排队等待延迟批处理。由于这些表竞争有限的系统资源,许多系统组件都有压力,数据更改以各种方式执行。 - -**表 1** TPC-C数据库结构 - -| 表 | 条目数 | -| :------- | :------------------- | -| 仓库 | n | -| 供货商品 | 100,000 | -| 库存 | n x 100,000 | -| 地区 | n x 10 | -| 客户 | 3000/区,30,000/仓库 | -| 订单 | 客户数量(初始值) | -| 新增订单 | 30%订单(初始值) | -| 定单分录 | ~10/单 | -| 历史记录 | 客户数量(初始值) | - -事务组合代表从订单输入到订单交付的完整业务处理。具体来说,所提供的组合旨在产生相等数量的新订单事务和支付事务,并且为每十个新订单事务产生一个交付事务、一个订单状态事务和一个库存水平事务。 - -**表 2** TPC-C事务比例 - -| 事务级别≥4% | 占所有事务份额 | -| :---------- | :------------- | -| TPC-C新订单 | ≤ 45% | -| 支付 | ≥ 43% | -| 订单状态 | ≥ 4% | -| 交付 | ≥4%(批次) | -| 库存水平 | ≥ 4% | - -有两种方法来执行事务:作为存储过程(允许更高的吞吐量)和以标准交互式SQL模式执行。 - -**性能指标:tpm-C** - -tpm-C指标是每分钟执行的新订单事务数。考虑到事务中所需的组合以及广泛的复杂性和类型,此指标最接近地模拟一个全面的业务活动,而不仅仅是一个或两个事务或计算机操作。因此,tpm-C指标被认为是业务吞吐量的指标。 - -tpm-C指标单位表示为每分钟事务数-C,而C表示TPC-C特定基准。 - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: 官方TPC-C基准规范可访问[此页面](http://www.tpc.org/tpc_documents_current_versions/pdf/tpc-c_v5.11.0.pdf)。本规范中的一些规则在行业中难以实现,因为对行业现状来说这些规则太严格了。例如:扩容规则(a) tpm-C/Warehouse必须大于9且小于12.86(要达到较高的tpm-C率,需要很高的仓库费率。这就意味着需要非常大的数据库和内存容量)以及规则(b)10倍终端*仓库(意味着大量的模拟客户端)。 - -## 系统级优化 - -请按照[MOT部署](3-mot-deployment.md)中的说明进行操作。下面介绍MogDB数据库在华为TaiShan服务器和Euler 2.8操作系统上部署时系统级的关键优化点,以达到极致性能。 - -## BenchmarkSQL:开源TPC-C工具 - -可以使用BenchmarkSQL测试TPCC,如下所示: - -- 下载benchmarksql:[https://osdn.net/frs/g_redir.php?m=kent&f=benchmarksql%2Fbenchmarksql-5.0.zip](https://osdn.net/frs/g_redir.php?m=kent&f=benchmarksql/benchmarksql-5.0.zip) -- benchmarksql工具中的模式创建脚本需要调整为MOT语法,避免使用不支持的DDL。下载调整后的脚本:。该tar文件的内容包括sql.common.mogdb.mot文件夹和jTPCCTData.java文件,以及一个示例配置文件postgresql.conf和TPCC属性文件props.mot供参考。 -- 将sql.common.mogdb.mot文件夹放在run文件夹下与sql.common同级的文件夹,用下载的Java文件替换src/client/jTPCCTData.java文件。 -- 编辑run文件夹下的runDatabaseBuild.sh文件,将extraHistID从AFTER_LOAD列表中删除,以避免不支持的ALTER表DDL。 -- 将lib/postgres文件夹下的JDBC驱动替换为openGauss JDBC。驱动下载链接:。 - -在下载的Java文件(与原始文件相比)中所做的唯一更改是注释错误日志打印,以进行序列化和重复键错误。这些错误在MOT中是正常的,因为MOT使用的是乐观并发控制(OCC)机制。 - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: 基准测试使用标准交互式SQL模式执行,没有存储过程。 - -## 运行基准 - -任何人都可以启动服务器,运行benchmarksql脚本。 - -运行基准测试: - -1. 进入benchmarksql运行文件夹,将sql.common重命名为sql.common.orig。 -2. 创建sql.common到sql.common.mogdb.mot的链接,用于测试MOT。 -3. 启动数据库服务器。 -4. 配置客户端props.pg文件。 -5. 运行基准测试。 - -## 结果报告 - -- CLI结果 - - BenchmarkSQL结果应如下所示: - - ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-sample-tpcc-benchmark-1.jpg) - - 随着时间的推移,基准测量并平均已提交的事务。上面的例子是两分钟的基准测试。 - - 得分为271万tmp-C(每分钟新增订单数),占总承诺事务数的45%,即tpmTOTAL。 - -- 详细结果报告 - - 详细结果报告示例: - -**图 1** 详细结果报告 - -![详细结果报告](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-sample-tpcc-benchmark-2.png) - -![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-sample-tpcc-benchmark-3.png) - -BenchmarkSQL收集详细的性能统计数据和操作系统性能数据(如果配置了的话)。 - -这些信息可以显示查询的延迟,从而暴露与存储/网络/CPU相关的瓶颈。 - -华为TaiShan 2480 MOT TPC-C测试结果 - -2020年5月1日TPC-C基准测试,TaiShan 2480服务器(Arm/鲲鹏4路服务器)安装MogDB数据库,吞吐量达到479万tpmC。 - -下图展示了近乎线性的可扩展性: - -**图 2** 华为TaiShan 2480 MOT TPC-C测试结果 - -![华为TaiShan-2480-MOT-TPC-C测试结果](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-sample-tpcc-benchmark-4.png) diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/using-mot.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/using-mot.md deleted file mode 100644 index 27fb00548b6fd94e4ffbab897ffc575c33744bff..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/2-using-mot/using-mot.md +++ /dev/null @@ -1,17 +0,0 @@ ---- -title: 使用MOT -summary: 使用MOT -author: Guo Huan -date: 2023-05-22 ---- - -# 使用MOT - -本章介绍如何部署、使用和管理MogDB MOT。使用MOT的方法非常简单。MOT命令的语法与MogDB基于磁盘的表相同。只有MOT中的创建和删除表语句与MogDB中基于磁盘的表的语句不同。您可以参考本章了解如何入门、如何将基于磁盘的表转换为MOT、如何使用高级MOT功能,如查询和存储过程的原生编译(JIT)、执行跨引擎事务,以及MOT的限制和覆盖范围。本章还将介绍MOT管理选项,以及如何进行TPC-C基准测试。 - -+ **[MOT使用概述](1-using-mot-overview.md)** -+ **[MOT准备](2-mot-preparation.md)** -+ **[MOT部署](3-mot-deployment.md)** -+ **[MOT使用](4-mot-usage.md)** -+ **[MOT管理](5-mot-administration.md)** -+ **[MOT样例TPC-C基准](6-mot-sample-tpcc-benchmark.md)** \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-1.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-1.md deleted file mode 100644 index 2f653b29c6f00083b552964f8d43e8c60a7a475b..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-1.md +++ /dev/null @@ -1,95 +0,0 @@ ---- -title: MOT纵向扩容架构 -summary: MOT纵向扩容架构 -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT纵向扩容架构 - -纵向扩容即为同一台机器添加额外的核以增加算力。纵向扩容是传统上为单对控制器和多核的机器增加算力的常见形式。纵向扩容架构受限于控制器的可扩展性。 - -
- -## 技术要求 - -MOT旨在实现以下目标: - -- **线性扩容**:MOT提供事务性存储引擎,利用单个NUMA架构服务器的所有核,以提供近线性的扩容性能。这意味着MOT的目标是在机器的核数和性能提升倍数之间实现直接的、近线性的关系。 - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: MOT的近线性扩容效果明显优于所有现有方案,并且尽可能接近于获得最佳效果,因现有方案皆受限于硬件(如电线)的物理限制和局限性。 - -- **无最大核数限制**:MOT对最大核数不做任何限制。这意味着MOT可从单核扩展到高达1000秒的多核,并且新增的核退化速度最小,即便是在跨NUMA槽位边界的情况下。 -- **极高的事务性吞吐量**:MOT提供了一个事务性存储引擎,与市场上任何其他OLTP供应商相比,它能够实现极高的事务性吞吐量。 -- **极低的事务性时延**:与市场上任何其他OLTP供应商相比,MOT提供事务性存储引擎,可以达到极低的事务时延。 -- **无缝集成和利用MogDB产品**:MOT事务引擎与MogDB产品标准无缝集成。通过这种方式,MOT最大限度地重用了位于其事务性存储引擎顶部的MogDB层功能。 - -
- -## 设计原则 - -为了实现上述要求(特别是在多核的环境中),我们存储引擎的体系结构实施了以下技术和策略: - -- 数据和索引只存在于内存中。 -- 数据和索引不用物理分区来布局(因为对于某些类型的应用程序,这些分区的性能可能会降低)。 -- 事务并发控制基于乐观并发控制(OCC),没有集中的争用点。有关OCC的详细信息,请参见[MOT并发控制机制](3-2.md)。 -- 使用平行重做日志(最后单位为核)来有效避免中央锁定点。 -- 使用免锁索引。有关免锁索引的详细信息,请参见[MOT索引](3-5.md)。 -- 使用NUMA感知内存分配,避免跨槽位访问,特别是会话生命周期对象。有关NUMA感知的更多信息,请参见[NUMA-aware分配和亲和性](3-4.md)。 -- 使用带有预缓存对象池的自定义MOT内存管理分配器,避免昂贵的运行时间分配和额外的争用点。这种专用的MOT内存分配器按需预先访问操作系统中较大的内存块,然后按需将内存分配给MOT,使内存分配更加高效。 - -
- -## 使用外部数据封装(FDW)进行集成 - -MOT遵循并利用了MogDB的标准扩展机制 - 外部数据封装(FDW),如下图所示。 - -在PostgreSQL外部数据封装特性的支持下,作为其他数据源的代理的MOT数据库可以创建外表,如MySQL、Oracle、PostgreSQL等。当对外表执行查询时,FDW将查询外部数据源并返回结果,就像查询内表一样。 - -MogDB依赖PostgreSQL外部数据封装和索引支持,因此SQL完全覆盖,包括存储过程、用户定义函数、系统函数调用。 - -**图 1** MOT架构 - -![MOT架构](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-scale-up-architecture-1.png) - -上图中绿色表示MOT引擎,蓝色表示现有的MogDB(基于Postgres)组件。由此可见,FDW在MOT引擎和MogDB组件之间进行中介。 - -**与MOT相关的FDW定制** - -通过FDW集成MOT可以重用最上层的MogDB功能,从而显著缩短MOT的上市时间,同时不影响SQL的覆盖范围。 - -但是,MogDB中原有的FDW机制并不是为存储引擎扩展而设计的,因此缺少以下基本功能: - -- 查询规划阶段待计算的外表的索引感知 -- 完整的DDL接口 -- 完整的事务生命周期接口 -- 检查点接口 -- 重做日志接口 -- 恢复接口 -- 真空接口 - -为了支持所有缺失的功能,SQL层和FDW接口层已扩展,从而为插入MOT事务存储引擎提供必要的基础设施。 - -
- -## 结果:线性扩容 - -以下是上述MOT设计原则和实现的结果: - -MOT在符合ACID工作负载的事务吞吐量方面优于所有现有的工业级OLTP数据库。 - -MogDB和MOT在以下多核系统上进行了测试,性能可扩展性良好。在x86架构Intel和ARM/鲲鹏架构的多核服务器上进行了测试。详细的性能评估请参见[MOT性能基准](../../../administrator-guide/mot-engine/1-introducing-mot/5-mot-performance-benchmarks.md)。 - -以2020年6月的TPC-C基准测试了一台泰山2480服务器上的MogDB MOT数据库(4路ARM/鲲鹏服务器,吞吐量:480万tpmC)。下图显示了MOT数据库的近线性性质,即MOT数据库通过增加核数显著提高性能。 - -**图 2** ARM上的TPC-C(256核) - -![ARM上的TPC-C(256核)](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-3.png) - -下面是另一个测试示例,一台基于x86的服务器上也显示了CPU使用率。 - -**图 3** tpmC 对比CPU使用率 - -![tpmC-对比CPU使用率](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-9.png) - -图表显示,MOT性能提高与核数增加有显著的相关性。随着核数的增加,MOT对CPU的消耗也越来越大。其他行业解决方案不能提高MOT性能,有时性能甚至略有下降,影响客户的CAPEX和OPEX支出以及运营效率。这是数据库行业的公认问题。 diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-2.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-2.md deleted file mode 100644 index aa77978a7e9d2089781c14da434b305b9a0f1ead..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-2.md +++ /dev/null @@ -1,191 +0,0 @@ ---- -title: MOT并发控制机制 -summary: MOT并发控制机制 -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT并发控制机制 - -通过大量研究,我们找到了最佳的并发控制机制,结论为:基于SILO的OCC算法是MOT中最符合ACID特性的OCC算法。SILO为满足MOT的挑战性需求提供了最好的基础。 - -随着MogDB 5.0的发布,MOT现已支持MVCC,其中包括减少了读取和更新事务之间的争用,从而减少了OCC方法导致的事务中止。 - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: MOT完全符合原子性、一致性、隔离性、持久性(ACID)特性,如[MOT简介](../../../administrator-guide/mot-engine/1-introducing-mot/1-mot-introduction.md)所述。 - -下面介绍MOT的并发控制机制。 - -
- -## MOT本地内存和全局内存 - -SILO管理本地内存和全局内存,如图1所示。 - -- 全局内存是所有核共享的长期内存,主要用于存储所有的表数据和索引。 -- 本地内存是短期内存,主要由会话使用,用于处理事务及将数据更改存储到事务内存中,直到提交阶段。 - -当事务需要更改时,SILO将该事务的所有数据从全局内存复制到本地内存。使用OCC方法,全局内存中放置的是最小的锁,因此争用时间极短。事务更改完成后,该数据从本地内存回推到全局内存中。 - -本地内存与SILO增强并发控制的基本交互式事务流如下所示: - -**图 1** 私有(本地)内存(每个事务)和全局内存(所有核的所有事务) - -![私有(本地)内存(每个事务)和全局内存(所有核的所有事务)](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-concurrency-control-mechanism-1.png) - -具体请参见[对比:磁盘与MOT](3-9.md)。 - -
- -## MOT SILO增强特性 - -SILO凭借其基本算法流程,优于我们在研究实验中测试的许多其他符合ACID的OCC算法。然而,为了使SILO成为产品级机制,我们必须用许多在最初设计中缺失的基本功能来增强它,例如: - -- 新增对交互式事务的支持,其中事务的SQL运行在客户端实现,而不是作为服务器端的单个步骤运行。 -- 新增乐观插入 -- 新增对非唯一索引的支持 -- 新增对事务中写后读校验(RAW)的支持,使用户能够在提交之前查看更改 -- 新增对无锁协同垃圾回收的支持 -- 新增对无锁检查点的支持 -- 新增对快速恢复的支持 -- 新增对多版本并发控制(MVCC)的支持(MogDB 5.0)。 - -在不破坏原始SILO的可扩展特性的前提下添加这些增强是非常具有挑战性的。 - -
- -## MOT隔离级别 - -即使MOT完全兼容ACID,MogDB 2.1并非支持所有的隔离级别。下表介绍了各隔离级别,以及MOT支持和不支持的内容。 - -**表 1** 隔离级别 - -| 隔离级别 | 说明 | -| :--------------- | :----------------------------------------------------------- | -| READ UNCOMMITTED | **MOT不支持** | -| READ COMMITTED | **MOT支持**
READ COMMITTED(读已提交)隔离级别保证任何正在读取的数据在上一次读取时都已提交。它只是限制读者看到任何中间数据、未提交数据,或脏读。数据被读取后可以自由更改,因此,读已提交隔离级别并不保证事务再次读取时能找到相同的数据。 | -| SNAPSHOT | **MOT不支持**
SNAPSHOT(快照)隔离级别提供与SERIALIZABLE(可序列化)相同的保证,除了支持并发事务修改数据。相反,它迫使每个读者看到自己的世界版本(自己的快照)。不阻止并发更新使得编程非常容易,且可扩展性很强。 | -| REPEATABLE READ | **MOT支持**
REPEATABLE READ(可重复读)是一个更高的隔离级别,除了READ COMMITTED隔离级别的保证之外,它还保证任何读取的数据都不能更改。如果一个事务再次读取相同的数据,它将找出该数据,不做更改,并且保证它可读取。乐观模型使得并发事务能更新该事务读取的行。在提交时,该事务将验证REPEATABLE READ隔离级别是否被违反。若违反,则回滚该事务,必须重试。 | -| SERIALIZABLE | **MOT不支持**
SERIALIZABLE(可序列化)隔离提供了更强的保证。除了REPEATABLE READ隔离级别保证的所有内容外,它还保证后续读取不会看到新数据。它之所以被命名为SERIALIZABLE,是因为隔离非常严格,几乎有点像事务串行运行,而不是并行运行。 | - -下表显示了不同隔离级别启用的并发副作用。 - -**表 2** 隔离级别启用的并发副作用 - -| 隔离级别 | 说明 | 不可重复读 | 幻影 | -| :--------------- | :--- | :--------- | :--- | -| READ UNCOMMITTED | 是 | 是 | 是 | -| READ COMMITTED | 否 | 是 | 是 | -| REPEATABLE READ | 否 | 否 | 是 | -| SNAPSHOT | 否 | 否 | 否 | -| SERIALIZABLE | 否 | 否 | 否 | - -
- -## MOT乐观并发控制 - -并发控制模块(简称CC模块)提供了主内存引擎的所有事务性需求。CC模块的主要目标是为主内存引擎提供各种隔离级别的支持。 - -
- -### 乐观OCC与悲观2PL - -悲观2PL(2阶段锁定)和乐观并发控制(OCC)的功能差异在于对事务完整性分别采用悲观和乐观方法。 - -基于磁盘的表使用悲观方法,这是最常用的数据库方法。MOT引擎使用的是乐观方法。 - -悲观方法和乐观方法的主要功能区别在于,如果冲突发生, - -- 悲观的方法会导致客户端等待; -- 而乐观方法会导致其中一个事务失败,使得客户端必须重试失败的事务。 - -**乐观并发控制方法(MOT使用)** - -乐观并发控制(OCC)方法在冲突发生时检测冲突,并在提交时执行验证检查。 - -乐观方法开销较小,而且通常效率更高,原因之一是事务冲突在大多数应用程序中并不常见。 - -当强制执行REPEATABLE READ隔离级别时,乐观方法与悲观方法之间的函数差异更大,而当强制执行SERIALIZABLE隔离级别时,函数差异最大。 - -**悲观方法(MOT未使用)** - -悲观并发控制(2PL,或称2阶段锁定)方法使用锁阻止在潜在冲突的发生。执行语句时应用锁,提交事务时释放锁。基于磁盘的行存储使用这种方法,并且添加了多版本并发控制(Multi-version Concurrency Control,MVCC)。 - -在2PL算法中,当一个事务正在写入行时,其他事务不能访问该行;当一个行正在读取时,其他事务不能覆盖该行。在访问时锁定每个行,以进行读写;在提交时释放锁。这些算法需要一个处理和避免死锁的方案。死锁可以通过计算等待图中的周期来检测。死锁可以通过使用TSO保持时序或使用某种回退方案来避免。 - -**遇时锁定(ETL)** - -另一种方法是遇时锁定(ETL),它以乐观的方式处理读取,但写入操作锁定它们访问的数据。因此,来自不同ETL事务的写入操作相互感知,并可以决定中止。实验证明,ETL通过两种方式提高OCC的性能: - -- 首先,ETL会在早期检测冲突,并通常能增加事务吞吐量。这是因为事务不会执行无用的操作。(通常)在提交时发现的冲突无法在不中止至少一个事务的情况下解决。 -- 其次,ETL写后读校验(RAW)运行高效,无需昂贵或复杂的机制。 - -**结论**: - -OCC是大多数工作负载最快的选项。这一点我们在初步研究阶段已经发现。 - -其中一个原因是,当每个核执行多个线程时,锁很可能被交换线程持有,特别是在交互模式下。另一个原因是悲观算法涉及死锁检测(产生开销),并通常使用读写锁(比标准自旋锁效率低)。 - -我们选择Silo是因为它比其他现有选项(如TicToc)简单,同时对大多数工作负载保持相同的性能。ETL有时比OCC更快,但它引入了假中止,可能会使用户混淆,而OCC则只在提交时中止。 - -
- -### OCC与2PL的区别举例 - -下面是会话同时更新同一个表时,两种用户体验的区别:悲观(针对基于磁盘的表)和乐观(针对MOT表)。 - -本例中,使用如下表测试命令: - -``` -table “TEST” - create table test (x int, y int, z int, primary key(x)); -``` - -本示例描述同一测试的两个方面:用户体验(本示例中的操作)和重试要求。 - -**悲观方法示例 - 用于基于磁盘的表** - -下面是一个悲观方法例子(非MOT)。任何隔离级别都可能适用。 - -以下两个会话执行尝试更新单个表的事务。 - -WAIT LOCK操作发生,客户端体验是:会话2卡住,直到会话1完成COMMIT,会话2才能进行。 - -但是,使用这种方法时,两个会话都成功,并且不会发生异常中止(除非应用了SERIALIZABLE或REPEATABLE-READ隔离级别),这会导致整个事务需要重试。 - -**表 3** 悲观方法代码示例 - -| | 会话1 | 会话2 | -| :--- | :------------------------------- | :----------------------------------------------------------- | -| t0 | Begin | Begin | -| t1 | update test set y=200 where x=1; | | -| t2 | y=200 | Update test set y=300 where x=1; - Wait on lock | -| t4 | Commit | | -| | | Unlock | -| | | Commit(in READ-COMMITTED this will succeed, in SERIALIZABLE it will fail) | -| | | y = 300 | - -**乐观方法示例 - 用于MOT** - -下面是一个乐观方法的例子。 - -它描述了创建一个MOT表,然后有两个并发会话同时更新同一个MOT表的情况。 - -``` -create foreign table test (x int, y int, z int, primary key(x)); -``` - -- OCC的优点是,在COMMIT之前没有锁。 -- OCC的缺点是,如果另一个会话更新了相同的记录,则更新可能会失败。如果更新失败(在所有支持的隔离级别中),则必须重试整个会话#2事务。 -- 更新冲突由内核在提交时通过版本检查机制检测。 -- 会话2将不会等待其更新操作,并且由于在提交阶段检测到冲突而中止。 - -**表 4** 乐观方法代码示例--用于MOT - -| | 会话1 | 会话2 | -| :--- | :------------------------------- | :------------------------------- | -| t0 | Begin | Begin | -| t1 | update test set y=200 where x=1; | | -| t2 | y=200 | Update test set y=300 where x=1; | -| t4 | Commit | y = 300 | -| | | Commit | -| | | ABORT | -| | | y = 200 | diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-3.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-3.md deleted file mode 100644 index 3975056b08a9d129a3abc330475c1d3b3376565c..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-3.md +++ /dev/null @@ -1,71 +0,0 @@ ---- -title: 扩展FDW与其他MogDB特性 -summary: 扩展FDW与其他MogDB特性 -author: Zhang Cuiping -date: 2021-03-04 ---- - -# 扩展FDW与其他MogDB特性 - -MogDB基于PostgreSQL,而PostgreSQL没有内置存储引擎适配器,如MySQL的handlerton。为了使MOT存储引擎能够集成到MogDB中,我们利用并扩展了现有的FDW机制。随着FDW引入PostgreSQL 9.1,现在可以将这些外表和数据源呈现为统一、本地可访问的关系来访问外部管理的数据库。 - -和PostgreSQL不同的是,MOT存储引擎是嵌入在MogDB内部的,表由MogDB管理。MogDB规划器和执行器控制表的访问。MOT从MogDB获取日志和检查点服务,并参与MogDB恢复过程和其他过程。 - -我们把正在使用或正在访问MOT存储引擎的所有组件称为封装。 - -下图显示了MOT存储引擎如何嵌入到MogDB中及其对数据库功能的双向访问。 - -**图 1** MogDB内置MOT存储引擎-外部数据库的FDW访问 - -![MOT架构](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-scale-up-architecture-1.png) - -我们通过扩展和修改FdwRoutine结构来扩展FDW的能力,以便引入在MOT引入之前不需要的特性和调用。例如,新增了对以下功能的支持:添加索引、删除索引/表、截断、真空和表/索引内存统计。重点放在了FdwRoutine结构与MogDB日志、复制和检查点机制的集成,以便通过故障为跨表事务提供一致性。在这种情况下,MOT本身有时会通过FDW层发起对MogDB功能的调用。 - -
- -## 创建表和索引 - -为了支持MOT表的创建,重用了标准的FDW语法。 - -例如,创建FOREIGN表。 - -MOT DW机制将指令传递给MOT存储引擎,用于实际建表。同样,我们支持创建索引(create index …)。此功能以前在FDW中不可用,因为表由外部管理,不需要此功能。 - -为了在MOT FDW中支持两者,ValidateTableDef函数实际上创建了指定的表。它还处理该关系的索引创建,DROP TABLE和DROP INDEX,以及先前在FDW中不支持的VACUUM和ALTER TABLE。 - -
- -## 索引规划与执行的使用方法 - -查询分为两个阶段:规划和执行。在规划阶段(可能在多次执行中才出现一次),选择扫描的最佳索引。该选择基于匹配查询的WHERE子句、JOIN子句和ORDER BY条件。在执行期间,查询迭代相关的表行,并执行各种任务,如每次迭代的更新或删除。插入是一种特殊情况-表将行添加到所有索引中,且不需要扫描。 - -- **规划器**:在标准FDW中,将查询传递给外部数据源执行。这意味着索引过滤和实际规划(例如索引的选择)不在数据库中本地执行,而是在外部数据源中执行。在内部,FDW向数据库规划器返回总体计划。MOT表的处理方式与磁盘表类似。这意味着相关的MOT索引得到过滤和匹配,最小化遍历行集的索引被选择并添加到计划中。 -- **执行器**:查询执行器使用所选的MOT索引来迭代表的相关行。每个行都由MogDB封装检查,根据查询条件调用update或delete处理相应的行。 - -
- -## 持久性、复制性和高可用性 - -存储引擎负责存储、读取、更新和删除底层内存和存储系统中的数据。存储引擎不处理日志、检查点和恢复,特别是因为某些事务包含多个不同存储引擎的表。因此,为了数据持久化和复制,MogDB封装使用如下高可用性设施: - -- **持久性**:MOT引擎通过WAL记录使数据持久化,WAL记录使用MogDB的XLOG接口。这为MogDB提供了使用相同API进行复制的好处。具体请参见[MOT持久性概念](3-6.md)。 -- **检查点设定**:通过向MogDB Checkpointer注册回调来启用MOT检查点每当执行通用数据库检查点时,MOT检查点也被调用。MOT保留了检查点的日志序列号(LSN),以便与MogDB恢复对齐。MOT Checkpointing算法是高度优化的异步算法,不会停止并发事务。具体请参见[MOT检查点概念](3-6.md#mot检查点概念)。 -- **恢复**:启动时,MogDB首先调用MOT回调,通过加载到内存行并创建索引来恢复MOT检查点,然后根据检查点的LSN重放记录来执行WAL恢复。MOT检查点使用多线程并行恢复,每个线程读取不同的数据段。这使MOT检查点在多核硬件上的恢复速度相当快,尽管可能比仅重放WAL记录的基于磁盘的表慢一些。具体请参见[MOT恢复概念](3-7.md)。 - -
- -## VACUUM和DROP - -为了最大化MOT功能,我们增加了对VACUUM、DROP TABLE和DROP INDEX的支持。这三个操作都使用排他表锁执行,这意味着不允许在表上并发事务。系统VACUUM调用一个新的FDW函数执行MOT真空,而ValidateTableDef()函数中增加了DROP。 - -
- -## 删除内存池 - -每个索引和表都跟踪它使用的所有内存池。DROP INDEX命令用于删除元数据。内存池作为单个连续块被删除。MOT VACUUM只对已使用的内存进行压缩,因为内存回收由基于epoch的垃圾收集器(GC)在后台持续进行。为了执行压缩,我们将索引或表切换到新的内存池,遍历所有实时数据,删除每行并使用新池插入数据,最后删除池,就像执行DROP那样。 - -
- -## 查询本机编译(JIT) - -MOT引擎的FDW适配器还包含一个轻量级执行路径,该路径使用LLVM编译器执行JIT编译查询。 \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-4.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-4.md deleted file mode 100644 index 3c6a0fd939cdd5620b48376774a5a7c88e8dbe09..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-4.md +++ /dev/null @@ -1,22 +0,0 @@ ---- -title: NUMA-aware分配和亲和性 -summary: NUMA-aware分配和亲和性 -author: Zhang Cuiping -date: 2021-03-04 ---- - -# NUMA-aware分配和亲和性 - -非统一内存访问(NUMA)是一种计算机内存设计,用于多重处理,其中内存访问时间取决于内存相对于处理器的位置。处理器可以利用NUMA的优势,优先访问本地内存(速度更快),而不是访问非本地内存(这意味着它不会访问另一个处理器的本地内存或处理器之间共享的内存)。 - -MOT内存访问设计时采用了NUMA感知。即MOT意识到内存不是统一的,而是通过访问最快和最本地的内存来获得最佳性能。 - -NUMA的优点仅限于某些类型的工作负载,特别是数据通常与某些任务或用户强相关的服务器上的工作负载。 - -在NUMA平台上运行的内储存数据库系统面临一些问题,例如访问远程主内存时,时延增加和带宽降低。为了应对这些NUMA相关问题,NUMA感知必须被看作是数据库系统基本架构的主要设计原则。 - -为了便于快速操作和高效利用NUMA节点,MOT为每个表的行分配一个指定的内存池,同时为索引的节点分配一个指定的内存池。每个内存池由多个2MB的块组成。指定API从本地NUMA节点、来自所有节点的页面或通过轮询分配这些块,每个块在下一个节点上分配。默认情况下,共享数据池以轮询方式分配,以保持访问均衡,同时避免在不同NUMA节点之间拆分行。但是,线程专用内存是从一个本地节点分配的,必须验证线程始终运行在同一个NUMA节点中。 - -**总结** - -MOT有一个智能内存控制模块,它预先为各种类型的内存对象分配了内存池。这种智能内存控制可以提高性能,减少锁并保证稳定性。事务的内存对象分配始终是NUMA-local,从而保证了CPU内存访问的最佳性能,降低时延和争用。被释放的对象返回到内存池中。在事务期间最小化使用操作系统的malloc函数可以避免不必要的锁。 diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-5.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-5.md deleted file mode 100644 index 7b117089c6effad853ff9f8eb437a5bef1a6341c..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-5.md +++ /dev/null @@ -1,43 +0,0 @@ ---- -title: MOT索引 -summary: MOT索引 -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT索引 - -MOT索引基于最先进的Masstree的免锁索引,用于多核系统的快速和可扩展的键值(KV)存储,通过B+树的Trie实现。在多核服务器和高并发工作负载上,性能优异。它使用各种先进的技术,如乐观锁方法、缓存感知和内存预取。 - -在比较了各种最先进的解决方案之后,我们选择了Masstree作为索引,因为它显示了点查询、迭代和修改的最佳总体性能。Masstree是Trie和B+树的组合,用以谨慎利用缓存、预取、乐观导航和细粒度锁定。它针对高争用进行了优化,并对其前代产品增加了许多优化,如OLFIT。然而,Masstree索引的缺点是它的内存消耗更高。虽然行数据占用相同的内存大小,但每个索引(主索引或辅助索引)的每行内存平均高了16字节-基于磁盘的表使用基于锁的B树,大小为29字节,而MOT的Masstree大小为45字节。 - -我们的实证研究表明,成熟的免锁Masstree实现与我们对Silo的强大改进相结合,恰能为我们解决这一方面的问题。 - -另一个挑战是对具有多个索引的表使用乐观插入。 - -Masstree索引是用于数据和索引管理的MOT内存布局的核心。我们的团队增强并显著改进了Masstree,同时提交了一些关键贡献给Masstree开源。这些改进包括: - -- 每个索引都有专用内存池:高效分配和快速索引下移 -- Masstree全球GC:快速按需内存回收 -- 具有插入键访问的大众树迭代器实现 -- ARM架构支持 - -我们为Masstree开放源码实现贡献了我们的Masstree索引改进,可以在[https://github.com/kohler/masstree-beta](https://github.com/kohler/masstree-beta)找到。 - -MOT的主要创新是增强了原有的Masstree数据结构和算法,它不支持非唯一索引(作为二级索引)。设计细节请参见[非唯一索引](#非唯一索引)。 - -MOT支持主索引、辅助索引和无键索引(限制参见[不支持的索引DDL和索引](../../../administrator-guide/mot-engine/2-using-mot/4-mot-usage.md#不支持的索引DDL和索引))。 - -
- -## 非唯一索引 - -一个非唯一索引可以包含多个具有相同键的行。非唯一索引仅用于通过维护频繁使用的数据值的排序来提高查询性能。例如,数据库可能使用非唯一索引对来自同一家庭的所有人员进行分组。但是,Masstree数据结构实现不允许将多个对象映射到同一个键。我们用于创建非唯一索引的解决方案(如下图所示)是为映射行的键添加一个打破对称的后缀。这个添加的后缀是指向行本身的指针,该行具有8个字节的常量大小,并且值对该行是唯一的。当插入到非唯一索引时,哨兵的插入总是成功的,这使执行事务分配的行能够被使用。这种方法还使MOT能够为非唯一索引提供一个快速、可靠、基于顺序的迭代器。 - -**图 1** 非唯一索引 - -![非唯一索引](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-indexes-1.png) - -上图描述了一个MOT的T表的结构,它有三个行和两个索引。矩形表示数据行,索引指向指向行的哨兵(椭圆形)。哨兵用键插入唯一索引,用键+后缀插入非唯一索引。哨兵可以方便维护操作,无需接触索引数据结构就可替换行。此外,在哨兵中嵌入了各种标志和参考计数,以便于乐观插入。 - -查找非唯一辅助索引时,会使用所需的键(如姓氏)。全串联键只用于插入和删除操作。插入和删除操作总是将行作为参数获取,从而可以创建整个键,并在执行删除或插入索引的特定行时使用它。 diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-6.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-6.md deleted file mode 100644 index c08ca42df02c7bce100d1021e6a26016383663c8..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-6.md +++ /dev/null @@ -1,210 +0,0 @@ ---- -title: MOT持久性概念 -summary: MOT持久性概念 -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT持久性概念 - -持久性是指长期的数据保护(也称为磁盘持久性)。持久性意味着存储的数据不会遭受任何形式的退化或破坏,因此数据不会丢失或损坏。持久性可确保在有计划停机(例如维护)或计划外崩溃(例如电源故障)后数据和MOT引擎恢复到一致状态。 - -内存存储是易失的,需要电源来维护所存储的信息。另一方面,磁盘存储是非易失性的,这意味着它不需要电源来维护存储的信息,因此它不用担心停电。MOT使用这两种类型的存储,它拥有内存中的所有数据,同时将事务性更改持久化到磁盘,并保持频繁的定期**MOT检查点**,以确保在关机时恢复数据。 - -用户必须保证有足够的磁盘空间用于日志记录和检查点操作。检查点使用单独的驱动器,通过减少磁盘I/O负载来提高性能。 - -有关如何在MOT引擎中实现持久性的概述,请参见[MOT关键技术](../../../administrator-guide/mot-engine/1-introducing-mot/3-mot-key-technologies.md)。 - -MOT的WAL重做日志和检查点启用了持久性,如下所述。 - -- [MOT日志记录:WAL重做日志概念](#mot日志记录wal重做日志概念) -- [MOT检查点概念](#mot检查点概念) - -
- -## MOT日志记录:WAL重做日志概念 - -### 概述 - -预写日志记录(WAL)是确保数据持久性的标准方法。WAL的主要概念是,数据文件(表和索引所在的位置)的更改只有在记录这些更改之后才会写入,即只有在描述这些更改的日志记录被刷新到永久存储之后才会写入。 - -MOT全面集成MogDB的封装日志记录设施。除持久性外,这种方法的另一个好处是能够将WAL用于复制目的。 - -支持三种日志记录方式:两种标准同步和一种异步方式。标准MogDB磁盘引擎也支持这三种日志记录方式。此外,在MOT中,组提交(Group-Commit)选项还提供了特殊的NUMA感知优化。Group-Commit在维护ACID属性的同时提供最高性能。 - -为保证持久性,MOT全面集成MogDB的WAL机制,通过MogDB的XLOG接口持久化WAL记录。这意味着,每次MOT记录的添加、更新和删除都记录在WAL中。这确保了可以从这个非易失性日志中重新生成和恢复最新的数据状态。例如,如果向表中添加了3个新行,删除了2个,更新了1个,那么日志中将记录6个条目。 - -- MOT日志记录和MogDB磁盘表的其他记录写入同一个WAL中。 -- MOT只记录事务提交阶段的操作。 -- MOT只记录更新的增量记录,以便最小化写入磁盘的数据量。 -- 在恢复期间,从最后一个已知或特定检查点加载数据;然后使用WAL重做日志完成从该点开始发生的数据更改。 -- WAL重做日志将保留所有表行修改,直到执行一个检查点为止(如上所述)。然后可以截断日志,以减少恢复时间和节省磁盘空间。 - -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: 为了确保日志IO设备不会成为瓶颈,日志文件必须放在具有低时延的驱动器上。 - -- 并行重做恢复:自MogDB 5.0版本发布以来,MOT引擎支持并行恢复机制。多线程执行恢复,单线程完成事务提交,确保事务一致性。这为单秒级的MOT表提供了恢复时间目标(RTO)。 - -
- -### 日志类型 - -支持两个同步事务日志选项和一个异步事务日志选项(标准MogDB磁盘引擎也支持这些选项)。MOT还支持同步的组提交日志记录与NUMA感知优化,如下所述。 - -根据您的配置,实现以下类型的日志记录: - -- **同步重做日志记录** - - 同步重做日志记录选项是最简单、最严格的重做日志记录器。当客户端应用程序提交事务时,事务重做条目记录在WAL重做日志中,如下所示: - - 1. 当事务正在进行时,它存储在MOT内存中。 - 2. 事务完成后,客户端应用程序发送提交命令,该事务被锁定,然后写入磁盘上的WAL重做日志。这意味着,当事务日志条目写入日志时,客户端应用程序仍在等待响应。 - 3. 一旦事务的整个缓冲区被写入日志,就更改内存中的数据,然后提交事务。事务提交后,客户端应用程序收到事务完成通知。 - - **技术说明** - - 当事务结束时,同步重做日志处理程序(SynchronousRedoLogHandler)序列化事务缓冲区,并写入XLOG iLogger实现。 - - **图 1** 同步日志记录 - - ![同步日志记录](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-durability-concepts-1.png) - - **总结** - - **同步重做日志记录**选项是最安全、最严格的,因为它确保了客户端应用程序和每个事务提交时的WAL重做日志条目的完全同步,从而确保了总的持久性和一致性,并且绝对不会丢失数据。此日志记录选项可防止客户端应用程序在事务尚未持久化到磁盘时将事务标记为成功的情况。 - - 同步重做日志记录选项的缺点是,它是三个选项中最慢的日志机制。这是因为客户端应用程序必须等到所有数据都写入磁盘,并且磁盘频繁写入(这通常使数据库变慢)。 - -- **组同步重做日志记录** - - **组同步重做日志记录**选项与**同步重做日志记录**选项非常相似,因为它还确保完全持久性,绝对不会丢失数据,并保证客户端应用程序和WAL重做日志条目的完全同步。不同的是,**组同步重做日志记录**选项将事务重做条目组同时写入磁盘上的WAL重做日志,而不是在提交时写入每个事务。使用组同步重做日志记录可以减少磁盘I/O数量,从而提高性能,特别是在运行繁重的工作负载时。 - - MOT引擎通过根据运行事务的核的NUMA槽位自动对事务进行分组,使用非统一内存访问(NUMA)感知优化来执行同步的组提交记录。 - - 有关NUMA感知内存访问的更多信息,请参见[NUMA-aware分配和亲和性](3-4.md)。 - - 当一个事务提交时,一组条目记录在WAL重做日志中,如下所示: - - 1. 当事务正在进行时,它存储在内存中。MOT引擎根据运行事务的核的NUMA槽位对桶中的事务进行分组。这意味着在同一槽位上运行的所有事务都被分在一组,并且多个组将根据事务运行的核心并行填充。 - - 这样,将事务写入WAL更为有效,因为来自同一个槽位的所有缓冲区都一起写入磁盘。 - - > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: - 每个线程在属于单个槽位的单核/CPU上运行,每个线程只写运行于其上的核的槽位。 - - 2. 在事务完成并且客户端应用程序发送Commit命令之后,事务重做日志条目将与属于同一组的其他事务一起序列化。 - - 3. 当特定一组事务满足配置条件后,如《重做日志(MOT)》小节中描述的已提交的事务数或超时时间,该组中的事务将被写入磁盘的WAL中。这意味着,当这些日志条目被写入日志时,发出提交请求的客户端应用程序正在等待响应。 - - 4. 一旦NUMA-aware组中的所有事务缓冲区都写入日志,该组中的所有事务都将对内存存储执行必要的更改,并且通知客户端这些事务已完成。 - - **技术说明** - - 4种颜色分别代表4个NUMA节点。因此,每个NUMA节点都有自己的内存日志,允许多个连接的组提交。 - - **图 2** 组提交-具有NUMA感知 - - ![组提交-具有NUMA感知](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-durability-concepts-2.png) - - **总结** - - 组同步重做日志记录选项是一个极其安全和严格的日志记录选项,因为它保证了客户端应用程序和WAL重做日志条目的完全同步,从而确保总的持久性和一致性,并且绝不会丢失数据。此日志记录选项可防止客户端应用程序在事务尚未持久化到磁盘时将事务标记为成功的情况。 - - 一方面,该选项的磁盘写入次数比同步重做日志记录选项少,这可能意味着它更快。缺点是事务被锁定的时间更长,这意味着它们被锁定,直到同一NUMA内存中的所有事务都写入磁盘上的WAL重做日志为止。 - - 使用此选项的好处取决于事务工作负载的类型。例如,此选项有利于事务多的系统(而对于事务少的系统而言,则较少使用,因为磁盘写入量也很少)。 - -- **异步重做日志记录** - - **异步重做日志记录**选项是最快的日志记录方法,但是,它不能确保数据不会丢失。也就是说,某些仍位于缓冲区且尚未写入磁盘的数据在电源故障或数据库崩溃时可能会丢失。当客户端应用程序提交事务时,事务重做条目将记录在内部缓冲区中,并按预先配置的时间间隔写入磁盘。客户端应用程序不会等待数据写入磁盘,而是继续到下一个事务。因此异步重做日志记录的速度最快。 - - 当客户端应用程序提交事务时,事务重做条目记录在WAL重做日志中,如下所示: - - 1. 当事务正在进行时,它存储在MOT内存中。 - 2. 在事务完成并且客户端应用程序发送Commit命令后,事务重做条目将被写入内部缓冲区,但尚未写入磁盘。然后更改MOT数据内存,并通知客户端应用程序事务已提交。 - 3. 后台运行的重做日志线程按预先配置的时间间隔收集所有缓存的重做日志条目,并将它们写入磁盘。 - - **技术说明** - - 在事务提交时,事务缓冲区被移到集中缓冲区(指针分配,而不是数据副本),并为事务分配一个新的事务缓冲区。一旦事务缓冲区移动到集中缓冲区,且事务线程不被阻塞,事务就会被释放。实际写入日志使用Postgres WALWRITER线程。当WALWRITER计时器到期时,它首先调用异步重做日志处理程序(通过注册的回调)来写缓冲区,然后继续其逻辑,并将数据刷新到XLOG中。 - - **图 3** 异步日志记录 - - ![异步日志记录](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-durability-concepts-3.png) - - **总结** - - 异步重做日志记录选项是最快的日志记录选项,因为它不需要客户端应用程序等待数据写入磁盘。此外,它将许多事务重做条目分组并把它们写入一起,从而减少降低MOT引擎速度的磁盘I/O数量。 - - 异步重做日志记录选项的缺点是它不能确保在崩溃或失败时数据不会丢失。已提交但尚未写入磁盘的数据在提交时是不持久的,因此在出现故障时无法恢复。异步重做日志记录选项对于愿意牺牲数据恢复(一致性)而不是性能的应用程序来说最为相关。 - - 日志记录设计细节 - - 下面将详细介绍内储存引擎模块中与持久化相关的各个组件的设计细节。 - - **图 4** 三种日志记录选项 - - ![三种日志记录选项](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-durability-concepts-4.png) - -重做日志组件由使用内储存引擎的后端线程和WAL编写器使用,以便持久化其数据。检查点通过Checkpoint管理器执行,由Postgres的Checkpointer触发。 - -- **日志记录设计概述** - - 预写日志记录(WAL)是确保数据持久性的标准方法。WAL的核心概念是,数据文件(表和索引所在的位置)的更改只有在记录了这些更改之后才会写入,这意味着在描述这些更改的日志记录被刷新到永久存储之后。 - - 在内储存引擎中,我们使用现有的MogDB日志设施,并没有从头开始开发低级别的日志API,以减少开发时间并使其可用于复制目的。 - -- **单事务日志记录** - - 在内储存引擎中,事务日志记录存储在事务缓冲区中,事务缓冲区是事务对象(TXN)的一部分。在调用addToLog()时记录事务缓冲区-如果缓冲区超过阈值,则将其刷新并重新使用。当事务提交并通过验证阶段或由于某种原因中止时,相应的消息也会保存在日志中,以便能够在恢复期间确定事务的状态。 - -**图 5** 单事务日志记录 - -![单事务日志记录](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-durability-concepts-5.png) - -并行日志记录由MOT和磁盘引擎执行。但是,MOT引擎通过每个事务的日志缓冲区、无锁准备和单个日志记录增强了这种设计。 - -- **异常处理** - - 持久化模块通过Postgres错误报告基础设施(ereport)处理异常。系统日志中会记录每个错误情况的错误信息。此外,使用Postgres内置的错误报告基础设施将错误报告到封装。 - - 该模块上报有如下异常: - -**表 1** 异常处理 - -| 异常条件 | 异常码 | 场景描述 | 最终结果 | -| :----------------------- | :----------------------------- | :------------------------------------- | :------------- | -| WAL写入失败 | ERRCODE_FDW_ERROR | 在任何情况下,WAL写入失败 | 事务终止 | -| 文件IO错误:写入、打开等 | ERRCODE_IO_ERROR | 检查点:在任何文件访问错误时调用 | 严重:进程存在 | -| 内存不足 | ERRCODE_INSUFFICIENT_RESOURCES | 检查点:本地内存分配失败 | 严重:进程存在 | -| 逻辑、DB错误 | ERRCODE*INTERNAL*错误 | 检查点:算法失败或无法检索表数据或索引 | 严重:进程存在 | - -
- -## MOT检查点概念 - -在MogDB中,检查点是事务序列中一个点的快照,在该点上,可以保证堆和索引数据文件已经同步了检查点之前写入的所有信息。 - -在执行检查点时,所有脏数据页都会刷新到磁盘,并将一个特殊的检查点记录写入日志文件。 - -数据直接存储在内存中。MOT没有像MogDB那样存储数据,因此不存在脏页的概念。 - -为此,我们研究并实现了CALC算法,该算法在耶鲁大学发布的Low-Overhead Asynchronous Checkpointing in Main-Memory Database Systems, SIGMOD 2016中得到了描述。 - -主内存数据库系统中的低开销异步检查点。 - -
- -### CALC检查点算法:内存和计算开销低 - -检查点算法具有以下优点: - -- 降低内存使用量-每条记录在任何时候最多存储两个副本。在记录处于活动且稳定版本相同或没有记录任何检查点时,仅存储记录的一个物理副本,可以最大限度地减少内存使用。 -- **低开销**:CALC的开销比其他异步检查点算法小。 -- **使用虚拟一致性点**:CALC不需要静默数据库以实现物理一致性点。 - -
- -### 检查点激活 - -MOT检查点被集成到MogDB的封装的检查点机制中。检查点流程可以通过执行**CHECKPOINT**;命令手动触发,也可以根据封装的检查点触发设置(时间/大小)自动触发。 - -检查点配置在mot.conf文件中执行,请参见[检查点(MOT)](../../../administrator-guide/mot-engine/2-using-mot/3-mot-deployment.md#检查点mot)部分。 diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-7.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-7.md deleted file mode 100644 index e445a17bb0a2751257b4c32c753dcf4fd8fe202d..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-7.md +++ /dev/null @@ -1,24 +0,0 @@ ---- -title: MOT恢复概念 -summary: MOT恢复概念 -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT恢复概念 - -MOT恢复模块提供了恢复MOT表数据所需的所有功能。恢复模块的主要目标是在计划(例如维护)关闭或计划外(例如电源故障)崩溃后,将数据和MOT引擎恢复到一致的状态。 - -MogDB数据库恢复(有时也称为冷启动)包括MOT表,并且随着数据库其余部分的恢复而自动执行。MOT恢复模块无缝、全面地集成到MogDB恢复过程中。 - -MOT恢复有两个主要阶段:检查点恢复和WAL恢复(重做日志)。 - -MOT检查点恢复在封装的恢复发生之前执行。仅在冷启动事件(PG进程的启动)中执行此操作。它首先恢复元数据,然后插入当前有效检查点的所有行,这由checkpoint_recovery_workers并行完成,每个行都在不同的表中工作。索引在插入过程中创建。 - -在检查点时,表被分成多个16MB的块,以便多个恢复工作进程可以并行地恢复表。这样做是为了加快检查点恢复速度,它被实现为一个多线程过程,其中每个线程负责恢复不同的段。不同段之间没有依赖关系,因此线程之间没有争用,在更新表或插入新行时也不需要使用锁。 - -WAL记录作为封装的WAL恢复的一部分进行恢复。MogDB封装会迭代XLOG,根据xlog记录类型执行必要的操作。如果是记录类型为MOT的条目,封装将它转发给MOT 恢复管理器进行处理。如果XLOG条目太旧(即XLOG条目的LSN比检查点的LSN旧),MOT恢复将忽略该条目。 - -在主备部署中,备用服务器始终处于Recovery状态,以便自动WAL恢复过程。 - -MOT恢复参数在mot.conf文件中配置,参见[MOT恢复](../../../administrator-guide/mot-engine/2-using-mot/5-mot-administration.md#mot恢复)。 diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-8.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-8.md deleted file mode 100644 index b999fc9e518fb73a21129fcbf18725a371f3380f..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-8.md +++ /dev/null @@ -1,78 +0,0 @@ ---- -title: MOT查询原生编译(JIT) -summary: MOT查询原生编译(JIT) -author: Zhang Cuiping -date: 2021-03-04 ---- - -# MOT查询原生编译(JIT) - -原生编译(JIT)是MOT提供极低时延和高吞吐量性能的关键技术之一。支持两种原生编译(使用PREPARE语句):JIT存储过程(JIT SP)和JIT查询。 -以下各节介绍如何在应用程序中使用这两种机制。 - -## JIT存储过程(JIT SP) - -JIT SP是指通过LLVM运行时代码生成和编译库来生成代码、编译和执行存储过程。JIT SP仅对访问MOT表的存储过程可用,对用户完全透明。加速级别取决于存储过程逻辑。例如,一个真实的客户应用程序为不同的存储过程实现了20%、44%、300%和500%的加速,减少了存储过程延迟。 -在调用存储过程的查询PREPARE阶段或第一次执行存储过程时,JIT模块尝试将存储过程SQL转换为基于C的函数,并在运行时(使用LLVM)编译。如果连续存储过程调用成功,MOT将执行编译函数,从而获得性能增益。如果无法生成编译函数,存储过程将由标准的PL/pgSQL执行。这两种情况对用户完全透明。 - -## JIT查询 - -MOT使您可以在执行之前以原生格式(使用PREPARE语句)准备并分析预编译的完整查询。 - -这种本机格式以后可以更有效地执行(使用EXECUTE命令)。这种类型的执行效率要高得多,因为在执行期间,本机格式绕过了多个数据库处理层。这种分工避免了重复的解析分析操作。Lite Executor模块负责执行预准备查询,其执行路径比封装执行的常规通用计划要快得多。这是通过LLVM使用实时(JIT)编译来实现的。此外,以伪LLVM的形式提供具有潜在相似性能的类似解决方案。 - -下面是SQL中的PREPARE语法示例: - -``` -PREPARE name [ ( data_type [, ...] ) ] AS statement -``` - -下面是一个如何在Java应用程序中调用PREPARE和EXECUTE语句的示例: - -```java -conn = DriverManager.getConnection(connectionUrl, connectionUser, connectionPassword); - -// Example 1: PREPARE without bind settings -String query = "SELECT * FROM getusers"; -PreparedStatement prepStmt1 = conn.prepareStatement(query); -ResultSet rs1 = pstatement.executeQuery()) -while (rs1.next()) {…} - -// Example 2: PREPARE with bind settings -String sqlStmt = "SELECT * FROM employees where first_name=? and last_name like ?"; -PreparedStatement prepStmt2 = conn.prepareStatement(sqlStmt); -prepStmt2.setString(1, "Mark"); // first name “Mark” -prepStmt2.setString(2, "%n%"); // last name contains a letter “n” -ResultSet rs2 = prepStmt2.executeQuery()) -while (rs2.next()) {…} -``` - -
- -## Prepare - -**Prepare**创建一个预处理语句。预处理语句是服务器端对象,可用于优化性能。执行PREPARE语句时,将解析、分析和重写指定的语句。 - -如果查询语句中提到的表是MOT表,则MOT编译负责对象准备,并基于LLVM将查询编译成IR字节码进行特殊优化。 - -每当需要新的查询编译时,都会分析查询,并使用实用程序GsCodeGen对象和标准LLVM JIT API (IRBuilder)为查询生成合适的IR字节代码。完成字节代码生成后,代码将被JIT编译到单独的LLVM模块中。编译的代码生成一个C函数指针,以后可以调用该指针直接执行。请注意,这个C函数可以被许多线程并发调用,只要每个线程提供不同的执行上下文(详细信息如下)。每个这样的执行上下文称为“JIT上下文”。 - -为了进一步提高性能,MOT JIT对其LLVM代码结果应用缓存策略,使它们能够被在不同会话中的相同查询重用。 - -
- -## 执行 - -当发出EXECUTE命令时,会计划并执行预准备语句(上文所述)。这种分工避免了重复的解析分析工作,同时使执行计划依赖于提供的特定设置值。 - -当生成的执行查询命令到达数据库时,它使用相应的IR字节代码,在MOT引擎中直接执行该代码,并且执行效率更高。这称为“轻量级执行”。 - -此外,为了可用性,Lite Executor维护了一个预先分配的JIT源池。每个会话预分配自己的会话本地JIT上下文对象池(用于重复执行预编译查询)。 - -您可以参考MOT SQL覆盖和限制中的[不支持的JIT功能](../../../administrator-guide/mot-engine/2-using-mot/4-mot-usage.md#不支持的JIT功能(原生编译和执行))。 - -
- -## JIT存储过程 - -JIT存储过程(JIT SP)由MogDB MOT引擎(从5.0版本开始)支持,其目标是提供更高的性能和更低的延迟。 diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-9.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-9.md deleted file mode 100644 index f0166ce69be96b23e3edc3609d389b05e65a51f2..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/3-9.md +++ /dev/null @@ -1,33 +0,0 @@ ---- -title: 对比:磁盘与MOT -summary: 对比:磁盘与MOT -author: Zhang Cuiping -date: 2021-03-04 ---- - -# 对比:磁盘与MOT - -下表简要对比了基于MogDB磁盘的存储引擎和MOT存储引擎的各种特性。 - -对比:基于磁盘与MOT - -| 特性 | MogDB 磁盘存储 | MogDB MOT引擎 | -| :--------------------- | :---------------------------- | :---------------------------------- | -| 英特尔x86+鲲鹏ARM | 是 | 是 | -| SQL和功能集覆盖率 | 100% | 98% | -| 纵向扩容(多核,NUMA) | 低效 | 高效 | -| 吞吐量 | 高 | 极高 | -| 时延 | 低 | 极低 | -| 分布式 | 是 | 是 | -| 隔离级别 | - RC+SI
- RR | - RC
- RR | -| 并发控制策略 | 悲观+MVCC | 乐观+MVCC | -| 数据容量(数据+索引) | 不受限制 | 受限于DRAM | -| 本地编译 | 否 | 是
- Query (by PREPARE command)
- 存储过程(使用PREPARE命令) | -| 复制、恢复 | 是 | 是 | -| 复制选项 | 2(同步,异步) | 3(同步、异步、组提交) | - -**其中,** - -- RR=可重复读取 -- RC=读已提交 -- SI=快照隔离 diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/concepts-of-mot.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/concepts-of-mot.md deleted file mode 100644 index d53ef50acddbba7e0b8bf37dee80050ea3738a89..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/3-concepts-of-mot/concepts-of-mot.md +++ /dev/null @@ -1,20 +0,0 @@ ---- -title: MOT的概念 -summary: MOT的概念 -author: Guo Huan -date: 2023-05-22 ---- - -# MOT的概念 - -本章介绍MogDB MOT的设计和工作原理,阐明其高级特性、功能及使用方法,旨在让读者了解MOT操作上的技术细节、重要特性细节和创新点。本章内容有助于决策MOT是否适合于特定的应用需求,以及进行最有效的使用和管理。 - -+ **[MOT纵向扩容架构](3-1.md)** -+ **[MOT并发控制机制](3-2.md)** -+ **[扩展FDW与其他MogDB特性](3-3.md)** -+ **[NUMA-aware分配和亲和性](3-4.md)** -+ **[MOT索引](3-5.md)** -+ **[MOT持久性概念](3-6.md)** -+ **[MOT恢复概念](3-7.md)** -+ **[MOT查询原生编译(JIT)](3-8.md)** -+ **[对比:磁盘与MOT](3-9.md)** \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/4-appendix/1-references.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/4-appendix/1-references.md deleted file mode 100644 index b4a892659e16b82aac26639a388b8c0bb119d1d1..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/4-appendix/1-references.md +++ /dev/null @@ -1,36 +0,0 @@ ---- -title: 参考文献 -summary: 参考文献 -author: Zhang Cuiping -date: 2021-05-18 ---- - -# 参考文献 - -[1] Y. Mao, E. Kohler, and R. T. Morris. Cache craftiness for fast multicore key-value storage. In Proc. 7th ACM European Conference on Computer Systems (EuroSys), Apr. 2012. - -[2] K. Ren, T. Diamond, D. J. Abadi, and A. Thomson. Low-overhead asynchronous checkpointing in main-memory database systems. In Proceedings of the 2016 ACM SIGMOD International Conference on Management of Data, 2016. - -[5] Tu, S., Zheng, W., Kohler, E., Liskov, B., and Madden, S. Speedy transactions in multicore in-memory databases. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (New York, NY, USA, 2013), SOSP ’13, ACM, pp. 18-32. - -[6] H. Avni at al. Industrial-Strength OLTP Using Main Memory and Many-cores, VLDB 2020. - -[7] Bernstein, P. A., and Goodman, N. Concurrency control in distributed database systems. ACM Comput. Surv. 13, 2 (1981), 185-221. - -[8] Felber, P., Fetzer, C., and Riegel, T. Dynamic performance tuning of word-based software transactional memory. In Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP 2008, Salt Lake City, UT, USA, February 20-23, 2008 (2008), - -pp. 237-246. - -[9] Appuswamy, R., Anadiotis, A., Porobic, D., Iman, M., and Ailamaki, A. Analyzing the impact of system architecture on the scalability of OLTP engines for high-contention workloads. PVLDB 11, 2 (2017), - -121-134. - -[10] R. Sherkat, C. Florendo, M. Andrei, R. Blanco, A. Dragusanu, A. Pathak, P. Khadilkar, N. Kulkarni, C. Lemke, S. Seifert, S. Iyer, S. Gottapu, R. Schulze, C. Gottipati, N. Basak, Y. Wang, V. Kandiyanallur, S. Pendap, D. Gala, R. Almeida, and P. Ghosh. Native store extension for SAP HANA. PVLDB, 12(12): - -2047-2058, 2019. - -[11] X. Yu, A. Pavlo, D. Sanchez, and S. Devadas. Tictoc: Time traveling optimistic concurrency control. In Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26 - July 01, 2016, pages 1629-1642, 2016. - -[12] V. Leis, A. Kemper, and T. Neumann. The adaptive radix tree: Artful indexing for main-memory databases. In C. S. Jensen, C. M. Jermaine, and X. Zhou, editors, 29th IEEE International Conference on Data Engineering, ICDE 2013, Brisbane, Australia, April 8-12, 2013, pages 38-49. IEEE Computer Society, 2013. - -[13] S. K. Cha, S. Hwang, K. Kim, and K. Kwon. Cache-conscious concurrency control of main-memory indexes on shared-memory multiprocessor systems. In P. M. G. Apers, P. Atzeni, S. Ceri, S. Paraboschi, K. Ramamohanarao, and R. T. Snodgrass, editors, VLDB 2001, Proceedings of 27th International Conference on Very Large Data Bases, September 11-14, 2001, Roma, Italy, pages 181-190. Morga Kaufmann, 2001. diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/4-appendix/2-glossary.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/4-appendix/2-glossary.md deleted file mode 100644 index ff47d4ffa2343ea3d49ce65e70591b5082124865..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/4-appendix/2-glossary.md +++ /dev/null @@ -1,59 +0,0 @@ ---- -title: 术语表 -summary: 术语表 -author: Zhang Cuiping -date: 2021-05-18 ---- - -# 术语表 - -| 缩略语 | 定义描述 | -| :----- | :----------------------------------------------------------- | -| 2PL | 2阶段锁(2-Phase Locking) | -| ACID | 原子性(Atomicity),一致性(Consistency),隔离性(Isolation),持久性(Durability) | -| AP | 分析处理(Analytical Processing) | -| Arm | 高级RISC机器(Advanced RISC Machine),x86的替代硬件架构。 | -| CC | 并发控制(Concurrency Control) | -| CPU | 中央处理器(Central Processing Unit) | -| DB | 数据库(Database) | -| DBA | 数据库管理员(Database Administrator) | -| DBMS | 数据库管理系统(DataBase Management System) | -| DDL | 数据定义语言(Data Definition Language)数据库模式管理语言 | -| DML | 数据修改语言(Data Modification Language) | -| ETL | 提取、转换、加载或遇时锁定(Extract, Transform, Load or Encounter Time Locking) | -| FDW | 外部数据封装(Foreign Data Wrapper) | -| GC | 垃圾收集器(Garbage Collector) | -| HA | 高可用性(High Availability) | -| HTAP | 事务分析混合处理(Hybrid Transactional-Analytical Processing) | -| IoT | 物联网(Internet of Things) | -| IM | 内储存(In-Memory) | -| IMDB | 内储存数据库(In-Memory Database) | -| IR | 源代码的中间表示(Intermediate Representation),用于编译和优化 | -| JIT | 准时(Just In Time) | -| JSON | JavaScript对象表示法(JavaScript Object Notation) | -| KV | 键值(Key Value) | -| LLVM | 低级虚拟机(Low-Level Virtual Machine),指编译代码或IR查询 | -| M2M | 机对机(Machine-to-Machine) | -| ML | 机器学习(Machine Learning) | -| MM | 主内存(Main Memory) | -| MO | 内存优化(Memory Optimized) | -| MOT | 内存优化表存储引擎(SE),读作/em/ /oh/ /tee/ | -| MVCC | 多版本并发控制(Multi-Version Concurrency Control) | -| NUMA | 非一致性内存访问(Non-Uniform Memory Access) | -| OCC | 乐观并发控制(Optimistic Concurrency Control) | -| OLTP | 在线事务处理(On-Line Transaction Processing),多用户在线交易类业务 | -| PG | PostgreSQL | -| RAW | 写后读校验(Reads-After-Writes) | -| RC | 返回码(Return Code) | -| RTO | 目标恢复时间(Recovery Time Objective) | -| SE | 存储引擎(Storage Engine) | -| SQL | 结构化查询语言(Structured Query Language) | -| TCO | 总体拥有成本(Total Cost of Ownership) | -| TP | 事务处理(Transactional Processing) | -| TPC-C | 一种联机事务处理基准 | -| Tpm-C | 每分钟事务数-C. TPC-C基准的性能指标,用于统计新订单事务。 | -| TVM | 微小虚拟机(Tiny Virtual Machine) | -| TSO | 分时选项(Time Sharing Option) | -| UDT | 自定义类型 | -| WAL | 预写日志(Write Ahead Log) | -| XLOG | 事务日志的PostgreSQL实现(WAL,如上文所述) | diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/4-appendix/mot-appendix.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/4-appendix/mot-appendix.md deleted file mode 100644 index e21b91e77194e7129a24a1319d6121662ea0cf60..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/4-appendix/mot-appendix.md +++ /dev/null @@ -1,11 +0,0 @@ ---- -title: 附录 -summary: 附录 -author: Guo Huan -date: 2023-05-22 ---- - -# 附录 - -+ **[参考文献](1-references.md)** -+ **[术语表](2-glossary.md)** \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/mot-engine.md b/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/mot-engine.md deleted file mode 100644 index a24839108e517a6e669445b51c826aca37552215..0000000000000000000000000000000000000000 --- a/product/zh/docs-mogdb/v5.0/administrator-guide/mot-engine/mot-engine.md +++ /dev/null @@ -1,13 +0,0 @@ ---- -title: MOT内存表管理 -summary: MOT内存表管理 -author: Guo Huan -date: 2023-05-22 ---- - -# MOT内存表管理 - -- **[MOT介绍](1-introducing-mot/introducing-mot.md)** -- **[使用MOT](2-using-mot/using-mot.md)** -- **[MOT的概念](3-concepts-of-mot/concepts-of-mot.md)** -- **[附录](4-appendix/mot-appendix.md)** \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/characteristic-description/application-development-interfaces/ECPG.md b/product/zh/docs-mogdb/v5.0/characteristic-description/application-development-interfaces/ECPG.md index c545c092e368bd860615deb581902c5f88581213..b863e57954c082f56d8aa50dc158cf913554fbd7 100644 --- a/product/zh/docs-mogdb/v5.0/characteristic-description/application-development-interfaces/ECPG.md +++ b/product/zh/docs-mogdb/v5.0/characteristic-description/application-development-interfaces/ECPG.md @@ -1,7 +1,7 @@ --- title: 支持嵌入式SQL预处理器(ECPG) summary: 支持嵌入式SQL预处理器(ECPG) -author: Guo Huan +author: Guo Huan 陈泽 date: 2023-04-04 --- @@ -29,11 +29,32 @@ date: 2023-04-04 ## 特性增强 -无。 +为兼容Oracle Pro\*C,平滑使用ECPG替换Pro\*C实现业务逻辑,MogDB 5.0.8实现了如下功能: + +1. 支持EXEC SQL FOR FETCH取出多行结果到SQLDA结构体 +2. 支持EXEC SQL EXECUTE IMMEDIATE {:host_string} +3. 支持动态SQL PREPARE主机变量 +4. 宿主变量声明与共享 +5. 数组形式指示符变量 +6. 兼容了SQLDA的DESCRIPTOR +7. 兼容了DESCRIBE SELECT LIST FOR和DESCRIBE BIND VARIABLES FOR +8. 兼容Pro*C格式的建立连接方式 +9. 数据类型转换与兼容 +10. 结构体数组处理多行数据 +11. for子句限制处理行数 +12. 事务处理语法commit relase rollback release +13. 匿名块语法支持 +14. 兼容了EXECUTE IMMEDIATE string_literal +15. 兼容了PREPARE FROM [SelectStmt|UpdateStmt|InsertStmt|DeleteStmt|MergeStmt] ## 特性约束 -ECPG支持大部分的MogDB SQL语法,但由于目前ECPG的语法和词法不支持对匿名块和Package语句的处理,因此匿名块和创建Package语句无法作为嵌入式SQL使用。 +1. 当前已经实现了部分OCI类型,但MogOCI库不成熟禁止使用 +2. 在sqlda中绑定这些数据类型的变量,其类型代码与ORACLE保持一致 +3. 与PRO*C一致,用户需要在预编译代码中自行实现绑定变量、查询时输出列类型处理 +4. 对于EXECUTE IMMEDIATE,仅支持hoststring,不支持string_literal +5. PREPARE FROM仅支持主机变量,不支持SELECT语法 +6. 使用SQLDA接收数据时,仅当列给出长度限制,如:char[10],才能正确获取变长字符串 ## 依赖关系 @@ -99,6 +120,287 @@ int main(int argc, char **argv) } ``` +```c +//基础的增删改 + +//查询 +EXEC SQL SELECT ename,job,sal +2000 into :emp_name , :job_title,:salary from emp where empno = :emp_number; + +// 插入 +EXEC SQL INSERT INTO emp (empno,ename,sal,deptno) VALUES (:emp_number,:emp_name,:salary,:dept_number); + +// 更新 +EXEC SQL UPDATE emp SET sal = :salary , comm = :commission WHERE empno =:emp_number; + +// 删除 +EXEC SQL DELETE FROM emo WHERE deptno = :dept_number; +``` + +```c +//对于动态 SQL 其中一种方法为使用 SQLDA 储存数据结构,SQLDA 结构定义于 sqlda.h 头文件中。在此将介绍如何使用 SQLDA +#include +SQLDA *bind_dp; +SQLDA *select_dp; + +//该 SQLDA 结构体初始化可以使用如下方式 +bind_dp = SQLSQLDAAlloc(runtime_context,size,name_lenght,ind_name_length); + +//在 ORACLE 早期版本中使用 sqlald()函数来分配描述符 +EXEC SQL FETCH ... USING DESCRIPTOR ... +EXEC SQL OPEN ... USING DESCRIPTOR ... +``` + +构建的SQLDA结构体含有若干成员,用户需要了解其各成员语义,并自行构建填充该SQLDA描述符。具体如下: + +- N变量:可以Describe的select-list或占位符的最大值 + +- V变量:指向储存select-list或绑定变量值的数据缓冲区地址数组的指针 + + 在使用select-list或绑定变量前需要分配V对应空间,并声明 + +- L变量:指向数据缓冲区的select-list或绑定变量值长度数组 +- T变量:指向数据缓冲区的select-list或绑定变量值数据类型代码数组 +- I变量:指向指示符变量的数据缓冲区 +- F变量:DESCRIBE实际找到的select-list或占位符数量 +- S变量:指向数据缓冲区的select-list或占位符的名称 +- M变量:select-list或占位符名称的长度 +- C变量:当前长度的select-list或占位符名称数组 +- X变量:储存指示符变量名称的数组 +- Y变量:指示符变量名称的最大长度数组 +- Z变量:指示变量名称的当前长度 + +用户需要了解如上SQLDA的部分实现细节,因为用户需要在其预编译的C代码中自行处理如何使用SQLDA。具体流程如下: + +1. Declare Section中声明一个主机字符串以保存查询文本 +2. 声明select和bind的SQLDA +3. 为select和bind描述符分配存储空间 +4. 设置可以Describe的选择列表和占位符的最大数量 +5. 将查询文本放在主机字符串中 +6. 从主机字符串准备查询 +7. 为查询声明游标 +8. 将待绑定的变量DESCRIBE到绑定描述符中 +9. 将占位符的数量重置为DESCRIBE实际找到的数量 +10. 获取值并为DESCRIBE找到的绑定变量分配空间 +11. 使用bind描述符来打开游标 +12. 将select-list DESCRIBE到描述符中 +13. 将select-list的数量重置为DESCRIBE实际找到的数量 +14. 重置每个select-list列的长度和数据类型以进行显示 +15. 将数据库中的一行FETCH到select描述符指向的已分配数据缓冲区中 +16. 处理FETCH返回的select-list值 +17. 释放用于选择列表项目、占位符、指示符变量和描述符的存储空间 +18. 关闭游标 + +```c +#include +#include +#include + +//定义列和绑定变量的最大个数 +#define MAX_ITEMS 40 +//定义列名的最大值 +#define MAX_VNAME_LEN 30 +#define MAX_INAME_LEN 30 + +int alloc_descriptor(int size,int max_vname_len,int max_iname_len); +void set_bind_v(); +void set_select_v(); +void free_da(); +void sql_error(char *msg); + +EXEC SQL INCLUDE sqlca; +EXEC SQL INCLUDE sqlda; +EXEC SQL INCLUDE sqlcpr; + +//宿主变量定义: +EXEC SQL BEGIN DECLARE SECTION; +float f1 = 12.34; +VARCHAR f2[64]; +char sql_statement[256]= "select * from test_ora"; +char type_statement[256]="select f1,f2 from test_ora where f1"); + + //给sqlda类型 分配数据 + alloc_descriptor(MAX_ITEMS,MAX_VNAME_LEN,MAX_INAME_LEN); + + //建表语句 + EXEC SQL DROP TABLE IF EXISTS TEST_ORA; + EXEC SQL CREATE TABLE TEST_ORA(f1 float, f2 text); + EXEC SQL INSERT INTO TEST_ORA VALUES(12.34,'abcd123'); + EXEC SQL INSERT INTO TEST_ORA VALUES(12,'e234d'); + EXEC SQL INSERT INTO TEST_ORA VALUES(12.34,'abcd123'); + EXEC SQL INSERT INTO TEST_ORA VALUES(333.33,'abcd'); + EXEC SQL commit; + //prepare语句 + EXEC SQL PREPARE S from :type_statement; + EXEC SQL DECLARE C1 CURSOR FOR S; + set_bind_v(); + + strcpy(f2.arr,"abcd123"); + f2.len = strlen(f2.arr); + f2.arr[f2.len] = '\0'; + + bind_p->L[0] = sizeof(float); + bind_p->V[0] = (char*)malloc(bind_p->L[0]); + memcpy(bind_p->V[0], &f1, sizeof(float)); + bind_p->T[0] = 4; /* EXTERNAL_PROC_FLOAT */ + bind_p->L[1] = sizeof(char) * 64; + bind_p->V[1] = (char*)malloc(bind_p->L[1] + 1); + memcpy(bind_p->V[1], &f2, sizeof(char) * 64); + bind_p->T[1] = 1; /* EXTERNAL_PROC_VARCHAR2 */ + + EXEC SQL OPEN C1 USING DESCRIPTOR bind_p; + EXEC SQL DESCRIBE SELECT LIST for S INTO select_p; + + set_select_v(); + printf("f1\t\tf2\n"); + printf("----------------------------------------------------------\n"); + for(;;) + { + EXEC SQL WHENEVER NOT FOUND DO break; + EXEC SQL FETCH C1 USING DESCRIPTOR select_p; + + for(i = 0;iF;i++){ + printf("%s ",select_p->V[i]); + } + printf("\n"); + } + free_da(); + EXEC SQL CLOSE C1; + printf("\n-----------------------------------------------------\n"); + alloc_descriptor(MAX_ITEMS,MAX_VNAME_LEN,MAX_INAME_LEN); + EXEC SQL PREPARE S from :sql_statement; + EXEC SQL DECLARE C CURSOR FOR S; + set_bind_v(); + EXEC SQL OPEN C USING DESCRIPTOR bind_p; + EXEC SQL DESCRIBE SELECT LIST for S INTO select_p; + set_select_v(); + EXEC SQL WHENEVER NOT FOUND DO break; + for (;;) { + EXEC SQL FETCH C USING DESCRIPTOR select_p; + for(i = 0;iF;i++){ + printf("%s ",select_p->V[i]); + } + printf("\n"); + } + free_da(); + EXEC SQL CLOSE C; + EXEC SQL DROP TABLE TEST_ORA; + EXEC SQL COMMIT WORK RELEASE; + exit(0); +} +//分配描述符空间: +int alloc_descriptor(int size,int max_vname_len,int max_iname_len) +{ + if((bind_p=sqlald(size,max_vname_len,max_iname_len))==(SQLDA*)0) + { + printf("can't allocate memory for bind_p."); + return -1; + } + + if((select_p=sqlald(size,max_vname_len,max_iname_len))==(SQLDA*)0) + { + printf("can't allocate memory for select_p."); + return -1; + } + + return 0; +} +//绑定变量的设置: +void set_bind_v() +{ + unsigned int i; + EXEC SQL WHENEVER SQLERROR DO sql_error(""); + bind_p ->N = MAX_ITEMS; + EXEC SQL DESCRIBE BIND VARIABLES FOR S INTO bind_p; + + if(bind_p->F<0) + { + printf("Too Many bind varibles"); + return; + } + bind_p->N = bind_p->F; + for(i=0;iN;i++) + { + bind_p->T[i] = 1; + } +} + +//选择列处理 +void set_select_v() +{ + unsigned int i; + int null_ok,precision,scale; + EXEC SQL DESCRIBE SELECT LIST for S INTO select_p; + + if(select_p->F<0) + { + printf("Too Many column varibles"); + return; + } + select_p->N = select_p->F; + //对格式作处理 + for(i = 0;iN;i++) + { + sqlnul((short unsigned int*)&(select_p->T[i]), (short unsigned int*)&(select_p->T[i]), &null_ok);//检查类型是否为空 + switch (select_p->T[i]) + { + case 1://VARCHAR2 + break; + case 2://NUMBER + sqlprc(&(select_p->L[i]), &precision, &scale); + if (precision == 0) + precision = 40; + select_p->L[i] = precision + 2; + break; + case 8://LONG + select_p->L[i] = 240; + break; + case 11://ROWID + select_p->L[i] = 18; + break; + case 12://DATE + select_p->L[i] = 9; + break; + case 23://RAW + break; + case 24://LONGRAW + select_p->L[i] = 240; + break; + } + select_p->V[i] = (char *)realloc(select_p->V[i], select_p->L[i]+1); + select_p->V[i][select_p->L[i]] ='\0';//加上终止符 + select_p->T[i] = 1;//把所有类型转换为字符型 + } +} +//释放内存SQLDA的函数: +void free_da() +{ + sqlclu(bind_p); + sqlclu(select_p); +} + +//错误处理 +void sql_error(char *msg) +{ + printf("\n%s %s\n", msg,(char *)sqlca.sqlerrm.sqlerrmc); + EXEC SQL WHENEVER SQLERROR CONTINUE; + EXEC SQL ROLLBACK RELEASE; + exit(0); +} +``` + 1. 创建数据库用户 ```sql diff --git a/product/zh/docs-mogdb/v5.0/characteristic-description/characteristic-description-overview.md b/product/zh/docs-mogdb/v5.0/characteristic-description/characteristic-description-overview.md index b5a478b484188a05f3471424e31860701dea33f8..d4a9bbe5f096803b768b254e94a279dc3fa40df2 100644 --- a/product/zh/docs-mogdb/v5.0/characteristic-description/characteristic-description-overview.md +++ b/product/zh/docs-mogdb/v5.0/characteristic-description/characteristic-description-overview.md @@ -30,6 +30,9 @@ MogDB 5.0版本具有以下特性: + [排序算子优化](./high-performance/ordering-operator-optimization.md) + [OCK加速数据传输](./high-performance/ock-accelerated-data-transmission.md) + [OCK SCRLock加速分布式锁](./high-performance/ock-scrlock-accelerate-distributed-lock.md) + + [极致刷脏](./high-performance/enhancement-of-dirty-pages-flushing-performance.md) + + [顺序扫描预读](./high-performance/seqscan-prefetch.md) + + [Ustore SMP并行扫描](./high-performance/ustore-smp.md) + 高可用 + [主备机](./high-availability/1-primary-standby.md) @@ -101,6 +104,7 @@ MogDB 5.0版本具有以下特性: + [支持在建表后修改表日志属性](./compatibility/modify-table-log-property.md) + [INSERT支持ON CONFLICT子句](./compatibility/insert-on-conflict.md) + [支持AUTHID CURRENT_USER](./compatibility/authid-current-user.md) + + [PBE模式支持存储过程out出参](./compatibility/stored-procedure-out-parameters-in-pbe-mode.md) + 数据库安全 + [访问控制模型](./database-security/1-access-control-model.md) @@ -149,6 +153,7 @@ MogDB 5.0版本具有以下特性: + [支持裁剪子查询投影列](./enterprise-level-features/support-for-pruning-subquery-projection-columns.md) + [排序列裁剪](./enterprise-level-features/pruning-order-by-in-subqueries.md) + [自动创建支持模糊匹配的索引](./enterprise-level-features/index-support-fuzzy-matching.md) + + [支持指定导入导出五类基本对象](./enterprise-level-features/import-export-specific-objects.md) + 应用开发接口 + [支持标准SQL](./application-development-interfaces/1-standard-sql.md) diff --git a/product/zh/docs-mogdb/v5.0/characteristic-description/compatibility/compatibility.md b/product/zh/docs-mogdb/v5.0/characteristic-description/compatibility/compatibility.md index 0abdfd48e5176653fce3881d3a2ef19f984d9e48..e7c9a3b8cc7ee11ee126927d4e4397550c985194 100644 --- a/product/zh/docs-mogdb/v5.0/characteristic-description/compatibility/compatibility.md +++ b/product/zh/docs-mogdb/v5.0/characteristic-description/compatibility/compatibility.md @@ -36,4 +36,5 @@ date: 2023-06-20 - **[ORDER BY/GROUP BY场景兼容](order-by-group-by-scenario-expansion.md)** - **[支持在建表后修改表日志属性](modify-table-log-property.md)** - **[INSERT支持ON CONFLICT子句](insert-on-conflict.md)** -- **[支持AUTHID CURRENT_USER](authid-current-user.md)** \ No newline at end of file +- **[支持AUTHID CURRENT_USER](authid-current-user.md)** +- **[PBE模式支持存储过程out出参](stored-procedure-out-parameters-in-pbe-mode.md)** \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/characteristic-description/compatibility/stored-procedure-out-parameters-in-pbe-mode.md b/product/zh/docs-mogdb/v5.0/characteristic-description/compatibility/stored-procedure-out-parameters-in-pbe-mode.md new file mode 100644 index 0000000000000000000000000000000000000000..6786df306d69488898ead36dab7f3ee15aad0353 --- /dev/null +++ b/product/zh/docs-mogdb/v5.0/characteristic-description/compatibility/stored-procedure-out-parameters-in-pbe-mode.md @@ -0,0 +1,180 @@ +--- +title: PBE模式支持存储过程out出参 +summary: 在使用如JDBC等PBE连接方式的情况下,支持匿名块中的左值与OUT/INOUT类型的数据在存储过程中返回值到对应驱动端。该功能的使用需要开启enable_outparams_override参数。 +author: 任建成 韩旭 +date: 2024-06-24 +--- + +# PBE模式支持存储过程out出参 + +## 可获得性 + +本特性自MogDB 5.0.8版本开始引入。 + +## 特性简介 + +在使用JDBC等PBE连接方式的情况下,支持匿名块中的左值在存储过程中的返回值返回到对应驱动端。该功能的使用需要设置behavior_compat_options参数中proc_outparam_override选项。方法如下: + +```sql +set behavior_compat_options = 'proc_outparam_override'; +``` + +或在postgres.conf文件中设置此参数并重启数据库。 + +## 客户价值 + +提高兼容性,兼容Oracle的对应用法。提高可用性,便于编写JDBC等程序连接数据库操作。 + +## 特性描述 + +在使用JDBC等PBE连接方式的情况下,支持匿名块中的左值(包括存储过程参数列表中OUT/INOUT类型的数据)在存储过程中的返回值返回到对应驱动端。以JDBC举例,用户在编写匿名块程序执行时,希望将表达式的左值出参返回到java程序端并在java程序端处理这些数据。可以将匿名块中对应位置的变量替换为“?”,并在后续程序中对数据类型进行合法的注册,即可使用对应的get方法获取到。 + +支持的数据范围包括JDBC支持映射的基本数据类型,tableof、object、array、refcursor、composite等。 + +Jdbc默认数据类型映射关系参考: + +get:[java.sql.CallableStatement](../../developer-guide/dev/2-development-based-on-jdbc/15-JDBC/2-java-sql-CallableStatement.md) + +set:[java.sql.PreparedStatement](../../developer-guide/dev/2-development-based-on-jdbc/15-JDBC/5-java-sql-PreparedStatement.md) + +支持的场景包括匿名块直接执行、匿名块调用函数、匿名块调用存储过程、匿名块调用package,匿名块内部调用immutable execute、select into、bulk into、execute immediate、fetch into等场景。 + +## 特性约束 + +1. 该功能仅在数据库兼容模式为Oracle时能够使用(即创建DB时不指定,或DBCOMPATIBILITY='A'),在其他数据库兼容模式下不能使用该特性。 + +2. JDBC版本应该大于等于5.0.0.4,JDBC端应正确连接到数据库。 + +3. 仅针对表达式左值(包括存储过程参数列表中OUT/INOUT类型的数据)。 + +4. 用户执行的匿名块中,变量不应该以“$”开头作为命名,如“$1”,否则可能会造成数据不准确的问题。 + +5. immutable execute场景不支持直接使用匿名块出参,因为字符串里面的”?”并不会被计算为需要替换的变量,但可以配合using使用。 + +## 示例 + +```java +public static void test_case_0001_output_mutil(Connection conn) throws Exception { + String baseSQLStrings = "set behavior_compat_options='proc_outparam_override';"; + String baseSQLString = "DECLARE" + + "baselen integer:= 199;" + + "BEGIN" + + "? := baselen;" + + "? := baselen*2;" + + "END;"; + try { + CallableStatement pstmt = conn.prepareCall(baseSQLStrings); + pstmt.execute(); + pstmt.close(); + + pstmt = conn.prepareCall(baseSQLString); + System.out.println("Prepare param out SQL succeed!"); + + pstmt.registerOutParameter(1, Types.INTEGER); + System.out.println("Register succeed!"); + + pstmt.registerOutParameter(2, Types.INTEGER); + System.out.println("Register succeed!"); + + pstmt.execute(); + System.out.println("Execute succeed!"); + + + if (199 == pstmt.getInt(1)) { + System.out.println("answer true"); + } else { + System.out.println("answer false"); + } + + if (398 == pstmt.getInt(2)) { + System.out.println("answer true"); + } else { + System.out.println("answer false"); + } + + System.out.println("Get succeed!"); + + pstmt.close(); + System.out.println("Run succeed!"); + } + catch (Exception e) { + String exceptionStr = e.toString(); + System.out.println(exceptionStr); + } +} +``` + +以上用例中,我们在baseSQLString中使用了匿名块出参的功能,其中涉及了两次表达式左值的返回。并且可以在java端获取到执行的结果并进行处理。 + +```java +public static void t02_base_test(Connection conn) throws Exception { + String createPackageHead = + "CREATE OR REPLACE PACKAGE testuser.pck2 AS" + + " PROCEDURE get_IN_OUT(output1 OUT varchar(26), output2 OUT bool, output3 OUT TINYINT, output4 OUT smallint, ret1 IN OUT DOUBLE PRECISION);" + + "END pck2;"; + + String createPackageBody = + "CREATE OR REPLACE PACKAGE BODY testuser.pck2 AS" + + " PROCEDURE get_IN_OUT(output1 OUT varchar(26), output2 OUT bool, output3 OUT TINYINT, output4 OUT smallint, ret1 IN OUT DOUBLE PRECISION) IS" + + " BEGIN" + + " output1 := 'abcdefghigklmnopqrstuvwxyz';" + + " output2 := false;" + + " output3 := 2;" + + " output4 := 12;" + + " ret1 := ret1 + 10;" + + " END get_IN_OUT;" + + "END pck2;"; + + String baseSQLString = + "BEGIN" + + " testuser.pck2.get_IN_OUT(?, ?, ?, ?, ?);" + + "END;"; + try { + CallableStatement pstmt = conn.prepareCall(createPackageHead); + pstmt.execute(); + pstmt.close(); + System.out.println("HEAD Prepare succeed!"); + + pstmt = conn.prepareCall(createPackageBody); + pstmt.execute(); + pstmt.close(); + System.out.println("BODY Prepare succeed!"); + + pstmt = conn.prepareCall(baseSQLString); + + pstmt.setDouble(5, 99.99999999); + + pstmt.registerOutParameter(1, Types.VARCHAR); + pstmt.registerOutParameter(2, Types.BOOLEAN); + pstmt.registerOutParameter(3, Types.TINYINT); + pstmt.registerOutParameter(4, Types.SMALLINT); + pstmt.registerOutParameter(5, Types.DOUBLE); + System.out.println("Register succeed!"); + + pstmt.execute(); + System.out.println("Execute succeed!"); + + System.out.println(pstmt.getString(1)); + System.out.println(pstmt.getBoolean(2)); + System.out.println(pstmt.getByte(3)); + System.out.println(pstmt.getShort(4)); + System.out.println(pstmt.getDouble(5)); + + System.out.println("Get succeed!"); + + pstmt.close(); + System.out.println("Run succeed!"); + } + catch (Exception e) { + String exceptionStr = e.toString(); + System.out.println(exceptionStr); + } +} +``` + +以上用例中,我们在baseSQLString中使用了匿名块出参的功能,其中涉及了五种基本类型的返回与package的调用,以及OUT与INOUT类型的使用。并且可以在java端获取到执行的结果并进行处理。 + +## 相关页面 + +[behavior_compat_options](../../reference-guide/guc-parameters/version-and-platform-compatibility/platform-and-client-compatibility.md#behavior_compat_options)、 +[基于JDBC开发](../../developer-guide/dev/2-development-based-on-jdbc/development-based-on-jdbc.md) \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/characteristic-description/enterprise-level-features/15-in-place-update-storage-engine.md b/product/zh/docs-mogdb/v5.0/characteristic-description/enterprise-level-features/15-in-place-update-storage-engine.md index 5793e046cbcf8b5214e0688ae9e4ffad6a8e9bb4..d363e4748ed424337906de58d22919bdc917b00f 100644 --- a/product/zh/docs-mogdb/v5.0/characteristic-description/enterprise-level-features/15-in-place-update-storage-engine.md +++ b/product/zh/docs-mogdb/v5.0/characteristic-description/enterprise-level-features/15-in-place-update-storage-engine.md @@ -25,7 +25,8 @@ In-place Update存储引擎可有效的降低多次更新元组后占用存储 ## 特性增强 -无。 +- MogDB 5.0.6:Ustore存储引擎商用 +- MogDB 5.0.8:支持SMP并行查询能力、顺序扫描预读 ## 特性约束 @@ -37,4 +38,4 @@ In-place Update存储引擎可有效的降低多次更新元组后占用存储 ## 相关页面 -[配置Ustore](../../performance-tuning/system-tuning/configuring-ustore.md) \ No newline at end of file +[In-place Update存储引擎Ustore](../../performance-tuning/system-tuning/configuring-ustore.md) \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/characteristic-description/enterprise-level-features/enterprise-level-features.md b/product/zh/docs-mogdb/v5.0/characteristic-description/enterprise-level-features/enterprise-level-features.md index c4dafc535f055c8fc0c5c3efe0c28b4f05180db3..9a51d8da5c7791c06602b4639339c7664ab67701 100644 --- a/product/zh/docs-mogdb/v5.0/characteristic-description/enterprise-level-features/enterprise-level-features.md +++ b/product/zh/docs-mogdb/v5.0/characteristic-description/enterprise-level-features/enterprise-level-features.md @@ -36,4 +36,5 @@ date: 2023-05-22 + **[游标支持倒序检索](scroll-cursor.md)** + **[支持裁剪子查询投影列](support-for-pruning-subquery-projection-columns.md)** + **[排序列裁剪](pruning-order-by-in-subqueries.md)** -+ **[自动创建支持模糊匹配的索引](index-support-fuzzy-matching.md)** \ No newline at end of file ++ **[自动创建支持模糊匹配的索引](index-support-fuzzy-matching.md)** ++ **[支持指定导入导出五类基本对象](./import-export-specific-objects.md)** \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/characteristic-description/enterprise-level-features/import-export-specific-objects.md b/product/zh/docs-mogdb/v5.0/characteristic-description/enterprise-level-features/import-export-specific-objects.md new file mode 100644 index 0000000000000000000000000000000000000000..6b09714cd23af19117c6b666b7bf6bd2c5a421ca --- /dev/null +++ b/product/zh/docs-mogdb/v5.0/characteristic-description/enterprise-level-features/import-export-specific-objects.md @@ -0,0 +1,167 @@ +--- +title: 支持指定导入导出五类基本对象 +summary: 支持指定导入导出五类基本对象 +author: Guo Huan 周帅康 +date: 2024-07-03 +--- + +# 支持指定导入导出五类基本对象 + +## 可获得性 + +本特性自MogDB 5.0.8版本开始引入。 + +## 特性简介 + +本特性支持逻辑备份工具(gs_dump)和恢复工具(gs_restore)指定导入导出package、function、procedure、trigger和type五类基本对象。 + +## 客户价值 + +增强逻辑备份和恢复工具的功能,提升MogDB的易用性。 + +## 特性描述 + +### 逻辑备份工具gs_dump支持导出指定基本对象 + +命令行新增指定备份导出指定对象的参数,实现指定package、function、procedure、trigger和type五类基本对象的导出功能。 + +- 可以指定这五种基本对象的一个或多个; +- 同一个基本对象可以指定多个参数名(如:--trigger name1 --trigger name2); +- 只导出指定的对象,不考虑对象的依赖关系; +- 对于trigger除了要导出定义外还要导出触发动作的function; +- 导出的备份文件可以用gs_restore工具导入; +- 不影响原有参数的设置。 + +**使用说明** + +```shell +-- 指定导出trigger +gs_dump -f backup_dir/filename -F p --trigger trigger_name + +-- 指定导出function +gs_dump -f backup_dir/filename -F p --function function_name(args) + +-- 指定导出type +gs_dump -f backup_dir/filename -F p --type type_name + +-- 指定导出package +gs_dump -f backup_dir/filename -F p --package package_name + +-- 指定导出procedure +gs_dump -f backup_dir/filename -F p --procedure procedure_name(args) +``` + +- --trigger trigger_name + + 指定导出trigger + +- --function function_name(args) + + 指定导出function + +- --type type_name + + 指定导出type + +- --package package_name + + 指定导出package + +- --procedure procedure_name(args) + + 指定导出procedure + +### 逻辑恢复工具gs_restore支持导入指定基本对象 + +命令行新增指定备份导入指定对象的参数,实现指定package、function、procedure、trigger和type五类基本对象的导入功能。 + +- 支持导入自定义归档格式、目录归档格式和tar归档格式; +- 只导入指定的对象,不考虑对象的依赖关系; +- 支持从全量备份里导入指定类型的对象; +- 支持按照通过gs_dump指定类型导出的归档文件导入指定对象; +- 可以指定这五种基本对象的一个或多个; +- 同一个基本对象可以指定多个参数名(如:--trigger name1 --trigger name2); +- 不影响原有参数的设置。 + +**使用说明** + +```shell +-- 指定导入trigger +gs_restore backup/MPPDB_backup.tar -p 8000 -d backupdb -e -T trigger_name +-- 或 +gs_restore backup/MPPDB_backup.tar -p 8000 -d backupdb -e --trigger trigger_name + +-- 指定导入function +gs_restore backup/MPPDB_backup.tar -p 8000 -d backupdb -e -P function_name(args) +-- 或 +gs_restore backup/MPPDB_backup.tar -p 8000 -d backupdb -e --function function_name(args) + +-- 指定导入type +gs_restore backup/MPPDB_backup.tar -p 8000 -d backupdb -e --type type_name + +-- 指定导入package +gs_restore backup/MPPDB_backup.tar -p 8000 -d backupdb -e --package package_name + +-- 指定导入procedure +gs_restore backup/MPPDB_backup.tar -p 8000 -d backupdb -e --procedure procedure_name(args) +``` + +- -T, --trigger trigger_name + + 指定导入trigger + +- -P, --function function_name(args) + + 指定导入function + +- --type type_name + + 指定导入type + +- --package package_name + + 指定导入package + +- --procedure procedure_name(args) + + 指定导入procedure + +## 注意事项 + +对于**函数**和**存储过程**这种带参数的名称要求标明参数类型。 + +例如,定义一个函数func(a INTEGER, b INTEGER),函数名为:“func(integer, integer)” + +为了兼容其他SQL语法,数据库可能会将某些参数类型转换成另一个类型,例如VARCHAR2会转成character varying,func(a INTEGER, table_name IN VARCHAR2) 会转换成:“func(integer, character varying)”。为了确保参数类型输入正确,可以采用如下SQL语句查询数据库中的函数参数类型: + + ```sql + SELECT p.proname AS function_name, + p.proargtypes AS parameter_types, + pg_catalog.pg_get_function_identity_arguments(p.oid) AS funcargs + FROM PG_PROC AS p + WHERE p.proname = 'func_gs_dump_0001'; + ``` + + 查询结果: + + ```sql + function name | parameter types | funcargs + -------------------+---------------------+--------------------------------- + func gs dump 0001 | 1043 | table name character varing + ``` + +通过SQL语句可以查到func_gs_dump_0001的参数类型是character varying,所以正确的对象名是"func_gs_dump_0001(character varying)"。 + +## 示例 + +```shell +-- 导出名为update_time的触发器 +gs_dump -f backup_dir/db.sql -F p --trigger update_time + +-- 导入名为update_time的触发器 +gs_restore backup/MPPDB_backup.tar -p 8000 -d backupdb -e --trigger update_time +``` + +## 相关页面 + +[gs_dump](../../reference-guide/tool-reference/server-tools/gs_dump.md)、[gs_restore](../../reference-guide/tool-reference/server-tools/gs_restore.md) \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/characteristic-description/high-availability/2-logical-replication.md b/product/zh/docs-mogdb/v5.0/characteristic-description/high-availability/2-logical-replication.md index 818cba7319d4e07fd63e4d82c728229b26c87fc2..46c789905de5ac61c02ae4d092965ffd772b57e2 100644 --- a/product/zh/docs-mogdb/v5.0/characteristic-description/high-availability/2-logical-replication.md +++ b/product/zh/docs-mogdb/v5.0/characteristic-description/high-availability/2-logical-replication.md @@ -27,10 +27,11 @@ DN通过物理日志反解析为逻辑日志,DRS等逻辑复制工具从DN抽 - MogDB 1.1.0逻辑解码新增全量+增量抽取日志的方案。 - MogDB 1.1.0逻辑解码新增备机支持逻辑解码。 +- MogDB 5.0.8逻辑解码功能新增对于DDL操作的支持。 ## 特性约束 -不支持列存复制,不支持DDL复制。 +不支持列存复制。 ## 依赖关系 @@ -38,4 +39,4 @@ DN通过物理日志反解析为逻辑日志,DRS等逻辑复制工具从DN抽 ## 相关页面 -[逻辑复制](../../developer-guide/logical-replication/logical-replication.md) \ No newline at end of file +[逻辑复制](../../developer-guide/logical-replication/logical-replication.md)、[逻辑解码支持DDL操作](../../developer-guide/logical-replication/logical-decoding/logical-decoding-support-for-DDL.md) \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/characteristic-description/high-availability/enhanced-efficiency-of-logical-backup-and-restore.md b/product/zh/docs-mogdb/v5.0/characteristic-description/high-availability/enhanced-efficiency-of-logical-backup-and-restore.md index 4553c015ad0d89bf7283dcb5525b17ee76c4ee50..04824fc99e1fbee0e6e6eff7ea4bed4d7a67db24 100644 --- a/product/zh/docs-mogdb/v5.0/characteristic-description/high-availability/enhanced-efficiency-of-logical-backup-and-restore.md +++ b/product/zh/docs-mogdb/v5.0/characteristic-description/high-availability/enhanced-efficiency-of-logical-backup-and-restore.md @@ -25,7 +25,7 @@ gs_dump工具新增`-j, --jobs=NUM`参数,支持在导出文件格式为目录 gs_restore工具支持对导出文件格式为目录和自定义归档格式(.dmp)的文件执行并行导入,实现备份数据导入效率提升。 -此外,本特性还支持通过对单表数据分片,并行执行每个分片的数据导入/导出,实现备份效率提升。 +此外,本特性支持通过对单表数据分片,并行执行每个分片的数据导入/导出;MogDB 5.0.8开始支持通过对分区表的每个分区进行分组,并行执行每个分组中的每个分区数据导入/导出,实现备份效率提升。 > 说明: > @@ -35,9 +35,7 @@ gs_restore工具支持对导出文件格式为目录和自定义归档格式(. ## 特性约束 -- 单表分片并行导出仅适用于1G以上的大表。 - -- 不支持分区表和二级分区表的单表分片并行导出。 +- 单表分片并行导出和分区表分组并行导出仅适用于1G以上的大表。 - 只有并行导出的单表可以并行导入(gs_dump和gs_restore的-j参数需要配合使用,参数值必须大于1)。例如: @@ -64,7 +62,59 @@ gs_restore backupdir/dir_bdat -d postgres -j 4 gs_restore backupdir/dir_bdat -d postgres --jobs=4 ``` -## 性能测试结果 +## 性能测试 + +性能测试共7组,分别是: + +1. 标准TPCC数据集并行导出导入 +2. 标准TPCH数据集并行导出导入 +3. 1000个小表并行导出导入 +4. 大单表并行导出导入 +5. 17GB分区大表并行导出导入 +6. 51GB分区大表并行导出导入 +7. 103GB分区大表并行导出导入 + +**1. 标准TPCC数据集并行导出导入** + +导出: + +![1](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/concurrency1.png) + +导入: + +![2](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/concurrency2.png) + +**2. 标准TPCH数据集并行导出导入** + +导出: + +![3](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/concurrency3.png) + +导入: + +![4](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/concurrency4.png) + +**3. 1000个小表并行导出导入** + +导出: + +![5](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/concurrency5.png) + +导入: + +![6](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/concurrency6.png) + +**4. 大单表并行并行导出导入** + +导出: + +![7](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/concurrency7.png) + +导入: + +![8](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/concurrency8.png) + +**1-4组结果分析** **gs_dump** @@ -84,6 +134,42 @@ gs_restore backupdir/dir_bdat -d postgres --jobs=4 - 并行度为10~20之间表现最优,继续提高并行度不会进一步增加导入效率,导出时MogDB的CPU使用率与并发数成正比 +**5. 17GB分区大表并行导出导入** + +导出: + +![9](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/concurrency9.png) + +导入: + +![10](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/concurrency10.png) + +**6. 51GB分区大表并行导出导入** + +导出: + +![11](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/concurrency11.png) + +导入: + +![12](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/concurrency12.png) + +**7. 103GB分区大表并行导出导入** + +导出: + +![13](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/concurrency13.png) + +导入: + +![14](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/concurrency14.png) + +**5-7组结果分析** + +针对103GB的分区大表,相比于串行导入导出,并行度设置为2、4、8时的性能(导入导出时间)分别提升了1倍、3倍、7倍。 + +由此可见,随着并行度的增加,对分区大表进行并行导出和导入的性能提高达到预期,17GB的分区表、51GB的分区表和103GB的分区表并发线性扩展基本一致。 + ## 相关页面 [gs_dump](../../reference-guide/tool-reference/server-tools/gs_dump.md)、[gs_restore](../../reference-guide/tool-reference/server-tools/gs_restore.md) diff --git a/product/zh/docs-mogdb/v5.0/characteristic-description/high-performance/9-smp-for-parallel-execution.md b/product/zh/docs-mogdb/v5.0/characteristic-description/high-performance/9-smp-for-parallel-execution.md index 9c866e69f2d73abd318b988a1767fcbbfbc50e00..1438c15b849bd2d76df529743a2ab078aec1c419 100644 --- a/product/zh/docs-mogdb/v5.0/characteristic-description/high-performance/9-smp-for-parallel-execution.md +++ b/product/zh/docs-mogdb/v5.0/characteristic-description/high-performance/9-smp-for-parallel-execution.md @@ -25,11 +25,10 @@ SMP并行技术充分利用了系统多核的能力,来提高重查询的性 ## 特性增强 -无。 +- MogDB 5.0.8:新增对于Ustore存储引擎并行能力的支持,涵盖并行顺序扫描、并行索引扫描、并行仅索引扫描、并行位图扫描。 ## 特性约束 -- 索引扫描不支持并行执行。 - 窗口函数中带有 order by不支持并行执行。(例如:SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary DESC)FROM empsalary;) - cursor(游标)不支持并行执行。 - 存储过程和函数内的查询不支持并行执行。 @@ -44,4 +43,4 @@ SMP并行技术充分利用了系统多核的能力,来提高重查询的性 ## 相关页面 -[配置SMP](../../performance-tuning/system-tuning/configuring-smp.md) \ No newline at end of file +[配置SMP](../../performance-tuning/system-tuning/configuring-smp.md)、[并行索引扫描](./parallel-index-scan.md)、[Ustore SMP并行扫描](./ustore-smp.md) \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/characteristic-description/high-performance/enhancement-of-dirty-pages-flushing-performance.md b/product/zh/docs-mogdb/v5.0/characteristic-description/high-performance/enhancement-of-dirty-pages-flushing-performance.md new file mode 100644 index 0000000000000000000000000000000000000000..5c95cd457f439cc2c95f6ad9fb3e62efbfdcdeb1 --- /dev/null +++ b/product/zh/docs-mogdb/v5.0/characteristic-description/high-performance/enhancement-of-dirty-pages-flushing-performance.md @@ -0,0 +1,102 @@ +--- +title: 极致刷脏 +summary: 极致刷脏 +author: 郭欢 王云龙 +date: 2024-06-26 +--- + +# 极致刷脏 + +## 可获得性 + +本特性自MogDB 5.0.8版本开始引入。 + +## 特性简介 + +MogDB在增量checkpoint模式下(也是默认刷脏模式),当数据库面临大压力写场景时,会产生大量脏页堆积,导致的后果有: + +1. Checkpoint耗时长; +2. Switchover耗时长; +3. 停库耗时长等; + +MogDB 5.0.8支持极致刷脏功能,通过设置参数extreme_flush_dirty_page = on开启。若当前系统的上述操作耗时比较高,可打开此参数,以提升大压力场景下的刷脏速度,使上层操作可以快速响应,降低执行checkpoint、switchover、重启RTO等操作的耗时。 + +## 新增GUC参数 + +### extreme_flush_dirty_page + +参数说明:是否开启极致刷脏模式(开启虽可以更快刷脏,但写放大增大) + +该参数属于POSTMASTER类型参数。 + +取值范围:布尔型 + +默认值:off + +注意:请确认当前系统刷脏慢的瓶颈不在系统IO能力之后,再打开此参数。可通过iostat、Node-exporter等监测工具确认磁盘IO不存在瓶颈。对于共享存储服务,还应确认共享存储服务的IO能力极限。 + +### checkpoint_target_time + +参数说明:期望执行checkpoint的最大耗时(值越小,刷脏越快,执行checkpoint实际耗时越小,但写放大增大,在IO成为瓶颈时,值很低可能影响业务);对应的上游操作有:停库(stop)、switchover(主备切换)、手动执行checkpoint语句。 + +该参数属于POSTMASTER类型参数。 + +取值范围:5 - 60s + +默认值:30s + +## 新增函数 + +local_pagewriter_flush_detail() + +描述:展示刷脏流程的详细信息,包括刷脏相关的GUC参数、刷脏流程中的变量信息等,在系统刷脏慢时,调用此函数可分析问题瓶颈所在。 + +权限:任何用户均可调用。 + +返回值: + +| 列名 | 描述 | +| ---------------------------- | ------------------------------------------------------------ | +| node_name | 节点名称 | +| pagewriter_sleep(ms) | 刷脏单个周期 | +| max_io_capacity(M) | 最大 io 能力 | +| dirty_page_percent_max | 脏页最大占比 | +| candidate_buf_percent_target | 候选 buffer 目标值占比 | +| max_redo_log_size(M) | 最大日志回放量 | +| main_pagewriter_detail | main_pagewriter 详细信息:开始时间、等待耗时、刷脏耗时 | +| sub_pagewriter_detail | id:sub_pagewriter 编号;wait_cost:上个刷脏周期等待耗时;flush_cost:上个刷脏周期实际刷脏耗时 | +| theoritical_max_io | 理论最大值=(「扫描 buffer 到候选队列」刷脏理论最大值 + 从脏页队列刷脏理论最大值) | +| lsn_percent | lsn 占比 | +| actual_max_io | 实际最大值=(「扫描 buffer 到候选队列」刷脏实际最大值 + 从脏页队列刷脏实际最大值) | +| actual_flush_num | 实际刷脏值=(「扫描 buffer 到候选队列」刷脏实际值 + 从脏页队列刷脏实际值) | +| remain_actual_dirty_page_num | 剩余实际脏页数量 | +| list_flush_detail | 扫描 buffer 到候选队列部分明细:当前候选 buffer 数、总 buffer 数 | +| queue_flush_detail | 从脏页队列刷脏部分明细:dirty_percent | +| forecast | 预测:当前速度、当前执行 checkpoint 预计耗时 | + +## 特性约束 + +- 开启极致刷脏模式,意味着写放大会增大。若IO本身已成为瓶颈,开启后优化效果不明显,同时可能导致tpmc下降。所以开启极致刷脏模式的前提是机器IO不是当前系统的瓶颈。 + +## 性能提升 + +开启刷脏优化后,SwitchOver时的CheckPoint时间、切换时间提升在47%以上,且TPMC平均值损失不大。 + +- SwitchOver RTO平均值降幅在47%到67.5% + + 不开启时平均值为41.55秒,checkpoint_target_time=5时降低到13.5秒,checkpoint_target_time = 30时降低到22秒 + +- SwitchOver切换时CheckPoint平均耗时降幅在49%到73% + + 不开启时平均值为38.68秒,checkpoint_target_time=5时降低到10.42秒,checkpoint_target_time = 30时降低到19.67秒 + +- 开启刷脏优化后的TPMC平均值与不开启刷脏优化的TPMC平均值接近持平。 + +TPCC和硬件配置情况: + +1. TPCC: 3000 warehouses 500/600 terminals 10 minutes Run +2. 硬件配置 : arm 48 CPU 200G Mem 3T Disk(RAID 0, 2 nvme SSD) + +## 相关页面 + +[extreme_flush_dirty_page](../../reference-guide/guc-parameters/resource-consumption/background-writer.md#extreme_flush_dirty_page)、[checkpoint_target_time](../../reference-guide/guc-parameters/resource-consumption/background-writer.md#checkpoint_target_time)、[local_pagewriter_flush_detail()](../../reference-guide/functions-and-operators/system-management-functions/other-functions.md) \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/characteristic-description/high-performance/high-performance.md b/product/zh/docs-mogdb/v5.0/characteristic-description/high-performance/high-performance.md index 240f9f1a8d65386049e9451e0f9128c232fdfe5f..82faeb034a19f43b59eab6efb33b1b6cd275594f 100644 --- a/product/zh/docs-mogdb/v5.0/characteristic-description/high-performance/high-performance.md +++ b/product/zh/docs-mogdb/v5.0/characteristic-description/high-performance/high-performance.md @@ -27,4 +27,7 @@ date: 2023-05-22 + **[排序算子优化](ordering-operator-optimization.md)** + **[OCK加速数据传输](ock-accelerated-data-transmission.md)** + **[OCK SCRLock加速分布式锁](ock-scrlock-accelerate-distributed-lock.md)** -+ **[日志回放性能增强](enhancement-of-wal-redo-performance.md)** \ No newline at end of file ++ **[日志回放性能增强](enhancement-of-wal-redo-performance.md)** ++ **[极致刷脏](enhancement-of-dirty-pages-flushing-performance.md)** ++ **[顺序扫描预读](seqscan-prefetch.md)** ++ **[Ustore SMP并行扫描](ustore-smp.md)** \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/characteristic-description/high-performance/parallel-index-scan.md b/product/zh/docs-mogdb/v5.0/characteristic-description/high-performance/parallel-index-scan.md index e06df2b28c54e1f325d0b865ada74f35b934a894..2bd94c69c9388acf1fbdeb7de0ea8ef8304f86d4 100644 --- a/product/zh/docs-mogdb/v5.0/characteristic-description/high-performance/parallel-index-scan.md +++ b/product/zh/docs-mogdb/v5.0/characteristic-description/high-performance/parallel-index-scan.md @@ -31,7 +31,7 @@ date: 2022-11-14 ## 特性增强 -无 +- MogDB 5.0.8:新增对于Ustore存储引擎并行索引扫描的支持。 ## 特性约束 @@ -39,7 +39,7 @@ date: 2022-11-14 - 并行索引扫描仅支持BTree索引。 -- 支持astore存储类型,不支持ustore、cstore存储类型。 +- 支持astore、ustore存储类型,不支持cstore存储类型。 ## 示例 diff --git a/product/zh/docs-mogdb/v5.0/characteristic-description/high-performance/seqscan-prefetch.md b/product/zh/docs-mogdb/v5.0/characteristic-description/high-performance/seqscan-prefetch.md new file mode 100644 index 0000000000000000000000000000000000000000..81a5c424d3f7285d426ea2f5e1f4da68e72e6125 --- /dev/null +++ b/product/zh/docs-mogdb/v5.0/characteristic-description/high-performance/seqscan-prefetch.md @@ -0,0 +1,158 @@ +--- +title: 顺序扫描预读 +summary: 顺序扫描预读 +author: Guo Huan 郑小进 赵金 陈先 +date: 2023-12-20 +--- + +# 顺序扫描预读 + +## 可获得性 + +本特性自MogDB 5.0.8版本开始引入。 + +## 特性简介 + +MogDB的顺序扫描预读针对较大数据量下的纯数据表顺序扫描场景(全表扫描场景)进行优化,提升扫描性能。本特性支持Astore和Ustore两种存储引擎,并且支持并行扫描下预读。 + +## 客户价值 + +并行化顺序扫描过程中的CPU处理和I/O操作,减少I/O对CPU的阻塞,提高CPU利用率,以提升顺序扫描性能。 + +## 特性描述 + +数据库中的数据是按照一个个页面进行组织管理的,CPU以页面为单位对数据进行处理,这就使得CPU处理和I/O之间形成了串行交替执行的现象。在该处理模型中,由于一个页面的I/O时延明显大于CPU处理一个页面的时间,导致CPU处理过程会被I/O操作频繁打断,使CPU利用率低下,这是导致如全表扫描等场景性能差的主要原因。 + +顺序扫描预读机制改变了该处理模型,将顺序扫描的CPU处理过程与I/O操作并行化,尽量避免CPU因为等待I/O而阻塞。理想状态是,当CPU将要处理下一个数据页时,刚好I/O服务例程已经将该数据页准备好放在内存中。这种模型我们定义为数据页预读机制(data prefetch)。 + +本特性在全表扫描类查询(如TPCH场景)中,SeqScan算子性能提升20%-60%,端到端性能提升10%-20%。 + +注: + +并非所有SQL在任何测试场景下,都有上述性能提升。预读性能提升主要和查询语句的复杂度(CPU计算和I/O耗时)及磁盘带宽有关,其他影响因素包括是否为全缓存场景、是否为混合查询负载。 + +- 算子性能提升明显的SQL特征:CPU计算耗时重,I/O带宽未达到磁盘最大带宽。 +- 端到端性能提升明显的SQL特征:CPU计算和I/O耗时各占50%左右,I/O带宽未达到磁盘最大带宽。 + +本特性默认关闭,设置GUC参数`enable_ios = on`,`enable_heap_async_prefetch = on`启用Astore顺序扫描预读。设置GUC参数 `enable_ios = on`,`enable_uheap_async_prefetch = on`启用Ustore顺序扫描预读。 + +## 性能对比 + +本特性在不同并行度下主节点执行TPCH测试的性能提升的结果、以及在混合负载(tpcc+tpch)的情况下性能提升的结果和对TPMC的影响如下。 + +> 注:下图中纵轴为算子或者SQL的执行时间(单位:秒),横轴为执行的SQL + +- Astore性能数据 + + - dop=1:TPCH顺序扫描算子提升为52%,端到端的提升为27%: + + ![img1](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/seqscan-prefetch-1.png) + + ![img2](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/seqscan-prefetch-2.png) + + - dop=8:TPCH顺序扫描算子提升为28%,端到端的提升为13%: + + ![img3](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/seqscan-prefetch-3.png) + + ![img4](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/seqscan-prefetch-4.png) + + - 混合负载(tpcc+tpch)的情况下性能提升的结果和对TPMC的影响: + + TPCH顺序扫描算子提升为19%,端到端的提升为10%,并且tpmc不受预读影响。 + + > 注:不开启预读的 tpmc 为:410204 , 开启预读的 tpmc 为:414793 + + ![img5](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/seqscan-prefetch-5.png) + + ![img6](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/seqscan-prefetch-6.png) + +- Ustore性能数据 + + 不同并行度下主节点tpch测试的结果: + + - dop=1:总体算子提升为41%,端到端的提升为19% + + ![img7](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/seqscan-prefetch-7.png) + + - dop=4:总体算子提升为43%,端到端的提升为21% + + ![img8](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/seqscan-prefetch-8.png) + + - dop=8:总体算子提升为45%,端到端的提升为23% + + ![img9](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/seqscan-prefetch-9.png) + + - dop=16:总体算子提升为37%,端到端的提升为13% + + ![img10](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/seqscan-prefetch-10.png) + + 混合负载(tpcc+tpch)的情况下性能提升的结果和对TPMC的影响: + + - dop=1:总体算子提升为32%,端到端的提升为19%,tpmc效果提升3%,tpmc不受预读影响 + + ![img11](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/seqscan-prefetch-11.png) + + - dop=4:总体算子提升为38%,端到端的提升为22%,tpmc效果提升2%,tpmc不受预读影响 + + ![img12](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/seqscan-prefetch-12.png) + +## 特性约束 + +- 支持串行扫描和并行扫描场景预读。 +- 当前版本支持Astore引擎和Ustore引擎,不支持Cstore以及段页式引擎。 +- 建议NVMe SSD开预读,磁盘不开预读。 + +## 使用指导 + +### 使用限制 + +1. 如果使用普通的机械硬盘,磁盘IO带宽可能是系统瓶颈,所以不能体现出预读的优势。 +2. 顺序预读机制主要适用于数据量很大的表(至少为GB级别),对于数据量很小的表,不建议开启预读,目前默认1G触发预读,可以设置的触发预读的最小表大小为512MB,用户可以由GUC参数min_table_block_num_enable_ios和min_uheap_table_block_num_enable_ios调整触发预读的表大小。 + +### 配置步骤 + +- 配置Astore预读 + + ```sql + enable_ios = true // 系统级别,重启数据库生效,默认为false + enable_heap_async_prefetch=true // 会话级别,支持在线配置,默认为false + ``` + +- 配置Ustore预读 + + ```sql + enable_ios = true // 系统级别,重启数据库生效,默认为false + enable_uheap_async_prefetch=true // 会话级别,支持在线配置,默认为false + ``` + +### GUC参数 + +注意:除了enable_ios和ios_worker_num需要重启数据库生效,其他GUC参数都支持在线配置。 + +| 序号 | 参数描述 | +| ---- | ------------------------------------------------------------ | +| 1 | [enable_ios](../../reference-guide/guc-parameters/thread-pool.md#enable_ios):控制是否启动IOS服务。 | +| 2 | [enable_heap_async_prefetch](../../reference-guide/guc-parameters/thread-pool.md#enable_heap_async_prefetch):控制是否对Astore全表扫描类场景启用预读功能。 | +| 3 | [enable_uheap_async_prefetch](../../reference-guide/guc-parameters/thread-pool.md#enable_uheap_async_prefetch):控制是否对Ustore全表扫描类场景启用预读功能。 | +| 4 | [ios_worker_num](../../reference-guide/guc-parameters/thread-pool.md#ios_worker_num):IOS线程池里面的ios_worker个数。 | +| 5 | [parallel_scan_gap](../../reference-guide/guc-parameters/thread-pool.md#parallel_scan_gap):开启并行扫描时(query_dop > 1),每个工作线程单次处理的页面数量。 | +| 6 | [ios_batch_read_size](../../reference-guide/guc-parameters/thread-pool.md#ios_batch_read_size):ios_worker每个批次下发给盘的预读页面个数。 | +| 7 | [max_requests_per_worker](../../reference-guide/guc-parameters/thread-pool.md#max_requests_per_worker):每个ios_worker最大队列深度。 | +| 8 | [min_table_block_num_enable_ios](../../reference-guide/guc-parameters/thread-pool.md#min_table_block_num_enable_ios):触发预读的Astore表大小阈值。只有当表的数据页总数大于等于该阈值时,才有可能触发预读。目前数据页大小为8kB。 | +| 9 | [min_uheap_table_block_num_enable_ios](../../reference-guide/guc-parameters/thread-pool.md#min_table_block_num_enable_ios):触发预读的Ustore表大小阈值。只有当表的数据页总数大于等于该阈值时,才有可能触发预读。目前数据页大小为8kB。 | +| 10 | [prefetch_protect_time](../../reference-guide/guc-parameters/thread-pool.md#prefetch_protect_time):预读buffer最大保护时间。 | +| 11 | [ios_status_update_gap](../../reference-guide/guc-parameters/thread-pool.md#ios_status_update_gap):更新IOS性能状态的时间间隔。 | + +### 运维监控能力 + +1. 用户可以通过执行计划里的shared buff hit命中指标,直观感受开启预读的效果,可以明显看到buffer命中率极高,并且配合打开GUC参数:[track_io_timing](../../reference-guide/guc-parameters/statistics-during-the-database-running/query-and-index-statistics-collector.md#track_io_timing) = on,观测I/O Timings: read ,即IO读时延极低。 + +2. 相关性能视图:[IOS_STATUS](../../reference-guide/system-catalogs-and-system-views/system-views/IOS_STATUS.md) + + 使用方法:`select * from ios_status();` + + 用于查看最近一段时间负责预读的IO线程池的性能状态,包含IOSCtl派发请求,IO时延/带宽,队列积压等指标。当主查询线程IO时延很高或者缓存命中率低等问题出现的时候,用户或者研发人员可以通过直观查看预读线程池的性能来帮助定位。 + +## 相关页面 + +[In-place Update存储引擎Ustore](../../performance-tuning/system-tuning/configuring-ustore.md) \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/characteristic-description/high-performance/ustore-smp.md b/product/zh/docs-mogdb/v5.0/characteristic-description/high-performance/ustore-smp.md new file mode 100644 index 0000000000000000000000000000000000000000..f3d5d0cd038a06cd24d64a61b24cb516d717d6e4 --- /dev/null +++ b/product/zh/docs-mogdb/v5.0/characteristic-description/high-performance/ustore-smp.md @@ -0,0 +1,96 @@ +--- +title: Ustore SMP并行扫描 +summary: Ustore SMP并行扫描 +author: Guo Huan 赵森 刘静怡 +date: 2024-07-02 +--- + +# Ustore SMP并行扫描 + +## 可获得性 + +本特性自MogDB 5.0.8版本开始引入。 + +## 特性简介 + +MogDB的SMP并行技术是一种利用计算机多核CPU架构来实现多线程并行计算,以充分利用CPU资源来提高查询性能的技术。之前的SMP并行技术仅支持Astore存储引擎,MogDB 5.0.8新增了对于Ustore存储引擎的并行能力支持。 + +## 特性描述 + +在复杂查询场景中,单个查询的执行较长,系统并发度低,通过SMP并行执行技术实现算子级的并行,能够有效减少查询执行时间,提升查询性能及资源利用率。 + +本特性中,Ustore 存储引擎的 SMP 并行能力涵盖以下几种场景: + +1. 并行顺序扫描(Parallel Seq Scan) + +2. 并行索引扫描(Parallel Index Scan) + +3. 并行仅索引扫描(Parallel Index Only Scan) + +4. 并行位图扫描 (Parallel Bitmap Scan) + +## 是否开启并行下的性能对比 + +- Seq Scan + + ![1](https://cdn-mogdb.enmotech.com/docs-media/mogdb/performance-tuning/ustore10.png) + + 开启并行后,随着并行度的增加,Seq scan顺序查询的性能随着并行度的增加而提升,其中在Agg场景下表现最佳,并行度16的情况下,查询性能相比串行快了12~13倍。 + +- Index Scan + + ![2](https://cdn-mogdb.enmotech.com/docs-media/mogdb/performance-tuning/ustore11.png) + + 开启并行后,在合适的场景下,Index Scan查询性能随着并行度的增加而提升,其中在Agg场景下表现最佳, 并行度16的情况下,并行比串行快了11-15倍。 + +- Index Only Scan + + ![3](https://cdn-mogdb.enmotech.com/docs-media/mogdb/performance-tuning/ustore12.png) + + 开启并行后,在合适的场景下,Index Only Scan查询性能随着并行度的增加而提升,其中在Agg场景下表现最佳, 并行度16的情况下,并行比串行快了5-13倍;且ustore的Index Only Scan算子在串行与并行场景下均优于astore。 + +- Bitmap Scan + + 位图读取针对随机读,且数据量较大的场景会有较好的表现,并行度16的情况下,BitmapScan算子性能有70%-100%的性能提升,但端到端性能提升不明显,尤其在数据量较小的情况下。 + +## 特性约束 + +- cursor(游标)不支持并行执行。 +- 存储过程和函数内的查询不支持并行执行。 +- 不支持子查询subplan和initplan的并行,以及包含子查询的算子的并行。 +- 查询语句中带有median操作的查询不支持并行执行。 +- 带全局临时表的查询不支持并行执行。 + +## 使用指导 + +### 使用限制 + +想要利用SMP提升查询性能需要考虑以下条件: + +- 资源对SMP性能的影响 + + 系统的CPU、内存、I/O和网络带宽等资源需要充足。SMP架构是一种利用富余资源来换取时间的方案,计划并行之后必定会引起资源消耗的增加,当上述资源成为瓶颈的情况下,SMP无法提升性能,反而可能导致性能的劣化。在出现资源瓶颈的情况下,建议关闭SMP。 + +- 其他因素对SMP性能的影响 + + 当数据中存在严重数据倾斜时,并行效果较差。例如某表join列上某个值的数据量远大于其他值,开启并行后,根据join列的值对该表数据做hash重分布,使得某个并行线程的数据量远多于其他线程,造成长尾问题,导致并行后效果差。 + + SMP特性会增加资源的使用,而在高并发场景下资源剩余较少。所以,如果在高并发场景下,开启SMP并行,尤其需要处理的数据量较小时,会导致各查询之间严重的资源竞争问题。一旦出现了资源竞争的现象,无论是CPU、I/O、内存,都会导致整体性能的下降。因此在高并发场景下,开启SMP往往不能达到性能提升的效果,甚至可能引起性能劣化。 + +### 配置步骤 + +1. 观察当前系统负载情况,如果系统资源充足(资源利用率小于50%),执行步骤2;否则退出。 + +2. 配置:通过`set query_dop = ${thread_num};`开启,默认为1。 + +3. 执行查询语句后关闭query_dop,如下图所示。 + + ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/characteristic-description/ustore-smp.png) + + 在这个计划中,实现了Seq Scan算子的并行,并新增了Local Gather数据交换算子。其中Local Gather算子标有的“dop: 1/2”表明该算子的发送端线程的并行度为2 ,而接受端线程的并行度为1 ,即下层的Seq Scan算子按照2并行度执行,Streaming算子实现了实例内并行线程的数据汇总。 + + 资源许可的情况下,并行度越高,性能提升效果越好。并不是并行度越高越好,多到一定程度后性能提升可能不明显。 + +## 相关页面 + +[In-place Update存储引擎Ustore](../../performance-tuning/system-tuning/configuring-ustore.md)、[query_dop](../../reference-guide/guc-parameters/query-planning/other-optimizer-options.md#query_dop) \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/developer-guide/logical-replication/logical-decoding/1-logical-decoding.md b/product/zh/docs-mogdb/v5.0/developer-guide/logical-replication/logical-decoding/1-logical-decoding.md index d8d2e0458d3cfa6f5969773f02aad33be1fae9fc..af72d208bddf7f62b3341d829d07ff25273bd0bc 100644 --- a/product/zh/docs-mogdb/v5.0/developer-guide/logical-replication/logical-decoding/1-logical-decoding.md +++ b/product/zh/docs-mogdb/v5.0/developer-guide/logical-replication/logical-decoding/1-logical-decoding.md @@ -29,10 +29,9 @@ MogDB提供了逻辑解码功能,通过反解xlog的方式生成逻辑日志 ## 注意事项 -- 不支持DDL语句解码,在执行特定的DDL语句(例如普通表truncate或分区表exchange)时,可能造成解码数据丢失。 +- MogDB 5.0.8开始[逻辑解码支持部分DDL操作](./logical-decoding-support-for-DDL.md)。 - 不支持列存、数据页复制的解码。 - 不支持级联备机进行逻辑解码。 -- 当执行DDL语句(如alter table)后,该DDL语句前尚未解码的物理日志可能会丢失。 - 单条元组大小不超过1GB,考虑解码结果可能大于插入数据,因此建议单条元组大小不超过500MB。 - MogDB支持解码的数据类型为:INTEGER、BIGINT、SMALLINT、TINYINT、SERIAL、SMALLSERIAL、BIGSERIAL、FLOAT、DOUBLE PRECISION、DATE、TIME[WITHOUT TIME ZONE]、TIMESTAMP[WITHOUT TIME ZONE]、CHAR(n)、VARCHAR(n)、TEXT。 - 如果需要ssl连接需要保证前置设置GUC参数ssl=on。 @@ -51,7 +50,6 @@ MogDB提供了逻辑解码功能,通过反解xlog的方式生成逻辑日志 - 请确保在创建逻辑复制槽过程中长事务未启动,启动长事务会阻塞逻辑复制槽的创建。 - 不支持interval partition表复制。 - 不支持全局临时表。 -- 在事务中执行DDL语句后,该DDL语句与之后的语句不会被解码。 - 如需进行备机解码,需在对应主机上设置guc参数enable_slot_log = on。 - 禁止在使用逻辑复制槽时在其他节点对该复制槽进行操作,删除复制槽进行操作的操作需在该复制槽停止解码后执行。 - 在开启逻辑复制的场景下,如需创建包含系统列的主键索引,必须将该表的REPLICA IDENTITY属性设置为FULL或是使用USING INDEX指定不包含系统列的、唯一的、非局部的、不可延迟的、仅包括标记为NOT NULL的列的索引。 diff --git a/product/zh/docs-mogdb/v5.0/developer-guide/logical-replication/logical-decoding/logical-decoding-support-for-DDL.md b/product/zh/docs-mogdb/v5.0/developer-guide/logical-replication/logical-decoding/logical-decoding-support-for-DDL.md new file mode 100644 index 0000000000000000000000000000000000000000..e4da2f53a91745b0cf733a28868e942542134d89 --- /dev/null +++ b/product/zh/docs-mogdb/v5.0/developer-guide/logical-replication/logical-decoding/logical-decoding-support-for-DDL.md @@ -0,0 +1,186 @@ +--- +title: 逻辑解码支持DDL操作 +summary: 逻辑解码支持DDL操作 +author: 郭欢 何文健 +date: 2024-01-29 +--- + +# 逻辑解码支持DDL操作 + +自MogDB 5.0.8版本开始,逻辑解码功能新增对于DDL操作的支持,减少用户在逻辑复制过程中对表的手动维护,避免由于表结构发生变动导致逻辑复制同步过程出现异常。内核逻辑解码添加序列支持,wal2json、logical_decoding、mppdb_decoding三个解码插件完成序列解码对接。 + +## 功能描述 + +MogDB在逻辑解码过程中支持如下DDL(Data Definition Language,数据库模式定义语言)操作: + +- CREATE/DROP TABLE|TABLE PARTITION +- CREATE/DROP INDEX +- TRUNCATE TABLE +- ALTER TABLE ADD COLUMN [CONSTRAINT] +- ALTER TABLE DROP COLUMN +- ALTER TABLE ALTER COLUMN [TYPE|SET NOT NULL|DROP NOT NULL|SET DEFAULT|DROP DEFAULT] +- ALTER TABLE [DROP|ADD|TRUNCATE] PARTITION +- ALTER TABLE MODIFY COLUMN data_type [ON UPDATE] +- ALTER TABLE MODIFY COLUMN [NOT] NULL +- ALTER TABLE ADD COLUMN [AFTER|FIRST] + +逻辑解码支持DDL操作所需配套插件支持: + +- wal2json +- mppdb_decoding +- test_decoding + +下面的插件中新增对逻辑解码DDL类型日志xl_logical_ddl_message的解析: + +- pg_xlogdump +- mog_xlogdump + +## 注意事项 + +- 只支持行存表的DDL操作。 +- 不支持列存、ustore存储引擎。 +- 不支持临时表。 +- 不支持非日志表。 +- 不支持逻辑订阅。 +- 当一条语句中存在多个对象,且多个对象属于不同schema,默认按对象出现的顺序,输出其所属的schema。 +- 部分DDL操作语句由于内核实现原因,会产生一些无需关注的DML语句解析结果。 +- GUC参数[wal_level](../../../reference-guide/guc-parameters/write-ahead-log/settings.md#wal_level)需要>=logical,且开启[enable_ddl_logical_record](../../../reference-guide/guc-parameters/ha-replication/sending-server.md#enable_ddl_logical_record)。 +- wal2json 只支持`format-version==1`,不支持`format-version==2`。 + +## 示例 + +1. 逻辑解码功能(以wal2json为plugin) + + - 设置enable_ddl_logical_record=true,设置wal_level=logical。 + + 输入sql语句如下: + + ```sql + DROP TABLE IF EXISTS range_sales ; + SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot', 'wal2json'); + + CREATE TABLE logical_tb2(col1 boolean[],col2 boolean); + drop table logical_tb2; + CREATE TABLE range_sales + ( + product_id INT4 NOT NULL, + customer_id INT4 PRIMARY KEY, + time_id DATE, + channel_id CHAR(1), + type_id INT4, + quantity_sold NUMERIC(3), + amount_sold NUMERIC(10,2) + ) + PARTITION BY RANGE (time_id) + ( + PARTITION time_2008 VALUES LESS THAN ('2009-01-01'), + PARTITION time_2009 VALUES LESS THAN ('2010-01-01'), + PARTITION time_2010 VALUES LESS THAN ('2011-01-01'), + PARTITION time_2011 VALUES LESS THAN ('2012-01-01') + ); + CREATE INDEX range_sales_idx1 ON range_sales(product_id) LOCAL; + CREATE INDEX range_sales_idx2 ON range_sales(time_id) GLOBAL; + + drop INDEX range_sales_idx1 ; + drop INDEX range_sales_idx2 ; + + drop TABLE range_sales; + SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'format-version', '1'); + SELECT 'stop' FROM pg_drop_replication_slot('regression_slot'); + ``` + + - wal2json解码结果: + + ```json + data + ------------------------------------------------------------------------------------------------- + {"change":[ + + DDL message: role: hewenjian; search_path: "$user",public; sz: 54; schemaname: public; + + original DDL query:CREATE TABLE logical_tb2(col1 boolean[],col2 boolean); + + ]} + {"change":[ + + DDL message: role: hewenjian; search_path: "$user",public; sz: 23; schemaname: public; + + original DDL query:drop table logical_tb2; + + ]} + {"change":[ + + DDL message: role: hewenjian; search_path: "$user",public; sz: 502; schemaname: public;+ + original DDL query:CREATE TABLE range_sales + + ( + + product_id INT4 NOT NULL, + + customer_id INT4 PRIMARY KEY, + + time_id DATE, + + channel_id CHAR(1), + + type_id INT4, + + quantity_sold NUMERIC(3), + + amount_sold NUMERIC(10,2) + + ) + + PARTITION BY RANGE (time_id) + + ( + + PARTITION time_2008 VALUES LESS THAN ('2009-01-01'), + + PARTITION time_2009 VALUES LESS THAN ('2010-01-01'), + + PARTITION time_2010 VALUES LESS THAN ('2011-01-01'), + + PARTITION time_2011 VALUES LESS THAN ('2012-01-01') + + ); + + ]} + {"change":[ + + DDL message: role: hewenjian; search_path: "$user",public; sz: 63; schemaname: public; + + original DDL query:CREATE INDEX range_sales_idx1 ON range_sales(product_id) LOCAL; + + ]} + {"change":[ + + DDL message: role: hewenjian; search_path: "$user",public; sz: 61; schemaname: public; + + original DDL query:CREATE INDEX range_sales_idx2 ON range_sales(time_id) GLOBAL; + + ]} + {"change":[ + + DDL message: role: hewenjian; search_path: "$user",public; sz: 30; schemaname: public; + + original DDL query:drop INDEX range_sales_idx1 ; + + ]} + {"change":[ + + DDL message: role: hewenjian; search_path: "$user",public; sz: 30; schemaname: public; + + original DDL query:drop INDEX range_sales_idx2 ; + + ]} + {"change":[ + + DDL message: role: hewenjian; search_path: "$user",public; sz: 23; schemaname: public; + + original DDL query:drop TABLE range_sales; + + ]} + (8 rows) + ``` + +2. pg_xlogdump + + 产生的wal对应的wal2json输出: + + ```json + data + -------------------------------------------------------------------------------------------------------------------------------------------- + {"change":[ + + DDL message: role: hewenjian; search_path: public, new_schema1, new_schema2; sz: 53; schemaname: public, new_schema1, new_schema2;+ + original DDL query:TRUNCATE TABLE range_sales,range_sales1,range_sales2; + + ]} + (1 row) + ``` + + 对应pg_xlogdump输出: + + ```json + REDO @ 0/55599A0; LSN 0/5559A80: prev 0/5559918; xid 15966; term 1; len 189; total 223; crc 4229648830; desc: LogicalDDLMessage - prefix "DDL"; role "hewenjian"; search_path "public, new_schema1, new_schema2"; schemaname "public, new_schema1, new_schema2"; payload (53 bytes): 54 52 55 4E 43 41 54 45 20 54 41 42 4C 45 20 72 61 6E 67 65 5F 73 61 6C 65 73 2C 72 61 6E 67 65 5F 73 61 6C 65 73 31 2C 72 61 6E 67 65 5F 73 61 6C 65 73 32 3B + ``` + +3. mog_xlogdump + + 产生的wal对应的wal2json输出: + + ```json + data + -------------------------------------------------------------------------------------------------------------------------------------------- + {"change":[ + + DDL message: role: hewenjian; search_path: public, new_schema1, new_schema2; sz: 53; schemaname: public, new_schema1, new_schema2;+ + original DDL query:TRUNCATE TABLE range_sales,range_sales1,range_sales2; + + ]} + (1 row) + ``` + + 对应mog_xlogdump输出: + + ```json + REDO @ 0/55599A0; LSN 0/5559A80: prev 0/5559918; xid 15966; term 1; len 189; total 223; crc 4229648830; desc: LogicalDDLMessage - prefix "DDL"; role "hewenjian"; search_path "public, new_schema1, new_schema2"; schemaname "public, new_schema1, new_schema2"; payload (53 bytes): 54 52 55 4E 43 41 54 45 20 54 41 42 4C 45 20 72 61 6E 67 65 5F 73 61 6C 65 73 2C 72 61 6E 67 65 5F 73 61 6C 65 73 31 2C 72 61 6E 67 65 5F 73 61 6C 65 73 32 3B + ``` \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/developer-guide/logical-replication/logical-decoding/logical-decoding.md b/product/zh/docs-mogdb/v5.0/developer-guide/logical-replication/logical-decoding/logical-decoding.md index 7866f9c395fd590f9c2b5f63beb7c4272556fb36..3a04cac0135fb1ff7b153a933c242f37b40807a1 100644 --- a/product/zh/docs-mogdb/v5.0/developer-guide/logical-replication/logical-decoding/logical-decoding.md +++ b/product/zh/docs-mogdb/v5.0/developer-guide/logical-replication/logical-decoding/logical-decoding.md @@ -9,3 +9,4 @@ date: 2023-05-19 + **[逻辑解码概述](1-logical-decoding.md)** + **[使用SQL函数接口进行逻辑解码](2-logical-decoding-by-sql-function-interfaces.md)** ++ **[逻辑解码支持DDL操作](logical-decoding-support-for-DDL.md)** diff --git a/product/zh/docs-mogdb/v5.0/performance-tuning/system-tuning/configuring-smp.md b/product/zh/docs-mogdb/v5.0/performance-tuning/system-tuning/configuring-smp.md index c31ef5f095053d75ebf6e0fff7bc141542611467..619a654a854e6d9d363dfc60cfae7d574a4a05bf 100644 --- a/product/zh/docs-mogdb/v5.0/performance-tuning/system-tuning/configuring-smp.md +++ b/product/zh/docs-mogdb/v5.0/performance-tuning/system-tuning/configuring-smp.md @@ -42,15 +42,13 @@ SMP特性通过算子并行来提升性能,同时会占用更多的系统资 ### 非适用场景 -1. 索引扫描不支持并行执行。 -2. MergeJoin不支持并行执行。 -3. WindowAgg order by不支持并行执行。 -4. cursor不支持并行执行。 -5. 存储过程和函数内的查询不支持并行执行。 -6. 不支持子查询subplan和initplan的并行,以及包含子查询的算子的并行。 -7. 查询语句中带有median操作的查询不支持并行执行。 -8. 带全局临时表的查询不支持并行执行。 -9. 物化视图的更新不支持并行执行。 +1. WindowAgg order by不支持并行执行。 +2. cursor不支持并行执行。 +3. 存储过程和函数内的查询不支持并行执行。 +4. 不支持子查询subplan和initplan的并行,以及包含子查询的算子的并行。 +5. 查询语句中带有median操作的查询不支持并行执行。 +6. 带全局临时表的查询不支持并行执行。 +7. 物化视图的更新不支持并行执行。 ## 资源对SMP性能的影响 diff --git a/product/zh/docs-mogdb/v5.0/performance-tuning/system-tuning/configuring-ustore.md b/product/zh/docs-mogdb/v5.0/performance-tuning/system-tuning/configuring-ustore.md index 60e8d9432d0f50753fd6be4c7ba9f0d461c56af0..edf77961a5065990f11d309b7e07e3d6625a4a39 100644 --- a/product/zh/docs-mogdb/v5.0/performance-tuning/system-tuning/configuring-ustore.md +++ b/product/zh/docs-mogdb/v5.0/performance-tuning/system-tuning/configuring-ustore.md @@ -1,11 +1,11 @@ --- -title: 配置Ustore -summary: 配置Ustore -author: zhang cuiping +title: In-place Update存储引擎 Ustore +summary: In-place Update存储引擎 Ustore +author: zhang cuiping 赵森 date: 2023-04-14 --- -# 配置Ustore +# In-place Update存储引擎Ustore Ustore存储引擎,又名In-place Update存储引擎(原地更新),是MogDB新增的一种存储模式。此前的版本使用的行存储引擎是Append Update(追加更新)模式。追加更新对于业务中的增、删以及HOT(HeapOnly Tuple)Update(即同一页面内更新)有很好的表现,但对于跨数据页面的非HOT UPDATE场景,垃圾回收不够高效。因此,Ustore存储引擎应运而生。 @@ -13,7 +13,8 @@ Ustore存储引擎,又名In-place Update存储引擎(原地更新),是Mo Ustore存储引擎将最新版本的“有效数据”和历史版本的“垃圾数据”分离存储。将最新版本的“有效数据”存储在数据页面上,并单独开辟一段UNDO空间,用于统一管理历史版本的“垃圾数据”,因此数据空间不会由于频繁更新而膨胀,“垃圾数据”集中回收效率更高。 -Ustore存储引擎采用NUMA-aware的UNDO子系统设计,使得UNDO子系统可以在多核平台上有效扩展;同时采用多版本索引技术,解决索引清理问题,有效提升了存储空间的回收复用效率。 +Ustore存储引擎采用NUMA-aware的UNDO子系统设计,使得UNDO子系统可以在多核平台上有效扩展;采用多版本索引技术,在Index Only Scan场景下的查询性能相比Astore得到了显著提升(2~5倍),另外还解决了索引清理问题,有效提升了存 +储空间的回收复用效率。 Ustore存储引擎结合UNDO空间,可以实现更高效、更全面的闪回查询和回收站机制,能快速回退人为“误操作”,为MogDB提供了更丰富的企业级功能。 @@ -31,14 +32,38 @@ Ustore的核心优势场景为频繁更新场景,相较于Astore空间利用 - 非主键的范围更新 + ![img1](https://cdn-mogdb.enmotech.com/docs-media/mogdb/performance-tuning/ustore1.png) + + ***图 1** 不同并发数下非主键范围更新,Astore 和 Ustore 的tps对比* + + ![img2](https://cdn-mogdb.enmotech.com/docs-media/mogdb/performance-tuning/ustore2.png) + + ***图 2** 30并发数下非主键范围更新,Astore 和 Ustore 的数据库大小对比* + 在非主键的范围更新下,Ustore展示出相较于Astore更大的优势,不同并发数下的平均性能提升在**40%**以上,且随着不断地更新操作,Astore出现了表空间膨胀,而Ustore的空间趋于平稳。 - 点查询+点更新(index scan + 非主键更新) + ![img3](https://cdn-mogdb.enmotech.com/docs-media/mogdb/performance-tuning/ustore3.png) + + ***图 3** 100并发数下点查询+点更新,2:8读写混合测试tps* + 在2:8读写混合情景下,100并发下Ustore低于Astore 5.36%,200并发下Ustore高出Astore **12.37%**;其余情况下二者有好有坏,相差不超过5%。总体来说, Ustore在高并发、点更新更多的情况下表现更好。 - 点查询+范围更新(index scan + 非主键更新) + ![img4](https://cdn-mogdb.enmotech.com/docs-media/mogdb/performance-tuning/ustore4.png) + + ***图 4** 不同并发数下点查询+范围更新,1:1读写混合测试tps* + + ![img5](https://cdn-mogdb.enmotech.com/docs-media/mogdb/performance-tuning/ustore5.png) + + ***图 5** 不同并发数下点查询+范围更新,2:8读写混合测试tps* + + ![img6](https://cdn-mogdb.enmotech.com/docs-media/mogdb/performance-tuning/ustore6.png) + + ***图 6** 不同并发数下点查询+范围更新,8:2读写混合测试tps* + 所有读写混合场景下,Ustore引擎都优于Astore引擎,尤其是在200并发、范围更新更多的情况下Ustore表现更好,具体如下: - 1:1读写混合情景下,100并发下Ustore高出37.8%,200并发下高出103.5%,300并发下高出78.4%; @@ -47,12 +72,90 @@ Ustore的核心优势场景为频繁更新场景,相较于Astore空间利用 - 范围查询+范围更新(index scan + 非主键更新) + ![img7](https://cdn-mogdb.enmotech.com/docs-media/mogdb/performance-tuning/ustore7.png) + + ***图 7** 不同并发数下范围更新+范围查询,1:1读写混合测试tps* + + ![img8](https://cdn-mogdb.enmotech.com/docs-media/mogdb/performance-tuning/ustore8.png) + + ***图 8** 不同并发数下范围更新+范围查询,2:8读写混合测试tps* + + ![img9](https://cdn-mogdb.enmotech.com/docs-media/mogdb/performance-tuning/ustore9.png) + + ***图 9** 不同并发数下范围更新+范围查询,8:2读写混合测试tps* + 所有读写混合场景下,Ustore引擎都优于Astore引擎,其中在200并发、范围更新更多的情况下Ustore表现更好。具体如下: - - - 1:1读写混合情景下,100并发下Ustore高出42.6%,200并发下高出106.1%,300并发下高出94.2%; - - 2:8读写混合情景下,100并发下Ustore高出48.4%,200并发下高出100.2%,300并发下高出95.9%; + + - 1:1读写混合情景下,100并发下Ustore高出42.6%,200并发下高出106.1%,300并发下高出94.2%; + - 2:8读写混合情景下,100并发下Ustore高出48.4%,200并发下高出100.2%,300并发下高出95.9%; - 8:2读写混合情景下,100并发下Ustore高出31.3%,200并发下高出69.7%,300并发下高出69.1%。 +## 特性增强 + +### SMP并行查询能力 + +Ustore存储引擎的SMP并行能力涵盖以下几种场景: + +- 并行顺序扫描(Parallel Seq Scan) + + ![img10](https://cdn-mogdb.enmotech.com/docs-media/mogdb/performance-tuning/ustore10.png) + + ***图10** Ustore seq scan并行性能* + + 开启并行后,随着并行度的增加,Seq scan顺序查询的性能随着并行度的增加而提升,其中在Agg场景下表现最佳,并行度16的情况下,查询性能相比串行快了12~13倍。 + +- 并行索引扫描(Parallel Index Scan) + + ![img11](https://cdn-mogdb.enmotech.com/docs-media/mogdb/performance-tuning/ustore11.png) + + ***图11** Ustore index scan并行性能* + + 开启并行后,在合适的场景下,Index Scan查询性能随着并行度的增加而提升,其中在Agg场景下表现最佳, 并行度16的情况下,并行比串行快了11-15倍。 + +- 并行仅索引扫描(Parallel Index Only Scan) + + ![img12](https://cdn-mogdb.enmotech.com/docs-media/mogdb/performance-tuning/ustore12.png) + + ***图12** Ustore index only scan并行性能* + + 开启并行后,在合适的场景下,Index Only Scan查询性能随着并行度的增加而提升,其中在Agg场景下表现最佳,并行度16的情况下,并行比串行快了5-13倍;且Ustore的Index Only Scan算子在串行与并行场景下均优于Astore。 + +- 并行位图扫描 (Parallel Bitmap Scan) + + 位图读取针对随机读,且数据量较大的场景会有较好的表现,并行度16的情况下,BitmapScan算子性能有70%-100%的性能提升,但端到端性能提升不明显,尤其在数据量较小的情况下。 + +### 顺序扫描预读 + +顺序扫描预读机制将顺序扫描的CPU处理过程与I/O操作并行化,尽量避免CPU因为等待I/O而阻塞。不同并行度下针对Ustore表的预读性能结果如下: + +- 主节点tpch测试 + + dop=1:总体算子提升为41%,端到端的提升为19% + + ![img13](https://cdn-mogdb.enmotech.com/docs-media/mogdb/performance-tuning/ustore13.png) + + dop=4:总体算子提升为43%,端到端的提升为21% + + ![img14](https://cdn-mogdb.enmotech.com/docs-media/mogdb/performance-tuning/ustore14.png) + + dop=8:总体算子提升为45%,端到端的提升为23% + + ![img15](https://cdn-mogdb.enmotech.com/docs-media/mogdb/performance-tuning/ustore15.png) + + dop=16:总体算子提升为37%,端到端的提升为13% + + ![img16](https://cdn-mogdb.enmotech.com/docs-media/mogdb/performance-tuning/ustore16.png) + +- 混合负载(tpcc+tpch)的情况下性能提升的结果和对TPMC的影响 + + dop=1 : 总体算子提升为32%,端到端的提升为19%,tpmc效果提升3%,tpmc不受预读影响 + + ![img17](https://cdn-mogdb.enmotech.com/docs-media/mogdb/performance-tuning/ustore17.png) + + dop=4 : 总体算子提升为38%,端到端的提升为22%,tpmc效果提升2%,tpmc不受预读影响 + + ![img18](https://cdn-mogdb.enmotech.com/docs-media/mogdb/performance-tuning/ustore18.png) + ## 特性规格 Ustore存储引擎当前已支持的特性如下所示,其他特性支持情况概览见表1。 @@ -65,9 +168,13 @@ Ustore存储引擎当前已支持的特性如下所示,其他特性支持情 4. 自适应空间管理 -5. Bitmap heap scan +5. SMP并行查询 + +6. 顺序扫描预读 + +7. Bitmap heap scan -6. Bitmap index scan +8. Bitmap index scan **表1** Ustore详细特性规格 @@ -98,7 +205,8 @@ Ustore存储引擎当前已支持的特性如下所示,其他特性支持情 | 闪回(闪回查询、闪回drop、闪回truncate) | 不支持 | | bloom filter(布隆过滤器) | 支持 | | 分区表 | 支持 | -| 并行查询(SMP) | 不支持 | +| 并行查询(SMP) | 支持 | +| 顺序扫描预读 | 支持 | | 行级压缩 | 不支持 | | PageInspect | 不支持 | | Wal2json | 不支持 | @@ -191,4 +299,8 @@ USTORE与原有的ASTORE(Append Update)存储引擎并存。USTORE存储引擎 "ubt_idx" ubtree (age) WITH (storage_type=USTORE) TBALESPACE pg_default Has OIDs: no Options: orientation=row, storage_type=ustore, compression=no - ``` \ No newline at end of file + ``` + +## 相关特性 + +[Ustore SMP并行扫描](../../characteristic-description/high-performance/ustore-smp.md)、[顺序扫描预读](../../characteristic-description/high-performance/seqscan-prefetch.md) \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/performance-tuning/system-tuning/system-tuning.md b/product/zh/docs-mogdb/v5.0/performance-tuning/system-tuning/system-tuning.md index 22c84ed72ecb6d934694f6ea4d9f2dae9fe46d63..a729d1153f23508b672a36df21e649e958c50197 100644 --- a/product/zh/docs-mogdb/v5.0/performance-tuning/system-tuning/system-tuning.md +++ b/product/zh/docs-mogdb/v5.0/performance-tuning/system-tuning/system-tuning.md @@ -11,5 +11,5 @@ date: 2023-04-23 - **[配置向量化执行引擎](configuring-vector-engine.md)** - **[配置SMP](configuring-smp.md)** - **[配置LLVM](configuring-llvm.md)** -- **[配置Ustore](configuring-ustore.md)** +- **[In-place Update存储引擎Ustore](configuring-ustore.md)** - **[资源负载管理](resource-load-management/resource-load-management.md)** \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/reference-guide/functions-and-operators/system-management-functions/other-functions.md b/product/zh/docs-mogdb/v5.0/reference-guide/functions-and-operators/system-management-functions/other-functions.md index f4d63df29541f1fe498b8e86b7106d1c89babc9f..904bb9029ca5bb6bfdd8efdb80f2ff8fd222fc79 100644 --- a/product/zh/docs-mogdb/v5.0/reference-guide/functions-and-operators/system-management-functions/other-functions.md +++ b/product/zh/docs-mogdb/v5.0/reference-guide/functions-and-operators/system-management-functions/other-functions.md @@ -761,4 +761,31 @@ date: 2021-04-20 Slots keep | 00000001000000000000000B | base on physical or logical slots Archive Keep | 00000001000000000000000B | base on wal archive (4 rows) - ``` \ No newline at end of file + ``` + +- local_pagewriter_flush_detail() + + 描述:展示刷脏流程的详细信息,包括刷脏相关的GUC参数、刷脏流程中的变量信息等,在系统刷脏慢时,调用此函数可分析问题瓶颈所在。(MogDB 5.0.8引入) + + 权限:任何用户均可调用。 + + 返回值: + + | 列名 | 描述 | + | ---------------------------- | ------------------------------------------------------------ | + | node_name | 节点名称 | + | pagewriter_sleep(ms) | 刷脏单个周期 | + | max_io_capacity(M) | 最大 io 能力 | + | dirty_page_percent_max | 脏页最大占比 | + | candidate_buf_percent_target | 候选 buffer 目标值占比 | + | max_redo_log_size(M) | 最大日志回放量 | + | main_pagewriter_detail | main_pagewriter 详细信息:开始时间、等待耗时、刷脏耗时 | + | sub_pagewriter_detail | id:sub_pagewriter 编号;wait_cost:上个刷脏周期等待耗时;flush_cost:上个刷脏周期实际刷脏耗时 | + | theoritical_max_io | 理论最大值=(「扫描 buffer 到候选队列」刷脏理论最大值 + 从脏页队列刷脏理论最大值) | + | lsn_percent | lsn 占比 | + | actual_max_io | 实际最大值=(「扫描 buffer 到候选队列」刷脏实际最大值 + 从脏页队列刷脏实际最大值) | + | actual_flush_num | 实际刷脏值=(「扫描 buffer 到候选队列」刷脏实际值 + 从脏页队列刷脏实际值) | + | remain_actual_dirty_page_num | 剩余实际脏页数量 | + | list_flush_detail | 扫描 buffer 到候选队列部分明细:当前候选 buffer 数、总 buffer 数 | + | queue_flush_detail | 从脏页队列刷脏部分明细:dirty_percent | + | forecast | 预测:当前速度、当前执行 checkpoint 预计耗时 | \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/auditing/audit-switch.md b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/auditing/audit-switch.md index c8797dd54490686f1c28db37142b1b60545a2bf5..d17ceb4dbefab935a7d1c15b6ad535770d1ad0a4 100644 --- a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/auditing/audit-switch.md +++ b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/auditing/audit-switch.md @@ -30,6 +30,8 @@ date: 2021-04-20 **默认值**:pg_audit。如果使用om工具部署MogDB,则审计日志路径为“$GAUSSLOG/pg_audit/实例名称”。 +PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-notice.gif) **须知:** > > - 不同的DN实例需要设置不同的审计文件存储目录,否则会导致审计日志异常。 diff --git a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/connection-and-authentication/connection-settings.md b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/connection-and-authentication/connection-settings.md index a14cb86384d0b0d02f16f88d6d9fb6970da742c2..75e961722ceb0814988948bdb2ba2f5b1e4b270e 100644 --- a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/connection-and-authentication/connection-settings.md +++ b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/connection-and-authentication/connection-settings.md @@ -36,7 +36,7 @@ date: 2021-04-20 - 星号`*`或`0.0.0.0`表示侦听所有IP地址。配置侦听所有IP地址存在安全风险,不推荐用户使用。必须与有效地址结合使用(比如本地IP等),否则,可能造成Build失败的问题。同时,主备环境下配置为`*`或`0.0.0.0`时,主节点数据库路径下postgresql.conf文件中的localport端口号不能为数据库dataPortBase+1,否则会导致数据库无法启动。 - 若存在非法IP时,进程启动阶段会报错退出。 -**默认值**: 空字符串(实际值由安装时配置文件指定) +**默认值**: 空字符串(实际值由安装时配置文件指定,PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc))。 > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: > @@ -54,7 +54,7 @@ date: 2021-04-20 该参数属于POSTMASTER类型参数,请参考表[GUC参数分类](../../../reference-guide/guc-parameters/appendix.md)中对应设置方法进行设置。 -**默认值**: 空字符串(实际值由安装时配置文件指定) +**默认值**: 空字符串(实际值由安装时配置文件指定,PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc))。 ## port @@ -62,7 +62,7 @@ date: 2021-04-20 该参数属于POSTMASTER类型参数,请参考表[GUC参数分类](../../../reference-guide/guc-parameters/appendix.md)中对应设置方法进行设置。 -> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: 该参数由安装时的配置文件指定,请勿轻易修改,否则修改后会影响数据库正常通信。 +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: 该参数由安装时的配置文件指定,请勿轻易修改,否则修改后会影响数据库正常通信。PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 > **取值范围**: 整型,1~65535 @@ -91,7 +91,7 @@ date: 2021-04-20 **设置建议**: -数据库主节点中此参数建议保持默认值。 +数据库主节点中此参数建议保持默认值。PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 **配置不当时影响**: @@ -139,7 +139,7 @@ date: 2021-04-20 **取值范围**: 字符串 -**默认值**: 空字符串(实际值由安装时配置文件指定) +**默认值**: 空字符串(实际值由安装时配置文件指定,PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc))。 ## unix_socket_group @@ -149,7 +149,7 @@ date: 2021-04-20 **取值范围**: 字符串,其中空字符串表示当前用户的缺省组。 -**默认值**: 空字符串 +**默认值**: 空字符串。PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 ## unix_socket_permissions @@ -199,7 +199,7 @@ Unix域套接字使用普通的Unix文件系统权限集。这个参数的值应 **取值范围**: 字符串。 -**默认值**: 空字符串(连接到后端的应用名,以实际安装为准) +**默认值**: 空字符串(连接到后端的应用名,以实际安装为准)。PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 ## connection_info diff --git a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/connection-and-authentication/security-and-authentication.md b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/connection-and-authentication/security-and-authentication.md index b787a5e235c1159db2f35ed945bf2e6651da744a..fec61ce531b2b06197ae494839c932b58fedcb44 100644 --- a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/connection-and-authentication/security-and-authentication.md +++ b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/connection-and-authentication/security-and-authentication.md @@ -42,6 +42,8 @@ date: 2021-04-20 **默认值**: 10min +PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 + ## idle_in_transaction_session_timeout **参数说明**:表明与服务器建立链接后,如果当前会话处于事务中,不进行任何操作的最长时间。 @@ -70,7 +72,7 @@ date: 2021-04-20 > > 开启此参数需要同时配置ssl_cert_file、ssl_key_file和ssl_ca_file等参数及对应文件,不正确的配置可能会导致MogDB无法正常启动。 -**默认值**: on +**默认值**: on。PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 ## require_ssl @@ -118,6 +120,8 @@ date: 2021-04-20 **默认值**: server.crt +PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 + ## ssl_cert_notify_time **参数说明**:SSL服务器证书到期前提醒的天数。 @@ -138,6 +142,8 @@ date: 2021-04-20 **默认值**: server.key +PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 + ## ssl_ca_file **参数说明**: 指定包含CA信息的文件的名称。相对路径是相对于数据目录的。 @@ -148,6 +154,8 @@ date: 2021-04-20 **默认值**: cacert.pem +PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 + ## ssl_crl_file **参数说明**: 证书吊销列表,如果客户端证书在该列表中,则当前客户端证书被视为无效证书。必须使用相对路径,相对路径是相对于数据目录的。 @@ -206,6 +214,8 @@ date: 2021-04-20 **默认值**: off +PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 + ## password_policy **参数说明**: 在使用CREATE ROLE/USER或者ALTER ROLE/USER命令创建或者修改MogDB帐户时,该参数决定是否进行密码复杂度检查。关于密码复杂度检查策略请参见[设置密码安全策略](../../../security-guide/security/2-managing-users-and-their-permissions.md#设置密码安全策略)。 @@ -243,6 +253,8 @@ date: 2021-04-20 **默认值**: 60 +PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 + ## password_reuse_max **参数说明**: 在使用ALTER USER或者ALTER ROLE修改用户密码时,该参数指定是否对新密码进行可重用次数检查。关于密码可重用策略请参见[设置密码安全策略](../../../security-guide/security/2-managing-users-and-their-permissions.md#设置密码安全策略)。 @@ -280,6 +292,8 @@ date: 2021-04-20 **默认值**: 1d +PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 + ## failed_login_attempts **参数说明**: 在任意时在任意时候,如果输入密码错误的次数达到failed_login_attempts参数设定的值,则当前帐户会被锁定。password_lock_time参数设定的天数过后,帐户自动解锁。例如,登录时输入密码失败,ALTER USER时修改密码失败等。关于帐户自动锁定策略请参见[设置密码安全策略](../../../security-guide/security/2-managing-users-and-their-permissions.md#设置密码安全策略)。 @@ -314,6 +328,8 @@ date: 2021-04-20 **默认值**: 2 +PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 + ## password_min_length **参数说明**: 该字段决定帐户密码的最小长度。 @@ -399,6 +415,8 @@ date: 2021-04-20 **默认值**: 90 +PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 + ## password_notify_time **参数说明**: 该字段决定帐户密码到期前提醒的天数。 diff --git a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/developer-options.md b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/developer-options.md index 94c376bd44864631d0e83d7edb7d70164eaa2874..966b81e3b790d2043be6024489a7cd266d203388 100644 --- a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/developer-options.md +++ b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/developer-options.md @@ -283,6 +283,8 @@ date: 2021-04-20 **默认值**: 0 +PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 + ## enable_beta_opfusion **参数说明**: 在enable_opfusion参数打开的状态下,如果开启该参数,可以支持TPCC中出现的聚集函数,排序两类SQL语句的加速执行,提升SQL执行性能。 diff --git a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/error-reporting-and-logging/logging-destination.md b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/error-reporting-and-logging/logging-destination.md index f66e7ae07cd8ea553c6ec2f65f921c61fd1c559b..e9106a07ba89fbcd161ecb09683be784147cc9f4 100644 --- a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/error-reporting-and-logging/logging-destination.md +++ b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/error-reporting-and-logging/logging-destination.md @@ -67,6 +67,8 @@ date: 2021-04-20 **默认值**: 安装时指定。 +PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 + ## log_filename **参数说明**: logging_collector设置为on时,log_filename决定服务器运行日志文件的名称。通常日志文件名是按照strftime模式生成,因此可以用系统时间定义日志文件名,用%转义字符实现。 @@ -93,7 +95,7 @@ date: 2021-04-20 > - 使用此选项前请设置log_directory,将日志存储到数据目录之外的地方。 > - 因日志文件可能含有敏感数据,故不能将其设为对外可读。 -**取值范围**: 整型,0000~0777(8进制计数,转化为十进制 0 ~ 511)。 +**取值范围**: 整型,0000~0777(8进制计数,转化为十进制 0 ~ 511)。PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: > diff --git a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/fault-tolerance.md b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/fault-tolerance.md index 58658ae005ef1e1a4e20e4e68c201717718d78b3..b2af47575e7a91181b06f77db4e137b58223cbd4 100644 --- a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/fault-tolerance.md +++ b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/fault-tolerance.md @@ -108,6 +108,8 @@ date: 2021-04-20 **默认值**: authentication +PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 + ## data_sync_failed_ignore **参数说明**:控制pagewriter执行fsync失败后,是否丢弃待sync的项。 diff --git a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/guc-parameter-list.md b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/guc-parameter-list.md index d4597e2dbd9cbd3ba662537e2522adcc217676f5..7f5467f6b243309045ca10dfd074fcd61b779e36 100644 --- a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/guc-parameter-list.md +++ b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/guc-parameter-list.md @@ -94,7 +94,7 @@ date: 2023-04-07 | [bbox_dump_count](load-management.md#bbox_dump_count) | | | [bbox_dump_path](load-management.md#bbox_dump_path) | | | [behavior_compat_options](version-and-platform-compatibility/platform-and-client-compatibility.md#behavior_compat_options) | | -| [best_agg_plan](./query-planning/other-optimizer-options.md#best_agg_plan) | | +| [best_agg_plan](./query-planning/other-optimizer-options.md#best_agg_plan) | | | [bgwriter_delay](resource-consumption/background-writer.md#bgwriter_delay) | | | [bgwriter_flush_after](resource-consumption/asynchronous-io-operations.md#bgwriter_flush_after) | | | [bgwriter_lru_maxpages](resource-consumption/background-writer.md#bgwriter_lru_maxpages) | | @@ -115,12 +115,13 @@ date: 2023-04-07 | [checkpoint_completion_target](write-ahead-log/checkpoints.md#checkpoint_completion_target) | | | [checkpoint_flush_after](resource-consumption/asynchronous-io-operations.md#checkpoint_flush_after) | | | [checkpoint_segments](write-ahead-log/checkpoints.md#checkpoint_segments) | | +| [checkpoint_target_time](./resource-consumption/background-writer.md#checkpoint_target_time) | 5.0.8 - [极致刷脏](../../characteristic-description/high-performance/enhancement-of-dirty-pages-flushing-performance.md) | | [checkpoint_timeout](write-ahead-log/checkpoints.md#checkpoint_timeout) | | | [checkpoint_wait_timeout](write-ahead-log/checkpoints.md#checkpoint_wait_timeout) | | | [checkpoint_warning](write-ahead-log/checkpoints.md#checkpoint_warning) | | | [client_encoding](default-settings-of-client-connection/zone-and-formatting.md#client_encoding) | | | [client_min_messages](error-reporting-and-logging/logging-time.md#client_min_messages) | | -| [cluster_run_mode](./miscellaneous-parameters.md#cluster_run_mode) | | +| [cluster_run_mode](./miscellaneous-parameters.md#cluster_run_mode) | | | [cn_send_buffer_size](fault-tolerance.md#cn_send_buffer_size) | | | [codegen_cost_threshold](query-planning/other-optimizer-options.md#codegen_cost_threshold) | | | [codegen_strategy](query-planning/other-optimizer-options.md#codegen_strategy) | | @@ -225,7 +226,7 @@ date: 2023-04-07 | [elastic_search_ip_addr](security-configuration.md#elastic_search_ip_addr) | | | [emit_illegal_bind_chars](miscellaneous-parameters.md#emit_illegal_bind_chars) | 5.0.2 - [写入不合法字符报错](../../characteristic-description/maintainability/error-when-writing-illegal-characters.md) | | [enable_absolute_tablespace](query-planning/optimizer-method-configuration.md#enable_absolute_tablespace) | | -| [enable_accept_empty_str](./query-planning/other-optimizer-options.md#enable_accept_empty_str) | | +| [enable_accept_empty_str](./query-planning/other-optimizer-options.md#enable_accept_empty_str) | | | [enable_access_server_directory](auditing/operation-audit.md#enable_access_server_directory) | | | [enable_adaptive_hashagg](./query-planning/optimizer-method-configuration.md#enable_adaptive_hashagg) | | | [enable_adio_debug](resource-consumption/asynchronous-io-operations.md#enable_adio_debug) | | @@ -237,7 +238,7 @@ date: 2023-04-07 | [enable_auto_clean_unique_sql](query.md#enable_auto_clean_unique_sql) | | | [enable_auto_explain](query-planning/other-optimizer-options.md#enable_auto_explain) | | | [enable_availablezone](./ha-replication/sending-server.md#enable_availablezone) | | -| [enable_backend_compress](./backend-compression.md#enable_backend_compress) | | +| [enable_backend_compress](./backend-compression.md#enable_backend_compress) | | | [enable_batch_dispatch](write-ahead-log/log-replay.md#enable_batch_dispatch) | | | [enable_bbox_dump](load-management.md#enable_bbox_dump) | | | [enable_beta_features](version-and-platform-compatibility/compatibility-with-earlier-versions.md#enable_beta_features) | | @@ -252,7 +253,7 @@ date: 2023-04-07 | [enable_codegen_print](query-planning/other-optimizer-options.md#enable_codegen_print) | | | [enable_compress_hll](./HyperLogLog.md#enable_compress_hll) | | | [enable_compress_spill](developer-options.md#enable_compress_spill) | | -| [enable_compression_check](./backend-compression.md#enable_compression_check) | | +| [enable_compression_check](./backend-compression.md#enable_compression_check) | | | [enable_consider_usecount](resource-consumption/background-writer.md#enable_consider_usecount) | | | [enable_constraint_optimization](reserved-parameters.md) | | | [enable_copy_server_files](./data-import-export.md#enable_copy_server_files) | | @@ -263,15 +264,16 @@ date: 2023-04-07 | [enable_dcf](./DCF-parameters-settings.md#enable_dcf) | | | [enable_debug_vacuum](error-reporting-and-logging/logging-content.md#enable_debug_vacuum) | | | [enable_default_cfunc_libpath](file-location.md#enable_default_cfunc_libpath) | | -| [enable_default_compression_table](./backend-compression.md#enable_default_compression_table) | | -| [enable_default_index_compression](./backend-compression.md#enable_default_index_compression) | | +| [enable_default_compression_table](./backend-compression.md#enable_default_compression_table) | | +| [enable_default_index_compression](./backend-compression.md#enable_default_index_compression) | | | [enable_default_ustore_table](miscellaneous-parameters.md#enable_default_ustore_table) | | | [enable_defer_calculate_snapshot](MogDB-transaction.md#enable_defer_calculate_snapshot) | | | [enable_delta_store](./data-import-export.md#enable_delta_store) | | +| [enable_ddl_logical_record](./ha-replication/sending-server.md#enable_ddl_logical_record) | 5.0.8 - [逻辑解码支持DDL操作](../../developer-guide/logical-replication/logical-decoding/logical-decoding-support-for-DDL.md) | | [enable_dolphin_proto](./connection-and-authentication/connection-settings.md#enable_dolphin_proto) | | | [enable_double_write](write-ahead-log/checkpoints.md#enable_double_write) | | | [enable_early_free](resource-consumption/memory.md#enable_early_free) | | -| [enable_event_trigger_a_mode](./miscellaneous-parameters.md#enable_event_trigger_a_mode) | | +| [enable_event_trigger_a_mode](./miscellaneous-parameters.md#enable_event_trigger_a_mode) | | | [enable_expr_fusion](./query-planning/optimizer-method-configuration.md#enable_expr_fusion) | | | [enable_extrapolation_stats](query-planning/other-optimizer-options.md#enable_extrapolation_stats) | | | [enable_fast_allocate](resource-consumption/asynchronous-io-operations.md#enable_fast_allocate) | | @@ -287,6 +289,7 @@ date: 2023-04-07 | [enable_hashagg](query-planning/optimizer-method-configuration.md#enable_hashagg) | | | [enable_hashjoin](query-planning/optimizer-method-configuration.md#enable_hashjoin) | | | [enable_hdfs_predicate_pushdown](reserved-parameters.md) | | +| [enable_heap_async_prefetch](thread-pool.md#enable_heap_async_prefetch) | 5.0.8 - [顺序扫描预读](../../characteristic-description/high-performance/seqscan-prefetch.md) | | [enable_hypo_index](query-planning/other-optimizer-options.md#enable_hypo_index) | | | [enable_incremental_catchup](ha-replication/primary-server.md#enable_incremental_catchup) | | | [enable_incremental_checkpoint](write-ahead-log/checkpoints.md#enable_incremental_checkpoint) | | @@ -300,6 +303,7 @@ date: 2023-04-07 | [enable_instr_cpu_timer](query.md#enable_instr_cpu_timer) | | | [enable_instr_rt_percentile](query.md#enable_instr_rt_percentile) | | | [enable_instr_track_wait](wait-events.md#enable_instr_track_wait) | | +| [enable_ios](thread-pool.md#enable_ios) | 5.0.8 - [顺序扫描预读](../../characteristic-description/high-performance/seqscan-prefetch.md) | | [enable_kill_query](query-planning/optimizer-method-configuration.md#enable_kill_query) | | | [enable_logical_io_statistics](load-management.md#enable_logical_io_statistics) | | | [enable_material](query-planning/optimizer-method-configuration.md#enable_material) | | @@ -313,7 +317,7 @@ date: 2023-04-07 | [enable_online_ddl_waitlock](lock-management.md#enable_online_ddl_waitlock) | | | [enable_opfusion](query-planning/other-optimizer-options.md#enable_opfusion) | | | [enable_orc_cache](reserved-parameters.md) | | -| [enable_page_compression](./backend-compression.md#enable_page_compression) | | +| [enable_page_compression](./backend-compression.md#enable_page_compression) | | | [enable_page_lsn_check](write-ahead-log/log-replay.md#enable_page_lsn_check) | | | [enable_partition_opfusion](query-planning/other-optimizer-options.md#enable_partition_opfusion) | | | [enable_partitionwise](query-planning/other-optimizer-options.md#enable_partitionwise) | | @@ -337,22 +341,23 @@ date: 2023-04-07 | [enable_sonic_hashjoin](query-planning/other-optimizer-options.md#enable_sonic_hashjoin) | | | [enable_sonic_optspill](query-planning/other-optimizer-options.md#enable_sonic_optspill) | | | [enable_sort](query-planning/optimizer-method-configuration.md#enable_sort) | | -| [enable_sse42](./query-planning/other-optimizer-options.md#enable_sse42) | | +| [enable_sse42](./query-planning/other-optimizer-options.md#enable_sse42) | | | [enable_startwith_debug](query-planning/other-optimizer-options.md#enable_startwith_debug) | | | [enable_stmt_track](query.md#enable_stmt_track) | | | [enable_stream_replication](ha-replication/primary-server.md#enable_stream_replication) | | | [enable_tde](security-configuration.md#enable_tde) | | | [enable_thread_pool](thread-pool.md#enable_thread_pool) | | -| [enable_tidrangescan](./miscellaneous-parameters.md#enable_tidrangescan) | | +| [enable_tidrangescan](./miscellaneous-parameters.md#enable_tidrangescan) | | | [enable_tidscan](query-planning/optimizer-method-configuration.md#enable_tidscan) | | | [enable_time_report](write-ahead-log/log-replay.md#enable_time_report) | | +| [enable_uheap_async_prefetch](thread-pool.md#enable_uheap_async_prefetch) | 5.0.8 - [顺序扫描预读](../../characteristic-description/high-performance/seqscan-prefetch.md) | | [enable_upgrade_merge_lock_mode](miscellaneous-parameters.md#enable_upgrade_merge_lock_mode) | | | [enable_user_metric_persisten](load-management.md#enable_user_metric_persistent) | | | [enable_ustore](miscellaneous-parameters.md#enable_ustore) | | | [enable_valuepartition_pruning](query-planning/optimizer-method-configuration.md#enable_valuepartition_prunin) | | | [enable_vector_engine](query-planning/optimizer-method-configuration.md#enable_vector_engine) | | | [enable_wal_shipping_compression](ha-replication/sending-server.md#enable_wal_shipping_compression) | | -| [enable_walrcv_reply_dueto_commit](./write-ahead-log/log-replay.md#enable_walrcv_reply_dueto_commit) | | +| [enable_walrcv_reply_dueto_commit](./write-ahead-log/log-replay.md#enable_walrcv_reply_dueto_commit) | | | [enable_wdr_snapshot](system-performance-snapshot.md#enable_wdr_snapshot) | | | [enable_xlog_prune](write-ahead-log/checkpoints.md#enable_xlog_prune) | | | [enableSeparationOfDuty](auditing/operation-audit.md#enableseparationofduty) | | @@ -364,13 +369,14 @@ date: 2023-04-07 | [explain_perf_mode](query-planning/other-optimizer-options.md#explain_perf_mode) | | | [external_pid_file](file-location.md#external_pid_file) | | | [extra_float_digits](default-settings-of-client-connection/zone-and-formatting.md#extra_float_digits) | | +| [extreme_flush_dirty_page](./resource-consumption/background-writer.md#extreme_flush_dirty_page) | 5.0.8 - [极致刷脏](../../characteristic-description/high-performance/enhancement-of-dirty-pages-flushing-performance.md) | | [failed_login_attempts](connection-and-authentication/security-and-authentication.md#failed_login_attempts) | | | [fast_extend_file_size](resource-consumption/asynchronous-io-operations.md#fast_extend_file_size) | | | [fault_mon_timeout](lock-management.md#fault_mon_timeout) | | | [FencedUDFMemoryLimit](guc-user-defined-functions.md#fencedudfmemorylimit) | | | [force_bitmapand](query-planning/optimizer-method-configuration.md#force_bitmapand) | | | [force_promote](write-ahead-log/settings.md#force_promote) | | -| [force_tidrangescan](./miscellaneous-parameters.md#force_tidrangescan) | | +| [force_tidrangescan](./miscellaneous-parameters.md#force_tidrangescan) | | | [from_collapse_limit](query-planning/other-optimizer-options.md#from_collapse_limit) | | | [fsync](write-ahead-log/settings.md#fsync) | | | [full_audit_users](./auditing/user-and-permission-audit.md#full_audit_users) | | @@ -394,8 +400,8 @@ date: 2023-04-07 | [hadr_recovery_time_target](./ha-replication/primary-server.md#hadr_recovery_time_target) | | | [hadr_super_user_record_path](./ha-replication/primary-server.md#hadr_super_user_record_path) | | | [handle_toast_in_autovac](./automatic-vacuuming.md#handle_toast_in_autovac) | | -| [hash_agg_total_cost_ratio](./query-planning/optimizer-cost-constants.md#hash_agg_total_cost_ratio) | | -| [hash_join_total_cost_ratio](./query-planning/optimizer-cost-constants.md#hash_join_total_cost_ratio) | | +| [hash_agg_total_cost_ratio](./query-planning/optimizer-cost-constants.md#hash_agg_total_cost_ratio) | | +| [hash_join_total_cost_ratio](./query-planning/optimizer-cost-constants.md#hash_join_total_cost_ratio) | | | [hashagg_table_size](query-planning/other-optimizer-options.md#hashagg_table_size) | | | [hba_file](file-location.md#hba_file) | | | [hll_default_expthresh](./HyperLogLog.md#hll_default_expthresh) | | @@ -410,7 +416,7 @@ date: 2023-04-07 | [hot_standby_feedback](ha-replication/standby-server.md#hot_standby_feedback) | | | [ident_file](file-location.md#ident_file) | | | [idle_in_transaction_session_timeout](connection-and-authentication/security-and-authentication.md#idle_in_transaction_session_timeout) | | -| [ifnull_all_return_text](./developer-options.md#ifnull_all_return_text) | | +| [ifnull_all_return_text](./developer-options.md#ifnull_all_return_text) | | | [ignore_checksum_failure](developer-options.md#ignore_checksum_failure) | | | [ignore_system_indexes](developer-options.md#ignore_system_indexes) | | | [incremental_checkpoint_timeout](write-ahead-log/checkpoints.md#incremental_checkpoint_timeout) | | @@ -423,6 +429,9 @@ date: 2023-04-07 | [io_control_unit](load-management.md#io_control_unit) | | | [io_limits](load-management.md#io_limits) | | | [io_priority](load-management.md#io_priority) | | +| [ios_batch_read_size](thread-pool.md#ios_batch_read_size) | 5.0.8 - [顺序扫描预读](../../characteristic-description/high-performance/seqscan-prefetch.md) | +| [ios_status_update_gap](thread-pool.md#ios_status_update_gap) | 5.0.8 - [顺序扫描预读](../../characteristic-description/high-performance/seqscan-prefetch.md) | +| [ios_worker_num](thread-pool.md#ios_worker_num) | 5.0.8 - [顺序扫描预读](../../characteristic-description/high-performance/seqscan-prefetch.md) | | [job_queue_processes](scheduled-task.md#job_queue_processes) | | | [join_collapse_limit](query-planning/other-optimizer-options.md#join_collapse_limit) | | | [keep_sync_window](ha-replication/primary-server.md#keep_sync_window) | | @@ -499,6 +508,7 @@ date: 2023-04-07 | [max_recursive_times](query-planning/optimizer-method-configuration.md#max_recursive_times) | | | [max_redo_log_size](write-ahead-log/checkpoints.md#max_redo_log_size) | | | [max_replication_slots](ha-replication/sending-server.md#max_replication_slots) | | +| [max_requests_per_worker](thread-pool.md#max_requests_per_worker) | 5.0.8 - [顺序扫描预读](../../characteristic-description/high-performance/seqscan-prefetch.md) | | [max_resource_package](./miscellaneous-parameters.md#max_resource_package) | | | [max_size_for_xlog_prune](write-ahead-log/checkpoints.md#max_size_for_xlog_prune) | | | [max_stack_depth](resource-consumption/memory.md#max_stack_depth) | | @@ -514,11 +524,13 @@ date: 2023-04-07 | [memory_tracking_mode](load-management.md#memory_tracking_mode) | | | [memorypool_enable](resource-consumption/memory.md#memorypool_enable) | | | [memorypool_size](resource-consumption/memory.md#memorypool_size) | | -| [merge_join_total_cost_ratio](./query-planning/optimizer-cost-constants.md#merge_join_total_cost_ratio) | | +| [merge_join_total_cost_ratio](./query-planning/optimizer-cost-constants.md#merge_join_total_cost_ratio) | | +| [min_table_block_num_enable_ios](thread-pool.md#min_table_block_num_enable_ios) | 5.0.8 - [顺序扫描预读](../../characteristic-description/high-performance/seqscan-prefetch.md) | +| [min_uheap_table_block_num_enable_ios](thread-pool.md#min_uheap_table_block_num_enable_ios) | 5.0.8 - [顺序扫描预读](../../characteristic-description/high-performance/seqscan-prefetch.md) | | [modify_initial_password](connection-and-authentication/security-and-authentication.md#modify_initial_password) | | | [most_available_sync](ha-replication/primary-server.md#most_available_sync) | | | [multi_stats_type](./AI-features.md#multi_stats_type) | | -| [nestloop_total_cost_ratio](./query-planning/optimizer-cost-constants.md#nestloop_total_cost_ratio) | | +| [nestloop_total_cost_ratio](./query-planning/optimizer-cost-constants.md#nestloop_total_cost_ratio) | | | [ngram_gram_size](query-planning/other-optimizer-options.md#ngram_gram_size) | | | [ngram_grapsymbol_ignore](query-planning/other-optimizer-options.md#ngram_grapsymbol_ignore) | | | [ngram_punctuation_ignore](query-planning/other-optimizer-options.md#ngram_punctuation_ignore) | | @@ -531,12 +543,13 @@ date: 2023-04-07 | [omit_encoding_error](fault-tolerance.md#omit_encoding_error) | | | [operation_mode](backup-and-restoration-parameter.md#operation_mode) | | | [opfusion_debug_mode](error-reporting-and-logging/logging-content.md#opfusion_debug_mode) | | -| [ora_dblink_col_case_sensitive](./version-and-platform-compatibility/platform-and-client-compatibility.md#ora_dblink_col_case_sensitive) | | +| [ora_dblink_col_case_sensitive](./version-and-platform-compatibility/platform-and-client-compatibility.md#ora_dblink_col_case_sensitive) | | | [pagewriter_sleep](resource-consumption/background-writer.md#pagewriter_sleep) | | | [pagewriter_thread_num](resource-consumption/background-writer.md#pagewriter_thread_num) | | | [parallel_recovery_batch](write-ahead-log/log-replay.md#parallel_recovery_batch) | | | [parallel_recovery_dispatch_algorithm](write-ahead-log/log-replay.md#parallel_recovery_dispatch_algorithm) | | | [parallel_recovery_timeout](write-ahead-log/log-replay.md#parallel_recovery_timeout) | | +| [parallel_scan_gap](thread-pool.md#parallel_scan_gap) | 5.0.8 - [顺序扫描预读](../../characteristic-description/high-performance/seqscan-prefetch.md) | | [partition_lock_upgrade_timeout](lock-management.md#partition_lock_upgrade_timeout) | | | [partition_max_cache_size](./data-import-export.md#partition_max_cache_size) | | | [partition_mem_batch](./data-import-export.md#partition_mem_batch) | | @@ -559,7 +572,7 @@ date: 2023-04-07 | [perf_directory](./query.md#perf_directory) | | | [pgxc_node_name](MogDB-transaction.md#pgxc_node_name) | | | [plan_cache_mode](query-planning/other-optimizer-options.md#plan_cache_mode) | | -| [plan_cache_type_validation](./miscellaneous-parameters.md#plan_cache_type_validation) | | +| [plan_cache_type_validation](./miscellaneous-parameters.md#plan_cache_type_validation) | | | [plan_mode_seed](query-planning/other-optimizer-options.md#plan_mode_seed) | | | [pldebugger_timeout](developer-options.md#pldebugger_timeout) | | | [pljava_vmoptions](guc-user-defined-functions.md#pljava_vmoptions) | | @@ -569,6 +582,7 @@ date: 2023-04-07 | [port](connection-and-authentication/connection-settings.md#port) | | | [post_auth_delay](developer-options.md#post_auth_delay) | | | [pre_auth_delay](developer-options.md#pre_auth_delay) | | +| [prefetch_protect_time](thread-pool.md#prefetch_protect_time) | 5.0.8 - [顺序扫描预读](../../characteristic-description/high-performance/seqscan-prefetch.md) | | [prefetch_quantity](resource-consumption/asynchronous-io-operations.md#prefetch_quantity) | | | [primary_slotname](ha-replication/standby-server.md#primary_slotname) | | | [proc_inparam_immutable](./version-and-platform-compatibility/platform-and-client-compatibility.md#proc_inparam_immutable) | 5.0.0 - [支持包内常量作为函数或者过程入参的默认值](../../characteristic-description/compatibility/support-for-constants-in-package-as-default-values.md) | @@ -630,7 +644,7 @@ date: 2023-04-07 | [show_fdw_remote_plan](./query-planning/other-optimizer-options.md#show_fdw_remote_plan) | | | [skew_option](query-planning/optimizer-method-configuration.md#skew_option) | | | [smp_thread_cost](./query-planning/optimizer-cost-constants.md#smp_thread_cost) | | -| [sort_agg_total_cost_ratio](./query-planning/optimizer-cost-constants.md#sort_agg_total_cost_ratio) | | +| [sort_agg_total_cost_ratio](./query-planning/optimizer-cost-constants.md#sort_agg_total_cost_ratio) | | | [sort_key_pruning_level](query-planning/other-optimizer-options.md#sort_key_pruning_level) | | | [sql_beta_feature](query-planning/other-optimizer-options.md#sql_beta_feature) | | | [sql_compatibility](version-and-platform-compatibility/platform-and-client-compatibility.md#sql_compatibility) | | @@ -696,10 +710,10 @@ date: 2023-04-07 | [temp_file_limit](resource-consumption/disk-space.md#temp_file_limit) | | | [temp_tablespaces](default-settings-of-client-connection/statement-behavior.md#temp_tablespaces) | | | [thread_pool_attr](thread-pool.md#thread_pool_attr) | | -| [thread_pool_committer_max_retry_count](./thread-pool.md#thread_pool_committer_max_retry_count) | | -| [thread_pool_committerctl_max_retry_count](./thread-pool.md#thread_pool_stream_attr) | | +| [thread_pool_committer_max_retry_count](./thread-pool.md#thread_pool_committer_max_retry_count) | | +| [thread_pool_committerctl_max_retry_count](./thread-pool.md#thread_pool_stream_attr) | | | [thread_pool_stream_attr](./thread-pool.md#thread_pool_stream_attr) | | -| [thread_pool_worker_num_per_committer](./thread-pool.md#thread_pool_worker_num_per_committer) | | +| [thread_pool_worker_num_per_committer](./thread-pool.md#thread_pool_worker_num_per_committer) | | | [time_to_target_rpo](./write-ahead-log/archiving.md#time_to_target_rpo) | | | [TimeZone](default-settings-of-client-connection/zone-and-formatting.md#timezone) | | | [timezone_abbreviations](default-settings-of-client-connection/zone-and-formatting.md#timezone_abbreviations) | | diff --git a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/ha-replication/sending-server.md b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/ha-replication/sending-server.md index c573ca9fdae066ddc500f5a8000fec2956bd8d92..b994dfa136c5d371c3f46ed7deac15ea4e58f8ba 100644 --- a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/ha-replication/sending-server.md +++ b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/ha-replication/sending-server.md @@ -22,6 +22,8 @@ date: 2021-04-20 **默认值**: 16 +PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 + ## wal_keep_segments **参数说明**: Xlog日志文件段数量。设置"pg_xlog"目录下保留事务日志文件的最小数目,备机通过获取主机的日志进行流复制。 @@ -216,7 +218,7 @@ date: 2021-04-20 **取值范围**: 字符串。其中空字符串表示没有配置节点信息。 -**默认值**: 空字符串 +**默认值**: 空字符串。PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 ## enable_availablezone @@ -255,3 +257,17 @@ date: 2021-04-20 **取值范围**: 字符串 **默认值**: 当前节点名称 + +## enable_ddl_logical_record + +**参数说明**:控制是否开启数据库逻辑解码对DDL的支持。该参数将在新增写入wal日志函数中控制是否写入DDL相关的wal日志。(MogDB 5.0.8引入) + +该参数属于SIGHUP类型参数,请参考表[GUC参数分类](../appendix.md)中对应设置方法进行设置。 + +**取值范围**:布尔型 + +- on:当有DDL命令时,DDL命令执行成功则向wal日志中写入xl_logical_ddl_message类型的wal日志,以备逻辑解码。 + +- off:逻辑解码不支持DDL操作。设置为off时,无论如何配置wal2json等插件的输出都将是空的change。 + +**默认值**:off \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/lock-management.md b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/lock-management.md index 86048229a7d45d9877076c6e0f7453f949c6fecc..9ef88d29e158c68a32049628f229ed07132938ca 100644 --- a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/lock-management.md +++ b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/lock-management.md @@ -135,6 +135,8 @@ date: 2021-04-20 **默认值**: 8 +PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 + ## num_internal_lock_partitions **参数说明**: 控制内部轻量级锁分区的个数。主要用于各类场景的性能调优。内容以关键字和数字的KV方式组织,各个不同类型锁之间以逗号隔开。先后顺序对设置结果不影响,例如“CLOG_PART=256,CSNLOG_PART=512”等同于“CSNLOG_PART=512,CLOG_PART=256”。重复设置同一关键字时,以最后一次设置为准,例如“CLOG_PART=256,CLOG_PART=2”,设置的结果为CLOG_PART=2。当没有设置关键字时,则为默认值,各类锁的使用描述和最大、最小、默认值如下。 diff --git a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/query-planning/other-optimizer-options.md b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/query-planning/other-optimizer-options.md index 6aa5fd9cdc5d8b6a7dcda9c9450d1d3d6d28ecf9..c6121f1071693fe4e41a3c01d621491a7870aa0d 100644 --- a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/query-planning/other-optimizer-options.md +++ b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/query-planning/other-optimizer-options.md @@ -173,7 +173,7 @@ set rewrite_rule = none; --关闭所有可选查询重写规则 - intargetlist:使用In Target List查询重写规则(提升目标列中的子查询)。 - predpushnormal:使用Predicate Push查询重写规则(下推谓词条件到子查询中)。 - predpushforce:使用Predicate Push查询重写规则(下推谓词条件到子查询中,尽可能的利用索引加速)。 -- predpush:在predpushnormal和predpushforce中根据代价选择最优计划。 +- predpush:在predpushnormal和predpushforce中根据代价选择最优计划。**注**:predpush类的重写规则在极个别场景会导致无法生成合法的计划,在启用参数前建议进行充分测试。 - reduce_orderby:使用reduce orderby查询重写规则(删除子查询中不必要的排序)。 - column_pruner:使用column_pruner查询重写规则(列裁剪消除子查询中冗余的投影列,支持A/PG兼容模式)。 @@ -300,7 +300,7 @@ set rewrite_rule = none; --关闭所有可选查询重写规则 - on表示使用。 - off表示不使用。 -**默认值**: off +**默认值**: on (使用PTK安装MogDB会对此参数进行优化,优化后默认值为off) ## enable_partition_opfusion diff --git a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/resource-consumption/background-writer.md b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/resource-consumption/background-writer.md index 4800167c4e490a915956fd13898c5b9235d43d56..00aa5ced5d3f9ef10d52df4577f78bf25661b875 100644 --- a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/resource-consumption/background-writer.md +++ b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/resource-consumption/background-writer.md @@ -137,3 +137,25 @@ date: 2021-04-20 **取值范围**: 整型,32~256 **默认值**: 256 + +## extreme_flush_dirty_page + +**参数说明**:是否开启极致刷脏模式。开启虽可以更快刷脏,但写放大增大。(MogDB 5.0.8引入) + +该参数属于POSTMASTER类型参数,请参考[GUC参数分类](../../../reference-guide/guc-parameters/appendix.md)中对应设置方法进行设置。 + +**取值范围**:布尔型 + +**默认值**:off + +**注意**:请确认当前系统刷脏慢的瓶颈不在系统IO能力之后,再打开此参数。可通过iostat、Node-exporter等监测工具确认磁盘IO不存在瓶颈。对于共享存储服务,还应确认共享存储服务的IO能力极限。 + +## checkpoint_target_time + +**参数说明**:期望执行checkpoint的最大耗时。值越小,刷脏越快,执行checkpoint实际耗时越小,但写放大增大,在IO成为瓶颈时,值很低可能影响业务;对应的上游操作有:停库(stop)、switchover(主备切换)、手动执行checkpoint语句。(MogDB 5.0.8引入) + +该参数属于POSTMASTER类型参数,请参考[GUC参数分类](../../../reference-guide/guc-parameters/appendix.md)中对应设置方法进行设置。 + +**取值范围**:5~60s + +**默认值**:30s \ No newline at end of file diff --git a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/resource-consumption/memory.md b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/resource-consumption/memory.md index d668bf317f28a8d31019ea66830b2211ff65a2d0..8b32cc0d89336dfe99f758a7f7a2b7cfed39dc37 100644 --- a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/resource-consumption/memory.md +++ b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/resource-consumption/memory.md @@ -64,7 +64,7 @@ date: 2021-04-20 **设置建议**: -数据库节点上该数值需要根据系统物理内存及单节点部署主数据库节点个数决定。建议计算公式如下:`(物理内存大小 - vm.min_free_kbytes) * 0.7 / (1 + 主节点个数)`。该系数的目的是尽可能保证系统的可靠性,不会因数据库内存膨胀导致节点OOM。这个公式中提到vm.min_free_kbytes,其含义是预留操作系统内存供内核使用,通常用作操作系统内核中通信收发内存分配,至少为5%内存。即,max_process_memory = `物理内存 * 0.665 / (1 + 主节点个数)`。 +数据库节点上该数值需要根据系统物理内存及单节点部署主数据库节点个数决定。建议计算公式如下:`(物理内存大小 - vm.min_free_kbytes) * 0.7 / (1 + 主节点个数)`。该系数的目的是尽可能保证系统的可靠性,不会因数据库内存膨胀导致节点OOM。这个公式中提到vm.min_free_kbytes,其含义是预留操作系统内存供内核使用,通常用作操作系统内核中通信收发内存分配,至少为5%内存。即,max_process_memory = `物理内存 * 0.665 / (1 + 主节点个数)`。PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-caution.gif) **注意:** 当该值设置不合理,即大于服务器物理内存,可能导致操作系统OOM问题。 @@ -107,9 +107,10 @@ shared_buffers需要设置为BLCKSZ的整数倍,BLCKSZ目前设置为8kB,即 **设置建议**: -1. 建议设置shared_buffers值为内存的40%以内。行存列存分开对待。行存设大,列存设小。列存:(单服务器内存/单服务器数据库节点个数)\*0.4\*0.25。 -2. 如果设置较大的shared_buffers需要同时增加checkpoint_segments的值,因为写入大量新增、修改数据需要消耗更多的时间周期。 -3. 如果调整shared_buffers参数之后,导致进程重启失败,请参考启动失败的报错信息,采用以下解决方案之一: +1. PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 +2. 建议设置shared_buffers值为内存的40%以内。行存列存分开对待。行存设大,列存设小。列存:(单服务器内存/单服务器数据库节点个数)\*0.4\*0.25。 +3. 如果设置较大的shared_buffers需要同时增加checkpoint_segments的值,因为写入大量新增、修改数据需要消耗更多的时间周期。 +4. 如果调整shared_buffers参数之后,导致进程重启失败,请参考启动失败的报错信息,采用以下解决方案之一: - 对应调整操作系统kernel.shmall、kernel.shmmax、kernel.shmmin参数,调整方式请参考《安装指南》的配置操作系统其他参数小节。 - 执行free -g观察操作系统可用内存和swap空间是否足够,如果内存明显不足,请手动停止其他比较占用内存的用户程序。 - 避免设置明显不合理(过大或过小)的shared_buffers值。 @@ -178,6 +179,8 @@ segment_buffers 用来缓存段页式段头的内容,属于关键元数据信 **默认值**: 10 +PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: 一般不需要对事务显式进行PREPARE操作,如果业务对事务进行显示PREPARE操作,为避免在准备步骤失败,需要调大该值,大于需要进行PREPARE业务的并发数。 ## work_mem @@ -190,10 +193,12 @@ segment_buffers 用来缓存段页式段头的内容,属于关键元数据信 **取值范围**: 整型,64~2147483647,单位为kB。 -**默认值**: 64MB +**默认值**: 64MB。 **设置建议**: +> PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 +> > 依据查询特点和并发来确定,一旦work_mem限定的物理内存不够,算子运算数据将写入临时表空间,带来5-10倍的性能下降,查询响应时间从秒级下降到分钟级。 > > - 对于串行无并发的复杂查询场景,平均每个查询有5-10关联操作,建议work_mem=50%内存/10。 @@ -243,6 +248,7 @@ segment_buffers 用来缓存段页式段头的内容,属于关键元数据信 **设置建议**: +- PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 - 建议设置此参数的值大于work_mem,可以改进清理和恢复数据库转储的速度。因为在一个数据库会话里,任意时刻只有一个维护性操作可以执行,并且在执行维护性操作时不会有太多的会话。 - 当[自动清理](../../../reference-guide/guc-parameters/automatic-vacuuming.md)线程运行时,autovacuum_max_workers倍数的内存将会被分配,所以此时设置maintenance_work_mem的值应该不小于work_mem。 - 如果进行大数据量的cluster等,可以在session中调大该值。 diff --git a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/thread-pool.md b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/thread-pool.md index f33607e47e21f7a5bd09c5c1d7f76fd002b58e7f..7862a164bd45c0d3f1425b6d6ff1e82a7c4f9951 100644 --- a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/thread-pool.md +++ b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/thread-pool.md @@ -81,6 +81,125 @@ resilience_threadpool_reject_cond = '100,200' > - 已经堆积的会话数可以通过查询pg_stat_activity视图有多少条数据获得,需要过滤少量后台线程;线程池设置的初试线程池线程数目可以通过查询thread_pool_attr参数获得。 > - 该参数如果设置的百分比过小,则会频繁触发线程池过载逃生流程,会使正在执行的会话被强制退出,新连接短时间接入失败,需要根据实际线程池使用情况慎重设置。 +## enable_ios + +**参数说明**:控制是否启动IOS服务。 + +该参数属于POSTMASTER类型参数,请参考表[GUC参数分类](appendix.md)中对应设置方法进行设置。 + +**取值范围**:布尔型 + +- on表示启用。 +- off表示不启用。 + +**默认值**:off + +## enable_heap_async_prefetch + +**参数说明**:控制是否对Astore全表扫描类场景启用预读功能。 + +该参数属于USERSET类型参数,请参考表[GUC参数分类](appendix.md)中对应设置方法进行设置。 + +**取值范围**:布尔型 + +- on表示启用。 +- off表示不启用。 + +**默认值**:off (当enable_ios = off时,该参数无效。) + +## enable_uheap_async_prefetch + +**参数说明**:控制是否对Ustore全表扫描类场景启用预读功能。 + +该参数属于USERSET类型参数,请参考表[GUC参数分类](appendix.md)中对应设置方法进行设置。 + +**取值范围**:布尔型 + +- on表示启用。 +- off表示不启用。 + +**默认值**:off (当enable_ios = off时,该参数无效。) + +## ios_worker_num + +**参数说明**:IOS线程池里面的ios_worker个数。 + +该参数属于POSTMASTER类型参数,请参考表[GUC参数分类](appendix.md)中对应设置方法进行设置。 + +**取值范围**:整型。最小值为1,最大值为100。 + +**默认值**:4 + +## parallel_scan_gap + +**参数说明**:开启并行扫描时(query_dop > 1),每个工作线程单次处理的页面数量。 + +该参数属于SIGHUP类型参数,请参考表[GUC参数分类](appendix.md)中对应设置方法进行设置。 + +**取值范围**:整型。最小值为64,最大值为4096。 + +**默认值**:128 + +## ios_batch_read_size + +**参数说明**:ios_worker每个批次下发给磁盘的预读页面个数。 + +该参数属于SIGHUP类型参数,请参考表[GUC参数分类](appendix.md)中对应设置方法进行设置。 + +**取值范围**:整型。最小值为64,最大值为1024。 + +**默认值**:64 + +## max_requests_per_worker + +**参数说明**:每个ios_worker的最大队列深度。当超过这一数量时,ios_worker线程无法接受新的请求,直到ios_worker中有一个被处理完毕,并移出任务队列。 + +该参数属于POSTMASTER类型参数,请参考表[GUC参数分类](appendix.md)中对应设置方法进行设置。 + +**取值范围**:整型。最小值为1,最大值为10。 + +**默认值**:2 + +## min_table_block_num_enable_ios + +**参数说明**:触发预读的Astore表大小阈值。只有当表的数据页总数大于等于该阈值时,才有可能触发预读。目前数据页大小为8kB。 + +该参数属于SIGHUP类型参数,请参考表[GUC参数分类](appendix.md)中对应设置方法进行设置。 + +**取值范围**:整型。最小值为65536(即512MB),最大值为6553600(即512GB)。 + +**默认值**:131072(1GB) + +## min_uheap_table_block_num_enable_ios + +**参数说明**:触发预读的Ustore表大小阈值。只有当表的数据页总数大于等于该阈值时,才有可能触发预读。目前数据页大小为8kB。 + +该参数属于SIGHUP类型参数,请参考表[GUC参数分类](appendix.md)中对应设置方法进行设置。 + +**取值范围**:整型。最小值为65536(即512MB),最大值为6553600(即512GB)。 + +**默认值**:131072(1GB) + +## prefetch_protect_time + +**参数说明**:预读buffer最大保护时间。 + +该参数属于SIGHUP类型参数,请参考表[GUC参数分类](appendix.md)中对应设置方法进行设置。 + +**取值范围**:整型。单位为毫秒。最小值为100,最大值为10000。 + +**默认值**:500 + +## ios_status_update_gap + +**参数说明**:更新IOS性能状态的时间间隔。 + +该参数属于SIGHUP类型参数,请参考表[GUC参数分类](appendix.md)中对应设置方法进行设置。 + +**取值范围**:整型。单位为秒。最小值为1,最大值为100。 + +**默认值**:1 + ## thread_pool_committer_max_retry_count **参数说明**: 设置线程池committer睡眠前的最大重试次数。 diff --git a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/write-ahead-log/settings.md b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/write-ahead-log/settings.md index 312c4b2e2300e43b5a249f874eb455761bfa58a0..d87536400e603302deef51d42d72ff51470f1399 100644 --- a/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/write-ahead-log/settings.md +++ b/product/zh/docs-mogdb/v5.0/reference-guide/guc-parameters/write-ahead-log/settings.md @@ -42,6 +42,8 @@ date: 2021-04-20 **默认值**: minimal +PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 + ## fsync **参数说明**: 设置MogDB服务器是否使用fsync()系统函数(请参见[wal_sync_method](#wal_sync_method))确保数据的更新及时写入物理磁盘中。 @@ -143,6 +145,8 @@ date: 2021-04-20 **默认值**: on +PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 + ## wal_log_hints **参数说明**: 设置在检查点之后对页面的第一次修改为页面上元组hint bits的修改时,是否将整个页面的全部内容写到WAL日志中。不推荐用户修改此设置。 @@ -156,6 +160,8 @@ date: 2021-04-20 **默认值**: on +PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 + ## wal_buffers **参数说明**: 设置用于存放WAL数据的共享内存空间的XLOG_BLCKSZ数,XLOG_BLCKSZ的大小默认为8kB。 @@ -169,7 +175,7 @@ date: 2021-04-20 **默认值**: 2048,即16MB -**设置建议**:每次事务提交时,WAL缓冲区的内容都写入到磁盘中,因此设置为很大的值不会带来明显的性能提升。如果将它设置成几百兆,就可以在有很多即时事务提交的服务器上提高写入磁盘的性能。根据经验来说,默认值可以满足大多数的情况。 +**设置建议**:每次事务提交时,WAL缓冲区的内容都写入到磁盘中,因此设置为很大的值不会带来明显的性能提升。如果将它设置成几百兆,就可以在有很多即时事务提交的服务器上提高写入磁盘的性能。根据经验来说,默认值可以满足大多数的情况。PTK会在安装数据库时根据服务器配置优化此参数值,详细信息请参考[数据库推荐参数](https://docs.mogdb.io/zh/ptk/v2.0/ref-recommend-guc)。 ## wal_writer_delay diff --git a/product/zh/docs-mogdb/v5.0/reference-guide/system-catalogs-and-system-views/system-views/IOS_STATUS.md b/product/zh/docs-mogdb/v5.0/reference-guide/system-catalogs-and-system-views/system-views/IOS_STATUS.md new file mode 100644 index 0000000000000000000000000000000000000000..73e7a47da01b437c45ef797c12d78229167079af --- /dev/null +++ b/product/zh/docs-mogdb/v5.0/reference-guide/system-catalogs-and-system-views/system-views/IOS_STATUS.md @@ -0,0 +1,27 @@ +--- +title: IOS_STATUS +summary: IOS_STATUS +author: Guo Huan +date: 2023-12-20 +--- + +# IOS_STATUS + +IOS_STATUS视图用于查看最近一段时间负责预读的I/O线程池的性能状态,包含IOSCtl派发请求,I/O时延/带宽,队列积压等指标。当主查询线程I/O时延很高或者缓存命中率低等问题出现的时候,用户或者研发人员可以通过直观查看预读线程池的性能来帮助定位。 + +**表 1** IOS_STATUS字段 + +| 名称 | 类型 | 描述 | +| :----------------------------- | :--- | :----------------------------------------------------------- | +| ios_worker_num | Int4 | 当前I/O线程池ios_worker数量 | +| io_requests | Int8 | 线程池收到的总预读请求次数 | +| io_dispatched | Int8 | 线程池派发给ios_worker总预读请求次数 | +| avg_io_size_blks | Int4 | 平均每个预读请求包含的8K页面数量 | +| avg_io_request_latency_history | Int4 | 历史上所有预读请求的平均总时延(开始放到IOSCtl的队列到ios_worker处理完成的总时间),单位:微秒 | +| avg_io_latency_history | Int4 | 历史上所有预读请求的I/O平均时延(I/O读盘的时间),单位:微秒 | +| avg_io_request_latency_ps | Int4 | 最近1秒时间内所有预读请求的平均总时延,单位:微秒 | +| avg_io_latency_ps | Int4 | 最近1秒时间内所有预读请求的I/O平均时延,单位:微秒 | +| blks_read_ps | Int4 | 最近1秒时间读8K页面总个数 | +| io_read_ps | Int4 | 最近1秒时间读盘的总个数 | +| io_queue_depth | Int4 | 当前IOSCtl的队列深度 | +| ring_buff_mem | Int4 | 当前管理预读ring buff数据结构的内存大小,单位:字节 | diff --git a/product/zh/docs-mogdb/v5.0/reference-guide/system-catalogs-and-system-views/system-views/system-views.md b/product/zh/docs-mogdb/v5.0/reference-guide/system-catalogs-and-system-views/system-views/system-views.md index c436d47c6e25e4f606d37ed6ab6c9a7f49b1209e..820a7a13bf06e9bc2f188801d0fb489938f0f66d 100644 --- a/product/zh/docs-mogdb/v5.0/reference-guide/system-catalogs-and-system-views/system-views/system-views.md +++ b/product/zh/docs-mogdb/v5.0/reference-guide/system-catalogs-and-system-views/system-views/system-views.md @@ -48,6 +48,7 @@ date: 2023-04-07 - **[GS_WLM_SESSION_INFO_ALL](GS_WLM_SESSION_INFO_ALL.md)** - **[GS_WLM_SESSION_STATISTICS](GS_WLM_SESSION_STATISTICS.md)** - **[GS_WLM_USER_INFO](GS_WLM_USER_INFO.md)** +- **[IOS_STATUS](IOS_STATUS.md)** - **[MPP_TABLES](MPP_TABLES.md)** - **[PG_AVAILABLE_EXTENSION_VERSIONS](PG_AVAILABLE_EXTENSION_VERSIONS.md)** - **[PG_AVAILABLE_EXTENSION](PG_AVAILABLE_EXTENSIONS.md)** diff --git a/product/zh/docs-mogdb/v5.0/reference-guide/tool-reference/server-tools/gs_dump.md b/product/zh/docs-mogdb/v5.0/reference-guide/tool-reference/server-tools/gs_dump.md index 988556a8e362e1ff0d21e2a658d13e6e390da218..0b1b73f540abbb85cc93a2f8bb51ddd16fac01cb 100644 --- a/product/zh/docs-mogdb/v5.0/reference-guide/tool-reference/server-tools/gs_dump.md +++ b/product/zh/docs-mogdb/v5.0/reference-guide/tool-reference/server-tools/gs_dump.md @@ -45,10 +45,12 @@ gs_dump可以创建四种不同的导出文件格式,通过“-F”或者“ ## 注意事项 -- gs_dump仅用于主库(Primary),不支持导出备库(Standby)和级联备(Cascade Standby)的数据。 - 禁止修改-F c/d/t 格式导出的文件和内容,否则可能无法恢复成功。对于-F p 格式导出的文件,如有需要,可根据需要谨慎编辑导出文件。 - 为了保证数据一致性和完整性,gs_dump会对需要转储的表设置共享锁。如果表在别的事务中设置了共享锁,gs_dump会等待锁释放后锁定表。如果无法在指定时间内锁定某个表,转储会失败。用户可以通过指定--lock-wait-timeout选项,自定义等待锁超时时间。 - 不支持加密导出存储过程和函数。 +- 自MogDB 5.0.8版本开始,gs_dump和gs_dumpall支持导出备库(Standby)和级联备(Cascade Standby)的数据,只要将端口号-p的参数指定为为备机端口号即可。在备机导出数据的过程中,假如主机发生了DDL/VACUUM FULL操作,可能会导致备机导出失败。为了保证成功导出,可以从以下两点着手。 + - 尽量选择主机不发生DDL/VACUUM FULL操作的时间段,在备机上进行导出。 + - 适当增大guc参数[max_standby_streaming_delay](../../../reference-guide/guc-parameters/ha-replication/standby-server.md#max_standby_streaming_delay)的值,可以使用`目标库的大小/磁盘的读盘速率*2`作为该值的估算值。注意由于只是估算值,不能保证在该时间长度内一定可以成功完成导出,只是提升导出成功的概率。 ## 语法 @@ -273,7 +275,7 @@ gs_dump -p port_number -f dump1.sql - --exclude-with - 导出的表定义,末尾不添加WITH(orientation=row,compression=on)这样的描述。 + 导出的表定义,末尾不添加WITH(orientation=row,compression=on)这样的描述。 - --binary-upgrade @@ -414,15 +416,58 @@ gs_dump -p port_number -f dump1.sql gs_dump -p port_number postgres -f backup.sql -F plain --dont-overwrite-file ``` -![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: - -- -s/--schema-only和-a/--data-only不能同时使用。 -- -c/--clean和-a/--data-only不能同时使用。 -- --inserts/--column-inserts和-o/--oids不能同时使用,因为INSERT命令不能设置OIDS。 -- --role和--rolepassword必须一起使用。 -- --binary-upgrade-usermap和--binary-upgrade必须一起使用。 -- --include-depend-objs/--exclude-self需要同-t/--include-table-file参数关联使用才会生效。 -- --exclude-self必须同--include-depend-objs一起使用。 +- --trigger trigger_name + + 指定导出trigger + +- --function function_name(args) + + 指定导出function + +- --type type_name + + 指定导出type + +- --package package_name + + 指定导出package + +- --procedure procedure_name(args) + + 指定导出procedure + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: +> +> - -s/--schema-only和-a/--data-only不能同时使用。 +> - -c/--clean和-a/--data-only不能同时使用。 +> - --inserts/--column-inserts和-o/--oids不能同时使用,因为INSERT命令不能设置OIDS。 +> - --role和--rolepassword必须一起使用。 +> - --binary-upgrade-usermap和--binary-upgrade必须一起使用。 +> - --include-depend-objs/--exclude-self需要同-t/--include-table-file参数关联使用才会生效。 +> - --exclude-self必须同--include-depend-objs一起使用。 +> - 对于**函数**和**存储过程**这种带参数的名称要求标明参数类型。 +> +> 例如,定义一个函数func(a INTEGER, b INTEGER),函数名为:“func(integer, integer)” +> +> 为了兼容其他SQL语法,数据库可能会将某些参数类型转换成另一个类型,例如VARCHAR2会转成character varying,func(a INTEGER, table_name IN VARCHAR2) 会转换成:“func(integer, character varying)”。为了确保参数类型输入正确,可以采用如下SQL语句查询数据库中的函数参数类型: +> +> ```sql +> SELECT p.proname AS function_name, +> p.proargtypes AS parameter_types, +> pg_catalog.pg_get_function_identity_arguments(p.oid) AS funcargs +> FROM PG_PROC AS p +> WHERE p.proname = 'func_gs_dump_0001'; +> ``` +> +> 查询结果: +> +> ```sql +> function name | parameter types | funcargs +> -------------------+---------------------+--------------------------------- +> func gs dump 0001 | 1043 | table name character varing +> ``` +> +> 通过SQL语句可以查到func_gs_dump_0001的参数类型是character varying,所以正确的对象名是"func_gs_dump_0001(character varying)"。 连接参数: diff --git a/product/zh/docs-mogdb/v5.0/reference-guide/tool-reference/server-tools/gs_dumpall.md b/product/zh/docs-mogdb/v5.0/reference-guide/tool-reference/server-tools/gs_dumpall.md index 5e39c6f89b471d04dbb34f183355f1f5ae0d79a8..8ac4a00b9d9b5faf7544693a2016c878c250a84c 100644 --- a/product/zh/docs-mogdb/v5.0/reference-guide/tool-reference/server-tools/gs_dumpall.md +++ b/product/zh/docs-mogdb/v5.0/reference-guide/tool-reference/server-tools/gs_dumpall.md @@ -30,10 +30,12 @@ gs_dumpall在导出MogDB所有数据库时分为两部分: ## 注意事项 -- gs_dumpall仅用于主库(Primary),不支持导出备库(Standby)和级联备(Cascade Standby)的数据。 - 禁止修改导出的文件和内容,否则可能无法恢复成功。 - 为了保证数据一致性和完整性,gs_dumpall会对需要转储的表设置共享锁。如果某张表在别的事务中设置了共享锁,gs_dumpall会等待此表的锁释放后锁定此表。如果无法在指定时间内锁定某张表,转储会失败。用户可以通过指定--lock-wait-timeout选项,自定义等待锁超时时间。 - 由于gs_dumpall读取所有数据库中的表,因此必须以MogDB管理员身份进行连接,才能导出完整文件。在使用gsql执行脚本文件导入时,同样需要管理员权限,以便添加用户和组,以及创建数据库。 +- 自MogDB 5.0.8版本开始,gs_dump和gs_dumpall支持导出备库(Standby)和级联备(Cascade Standby)的数据,只要将端口号-p的参数指定为为备机端口号即可。在备机导出数据的过程中,假如主机发生了DDL/VACUUM FULL操作,可能会导致备机导出失败。为了保证成功导出,可以从以下两点着手。 + - 尽量选择主机不发生DDL/VACUUM FULL操作的时间段,在备机上进行导出。 + - 适当增大guc参数[max_standby_streaming_delay](../../../reference-guide/guc-parameters/ha-replication/standby-server.md#max_standby_streaming_delay)的值,可以使用`目标库的大小/磁盘的读盘速率*2`作为该值的估算值。注意由于只是估算值,不能保证在该时间长度内一定可以成功完成导出,只是提升导出成功的概率。 ## 语法 diff --git a/product/zh/docs-mogdb/v5.0/reference-guide/tool-reference/server-tools/gs_restore.md b/product/zh/docs-mogdb/v5.0/reference-guide/tool-reference/server-tools/gs_restore.md index db7c27f0acb49f340136fa3eb24da5ed6de21f70..8e2c635e82c0ddb3065176c13effb236b5fc5c3b 100644 --- a/product/zh/docs-mogdb/v5.0/reference-guide/tool-reference/server-tools/gs_restore.md +++ b/product/zh/docs-mogdb/v5.0/reference-guide/tool-reference/server-tools/gs_restore.md @@ -172,35 +172,47 @@ gs_restore [OPTION]... FILE - -t, --table=NAME -只导入已列举的表定义、数据或定义和数据。该选项与-n选项同时使用时,用来指定某个模式下的表对象。-n参数不输入时,默认为PUBLIC模式。多次输入-n -t 可以导入指定模式下的多个表。 + 只导入已列举的表定义、数据或定义和数据。该选项与-n选项同时使用时,用来指定某个模式下的表对象。-n参数不输入时,默认为PUBLIC模式。多次输入-n -t 可以导入指定模式下的多个表。 -例如: - -导入PUBLIC模式下的table1 + 例如: -``` -gs_restore -h host_name -p port_number -d postgres -t table1 backup/MPPDB_backup.tar -``` + 导入PUBLIC模式下的table1 -导入test1模式下的test1和test2模式下test2 + ```bash + gs_restore -h host_name -p port_number -d postgres -t table1 backup/MPPDB_backup.tar + ``` -```bash -gs_restore -h host_name -p port_number -d postgres -n test1 -t test1 -n test2 -t test2 backup/MPPDB_backup.tar -``` + 导入test1模式下的test1和test2模式下test2 -导入PUBLIC模式下的table1和test1 模式下table1 + ```bash + gs_restore -h host_name -p port_number -d postgres -n test1 -t test1 -n test2 -t test2 backup/MPPDB_backup.tar + ``` -```bash -gs_restore -h host_name -p port_number -d postgres -n PUBLIC -t table1 -n test1 -t table1 backup/MPPDB_backup.tar -``` + 导入PUBLIC模式下的table1和test1 模式下table1 -![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: + ```bash + gs_restore -h host_name -p port_number -d postgres -n PUBLIC -t table1 -n test1 -t table1 backup/MPPDB_backup.tar + ``` --t不支持schema_name.table_name的输入格式,指定此格式不会报错,但不会生效。 + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **说明**: + > + > -t不支持schema_name.table_name的输入格式,指定此格式不会报错,但不会生效。 - -T, --trigger=NAME - 该参数为扩展预留接口。 + 指定导入trigger。 + +- --type type_name + + 指定导入type。 + +- --package package_name + + 指定导入package。 + +- --procedure procedure_name(args) + + 指定导入procedure。 - -x, --no-privileges/--no-acl @@ -269,6 +281,29 @@ gs_restore -h host_name -p port_number -d postgres -n PUBLIC -t table1 -n test1 3. -c/--clean 和 -a/--data-only不能同时使用。 4. 使用--single-transaction时,-j/--jobs必须为单任务。 5. --role 和--rolepassword必须一起使用。 +6. 对于**函数**和**存储过程**这种带参数的名称要求标明参数类型。 + + 例如,定义一个函数func(a INTEGER, b INTEGER),函数名为:“func(integer, integer)” + + 为了兼容其他SQL语法,数据库可能会将某些参数类型转换成另一个类型,例如VARCHAR2会转成character varying,func(a INTEGER, table_name IN VARCHAR2) 会转换成:“func(integer, character varying)”。为了确保参数类型输入正确,可以采用如下SQL语句查询数据库中的函数参数类型: + + ```sql + SELECT p.proname AS function_name, + p.proargtypes AS parameter_types, + pg_catalog.pg_get_function_identity_arguments(p.oid) AS funcargs + FROM PG_PROC AS p + WHERE p.proname = 'func_gs_dump_0001'; + ``` + + 查询结果: + + ```sql + function name | parameter types | funcargs + -------------------+---------------------+--------------------------------- + func gs dump 0001 | 1043 | table name character varing + ``` + + 通过SQL语句可以查到func_gs_dump_0001的参数类型是character varying,所以正确的对象名是"func_gs_dump_0001(character varying)"。 ### 连接参数 diff --git a/product/zh/docs-mogdb/v5.0/toc.md b/product/zh/docs-mogdb/v5.0/toc.md index 667c73e0eb2142b41d1282017774a6f190556f2f..5bd7a78f4a71f1df86f6ce803e1089e5cfeb139c 100644 --- a/product/zh/docs-mogdb/v5.0/toc.md +++ b/product/zh/docs-mogdb/v5.0/toc.md @@ -8,6 +8,7 @@ + [MogDB简介](/overview.md) + [MogDB与openGauss](/about-mogdb/MogDB-compared-to-openGauss.md) + [MogDB发布说明](/about-mogdb/mogdb-new-feature/release-note.md) + + [MogDB 5.0.8](/about-mogdb/mogdb-new-feature/5.0.8.md) + [MogDB 5.0.7](/about-mogdb/mogdb-new-feature/5.0.7.md) + [MogDB 5.0.6](/about-mogdb/mogdb-new-feature/5.0.6.md) + [MogDB 5.0.5](/about-mogdb/mogdb-new-feature/5.0.5.md) @@ -73,6 +74,9 @@ + [OCK加速数据传输](/characteristic-description/high-performance/ock-accelerated-data-transmission.md) + [OCK SCRLock加速分布式锁](/characteristic-description/high-performance/ock-scrlock-accelerate-distributed-lock.md) + [日志回放性能增强](/characteristic-description/high-performance/enhancement-of-wal-redo-performance.md) + + [极致刷脏](/characteristic-description/high-performance/enhancement-of-dirty-pages-flushing-performance.md) + + [顺序扫描预读](/characteristic-description/high-performance/seqscan-prefetch.md) + + [Ustore SMP并行扫描](/characteristic-description/high-performance/ustore-smp.md) + [高可用](/characteristic-description/high-availability/high-availability.md) + [主备机](/characteristic-description/high-availability/1-primary-standby.md) + [逻辑复制](/characteristic-description/high-availability/2-logical-replication.md) @@ -141,6 +145,7 @@ + [支持在建表后修改表日志属性](/characteristic-description/compatibility/modify-table-log-property.md) + [INSERT支持ON CONFLICT子句](/characteristic-description/compatibility/insert-on-conflict.md) + [支持AUTHID CURRENT_USER](/characteristic-description/compatibility/authid-current-user.md) + + [PBE模式支持存储过程out出参](/characteristic-description/compatibility/stored-procedure-out-parameters-in-pbe-mode.md) + [数据库安全](/characteristic-description/database-security/database-security.md) + [访问控制模型](/characteristic-description/database-security/1-access-control-model.md) + [控制权和访问权分离](/characteristic-description/database-security/2-separation-of-control-and-access-permissions.md) @@ -187,6 +192,7 @@ + [支持裁剪子查询投影列](/characteristic-description/enterprise-level-features/support-for-pruning-subquery-projection-columns.md) + [排序列裁剪](/characteristic-description/enterprise-level-features/pruning-order-by-in-subqueries.md) + [自动创建支持模糊匹配的索引](/characteristic-description/enterprise-level-features/index-support-fuzzy-matching.md) + + [支持指定导入导出五类基本对象](/characteristic-description/enterprise-level-features/import-export-specific-objects.md) + [应用开发接口](/characteristic-description/application-development-interfaces/application-development-interfaces.md) + [支持标准SQL](/characteristic-description/application-development-interfaces/1-standard-sql.md) + [支持标准开发接口](/characteristic-description/application-development-interfaces/2-standard-development-interfaces.md) @@ -246,33 +252,6 @@ + [慢SQL诊断](/administrator-guide/routine-maintenance/slow-sql-diagnosis.md) + [日志参考](/administrator-guide/routine-maintenance/11-log-reference.md) + [主备管理](/administrator-guide/primary-and-standby-management.md) - + [MOT内存表管理](/administrator-guide/mot-engine/mot-engine.md) - + [MOT介绍](/administrator-guide/mot-engine/1-introducing-mot/introducing-mot.md) - + [MOT简介](/administrator-guide/mot-engine/1-introducing-mot/1-mot-introduction.md) - + [MOT特性及价值](/administrator-guide/mot-engine/1-introducing-mot/2-mot-features-and-benefits.md) - + [MOT关键技术](/administrator-guide/mot-engine/1-introducing-mot/3-mot-key-technologies.md) - + [MOT应用场景](/administrator-guide/mot-engine/1-introducing-mot/4-mot-usage-scenarios.md) - + [MOT性能基准](/administrator-guide/mot-engine/1-introducing-mot/5-mot-performance-benchmarks.md) - + [使用MOT](/administrator-guide/mot-engine/2-using-mot/using-mot.md) - + [MOT使用概述](/administrator-guide/mot-engine/2-using-mot/1-using-mot-overview.md) - + [MOT准备](/administrator-guide/mot-engine/2-using-mot/2-mot-preparation.md) - + [MOT部署](/administrator-guide/mot-engine/2-using-mot/3-mot-deployment.md) - + [MOT使用](/administrator-guide/mot-engine/2-using-mot/4-mot-usage.md) - + [MOT管理](/administrator-guide/mot-engine/2-using-mot/5-mot-administration.md) - + [MOT样例TPC-C基准](/administrator-guide/mot-engine/2-using-mot/6-mot-sample-tpcc-benchmark.md) - + [MOT的概念](/administrator-guide/mot-engine/3-concepts-of-mot/concepts-of-mot.md) - + [MOT纵向扩容架构](/administrator-guide/mot-engine/3-concepts-of-mot/3-1.md) - + [MOT并发控制机制](/administrator-guide/mot-engine/3-concepts-of-mot/3-2.md) - + [扩展FDW与其他MogDB特性](/administrator-guide/mot-engine/3-concepts-of-mot/3-3.md) - + [NUMA-aware分配和亲和性](/administrator-guide/mot-engine/3-concepts-of-mot/3-4.md) - + [MOT索引](/administrator-guide/mot-engine/3-concepts-of-mot/3-5.md) - + [MOT持久性概念](/administrator-guide/mot-engine/3-concepts-of-mot/3-6.md) - + [MOT恢复概念](/administrator-guide/mot-engine/3-concepts-of-mot/3-7.md) - + [MOT查询原生编译(JIT)](/administrator-guide/mot-engine/3-concepts-of-mot/3-8.md) - + [对比:磁盘与MOT](/administrator-guide/mot-engine/3-concepts-of-mot/3-9.md) - + [附录](/administrator-guide/mot-engine/4-appendix/mot-appendix.md) - + [参考文献](/administrator-guide/mot-engine/4-appendix/1-references.md) - + [术语表](/administrator-guide/mot-engine/4-appendix/2-glossary.md) + [列存表管理](/administrator-guide/column-store-tables-management.md) + [备份与恢复](/administrator-guide/backup-and-restoration/backup-and-restoration.md) + [概述](/administrator-guide/backup-and-restoration/backup-and-restoration-overview.md) @@ -571,6 +550,7 @@ + [逻辑解码](/developer-guide/logical-replication/logical-decoding/logical-decoding.md) + [逻辑解码概述](/developer-guide/logical-replication/logical-decoding/1-logical-decoding.md) + [使用SQL函数接口进行逻辑解码](/developer-guide/logical-replication/logical-decoding/2-logical-decoding-by-sql-function-interfaces.md) + + [逻辑解码支持DDL操作](/developer-guide/logical-replication/logical-decoding/logical-decoding-support-for-DDL.md) + [发布订阅](/developer-guide/logical-replication/publication-subscription/publication-subscription.md) + [发布](/developer-guide/logical-replication/publication-subscription/publications.md) + [订阅](/developer-guide/logical-replication/publication-subscription/subscriptions.md) @@ -745,7 +725,7 @@ + [配置向量化执行引擎](/performance-tuning/system-tuning/configuring-vector-engine.md) + [配置SMP](/performance-tuning/system-tuning/configuring-smp.md) + [配置LLVM](/performance-tuning/system-tuning/configuring-llvm.md) - + [配置Ustore](/performance-tuning/system-tuning/configuring-ustore.md) + + [In-place Update存储引擎Ustore](/performance-tuning/system-tuning/configuring-ustore.md) + [资源负载管理](/performance-tuning/system-tuning/resource-load-management/resource-load-management.md) + [资源负载管理概述](/performance-tuning/system-tuning/resource-load-management/resource-load-management-overview.md) + [资源管理准备](/performance-tuning/system-tuning/resource-load-management/resource-management-preparations/resource-management-preparations.md) @@ -926,6 +906,7 @@ + [GS_WLM_SESSION_INFO_ALL](./reference-guide/system-catalogs-and-system-views/system-views/GS_WLM_SESSION_INFO_ALL.md) + [GS_WLM_SESSION_STATISTICS](./reference-guide/system-catalogs-and-system-views/system-views/GS_WLM_SESSION_STATISTICS.md) + [GS_WLM_USER_INFO](./reference-guide/system-catalogs-and-system-views/system-views/GS_WLM_USER_INFO.md) + + [IOS_STATUS](./reference-guide/system-catalogs-and-system-views/system-views/IOS_STATUS.md) + [MPP_TABLES](./reference-guide/system-catalogs-and-system-views/system-views/MPP_TABLES.md) + [PG_AVAILABLE_EXTENSION_VERSIONS](./reference-guide/system-catalogs-and-system-views/system-views/PG_AVAILABLE_EXTENSION_VERSIONS.md) + [PG_AVAILABLE_EXTENSIONS](./reference-guide/system-catalogs-and-system-views/system-views/PG_AVAILABLE_EXTENSIONS.md) diff --git a/product/zh/docs-mogdb/v5.0/toc_about.md b/product/zh/docs-mogdb/v5.0/toc_about.md index 02f1b8ce955e8bba17d559441cf84db7aa5aaeb4..cfaa891e66ea266d01f08aacecdfebd97eb3b565 100644 --- a/product/zh/docs-mogdb/v5.0/toc_about.md +++ b/product/zh/docs-mogdb/v5.0/toc_about.md @@ -8,6 +8,7 @@ + [MogDB简介](/overview.md) + [MogDB与openGauss](/about-mogdb/MogDB-compared-to-openGauss.md) + [MogDB发布说明](/about-mogdb/mogdb-new-feature/release-note.md) + + [MogDB 5.0.8](/about-mogdb/mogdb-new-feature/5.0.8.md) + [MogDB 5.0.7](/about-mogdb/mogdb-new-feature/5.0.7.md) + [MogDB 5.0.6](/about-mogdb/mogdb-new-feature/5.0.6.md) + [MogDB 5.0.5](/about-mogdb/mogdb-new-feature/5.0.5.md) diff --git a/product/zh/docs-mogdb/v5.0/toc_characteristic_description.md b/product/zh/docs-mogdb/v5.0/toc_characteristic_description.md index b16c1210d99c8e89dbaecbb2702dd047192df3ab..9ac9a9cee05b3fd9d7cacadf1ebd099caf153a9e 100644 --- a/product/zh/docs-mogdb/v5.0/toc_characteristic_description.md +++ b/product/zh/docs-mogdb/v5.0/toc_characteristic_description.md @@ -29,6 +29,9 @@ + [OCK加速数据传输](/characteristic-description/high-performance/ock-accelerated-data-transmission.md) + [OCK SCRLock加速分布式锁](/characteristic-description/high-performance/ock-scrlock-accelerate-distributed-lock.md) + [日志回放性能增强](/characteristic-description/high-performance/enhancement-of-wal-redo-performance.md) + + [极致刷脏](/characteristic-description/high-performance/enhancement-of-dirty-pages-flushing-performance.md) + + [顺序扫描预读](/characteristic-description/high-performance/seqscan-prefetch.md) + + [Ustore SMP并行扫描](/characteristic-description/high-performance/ustore-smp.md) + [高可用](/characteristic-description/high-availability/high-availability.md) + [主备机](/characteristic-description/high-availability/1-primary-standby.md) + [逻辑复制](/characteristic-description/high-availability/2-logical-replication.md) @@ -97,6 +100,7 @@ + [支持在建表后修改表日志属性](/characteristic-description/compatibility/modify-table-log-property.md) + [INSERT支持ON CONFLICT子句](/characteristic-description/compatibility/insert-on-conflict.md) + [支持AUTHID CURRENT_USER](/characteristic-description/compatibility/authid-current-user.md) + + [PBE模式支持存储过程out出参](/characteristic-description/compatibility/stored-procedure-out-parameters-in-pbe-mode.md) + [数据库安全](/characteristic-description/database-security/database-security.md) + [访问控制模型](/characteristic-description/database-security/1-access-control-model.md) + [控制权和访问权分离](/characteristic-description/database-security/2-separation-of-control-and-access-permissions.md) @@ -143,6 +147,7 @@ + [支持裁剪子查询投影列](/characteristic-description/enterprise-level-features/support-for-pruning-subquery-projection-columns.md) + [排序列裁剪](/characteristic-description/enterprise-level-features/pruning-order-by-in-subqueries.md) + [自动创建支持模糊匹配的索引](/characteristic-description/enterprise-level-features/index-support-fuzzy-matching.md) + + [支持指定导入导出五类基本对象](/characteristic-description/enterprise-level-features/import-export-specific-objects.md) + [应用开发接口](/characteristic-description/application-development-interfaces/application-development-interfaces.md) + [支持标准SQL](/characteristic-description/application-development-interfaces/1-standard-sql.md) + [支持标准开发接口](/characteristic-description/application-development-interfaces/2-standard-development-interfaces.md) diff --git a/product/zh/docs-mogdb/v5.0/toc_dev.md b/product/zh/docs-mogdb/v5.0/toc_dev.md index 26e286adac6b0287c96e9bf094f12ec1fc59fb87..84b5fbe8e6c7852a544e8a83bb044bbfd7772c5d 100644 --- a/product/zh/docs-mogdb/v5.0/toc_dev.md +++ b/product/zh/docs-mogdb/v5.0/toc_dev.md @@ -169,6 +169,7 @@ + [逻辑解码](/developer-guide/logical-replication/logical-decoding/logical-decoding.md) + [逻辑解码概述](/developer-guide/logical-replication/logical-decoding/1-logical-decoding.md) + [使用SQL函数接口进行逻辑解码](/developer-guide/logical-replication/logical-decoding/2-logical-decoding-by-sql-function-interfaces.md) + + [逻辑解码支持DDL操作](/developer-guide/logical-replication/logical-decoding/logical-decoding-support-for-DDL.md) + [发布订阅](/developer-guide/logical-replication/publication-subscription/publication-subscription.md) + [发布](/developer-guide/logical-replication/publication-subscription/publications.md) + [订阅](/developer-guide/logical-replication/publication-subscription/subscriptions.md) diff --git a/product/zh/docs-mogdb/v5.0/toc_manage.md b/product/zh/docs-mogdb/v5.0/toc_manage.md index 8ab7dfc54e33d62bcb5f5744b40dc5055c6ccabe..b9afd82e09f51e3611e766c3edba570f2f3b4d3c 100644 --- a/product/zh/docs-mogdb/v5.0/toc_manage.md +++ b/product/zh/docs-mogdb/v5.0/toc_manage.md @@ -25,33 +25,6 @@ + [慢SQL诊断](/administrator-guide/routine-maintenance/slow-sql-diagnosis.md) + [日志参考](/administrator-guide/routine-maintenance/11-log-reference.md) + [主备管理](/administrator-guide/primary-and-standby-management.md) -+ [MOT内存表管理](/administrator-guide/mot-engine/mot-engine.md) - + [MOT介绍](/administrator-guide/mot-engine/1-introducing-mot/introducing-mot.md) - + [MOT简介](/administrator-guide/mot-engine/1-introducing-mot/1-mot-introduction.md) - + [MOT特性及价值](/administrator-guide/mot-engine/1-introducing-mot/2-mot-features-and-benefits.md) - + [MOT关键技术](/administrator-guide/mot-engine/1-introducing-mot/3-mot-key-technologies.md) - + [MOT应用场景](/administrator-guide/mot-engine/1-introducing-mot/4-mot-usage-scenarios.md) - + [MOT性能基准](/administrator-guide/mot-engine/1-introducing-mot/5-mot-performance-benchmarks.md) - + [使用MOT](/administrator-guide/mot-engine/2-using-mot/using-mot.md) - + [MOT使用概述](/administrator-guide/mot-engine/2-using-mot/1-using-mot-overview.md) - + [MOT准备](/administrator-guide/mot-engine/2-using-mot/2-mot-preparation.md) - + [MOT部署](/administrator-guide/mot-engine/2-using-mot/3-mot-deployment.md) - + [MOT使用](/administrator-guide/mot-engine/2-using-mot/4-mot-usage.md) - + [MOT管理](/administrator-guide/mot-engine/2-using-mot/5-mot-administration.md) - + [MOT样例TPC-C基准](/administrator-guide/mot-engine/2-using-mot/6-mot-sample-tpcc-benchmark.md) - + [MOT的概念](/administrator-guide/mot-engine/3-concepts-of-mot/concepts-of-mot.md) - + [MOT纵向扩容架构](/administrator-guide/mot-engine/3-concepts-of-mot/3-1.md) - + [MOT并发控制机制](/administrator-guide/mot-engine/3-concepts-of-mot/3-2.md) - + [扩展FDW与其他MogDB特性](/administrator-guide/mot-engine/3-concepts-of-mot/3-3.md) - + [NUMA-aware分配和亲和性](/administrator-guide/mot-engine/3-concepts-of-mot/3-4.md) - + [MOT索引](/administrator-guide/mot-engine/3-concepts-of-mot/3-5.md) - + [MOT持久性概念](/administrator-guide/mot-engine/3-concepts-of-mot/3-6.md) - + [MOT恢复概念](/administrator-guide/mot-engine/3-concepts-of-mot/3-7.md) - + [MOT查询原生编译(JIT)](/administrator-guide/mot-engine/3-concepts-of-mot/3-8.md) - + [对比:磁盘与MOT](/administrator-guide/mot-engine/3-concepts-of-mot/3-9.md) - + [附录](/administrator-guide/mot-engine/4-appendix/mot-appendix.md) - + [参考文献](/administrator-guide/mot-engine/4-appendix/1-references.md) - + [术语表](/administrator-guide/mot-engine/4-appendix/2-glossary.md) + [列存表管理](/administrator-guide/column-store-tables-management.md) + [备份与恢复](/administrator-guide/backup-and-restoration/backup-and-restoration.md) + [概述](/administrator-guide/backup-and-restoration/backup-and-restoration-overview.md) diff --git a/product/zh/docs-mogdb/v5.0/toc_performance.md b/product/zh/docs-mogdb/v5.0/toc_performance.md index e18293a5905fef92ac3b5f049195a5dd4b99d7e7..8ac52080309549ee75d26a34cb57e2923c2e857f 100644 --- a/product/zh/docs-mogdb/v5.0/toc_performance.md +++ b/product/zh/docs-mogdb/v5.0/toc_performance.md @@ -9,7 +9,7 @@ + [配置向量化执行引擎](/performance-tuning/system-tuning/configuring-vector-engine.md) + [配置SMP](/performance-tuning/system-tuning/configuring-smp.md) + [配置LLVM](/performance-tuning/system-tuning/configuring-llvm.md) - + [配置Ustore](/performance-tuning/system-tuning/configuring-ustore.md) + + [In-place Update存储引擎Ustore](/performance-tuning/system-tuning/configuring-ustore.md) + [资源负载管理](/performance-tuning/system-tuning/resource-load-management/resource-load-management.md) + [资源负载管理概述](/performance-tuning/system-tuning/resource-load-management/resource-load-management-overview.md) + [资源管理准备](/performance-tuning/system-tuning/resource-load-management/resource-management-preparations/resource-management-preparations.md) diff --git a/product/zh/docs-mogdb/v5.0/toc_system-catalogs-and-functions.md b/product/zh/docs-mogdb/v5.0/toc_system-catalogs-and-functions.md index 8fa91b31dd50346670253d5075c3f90eb4c683e4..0db02071f6172bf7f7797d36bea2e0341e663de6 100644 --- a/product/zh/docs-mogdb/v5.0/toc_system-catalogs-and-functions.md +++ b/product/zh/docs-mogdb/v5.0/toc_system-catalogs-and-functions.md @@ -162,6 +162,7 @@ + [GS_WLM_SESSION_INFO_ALL](./reference-guide/system-catalogs-and-system-views/system-views/GS_WLM_SESSION_INFO_ALL.md) + [GS_WLM_SESSION_STATISTICS](./reference-guide/system-catalogs-and-system-views/system-views/GS_WLM_SESSION_STATISTICS.md) + [GS_WLM_USER_INFO](./reference-guide/system-catalogs-and-system-views/system-views/GS_WLM_USER_INFO.md) + + [IOS_STATUS](./reference-guide/system-catalogs-and-system-views/system-views/IOS_STATUS.md) + [MPP_TABLES](./reference-guide/system-catalogs-and-system-views/system-views/MPP_TABLES.md) + [PG_AVAILABLE_EXTENSION_VERSIONS](./reference-guide/system-catalogs-and-system-views/system-views/PG_AVAILABLE_EXTENSION_VERSIONS.md) + [PG_AVAILABLE_EXTENSIONS](./reference-guide/system-catalogs-and-system-views/system-views/PG_AVAILABLE_EXTENSIONS.md) diff --git a/product/zh/docs-mogdb/v5.2/reference-guide/guc-parameters/query-planning/other-optimizer-options.md b/product/zh/docs-mogdb/v5.2/reference-guide/guc-parameters/query-planning/other-optimizer-options.md index 6aa5fd9cdc5d8b6a7dcda9c9450d1d3d6d28ecf9..914ca0bf8349e1e03173d000e1382ecda11f4c0e 100644 --- a/product/zh/docs-mogdb/v5.2/reference-guide/guc-parameters/query-planning/other-optimizer-options.md +++ b/product/zh/docs-mogdb/v5.2/reference-guide/guc-parameters/query-planning/other-optimizer-options.md @@ -300,7 +300,7 @@ set rewrite_rule = none; --关闭所有可选查询重写规则 - on表示使用。 - off表示不使用。 -**默认值**: off +**默认值**: on (使用PTK安装MogDB会对此参数进行优化,优化后默认值为off) ## enable_partition_opfusion diff --git a/product/zh/docs-mogdb/v6.0/reference-guide/guc-parameters/query-planning/other-optimizer-options.md b/product/zh/docs-mogdb/v6.0/reference-guide/guc-parameters/query-planning/other-optimizer-options.md index 4fcad306cd3d5eca17effd4c3829c1ab40c9eca1..ff0df877cf98f3409dced6c324c1b6ffcf8c787c 100644 --- a/product/zh/docs-mogdb/v6.0/reference-guide/guc-parameters/query-planning/other-optimizer-options.md +++ b/product/zh/docs-mogdb/v6.0/reference-guide/guc-parameters/query-planning/other-optimizer-options.md @@ -187,7 +187,7 @@ set rewrite_rule = none; --关闭所有可选查询重写规则 - intargetlist:使用In Target List查询重写规则(提升目标列中的子查询)。 - predpushnormal:使用Predicate Push查询重写规则(下推谓词条件到子查询中)。 - predpushforce:使用Predicate Push查询重写规则(下推谓词条件到子查询中,尽可能的利用索引加速)。 -- predpush:在predpushnormal和predpushforce中根据代价选择最优计划。 +- predpush:在predpushnormal和predpushforce中根据代价选择最优计划。**注**:predpush类的重写规则在极个别场景会导致无法生成合法的计划,在启用参数前建议进行充分测试。 - reduce_orderby:使用reduce orderby查询重写规则(删除子查询中不必要的排序)。 **默认值**: magicset, reduce_orderby @@ -315,7 +315,7 @@ set rewrite_rule = none; --关闭所有可选查询重写规则 - on表示使用。 - off表示不使用。 -**默认值**: off +**默认值**: on (使用PTK安装MogDB会对此参数进行优化,优化后默认值为off) ## enable_partition_opfusion