diff --git a/product/en/docs-mogdb/v3.0/about-mogdb/MogDB-compared-to-openGauss.md b/product/en/docs-mogdb/v3.0/about-mogdb/MogDB-compared-to-openGauss.md new file mode 100644 index 0000000000000000000000000000000000000000..845973ceef44c8e84eafb45619cc7c370542f09d --- /dev/null +++ b/product/en/docs-mogdb/v3.0/about-mogdb/MogDB-compared-to-openGauss.md @@ -0,0 +1,124 @@ +--- +title: Comparison Between MogDB and openGauss +summary: Comparison Between MogDB and openGauss +author: Guo Huan +date: 2021-04-02 +--- + +# Comparison Between MogDB and openGauss + +## Relationship Between MogDB and openGauss + +MogDB is an enterprise database that is packaged and improved on the basis of openGauss open source kernel and it is more friendly to enterprise applications. On the basis of openGauss kernel, MogDB adds the MogHA component for automatic management and high availability under the primary-standby architecture, which is crucial for enterprise applications. At the same time, MogDB Manager includes backup and recovery, monitoring, automatic installation and other components for enterprise-level ease of use requirements. + +MogDB is a commercial product that is sold according to an established license pricing system and is supported by Enmo's professional services. + +## Introduction to openGauss + +openGauss is an open source relational database management system. The kernel of openGauss is derived from PostgreSQL and is distributed under Mulan PSL v2. openGauss kernel is open source, anyone and any organization can download the source code for compilation and installation without any cost; The openGauss community also regularly releases compiled binary installation files, and the current release strategy is to release one stable frequently supported version per year (end of March each year) and one radical version with new features (end of September each year). + +openGauss is a standalone database. It has the basic features of relational databases as well as enhanced features. + +For more details, please visit openGauss official website: + +### Basic Features + +- Standard SQLs + + Supports SQL92, SQL99, SQL2003, and SQL2011 standards, GBK and UTF-8 character sets, SQL standard functions and analytic functions, and SQL Procedural Language. + +- Database storage management + + Supports tablespaces where different tables can be stored in different locations. + +- Primary/standby deployment + + Supports the ACID properties, single-node fault recoveries, primary/standby data synchronization, and primary/standby switchover. + +- APIs + + Supports standard JDBC 4.0 and ODBC 3.5. + +- Management tools + + Provides installation and deployment tools, instance start and stop tools, and backup and restoration tools. + +- Security management + + Supports SSL network connections, user permission management, password management, security auditing, and other functions, to ensure data security at the management, application, system, and network layers. + +### Enhanced Features + +- Data Partitioning + + Data partitioning is a general function for most database products. In openGauss, data is partitioned horizontally with a user-specified policy. This operation splits a table into multiple partitions that are not overlapped. + +- Vectorized Executor and Hybrid Row-Column Storage Engine + + In a wide table containing a huge amount of data, a query usually only involves certain columns. In this case, the query performance of the row-store engine is poor. For example, a single table containing the data of a meteorological agency has 200 to 800 columns. Among these columns, only ten of them are frequently accessed. In this case, a vectorized executor and column-store engine can significantly improve performance by saving storage space. + +- High Availability (HA) Transaction Processing + + openGauss manages transactions and guarantees the ACID properties. openGauss provides a primary/standby HA mechanism to reduce the service interruption time when the primary node is faulty. It protects key user programs and continuously provides external services, minimizing the impact of hardware, software, and human faults on services to ensure service continuity. + +- High Concurrency and High Performance + + openGauss supports 10,000 concurrent connections through server thread pools. It supports thread nucleophilicity and millions of tpmC using the NUMA-based kernel data structure, manages TB-level large memory buffers through efficient hot and cold data elimination, achieves multi-version access without read/write blocks using CSN-based snapshots, and avoids performance fluctuation caused by full-page writes using incremental checkpoints. + +- SQL Self-Diagnosis + + To locate performance issues of a query, you can use **EXPLAIN PERFORMANCE** to query its execution plan. However, this method produces many logs, requires to modify service logic, and depends on expertise to locate problems. SQL self-diagnosis enables users to locate performance issues more efficiently. + +- Equality Query in a Fully-encrypted Database + + The encrypted database allows the client to encrypt sensitive data within the client application. During the query period, the entire service data flow exists in the form of ciphertext during data processing. It has the following advantages: + + - Protects data privacy and security throughout the lifecycle on the cloud. + - Resolves trust issues by making the public cloud, consumer cloud, and development users keep their own keys. + - Enables partners to better comply with personal privacy protection laws and regulations with the help of the full encryption capability. + +- Memory Table + + With memory tables, all data access is lock-free and concurrent, optimizing data processing and meeting real-time requirements. + +- Primary/Standby Deployment + + The primary/standby deployment mode supports synchronous and asynchronous replication. Applications are deployed based on service scenarios. For synchronous replication, one primary node and two standby nodes are deployed. This ensures reliability but affects performance. For asynchronous replication, one primary node and one standby node are deployed. This has little impact on performance, but data may be lost when exceptions occur. openGauss supports automatic recovery of damaged pages. When a page on the primary node is damaged, the damaged page can be automatically recovered on the standby node. Besides, openGauss supports concurrent log recovery on the standby node to minimize the service unavailability time when the primary node is down. + + In addition, in primary/standby deployment mode, if the read function of the standby node is enabled, the standby node supports read operations instead of write operations (such as table creation, data insertion, and data deletion), reducing the pressure on the primary node. + +- AI Capabilities + + - Automatic parameter optimization + - Slow SQL discovery + - Index recommendation + - Time series prediction and exception detection + - DB4AI function + - SQL execution time prediction + - Database monitoring + +- Logical Log Replication + + In logical replication, the primary database is called the source database, and the standby database is called the target database. The source database parses the WAL file based on the specified logical parsing rules and parses the DML operations into certain logical change information (standard SQL statements). The source database sends standard SQL statements to the target database. After receiving the SQL statements, the target database applies them to implement data synchronization. Logical replication involves only DML operations. Logical replication can implement cross-version replication, heterogeneous database replication, dual-write database replication, and table-level replication. + +- Automatic WDR Performance Analysis Report + + Periodically and proactively analyzes run logs and WDR reports (which are automatically generated in the background and can be triggered by key indicator thresholds such as the CPU usage, memory usage, and long SQL statement proportions) and generates reports in HTML and PDF formats. The performance report can be automatically generated. The WDR generates a performance report between two different time points based on the system performance snapshot data at two different time points. The report is used to diagnose database kernel performance faults. + +- Incremental Backup and Restoration (beta) + + Supports full backup and incremental backup of the database, manages backup data, and views the backup status. Supports combination of incremental backups and deletion of expired backups. The database server dynamically tracks page changes, and when a relational page is updated, the page is marked for backup. The incremental backup function requires that the GUC parameter enable_cbm_tracking be enabled to allow the server to track the modification page. + +- Point-In-Time Recovery + + Point-in-time recovery (PITR) uses basic hot backup, write-ahead logs (WALs), and archived WALs for backup and recovery. Replaying a WAL record can be stopped at any point of time, so that there is a consistent snapshot of the database at any point of time. That is, you can restore the database to the state at any time since the backup starts. During recovery, you can specify a recovery stop point with a terminal ID (TID), time, and license serial number (LSN). + +## Advantages of MogDB + +openGauss is a standalone database where data is stored on a single physical node and data access tasks are pushed to service nodes. In this way, high concurrency of servers enables quick data processing. In addition, data can be copied to the standby server through log replication, ensuring high reliability and scalability. + +openGauss is a stand-alone database. To use openGauss in formal commercial projects, you need to build complete tool chain capabilities such as database monitoring and primary/standby switchover. + +At the product level, MogDB adds MogHA enterprise-class high availability components and feature-rich graphical management tool MogDB Manager to the original functions of openGauss, and continuously enhances the openGauss kernel along the established route. MogDB can maximize the high-availability deployment capabilities of multiple machine rooms, and can reach 2.5 million tpmC on 4-CPU server. MogDB Manager contains a variety of practical components, such as MTK database migration, MIT performance monitoring, RWT performance pressure testing, PTK automated deployment, etc., which greatly makes up the shortcomings of openGauss open source database and enriches various enterprise-class functions. + +At the service level, Enmo has decades of experience in database operation and maintenance, and can provide complete services to ensure a more stable database, smoother application transformation, and less risk, making up for the disadvantages of openGauss open source database human operation and maintenance shortage, while reducing maintenance costs. diff --git a/product/en/docs-mogdb/v3.0/about-mogdb/mogdb-release-notes.md b/product/en/docs-mogdb/v3.0/about-mogdb/mogdb-release-notes.md new file mode 100644 index 0000000000000000000000000000000000000000..dfcd9a086bb0b1108b178ffb6a9e33cb6b74515e --- /dev/null +++ b/product/en/docs-mogdb/v3.0/about-mogdb/mogdb-release-notes.md @@ -0,0 +1,355 @@ +--- +title: MogDB 2.1 Release Notes +summary: MogDB 2.1 Release Notes +author: Guo Huan +date: 2021-12-06 +--- + +# MogDB 2.1 Release Notes + +## 1. Version Description + +MogDB version 2.1 is further enhanced based on MogDB version 2.0 and incorporates the new features of openGauss 2.1.0. + +> Note: MogDB 2.1 is the Preview version, and the life cycle of this version is half a year. + +## 2. New Features + +### 2.1 Incorporate new features of openGauss 2.1.0 + +- The stored procedure compatibility is enhanced. +- The SQL engine capability is enhanced. +- The Ustore storage engine is supported. +- Segment-page storage is supported. +- High availability is based on the Paxos distributed consistency protocol. +- AI4DB and DB4AI competitiveness is continuously built. +- The log framework and error codes are modified. +- JDBC client load is balanced and read and write are isolated. +- The CMake script compilation is supported. +- The column-store table supports the primary key constraint and unique key constraint. +- The jsonb data type is supported. +- Automatic elimination of unique SQL statements is supported. +- The UCE fault detection is supported. +- The GB18030 character set is supported. +- The standby server catch is optimized. +- The client tool gsql supports automatic supplement of the readline command. +- The dynamic data masking is supported. +- The State Cryptography Administration (SCA) algorithms are supported. +- The tamper-proof ledger database is supported. +- The built-in role and permission management mechanism is supported. +- The transparent encryption is supported. +- The fully-encrypted database is enhanced. +- The dblink is supported. +- The Ubuntu system is supported. +- The hash index is supported. +- UPSERT supports subqueries. +- The MIN/MAX function supports the IP address type. +- The array_remove, array_replace, first, and last functions are added. +- The Data Studio client tool adapts the kernel features. + +### 2.2 Performance Optimization for x86 Architecture + +Optimize the multi-core performance on x86 architecture. The performance of TPC-C under high concurrency is 1.5-5 times that of PostgreSQL 14. The main optimization points are: + +- Support NUMA binding +- Unlocked WAL +- Cache friendly data structure + +### 2.3 Create and Rebuild Indexes Concurrently + +Supports specifying the CONCURRENTLY option when executing create index and reindex index to create and rebuild indexes without blocking the execution of DML statements, improving index maintainability. Supports creating and rebuilding of indexes on ordinary tables and global indexes on partitioned tables concurrently. + +Compared with ordinary index creation and rebuilding, creating and rebuilding concurrently may take longer to complete. + +Indexes on column-store tables, local indexes on partitioned tables, and indexes on temporary tables do not support concurrent index creation and rebuilding. + +**Related Topics** + +- [CREATE INDEX](50-CREATE-INDEX#CONCURRENTLY) +- [REINDEX](117-REINDEX#CONCURRENTLY) + +### 2.4 Enhanced Oracle compatibility + +#### 2.4.1 Support for Orafce plugin + +> Note: Users need to download the plugin package and install it manually. + +By integrating the orafce plugin, the following Oracle compatible syntax is supported: + +- SQL Queries + - DUAL table +- SQL Functions + - Mathematical functions + - BITAND + - COSH + - SINH + - TANH + - String functions + - INSTR + - LENGTH + - LENGTHB + - LPAD + - LTRIM + - NLSSORT + - REGEXP_COUNT + - REGEXP_INSTR + - REGEXP_LIKE + - REGEXP_SUBSTR + - REGEXP_REPLACE + - RPAD + - RTRIM + - SUBSTR + - SUBSTRB + - Date/time functions + - ADD_MONTHS + - DBTIMEZONE + - LAST_DAY + - MONTHS_BETWEEN + - NEXT_DAY + - ROUND + - SESSIONTIMEZONE + - SYSDATE + - TRUNC + - Data type formatting functions + - TO_CHAR + - TO_DATE + - TO_MULTI_BYTE + - TO_NUMBER + - TO_SINGLE_BYTE + - Conditional expressions + - DECODE + - LNNVL + - NANVL + - NVL + - NVL2 + - Aggregate functions + - LISTAGG + - MEDIAN + - Functions that return internal information + - DUMP +- SQL Operators + - Datetime operator +- Packages + - DBMS_ALERT + - DBMS_ASSERT + - DBMS_OUTPUT + - DBMS_PIPE + - DBMS_RANDOM + - DBMS_UTILITY + - UTL_FILE + +**Related Topics** + +- [orafce](orafce-user-guide) + +#### 2.4.2 Support CONNECT BY Syntax + +Provide Oracle-compatible **connect by** syntax, implement level data query control, and display levels, loops, starting levels, etc. + +Provides an oracle-compatible level query function, which can display data content, data levels, paths, etc. in a tree-like structure according to the specified connection relationship, starting conditions, etc. + +Specify the root row of the level query through the start with condition, and perform a recursive query based on these rows to obtain all sub-rows, sub-rows of sub-rows, etc. + +The relationship between the parent row and the child row between the levels is specified by the connect by condition to determine all the child rows of each row that meet the condition. + +If there is a connection, whether it is a connection statement, or in the from or where clause, the result set after the connection is obtained first, and then the level query is performed. + +If there is a where filter condition in the statement, execute the level query first and then filter the result set, instead of filtering out unsatisfied rows and all its sub-rows. + +You can view the level of the row through the level pseudo column, **sys_connect_by_path** to view the path from the root row to the row, and **connect_by_root** to view auxiliary functions such as the root row. + +**Related Topics** + +- [CONNECT BY](139-CONNECT-BY) + +#### 2.4.3 Updatable View + +Supports updatable views. Users can perform Insert/Update/Delete operations on the view, and the update operation will directly affect the base table corresponding to the view. + +Not all views can be updated. There must be a one-to-one correspondence between the rows in the view and the rows in the base table, that is, the content of the view cannot be created based on aggregates or window functions. + +For a view connected by multiple tables, if the primary key (unique key) of a base table can be used as the primary key (unique key) of the view, the view also supports updating, and the update result applies to the base table from which the primary key is derived. + +**Related Topics** + +- [Updatable-views Supported](overview-of-system-catalogs-and-system-views#updatable-views-supported) + +#### 2.4.4 Alter Columns When Rebuilding View + +When the view is rebuilt, it supports the operations of reducing columns and changing column names. This command is only valid for non-materialized views. + +**Related Topics** + +- [CREATE VIEW](70-CREATE-VIEW#replace) + +#### 2.4.5 Support systimestamp Function + +Returns the current system date and time of the server where the database is located, as well as time zone information. + +**Related Topics** + +- [Date and Time Processing Functions and Operators](8-date-and-time-processing-functions-and-operators#systimestamp) + +#### 2.4.6 Support sys_guid Function + +The system generates and returns a 16-byte globally unique identifier based on the current time and machine code. + +**Related Topics** + +- [System Information Functions](23-system-information-functions#sys_guid) + +### 2.5 Support PostgreSQL Plugins + +> Note: Users need to download the plugin package and install it manually. + +- [pg_repack](pg_repack-user-guide): Through the trigger mechanism, it provides the function of rebuilding the table online, which is mainly used to reduce the size of the free space in the table online. +- [wal2json](wal2json-user-guide): Through the logical replication mechanism, continuous data changes are provided in the form of json, which are mainly used for heterogeneous replication and other situations. +- [pg_trgm](pg_trgm-user-guide): Implement the trgm word segmentation algorithm to achieve better full-text retrieval capabilities. +- [pg_prewarm](pg_prewarm-user-guide): Pre-cache the specified data table in shared memory to speed up data access. +- [pg_bulkload](pg_bulkload-user-guide): The data is directly loaded into the data file without going through the shared memory, which speeds up the batch import of the database. + +### 2.6 Support Read Extensibility + +> Note: Comes with ShardingSphere 5.1.0 and later versions, which need to be downloaded and installed manually by the user. + +MogDB supports read extensibility by integrating with ShardingSphere's Proxy. + +- Read and write transactions are automatically routed to the primary library for execution, and read-only transactions are automatically routed to the backup library for execution; in scenarios with higher read consistency requirements, read-only transactions can also be routed to the primary library for execution through hint control. + +- Support for automatic identification and configuration of read and write nodes, without the need to configure primary and secondary roles, and automatic discovery of the primary and secondary libraries in the configuration list. + +- Support for automatic identification of primary and backup roles after switching, with no additional operation required to automatically identify the new primary and backup roles and route them correctly. + +- Support automatic load balancing of backup nodes: when the backup library is down and recovered or when a new backup library is added, it will be automatically added to the read load balancing after the replication status of the backup library is normal. + +### 2.7 Others + +- The nlssort function supports sorting by pinyin for the GBK character set of rare characters + + **Related Topics**: [SELECT](125-SELECT#nlssort) + +- ALTER SEQUENCE supports modification of increment + + **Related Topics**: [ALTER SEQUENCE](16-ALTER-SEQUENCE#increment) + +- For TIMESTAMP WITH TIME ZONE type, you can use TZH, TZM, TZD, TZR parameters in TO_CHAR to output time zone information + + **Related Topics**: [Type Conversion Functions](9-type-conversion-functions#to_char) + +### 2.8 Preview Features + +> Note: Preview features need to be enabled manually. +> +> ```sql +> alter system set enable_poc_feature = on; +> -- or +> alter system set enable_poc_feature to on; +> -- Or add ‘enable_poc_feature = on’ to the postgresql.conf file in the MogDB data directory +> -- Take effect after restart +> ``` + +#### 2.8.1 Row-store Table Compression + +Supports specifying whether a row-store table (astore) is a compressed table when it is created. For a compressed row-store table, the system compresses the table data automatically to save storage space. When writing data to the compressed table, the system automatically selects the appropriate compression algorithm according to the characteristics of each column, and the user can also specify the compression algorithm used for each column directly. + +There is a strong correlation between the actual compression ratio and the data content, and the compression ratio can reach 50% in the typical scenario, and the performance loss is less than 5% in the typical TPC-C model, the actual performance impact depends on the actual system load. + +For the non-compressed table, you can also use `Alter Table` to change the table to a compressed table, subsequent new write data will be automatically compressed. + +**Related Topics** + +- [CREATE TABLE](60-CREATE-TABLE#COMPRESSION) +- [ALTER TABLE](22-ALTER-TABLE#COMPRESS) + +#### 2.8.2 SubPartition + +Support to create subpartition table, data automatically partition storage according to the partition mode, to improve the storage and query efficiency of large data volumes. The supported subpartition combinations include: + +- List-List +- List-Range +- List-Hash +- Range-List +- Range-Range +- Range-Hash + +Support querying a single Partition and SubPartition; + +Supports partition pruning for Partition Key, SubPartition Key or their combined conditions to further optimize partition query efficiency; + +Supports truncate and vacuum operations on partition tables or first-level partitions; + +During Update operation, data movement across partitions is supported (Partition/SubPartition Key is not supported as List or Hash partition type); + +Backup and restore of subpartition are supported. + +**Related Topics** + +- [CREATE TABLE SUBPARTITION](62.1-CREATE-TABLE-SUBPARTITION) +- [ALTER TABLE SUBPARTITION](23.1-ALTER-TABLE-SUBPARTITION) + +
+ +## 3. Modified Defects + +### 3.1 Incorporate openGauss 2.1.0 Modified Defects + +- [I435UP](https://gitee.com/opengauss/openGauss-server/issues/I435UP) An error is reported when the EXPLAIN statement is executed. +- [I44QS6](https://gitee.com/opengauss/openGauss-server/issues/I44QS6) When the **select get_local_active_session() limit 1 ;** function is executed, the database breaks down. +- [I4566H](https://gitee.com/opengauss/openGauss-server/issues/I4566H) After UPDATE GLOBAL INDEX is performed on a partition of a partitioned table, the query result is inconsistent with the master version. +- [I45822](https://gitee.com/opengauss/openGauss-server/issues/I45822) An error occurs when the GPC global plan cache information is queried in the global temporary table. +- [I442TY](https://gitee.com/opengauss/openGauss-server/issues/I442TY) Failed to recover to the timestamp specified by PITR. +- [I45T7A](https://gitee.com/opengauss/openGauss-server/issues/I45T7A) Remote backup is abnormal when the database is installed in environment variable separation mode. +- [I464G5](https://gitee.com/opengauss/openGauss-server/issues/I464G5) Failed to use **gs_ctl build** to rebuild a specified non-instance directory on a standby node. The error information is inconsistent. +- [I45TTB](https://gitee.com/opengauss/openGauss-server/issues/I45TTB) The foreign table is successfully created for the file type that is not supported by file_fdw, but no error is reported. +- [I491CN](https://gitee.com/opengauss/openGauss-server/issues/I491CN) When the subnet mask of the network address of the cidr type is 32, an error is reported when the MAX function is called. +- [I496VN](https://gitee.com/opengauss/openGauss-server/issues/I496VN) After a large number of Xlogs are stacked on the standby node, the archiving address is corrected. As a result, the archiving fails. +- [I49HRV](https://gitee.com/opengauss/openGauss-server/issues/I49HRV) When the standby node archiving is enabled, the standby node archiving is slow. After the switchover, the new primary node is abnormal. +- [I492W4](https://gitee.com/opengauss/openGauss-server/issues/I492W4) When operations related to the mysql_fdw and oracle_fdw foreign tables are performed on the database installed using the OM, a core dump occurs in the database. +- [I498QT](https://gitee.com/opengauss/openGauss-server/issues/I498QT) In the maximum availability mode, when the synchronous standby parameter is ANY2 and the primary server is under continuous pressure, running the **kill-9** command to stop one synchronous standby server causes transaction congestion on the primary server for 2s. +- [I49L15](https://gitee.com/opengauss/openGauss-server/issues/I49L15) Two standby nodes are enabled for archiving. After one node is scaled in and out, the archiving of the other node is abnormal. +- [I43MTG](https://gitee.com/opengauss/openGauss-server/issues/I43MTG) The developer guide does not contain information related to new functions. +- [I42YW8](https://gitee.com/opengauss/openGauss-server/issues/I42YW8) The UPSERT subquery information is not supplemented. +- [I45WDH](https://gitee.com/opengauss/openGauss-server/issues/I45WDH) file_fdw does not support the fixed format. The related description needs to be deleted from the developer guide. +- [I484J0](https://gitee.com/opengauss/openGauss-server/issues/I484J0) The **gs_initdb -T** parameter is not verified, and the value is incorrect after being set according to the guide. +- [I471CS](https://gitee.com/opengauss/openGauss-server/issues/I471CS) When **pgxc_node_name** contains hyphens (-), the database exits abnormally. If residual temporary tables are not cleared, automatic clearance and vacuum cannot be performed. +- [I40QM1](https://gitee.com/opengauss/openGauss-server/issues/I40QM1) When gs_basebackup is executed, an exception occurs on the standby node. As a result, the gs_basebackup process is blocked and cannot exit. +- [I3RTQK](https://gitee.com/opengauss/openGauss-server/issues/I3RTQK) The standby node fails to be backed up using gs_basebackup, and the message "could not fetch mot checkpoint info:, status:7" is displayed. + +### 3.2 MogDB 2.1.0 Modified Defects + +- It prompts **which doesn't support recovery_target_lsn** when **gs_probackup** restores the database + +- **Statement_history** table cannot be cleaned up + +- Abnormal database downtime caused by schema cascade delete operation + +- **\d** in gsql cannot query the field information of the table or view corresponding to the synonym + +- The **lengthb** function does not support large object fields such as blob + +- After enabling sha256 authentication, the original md5 encrypted users can still successfully login through md5 authentication + +- The output of **raise** inside nested stored procedures in MogDB is too detailed + +### MogDB 2.1.1 Modified Defects + +MogDB 2.1.1 is the patch version of MogDB 2.1.0, released on 2022.03.22. Based on MogDB 2.1.0, the following fixes are made: + +- Fixed the defect of coredump caused by parameter overflow in **pg_encoding_to_char()** function + +- Fixed the defect of coredump generated when **connect by** statement is used as query clause + +- Fixed the bug that the order of query data in the connect by statement order by level is inconsistent on the x86 platform + +## 4. Compatibility + +This version supports the following operating system and CPU architecture combinations. + +| OS | CPU Architecture | +| ------------------ | -------------------------------------------------- | +| CentOS 7.x | X86_64 (Intel, AMD, Hygon, ZHAOXIN) | +| Redhat 7.x | X86_64 (Intel, AMD, Hygon, ZHAOXIN) | +| openEuler 20.03LTS | ARM (Kunpeng), X86_64 (Intel, AMD, Hygon, ZHAOXIN) | +| Kylin V10 | ARM (Kunpeng), X86_64 (Intel, AMD, Hygon, ZHAOXIN) | +| UOS V20-D / V20-E | ARM (Kunpeng), X86_64 (Intel, AMD, Hygon, ZHAOXIN) | +| UOS V20-A | X86_64 (Intel, AMD, Hygon, ZHAOXIN) | diff --git a/product/en/docs-mogdb/v3.0/about-mogdb/open-source-components/1-enhanced-opengauss-kernel.md b/product/en/docs-mogdb/v3.0/about-mogdb/open-source-components/1-enhanced-opengauss-kernel.md new file mode 100644 index 0000000000000000000000000000000000000000..ac7a28ef8cfbb457901397299607e0261a0968bd --- /dev/null +++ b/product/en/docs-mogdb/v3.0/about-mogdb/open-source-components/1-enhanced-opengauss-kernel.md @@ -0,0 +1,12 @@ +--- +title: Enhanced openGauss Kernel +summary: Enhanced openGauss Kernel +author: Liuxu +date: 2021-06-09 +--- + +# Enhanced openGauss Kernel + +
+ +MogDB is an enterprise-level database which is more friendly to enterprise applications, encapsulated and improved on the basis of openGauss open source kernel. Based on the openGauss kernel, MogDB adds MogHA, TimeSeries (for IoT scenarios), Distributed (realize each data node as an openGauss kernel distributed environment), Oracle view compatibility (DBA and V$ views) and other plug-ins. The MogHA plug-in is used for automated HA management in the primary/standby architecture, which is crucial for enterprise applications. At the same time, MogDB Manager management software is developed, which includes backup and recovery, performance monitoring, health check toolkit, automated deployment, database migration, data synchronization, compatibility analysis, performance pressure test and data recovery from a corrupted database for a variety of enterprise-level usability requirements. diff --git a/product/en/docs-mogdb/v3.0/about-mogdb/open-source-components/2-docker-based-mogdb.md b/product/en/docs-mogdb/v3.0/about-mogdb/open-source-components/2-docker-based-mogdb.md new file mode 100644 index 0000000000000000000000000000000000000000..97d398f1d05f9c8608670dd6a97a99867621bc56 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/about-mogdb/open-source-components/2-docker-based-mogdb.md @@ -0,0 +1,37 @@ +--- +title: Container-based MogDB +summary: Container-based MogDB +author: Liuxu +date: 2021-06-09 +--- + +# Container-based MogDB + +
+ +## Features + +- As the version of MogDB changes, release the image of the new version as soon as possible. +- The container version of the database image has built-in configuration of initialization parameters for best practices. +- The container version database supports both x86 and ARM CPU architectures. +- MogDB 2.1 container version supports the latest version of compat-tools and plugin functions. + +**Currently, x86-64 and ARM64 architectures are supported. Please get the corresponding container image according to the machine architecture of your host.** + +Starting from version 2.0 (including version 2.0) + +- MogDB of the x86-64 architecture is run on the [Ubuntu 18.04 operating system](https://ubuntu.com/). +- MogDB of the ARM64 architecture is run on the [Debian 10 operating system](https://www.debian.org/). + +Before version 1.1.0 (including version 1.1.0) + +- MogDB of the x86-64 architecture is run on the [CentOS7.6 operating system](https://www.centos.org/). +- MogDB of the ARM64 architecture is run on the [openEuler 20.03 LTS operating system](https://openeuler.org/en/). + +
+ +## How to Use an Image? + +For details, visit the following website: + +[Installation Guide - Container-based Installation](docker-installation) diff --git a/product/en/docs-mogdb/v3.0/about-mogdb/open-source-components/5-wal2json-extention-for-mogdb&opengauss.md b/product/en/docs-mogdb/v3.0/about-mogdb/open-source-components/5-wal2json-extention-for-mogdb&opengauss.md new file mode 100644 index 0000000000000000000000000000000000000000..88e2d0aa7a814ae9bebad1f11f63c1047674a9bd --- /dev/null +++ b/product/en/docs-mogdb/v3.0/about-mogdb/open-source-components/5-wal2json-extention-for-mogdb&opengauss.md @@ -0,0 +1,125 @@ +--- +title: wal2json Extention for MogDB&openGauss +summary: wal2json Extention for MogDB&openGauss +author: Guo Huan +date: 2021-06-02 +--- + +# wal2json Extention for MogDB&openGauss + +## How to Get the Component + + + +
+ +## Logical Decoding (wal2json) + +With the wal2json plug-in of MogDB and openGauss, you can export logical log files in the JSON format. + +
+ +## Prerequisites + +You have set **wal_level** to **logical**. + +
+ +## Background Information + +wal2json is a logical decoding plug-in. It can be used for accessing tuples generated by INSERT and UPDATE and parsing WAL logs. + +wal2json generates a JSON object in each transaction. JSON provides all new and old tuples as well as other optional properties, such as transaction timestamp, qualified schema, data type, and transaction ID. + +
+ +## Obtaining JSON Objects Using SQL + +1. Log in to the MogDB database. + +2. Run the following commands to create tables and initialize the plug-in. + + ```sql + --Open a new session and run the following: + pg_recvlogical -d mogdb --slot test_slot --create-slot -P wal2json + pg_recvlogical -d mogdb --slot test_slot --start -o pretty-print=1 -f - + --Perform the following basic DML operations: + CREATE TABLE test_table ( + id char(10) NOT NULL, + code char(10), + PRIMARY KEY (id) + ); + mogdb=# INSERT INTO test_table (id, code) VALUES('id1', 'code1'); + INSERT 0 1 + mogdb=# update test_table set code='code2' where id='id1'; + UPDATE 1 + mogdb=# delete from test_table where id='id1'; + DELETE 1 + ``` + + DML output: + + **INSERT** + + ```json + { + "change": [ + { + "kind": "insert", + "schema": "mdmv2", + "table": "test_table", + "columnnames": ["id", "code"], + "columntypes": ["character(10)", "character(10)"], + "columnvalues": ["id1 ", "code1 "] + } + ] + } + ``` + + **UPDATE** + + ```json + { + "change": [ + { + "kind": "update", + "schema": "mdmv2", + "table": "test_table", + "columnnames": ["id", "code"], + "columntypes": ["character(10)", "character(10)"], + "columnvalues": ["id1 ", "code2 "], + "oldkeys": { + "keynames": ["id"], + "keytypes": ["character(10)"], + "keyvalues": ["id1 "] + } + } + ] + } + ``` + + **DELETE** + + ```json + { + "change": [ + { + "kind": "delete", + "schema": "mdmv2", + "table": "test_table", + "oldkeys": { + "keynames": ["id"], + "keytypes": ["character(10)"], + "keyvalues": ["id1 "] + } + } + ] + } + ``` + + REPLICA IDENTITY can determine the details of the exported logical logs when the UPDATE and DELETE operations are performed on a table. + + - DEFAULT: A logical log includes the original values of the primary key columns when the value is updated or deleted. + - NOTHING: A logical log does not include any update or delete information. + - FULL: A logical log include original information of the whole row to which the value belongs when the value in a table is updated or deleted. + - USING INDEX: includes only original values of all columns in specified indexes. diff --git a/product/en/docs-mogdb/v3.0/about-mogdb/open-source-components/DBMS-RANDOM.md b/product/en/docs-mogdb/v3.0/about-mogdb/open-source-components/DBMS-RANDOM.md new file mode 100644 index 0000000000000000000000000000000000000000..9464e7f1128c56530ddf672e97f858240148046b --- /dev/null +++ b/product/en/docs-mogdb/v3.0/about-mogdb/open-source-components/DBMS-RANDOM.md @@ -0,0 +1,531 @@ +--- +title: DBMS_RANDOM - Generating Random Data (Numbers, Strings and Dates) in MogDB with compat-tools +summary: DBMS_RANDOM - Generating Random Data (Numbers, Strings and Dates) in MogDB with compat-tools +author: Zhang Cuiping +date: 2021-08-30 +--- + +# DBMS_RANDOM - Generating Random Data (Numbers, Strings and Dates) in MogDB with compat-tools + +
+ +## Introduction to compat-tools + +Compat-tools is a set of compatibility tools. It aims to provide compatibility for necessary functions and system views tha are created for OSs migrated from other asynchronous databases to MogDB, thereby facilitating the follow-up system maintenance and application modification. + +
+ +## compat-tools Download + +To install compat-tools, please download the tool of the latest version from [https://gitee.com/enmotech/compat-tools](https://gitee.com/enmotech/compat-tools). + +
+ +## Features of compat-tools + +1. runMe.sql: General scheduling script +2. Oracle_Views.sql: Compatible with Oracle database data dictionaries and views +3. Oracle_Functions.sql: Compatible with Oracle database functions +4. Oracle_Packages.sql: Compatible with Oracle database packages +5. MySQL_Views.sql: Compatible with MySQL database data dictionaries and views //TODO +6. MySQL_Functions.sql: Compatible with MySQL database functions //TODO + +
+ +## MogDB Versions Supported By compat-tools + +- MogDB 2.0 +- MogDB 1.1 + +
+ +## Installing and Using compat-tools + +1. Download compat-tools: +2. Store the downloaded files to a customized directory (**/opt/compat-tools-0902** is taken as an example in this article). + + ```bash + [root@mogdb-kernel-0005 compat-tools-0902]# pwd + /opt/compat-tools-0902 + [root@mogdb-kernel-0005 compat-tools-0902]# ls -l + total 228 + -rw-r--r-- 1 root root 9592 Sep 2 14:40 LICENSE + -rw-r--r-- 1 root root 0 Sep 2 14:40 MySQL_Functions.sql + -rw-r--r-- 1 root root 0 Sep 2 14:40 MySQL_Views.sql + -rw-r--r-- 1 root root 41652 Sep 2 14:40 Oracle_Functions.sql + -rw-r--r-- 1 root root 34852 Sep 2 14:40 Oracle_Packages.sql + -rw-r--r-- 1 root root 125799 Sep 2 14:40 Oracle_Views.sql + -rw-r--r-- 1 root root 4708 Sep 2 14:40 README.md + -rw-r--r-- 1 root root 420 Sep 2 14:40 runMe.sql + ``` + +3. Switch to user `omm`. + + ```bash + su - omm + ``` + +4. Run the following command (26000 is the port for connnecting the database). + + ```bash + gsql -d mogdb -p 26000 -f /opt/compat-tools-0902/runMe.sql + ``` + +
+ +## Testing DBMS_RANDOM + +### Log In to the mogdb Database + +```sql +[omm@mogdb-kernel-0005 ~]$ gsql -d mogdb -p 26000 +gsql ((MogDB x.x.x build 56189e20) compiled at 2022-01-07 18:47:53 commit 0 last mr ) +Non-SSL connection (SSL connection is recommended when requiring high-security) +Type "help" for help. + +mogdb=# +``` + +
+ +- [SEED](#seed) +- [VALUE](#value) +- [STRING](#string) +- [NORMAL](#normal) +- [RANDOM](#random) +- [Generating Random Dates](#generating-random-dates) +- [Generating Random Data](#generating-random-data) + +
+ +## SEED + +The `SEED` procedure allows you to seed the pseudo-random number generator, making it more random. `SEED` is limited to binary integers or strings up to 2000 characters. If you want to consistently generate the same set of pseudo-random numbers, always use the same seed. + +```sql +declare +BEGIN + DBMS_OUTPUT.put_line('Run 1 : seed=0'); + DBMS_RANDOM.seed (val => 0); + FOR i IN 1 ..5 LOOP + DBMS_OUTPUT.put_line('i=' || i || ' : value=' || DBMS_RANDOM.value(low => 1, high => 10)); + END LOOP; + + DBMS_OUTPUT.put_line('Run 2 : seed=0'); + DBMS_RANDOM.seed (val => 0); + FOR i IN 1 ..5 LOOP + DBMS_OUTPUT.put_line('i=' || i || ' : value=' || DBMS_RANDOM.value(low => 1, high => 10)); + END LOOP; + +END; +/ +NOTICE: Run 1 : seed=0 +CONTEXT: SQL statement "CALL dbms_output.put_line('Run 1 : seed=0')" +PL/pgSQL function inline_code_block line 3 at PERFORM +NOTICE: i=1 : value=2.53745232429355 +CONTEXT: SQL statement "CALL dbms_output.put_line('i=' || i || ' : value=' || DBMS_RANDOM.value(low => 1, high => 10))" +PL/pgSQL function inline_code_block line 6 at PERFORM +NOTICE: i=2 : value=7.749117821455 +CONTEXT: SQL statement "CALL dbms_output.put_line('i=' || i || ' : value=' || DBMS_RANDOM.value(low => 1, high => 10))" +PL/pgSQL function inline_code_block line 6 at PERFORM +NOTICE: i=3 : value=1.86734489817172 +CONTEXT: SQL statement "CALL dbms_output.put_line('i=' || i || ' : value=' || DBMS_RANDOM.value(low => 1, high => 10))" +PL/pgSQL function inline_code_block line 6 at PERFORM +NOTICE: i=4 : value=8.83418704243377 +CONTEXT: SQL statement "CALL dbms_output.put_line('i=' || i || ' : value=' || DBMS_RANDOM.value(low => 1, high => 10))" +PL/pgSQL function inline_code_block line 6 at PERFORM +NOTICE: i=5 : value=6.19573155790567 +CONTEXT: SQL statement "CALL dbms_output.put_line('i=' || i || ' : value=' || DBMS_RANDOM.value(low => 1, high => 10))" +PL/pgSQL function inline_code_block line 6 at PERFORM +NOTICE: Run 2 : seed=0 +CONTEXT: SQL statement "CALL dbms_output.put_line('Run 2 : seed=0')" +PL/pgSQL function inline_code_block line 9 at PERFORM +NOTICE: i=1 : value=2.53745232429355 +CONTEXT: SQL statement "CALL dbms_output.put_line('i=' || i || ' : value=' || DBMS_RANDOM.value(low => 1, high => 10))" +PL/pgSQL function inline_code_block line 12 at PERFORM +NOTICE: i=2 : value=7.749117821455 +CONTEXT: SQL statement "CALL dbms_output.put_line('i=' || i || ' : value=' || DBMS_RANDOM.value(low => 1, high => 10))" +PL/pgSQL function inline_code_block line 12 at PERFORM +NOTICE: i=3 : value=1.86734489817172 +CONTEXT: SQL statement "CALL dbms_output.put_line('i=' || i || ' : value=' || DBMS_RANDOM.value(low => 1, high => 10))" +PL/pgSQL function inline_code_block line 12 at PERFORM +NOTICE: i=4 : value=8.83418704243377 +CONTEXT: SQL statement "CALL dbms_output.put_line('i=' || i || ' : value=' || DBMS_RANDOM.value(low => 1, high => 10))" +PL/pgSQL function inline_code_block line 12 at PERFORM +NOTICE: i=5 : value=6.19573155790567 +CONTEXT: SQL statement "CALL dbms_output.put_line('i=' || i || ' : value=' || DBMS_RANDOM.value(low => 1, high => 10))" +PL/pgSQL function inline_code_block line 12 at PERFORM +ANONYMOUS BLOCK EXECUTE +mogdb=# +``` + +
+ +## VALUE + +The `VALUE` function is used to produce random numbers with a specified range. When called without parameters it produce a number greater than or equal to 0 and less than 1, with 38 digit precision. + +```sql +DECLARE +BEGIN + FOR cur_rec IN 1 ..5 LOOP + DBMS_OUTPUT.put_line('value= ' || DBMS_RANDOM.value()); + END LOOP; +END; +/ +NOTICE: value= .785799258388579 +CONTEXT: SQL statement "CALL dbms_output.put_line('value= ' || DBMS_RANDOM.value())" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: value= .692194153089076 +CONTEXT: SQL statement "CALL dbms_output.put_line('value= ' || DBMS_RANDOM.value())" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: value= .368766269646585 +CONTEXT: SQL statement "CALL dbms_output.put_line('value= ' || DBMS_RANDOM.value())" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: value= .87390407640487 +CONTEXT: SQL statement "CALL dbms_output.put_line('value= ' || DBMS_RANDOM.value())" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: value= .745095098391175 +CONTEXT: SQL statement "CALL dbms_output.put_line('value= ' || DBMS_RANDOM.value())" +PL/pgSQL function inline_code_block line 4 at PERFORM +ANONYMOUS BLOCK EXECUTE +``` + +If the parameters are used, the resulting number will be greater than or equal to the low value and less than the high value, with the precision restricted by the size of the high value. + +```sql +declare +BEGIN + FOR cur_rec IN 1 ..5 LOOP + DBMS_OUTPUT.put_line('value(1,100)= ' || DBMS_RANDOM.value(1,100)); + END LOOP; +END; +/ + +NOTICE: value(1,100)= 45.158544998616 +CONTEXT: SQL statement "CALL dbms_output.put_line('value(1,100)= ' || DBMS_RANDOM.value(1,100))" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: value(1,100)= 36.0190920610912 +CONTEXT: SQL statement "CALL dbms_output.put_line('value(1,100)= ' || DBMS_RANDOM.value(1,100))" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: value(1,100)= 73.5194435422309 +CONTEXT: SQL statement "CALL dbms_output.put_line('value(1,100)= ' || DBMS_RANDOM.value(1,100))" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: value(1,100)= 26.7619780991226 +CONTEXT: SQL statement "CALL dbms_output.put_line('value(1,100)= ' || DBMS_RANDOM.value(1,100))" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: value(1,100)= 40.035083710216 +CONTEXT: SQL statement "CALL dbms_output.put_line('value(1,100)= ' || DBMS_RANDOM.value(1,100))" +PL/pgSQL function inline_code_block line 4 at PERFORM +ANONYMOUS BLOCK EXECUTE +mogdb=# +``` + +Use `TRUNC` or `ROUND` to alter the precision as required. For example, to produce random integer values between 1 and 10 truncate the output and add 1 to the upper boundary. + +```sql +mogdb=# select TRUNC(DBMS_RANDOM.value(1,11)) ; + + trunc +------- + + 6 + +(1 row) + +mogdb=# +``` + +
+ +## STRING + +The `STRING` function returns a string of random characters of the specified length. The `OPT` parameter determines the type of string produced as follows: + +- 'u', 'U' - uppercase alpha characters +- 'l', 'L' - lowercase alpha characters +- 'a', 'A' - mixed case alpha characters +- 'x', 'X' - uppercase alpha-numeric characters +- 'p', 'P' - any printable characters + +The `LEN` parameter, not surprisingly, specifies the length of the string returned. + +```sql +declare +BEGIN + FOR i IN 1 .. 5 LOOP + DBMS_OUTPUT.put_line('string(''x'',10)= ' || DBMS_RANDOM.string('x',10)); + END LOOP; +END; +/ + +NOTICE: string('x',10)= i5S6XOZxrA +CONTEXT: SQL statement "CALL dbms_output.put_line('string(''x'',10)= ' || DBMS_RANDOM.string('x',10))" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: string('x',10)= HGvRm75w19 +CONTEXT: SQL statement "CALL dbms_output.put_line('string(''x'',10)= ' || DBMS_RANDOM.string('x',10))" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: string('x',10)= N9WsQGJl6l +CONTEXT: SQL statement "CALL dbms_output.put_line('string(''x'',10)= ' || DBMS_RANDOM.string('x',10))" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: string('x',10)= hDlPevVgRb +CONTEXT: SQL statement "CALL dbms_output.put_line('string(''x'',10)= ' || DBMS_RANDOM.string('x',10))" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: string('x',10)= ZdSd8x8RKx +CONTEXT: SQL statement "CALL dbms_output.put_line('string(''x'',10)= ' || DBMS_RANDOM.string('x',10))" +PL/pgSQL function inline_code_block line 4 at PERFORM +ANONYMOUS BLOCK EXECUTE +mogdb=# +``` + +Combine the `STRING` and `VALUE` functions to get variable length strings. + +```sql +declare +BEGIN + FOR i IN 1 .. 5 LOOP + DBMS_OUTPUT.put_line('string(''L'',?)= ' || DBMS_RANDOM.string('L',TRUNC(DBMS_RANDOM.value(10,21)))); + END LOOP; +END; +/ + +NOTICE: string('L',?)= kcyzowdxqbyzu +CONTEXT: SQL statement "CALL dbms_output.put_line('string(''L'',?)= ' || DBMS_RANDOM.string('L',TRUNC(DBMS_RANDOM.value(10,21))))" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: string('L',?)= ohzpljyatsplqtbbus +CONTEXT: SQL statement "CALL dbms_output.put_line('string(''L'',?)= ' || DBMS_RANDOM.string('L',TRUNC(DBMS_RANDOM.value(10,21))))" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: string('L',?)= hbrjsfeevoi +CONTEXT: SQL statement "CALL dbms_output.put_line('string(''L'',?)= ' || DBMS_RANDOM.string('L',TRUNC(DBMS_RANDOM.value(10,21))))" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: string('L',?)= lfsapmytdamvwcw +CONTEXT: SQL statement "CALL dbms_output.put_line('string(''L'',?)= ' || DBMS_RANDOM.string('L',TRUNC(DBMS_RANDOM.value(10,21))))" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: string('L',?)= pcvtxnwzomkqwpfzes +CONTEXT: SQL statement "CALL dbms_output.put_line('string(''L'',?)= ' || DBMS_RANDOM.string('L',TRUNC(DBMS_RANDOM.value(10,21))))" +PL/pgSQL function inline_code_block line 4 at PERFORM +ANONYMOUS BLOCK EXECUTE +mogdb=# +``` + +
+ +## NORMAL + +The `NORMAL` function returns random numbers in a normal distribution. + +```sql +declare +BEGIN + FOR cur_rec IN 1 ..5 LOOP + DBMS_OUTPUT.put_line('normal= ' || DBMS_RANDOM.normal()); + END LOOP; +END; +/ + +NOTICE: normal= .838851847718988 +CONTEXT: SQL statement "CALL dbms_output.put_line('normal= ' || DBMS_RANDOM.normal())" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: normal= -.523612260373397 +CONTEXT: SQL statement "CALL dbms_output.put_line('normal= ' || DBMS_RANDOM.normal())" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: normal= -.241931681458075 +CONTEXT: SQL statement "CALL dbms_output.put_line('normal= ' || DBMS_RANDOM.normal())" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: normal= -.120847761874286 +CONTEXT: SQL statement "CALL dbms_output.put_line('normal= ' || DBMS_RANDOM.normal())" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: normal= .360125112757284 +CONTEXT: SQL statement "CALL dbms_output.put_line('normal= ' || DBMS_RANDOM.normal())" +PL/pgSQL function inline_code_block line 4 at PERFORM +ANONYMOUS BLOCK EXECUTE +mogdb=# +``` + +
+ +## RANDOM + +```sql +declare +BEGIN + FOR i IN 1 .. 5 LOOP + DBMS_OUTPUT.put_line('random= ' || DBMS_RANDOM.random()); + END LOOP; +END; +/ +NOTICE: This function is deprecated with Release 11gR1, although currently supported, it should not be used. +CONTEXT: SQL statement "CALL dbms_output.put_line('random= ' || DBMS_RANDOM.random())" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: random= -1023930867 +CONTEXT: SQL statement "CALL dbms_output.put_line('random= ' || DBMS_RANDOM.random())" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: This function is deprecated with Release 11gR1, although currently supported, it should not be used. +CONTEXT: SQL statement "CALL dbms_output.put_line('random= ' || DBMS_RANDOM.random())" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: random= 1068572119 +CONTEXT: SQL statement "CALL dbms_output.put_line('random= ' || DBMS_RANDOM.random())" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: This function is deprecated with Release 11gR1, although currently supported, it should not be used. +CONTEXT: SQL statement "CALL dbms_output.put_line('random= ' || DBMS_RANDOM.random())" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: random= 95361253 +CONTEXT: SQL statement "CALL dbms_output.put_line('random= ' || DBMS_RANDOM.random())" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: This function is deprecated with Release 11gR1, although currently supported, it should not be used. +CONTEXT: SQL statement "CALL dbms_output.put_line('random= ' || DBMS_RANDOM.random())" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: random= -712638729 +CONTEXT: SQL statement "CALL dbms_output.put_line('random= ' || DBMS_RANDOM.random())" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: This function is deprecated with Release 11gR1, although currently supported, it should not be used. +CONTEXT: SQL statement "CALL dbms_output.put_line('random= ' || DBMS_RANDOM.random())" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: random= -1251059926 +CONTEXT: SQL statement "CALL dbms_output.put_line('random= ' || DBMS_RANDOM.random())" +PL/pgSQL function inline_code_block line 4 at PERFORM +ANONYMOUS BLOCK EXECUTE +mogdb=# +``` + +
+ +## Generating Random Dates + +There are no specific functions for generating random dates currently, but we can add random numbers to an existing date to make it random. The following example generates random dates over the next year. + +```sql +declare +BEGIN + FOR i IN 1 .. 5 LOOP + DBMS_OUTPUT.put_line('date= ' || TRUNC(SYSDATE + DBMS_RANDOM.value(0,366))); + END LOOP; +END; +/ + +NOTICE: date= 2021-10-06 00:00:00 +CONTEXT: SQL statement "CALL dbms_output.put_line('date= ' || TRUNC(SYSDATE + DBMS_RANDOM.value(0,366)))" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: date= 2022-05-09 00:00:00 +CONTEXT: SQL statement "CALL dbms_output.put_line('date= ' || TRUNC(SYSDATE + DBMS_RANDOM.value(0,366)))" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: date= 2022-04-07 00:00:00 +CONTEXT: SQL statement "CALL dbms_output.put_line('date= ' || TRUNC(SYSDATE + DBMS_RANDOM.value(0,366)))" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: date= 2021-11-29 00:00:00 +CONTEXT: SQL statement "CALL dbms_output.put_line('date= ' || TRUNC(SYSDATE + DBMS_RANDOM.value(0,366)))" +PL/pgSQL function inline_code_block line 4 at PERFORM +NOTICE: date= 2022-06-04 00:00:00 +CONTEXT: SQL statement "CALL dbms_output.put_line('date= ' || TRUNC(SYSDATE + DBMS_RANDOM.value(0,366)))" +PL/pgSQL function inline_code_block line 4 at PERFORM +ANONYMOUS BLOCK EXECUTE +mogdb=# +``` + +By doing the correct divisions, we can add random numbers of hours, seconds or minutes to a date. + +```sql +DECLARE + l_hours_in_day NUMBER := 24; + l_mins_in_day NUMBER := 24*60; + l_secs_in_day NUMBER := 24*60*60; +BEGIN + FOR i IN 1 .. 5 LOOP + DBMS_OUTPUT.put_line('hours= ' || (TRUNC(SYSDATE) + (TRUNC(DBMS_RANDOM.value(0,1000))/l_hours_in_day))); + END LOOP; + FOR i IN 1 .. 5 LOOP + DBMS_OUTPUT.put_line('mins = ' || (TRUNC(SYSDATE) + (TRUNC(DBMS_RANDOM.value(0,1000))/l_mins_in_day))); + END LOOP; + FOR i IN 1 .. 5 LOOP + DBMS_OUTPUT.put_line('secs = ' || (TRUNC(SYSDATE) + (TRUNC(DBMS_RANDOM.value(0,1000))/l_secs_in_day))); + END LOOP; +END; +/ +NOTICE: hours= 2021-10-13 22:00:00 +CONTEXT: SQL statement "CALL dbms_output.put_line('hours= ' || (TRUNC(SYSDATE)+ (TRUNC(DBMS_RANDOM.value(0,1000))/l_hours_in_day)))" +PL/pgSQL function inline_code_block line 6 at PERFORM +NOTICE: hours= 2021-10-10 00:00:00 +CONTEXT: SQL statement "CALL dbms_output.put_line('hours= ' || (TRUNC(SYSDATE)+ (TRUNC(DBMS_RANDOM.value(0,1000))/l_hours_in_day)))" +PL/pgSQL function inline_code_block line 6 at PERFORM +NOTICE: hours= 2021-09-07 02:00:00 +CONTEXT: SQL statement "CALL dbms_output.put_line('hours= ' || (TRUNC(SYSDATE)+ (TRUNC(DBMS_RANDOM.value(0,1000))/l_hours_in_day)))" +PL/pgSQL function inline_code_block line 6 at PERFORM +NOTICE: hours= 2021-09-26 11:00:00 +CONTEXT: SQL statement "CALL dbms_output.put_line('hours= ' || (TRUNC(SYSDATE)+ (TRUNC(DBMS_RANDOM.value(0,1000))/l_hours_in_day)))" +PL/pgSQL function inline_code_block line 6 at PERFORM +NOTICE: hours= 2021-09-19 22:00:00 +CONTEXT: SQL statement "CALL dbms_output.put_line('hours= ' || (TRUNC(SYSDATE)+ (TRUNC(DBMS_RANDOM.value(0,1000))/l_hours_in_day)))" +PL/pgSQL function inline_code_block line 6 at PERFORM +NOTICE: mins = 2021-09-04 00:01:00 +CONTEXT: SQL statement "CALL dbms_output.put_line('mins = ' || (TRUNC(SYSDATE)+ (TRUNC(DBMS_RANDOM.value(0,1000))/l_mins_in_day)))" +PL/pgSQL function inline_code_block line 9 at PERFORM +NOTICE: mins = 2021-09-04 11:56:00 +CONTEXT: SQL statement "CALL dbms_output.put_line('mins = ' || (TRUNC(SYSDATE)+ (TRUNC(DBMS_RANDOM.value(0,1000))/l_mins_in_day)))" +PL/pgSQL function inline_code_block line 9 at PERFORM +NOTICE: mins = 2021-09-04 00:53:00 +CONTEXT: SQL statement "CALL dbms_output.put_line('mins = ' || (TRUNC(SYSDATE)+ (TRUNC(DBMS_RANDOM.value(0,1000))/l_mins_in_day)))" +PL/pgSQL function inline_code_block line 9 at PERFORM +NOTICE: mins = 2021-09-04 00:21:00 +CONTEXT: SQL statement "CALL dbms_output.put_line('mins = ' || (TRUNC(SYSDATE)+ (TRUNC(DBMS_RANDOM.value(0,1000))/l_mins_in_day)))" +PL/pgSQL function inline_code_block line 9 at PERFORM +NOTICE: mins = 2021-09-04 12:38:00 +CONTEXT: SQL statement "CALL dbms_output.put_line('mins = ' || (TRUNC(SYSDATE)+ (TRUNC(DBMS_RANDOM.value(0,1000))/l_mins_in_day)))" +PL/pgSQL function inline_code_block line 9 at PERFORM +NOTICE: secs = 2021-09-04 00:10:28 +CONTEXT: SQL statement "CALL dbms_output.put_line('secs = ' || (TRUNC(SYSDATE)+ (TRUNC(DBMS_RANDOM.value(0,1000))/l_secs_in_day)))" +PL/pgSQL function inline_code_block line 12 at PERFORM +NOTICE: secs = 2021-09-04 00:15:31 +CONTEXT: SQL statement "CALL dbms_output.put_line('secs = ' || (TRUNC(SYSDATE)+ (TRUNC(DBMS_RANDOM.value(0,1000))/l_secs_in_day)))" +PL/pgSQL function inline_code_block line 12 at PERFORM +NOTICE: secs = 2021-09-04 00:09:07 +CONTEXT: SQL statement "CALL dbms_output.put_line('secs = ' || (TRUNC(SYSDATE)+ (TRUNC(DBMS_RANDOM.value(0,1000))/l_secs_in_day)))" +PL/pgSQL function inline_code_block line 12 at PERFORM +NOTICE: secs = 2021-09-04 00:06:54 +CONTEXT: SQL statement "CALL dbms_output.put_line('secs = ' || (TRUNC(SYSDATE)+ (TRUNC(DBMS_RANDOM.value(0,1000))/l_secs_in_day)))" +PL/pgSQL function inline_code_block line 12 at PERFORM +NOTICE: secs = 2021-09-04 00:06:32 +CONTEXT: SQL statement "CALL dbms_output.put_line('secs = ' || (TRUNC(SYSDATE)+ (TRUNC(DBMS_RANDOM.value(0,1000))/l_secs_in_day)))" +PL/pgSQL function inline_code_block line 12 at PERFORM +ANONYMOUS BLOCK EXECUTE +mogdb=# +``` + +
+ +## Generating Random Data + +The `DBMS_RANDOM` package is useful for generating random test data. You can generate large amounts quickly by combining it into a query. + +```sql +mogdb=# CREATE TABLE random_data ( + id NUMBER, + small_number NUMBER(5), + big_number NUMBER, + short_string VARCHAR2(50), + long_string VARCHAR2(400), + created_date DATE, + CONSTRAINT random_data_pk PRIMARY KEY (id) +); +NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "random_data_pk" for table "random_data" +CREATE TABLE +mogdb=# +``` + +```sql +mogdb=# INSERT INTO random_data +SELECT generate_series(1,29999), + TRUNC(DBMS_RANDOM.value(1,5)) AS small_number, + TRUNC(DBMS_RANDOM.value(100,10000)) AS big_number, + DBMS_RANDOM.string('L',TRUNC(DBMS_RANDOM.value(10,50))) AS short_string, + DBMS_RANDOM.string('L',TRUNC(DBMS_RANDOM.value(100,400))) AS long_string, + TRUNC(SYSDATE + DBMS_RANDOM.value(0,366)) AS created_date; +INSERT 0 29999 +mogdb=# +mogdb=# select count(*) from random_data; + count +------- + 29999 +(1 row) + +mogdb=# +``` diff --git a/product/en/docs-mogdb/v3.0/about-mogdb/open-source-components/compat-tools.md b/product/en/docs-mogdb/v3.0/about-mogdb/open-source-components/compat-tools.md new file mode 100644 index 0000000000000000000000000000000000000000..cf4fa7b631183ddd2b60d4792e6a59a636664fac --- /dev/null +++ b/product/en/docs-mogdb/v3.0/about-mogdb/open-source-components/compat-tools.md @@ -0,0 +1,18 @@ +--- +title: compat-tools +summary: compat-tools +author: Zhang Cuiping +date: 2021-07-14 +--- + +# compat-tools + +This project is a set of compatibility tools. It aims to provide compatibility for necessary functions and system views created for OSs migrated from other asynchronous databases to MogDB, thereby facilitating the follow-up system maintenance and application modification. + +The script is executed based on the version information. When you execute the script, it will be executed in terms of the following three situations: + +1. If the object to be created does not exist in the target database, it will be directly created. +2. If the version of the object to be created is later than that of the object in the target database, the target database will be upgraded and has the object re-created. +3. If the version of the object to be created is earlier than that of the object in the target database, the creation operation will be skipped. + +Please refer to [compat-tools repository page](https://gitee.com/enmotech/compat-tools) for details on how to obtain and use component. diff --git a/product/en/docs-mogdb/v3.0/about-mogdb/open-source-components/mog_filedump.md b/product/en/docs-mogdb/v3.0/about-mogdb/open-source-components/mog_filedump.md new file mode 100644 index 0000000000000000000000000000000000000000..cdc52654a6599f98cd0606ef9537822c4bf51120 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/about-mogdb/open-source-components/mog_filedump.md @@ -0,0 +1,106 @@ +--- +title: mog_filedump User Guide +summary: mog_filedump User Guide +author: Guo Huan +date: 2021-11-15 +--- + + + +# mog_filedump User Guide + +## Introduction + +mog_filedump is a tool for parsing data files ported to MogDB based on the improved compatibility of the pg_filedump tool, which is used to convert MogDB heap/index/control files into user-readable format content. This tool can parse part of the fields in the data columns as needed, and can also dump the data content directly in binary format. The tool can automatically determine the type of the file by the data in the blocks of the file. The **-c** option must be used to format the pg_control file. + +
+ +## Principle + +The implementation steps are divided into three main steps. + +1. Reads a data block from a data file. + +2. Parse the data of the corresponding type with the callback function of the corresponding data type. + +3. Call the output of the corresponding data type function to print the data content. + +
+ +## Enmo's Improvements + +1. Compatibility porting to MogDB. + +2. Fix official bug: parsing bug of data type **char**. + +3. Fix official bug: In the multi-field scenario, parsing the data file, the data type name will cause a data length mismatch bug. + +
+ +## Installation + +Visit [MogDB official website download page](https://www.mogdb.io/en/downloads/mogdb/) to download the corresponding version of the toolkit, and put the tool in the **bin** directory of the MogDB installation path. As shown below, toolkits-xxxxxx.tar.gz is the toolkit that contains mog_filedump. + +![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/about-mogdb/open-source-components-2.png) + +
+ +## How to Use + +``` +mog_filedump [-abcdfhikxy] [-R startblock [endblock]] [-D attrlist] [-S blocksize] [-s segsize] [-n segnumber] file +``` + +Valid options for **heap** and **index** files are as follows: + +| options | function | +| ---- | ------------------------------------------------------------ | +| -a | show absolute path | +| -b | output a range of binary block images | +| -d | output file block content | +| -D | The data type of the table.
Currently supported data types are: bigint, bigserial, bool, charN, date, float, float4, float8, int, json, macaddr, name, oid, real, serial, smallint, smallserial, text, time, timestamp, timetz, uuid, varchar, varcharN, xid, xml, ~.
'~' means ignore all the following data types, for example, the tuple has 10 columns, `-D first three column data types, ~` means that only the first three columns of the table tuple are parsed. | +| -f | Output and parse the content of the data block | +| -h | Display instructions and help information | +| -i | Output and parse item details (including XMIN, XMAX, Block Id, linp Index, Attributes, Size, infomask) | +| -k | Verify the checksum of the data block | +| -R | Parse and output the data file contents for the specified LSN range, e.g. **-R startblock [endblock]**. If only has **startblock** and no **endblock**, only output a single data block content | +| -s | Set segment size | +| -n | Set the number of segments | +| -S | Set data block size | +| -x | Parse and output block items as index item format (included by default) | +| -y | Parse and output block items as heap item format (included by default) | + +The options available for the control file are as follows: + +| options | function | +| ------- | ---------------------------------------------- | +| -c | List of directories for parsing control files | +| -f | Output and parse the content of the data block | +| -S | Sets the block size that controls file parsing | + +You can combine the **-i** and **-f** parameters to get more effective data to help operation and maintenance personnel analyze and refer to. + +
+ +## Examples + +The test table basically covers the data types contained in mog_filedump. + +Here is a use case to show the data parsing function. Please add other parameters according to actual needs. + +```sql +-- Create table test: +create table test(serial serial, smallserial smallserial, bigserial bigserial, bigint bigint, bool bool, char char(3), date date, float float, float4 float4, float8 float8, int int, json json, macaddr macaddr, name name, oid oid, real real, smallint smallint, text text, time time, timestamp timestamp, timetz timetz, uuid uuid, varchar varchar(20), xid xid, xml xml); + +-- Insert data: +insert into test(bigint, bool, char, date, float, float4, float8, int, json, macaddr, name, oid, real, smallint, text, time, timestamp, timetz, uuid, varchar, xid, xml) values(123456789, true, 'abc', '2021-4-02 16:45:00', 3.1415926, 3.1415926, 3.14159269828412, 123456789, '{"a":1, "b":2, "c":3}'::json, '04-6C-59-99-AF-07', 'lvhui', 828243, 3.1415926, 12345, 'text', '2021-04-02 16:48:23', '2021-04-02 16:48:23', '2021-04-02 16:48:23', 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11', 'adsfghkjlzc', '9973::xid', 'Book0001'); + +-- The directory where the data files of the query table test are located. The data directory specified by gs_initdb here is db_p. So the table test data file is in db_p/base/15098/32904 +mogdb=# select pg_relation_filepath('test'); +base/15098/32904 (1 row) + +-- Use the mog_filedump tool to parse the data file content: +./mog_filedump -D serial,smallserial,bigserial,bigint,bool,charN,date,float,float4,float8,int,json,macaddr,name,oid,real,smallint,text,time,timestamp,timetz,uuid,varchar,xid,xml db_p/base/15098/32904 +``` + +![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/reference-guide/mog_filedump.png) diff --git a/product/en/docs-mogdb/v3.0/about-mogdb/open-source-components/mog_xlogdump.md b/product/en/docs-mogdb/v3.0/about-mogdb/open-source-components/mog_xlogdump.md new file mode 100644 index 0000000000000000000000000000000000000000..e11969cad7d6d758af17a7dbdb7bc6a27e6891c6 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/about-mogdb/open-source-components/mog_xlogdump.md @@ -0,0 +1,319 @@ +--- +title: mog_xlogdump User Guide +summary: mog_xlogdump User Guide +author: Guo Huan +date: 2021-11-15 +--- + +# mog_xlogdump User Guide + +## Introduction + +mog_xlogdump is an offline parsing tool for wal logs independently developed by Enmo. It is mainly used in the active-standby cluster scenario, when the database is permanently down and cannot be recovered, reversely analyze the database that cannot be started, and then recover the data that is not synchronized at the end of the wal log in the cluster. + +
+ +## R&D Background + +In MogDB primary/standby high availability cluster with one primary database and multiple standby databases, using asynchronous logical replication scenario, when the primary shuts down and its transaction is committed, the data of the transaction operation has been written to the wal log. After the primary is down, the standby will generate incomplete data segment logs because it cannot send to the standby. Therefore, after the primary is down, there is data loss and no logical alignment between the standby and the primary. So there is a risk of data loss in the primary/standby cluster composed of its standby databases and the data in the actual business. + +During the recovery of the primary database, the cluster composed of the standby database will have business data written. At this time, when the primary database is restored immediately, the segment number, start and end position of the incomplete data segment at the end of the wal log of the primary database are the same as those in the standby database. The inconsistency will also cause the data lost when the primary database shuts down and cannot be restored to the standby database. + +
+ +## Scenario + +In a MogDB high-availability cluster, when the primary database is down, the walbuffer is triggered to write to the wal log when the walbuffer is filled with a certain percentage, or when the checkpoint or commit occurs. Due to the database downtime, the logically synchronized WalSender thread stops sending logs, and the standby database receives incomplete data segment wal logs. At this time, you need to use the falshback tool to read the data blocks in the wal log of the primary database, and decode the SQL statement corresponding to the data operation, so that the DBA can analyze whether the data is valuable and restore it to the standby database. + +
+ +## Principle + +The tool uses two mechanisms, one is the header parsing of the wal log, and the other is the logical replication mechanism. + +The implementation steps are mainly divided into three steps: + +1. Read the wal log file and parse the header. + +2. Read the data in turn, and de-decode the data. + +3. According to the different data types of the data, different types of function outputs are called back. + +
+ +## Supported Table Types for Parsing + +Partitioned and normal tables are currently supported. + +
+ +## Supported Data Types for Parsing + +bool, bytea, char, name, int8, int2, int, text, oid, tid, xid, cid, xid32, clob, float4, float8, money, inet, varchar, numeric, int4; + +> 注意:Note: Since mog_xlogdump is an offline wal parsing tool, it does not currently support large data types (clob, etc.) that require toast data. The next version will support offline parsing of toast table files. + +
+ +## Installation + +Visit [MogDB official website download page](https://www.mogdb.io/en/downloads/mogdb/) to download the corresponding version of the toolkit, and put the tool in the **bin** directory of the MogDB installation path. As shown below, toolkits-xxxxxx.tar.gz is the toolkit that contains mog_xlogdump. + +![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/about-mogdb/open-source-components-2.png) + +
+ +## Instructions for Use + +mog_xlogdump is a tool for parsing and displaying MogDB-2.1 Wal logs. Auxiliary tool designed to help DBAs analyze and debug database problems. + +The mog_xlogdump parsing tool currently does not support column table. (The research found that the column-stored table will generate two corresponding CUDesc and delta tables in cstore mode. CUDesc is the metadata of the column-stored table, and delta is the temporary table of the column-stored table, which is a row-stored table. These two tables (CUDesc, delta) will be written in the wal log. Although the wal log can parse the corresponding delta table, the table is controlled by the table attribute threshold deltarow_threshold. The default is 100, that is, less than 100 pieces of data will be written in the delta table. Write directly to the cu file) + +> Note: To write the delta table in the column-store table, you need to enable the parameter **enable_delta_store = on** in postgres.conf. + +
+ +## How to Use + +``` +mog_xlogdump [OPTION]... [STARTSEG [ENDSEG]] +``` + +
+ +## Options + +- -b, --bkp-details + + Details of output file blocks. (By default, the id of the block, rel, fork, blk, and lastlsn is displayed, and this parameter will display the Block Image) + +- -B, --bytea_output + + Specify the display format of bytea type decoding output, there are binary and character formats + +- -c --connectinfo + + Specify a connect string URL, such as postgres://user:password@ip:port/dbname + +- -e, --end=RECPTR + + Specify the end position for parsing the wal log, LSN number + +- -f, --follow + + Indicates that when the specified wal log is parsed to the end, continue parsing to the next file + +- -n, --limit=N + + Specify the number of output data records + +- -o, --oid=OID + + Specifies the OID of the inverse decoding table + +- -p, --path=PATH + + Specify the wal log storage directory + +- -R, --Rel=Relation + + Specifies the data type of the inverse decoding table + +- -r, --rmgr=RMGR + + Show only the contents of records generated by the explorer + +- -s, --start=RECPTR + + Specify the starting position for parsing the wal log, LSN number + +- -T, --CTimeZone_ENV + + Specify the time zone, the default is UTC. + +- -t, --timeline=TLI + + Specify the timeline to start reading the wal log + +- -V, --version + + show version number + +- -w, --write-FPW + + Display the information written on the full page, use with -b + +- -x, --xid=XID + + Output the record with the specified transaction ID + +- -z, --stats + + Output statistics of inserted records + +- -v, --verbose + + show verbose + +- -?, --help + + show help information and exit + +
+ +## Use Case 1 + +### Scenario + +When the primary database is down and cannot be recovered, the standby database can be connected normally. At this time, the wal log sent by the primary database may contain tens of thousands of table data operations, and the mog_xlogdump tool needs to start and end according to the specified -s, -e (starting and the ending lsn position), parse out all data operations of the table. + +### Instruction + +``` +mog_xlogdump -c -s -e +``` + +### Parse Settings + +Note: The main purpose is to record old data in the wal log, that is, the data tuple before the update operation is modified, and the data deleted by the delete operation. + +1. Set the **wal_level** in the database configuration file postgres.conf to the **logical** level. +2. Alter table: `alter table table_name replica identity full;` + +### Result + +Output the wal log data parsing result in json format. The tuple display format is `'column name':'data'` + +```json +{'table_name':'xxx','schema_name':'yyy','action':'insert','tuple':{'name':'xx','id':'ss'}} +``` + +### Example + +![fe1b12d080accfb9e54f857e79baebc](https://cdn-mogdb.enmotech.com/docs-media/mogdb/reference-guide/mog_xlogdump-1.png) + +The red box is the old data that will be parsed according to the parsing settings. If there is no setting, the old data of update and delete will not be parsed. + +The standby connect URL after -c is `postgres://test:Test123456@172.16.0.44:5003/postgres` + +- postgres:// + + connect string tag header + +- test + + connect username + +- Test123456 + + The password of the connect user + +- 172.16.0.44 + + The IP address of the standby node + +- 5003 + + Standby connect port + +- postgres + + The database name of the connect standby node + +- db_p/pg_xlog/000000010000000000000004 + + The wal log file of primary node + +
+ +## Use Case 2 + +### Scenario + +When the primary database is down and cannot be recovered, and the standby database can be connected normally, the user may only pay attention to a few tables (individual tables) in the database. The mog_xlogdump tool can parse the table data of the specified oid according to the parameters -o and -R. For example, -o specifies the oid of the table, and -R specifies the field type of the table. + +### Instruction + +Create a table, write data and modify it, and use the mog_xlogdump tool to parse the Wal log. + +```sql +create table t2(id int, money money,inet inet,bool bool,numeric numeric ,text text); +insert into t2 values(1, 24.241, '192.168.255.132', true, 3.1415926, 'ljfsodfo29892ifj'); +insert into t2 values(2, 928.8271, '10.255.132.101', false, 3.1415926, 'vzvzcxwf2424@'); +update t2 set id=111, money=982.371 where id =2; +delete from t2 where id=1; + +postgres=# select * from t2; + id | money | inet | bool | numeric | text +----+---------+-----------------+------+-----------+------------------ + 1 | $24.24 | 192.168.255.132 | t | 3.1415926 | ljfsodfo29892ifj + 2 | $928.83 | 10.255.132.101 | f | 3.1415926 | vzvzcxwf2424@ +(2 rows) + +postgres=# update t2 set id=111, money=982.371 where id =2; +Postgres=# delete from t2 where id=1; +postgres=# select * from t2; + id | money | inet | bool | numeric | text +-----+-------------+----------------+------+-----------+--------------- + 111 | $982,371.00 | 10.255.132.101 | f | 3.1415926 | vzvzcxwf2424@ + +(1 rows) +``` + +### Parse Settings + +Same as use case 1, set wal_level and alter table. + +### Result + +![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/reference-guide/mog_xlogdump-2.png) + +```json +./mog_xlogdump -o 16394 -R int,money,inet,bool,numeric,text ./db_p/pg_xlog/000000010000000000000004 +'insert','tuple':{'(null)':'1','(null)':'$24.24','(null)':'192.168.255.132','(null)':true,'(null)':'3.1415926','(null)':'ljfsodfo29892ifj'}} +'insert','tuple':{'(null)':'2','(null)':'$928.83','(null)':'10.255.132.101','(null)':false,'(null)':'3.1415926','(null)':'vzvzcxwf2424@'}} +'update','old_tuple':{'(null)':'2','(null)':'$928.83','(null)':'10.255.132.101','(null)':false,'(null)':'3.1415926','(null)':'vzvzcxwf2424@'},'new_tuple':{'(null)':'111','(null)':'$982,371.00','(null)':'10.255.132.101','(null)':false,'(null)':'3.1415926','(null)':'vzvzcxwf2424@'}} +'delete','tuple':{'(null)':'1','(null)':'$24.24','(null)':'192.168.255.132','(null)':true,'(null)':'3.1415926','(null)':'ljfsodfo29892ifj'}} +``` + +> Note: Due to the change of the output format, the table name, schema name and column name are queried on the standby node according to the -c connect string, but because the original -o, -R designation of the table oid and field type is completely offline, Therefore, information such as table name, schema name, and column name cannot be obtained, so use -o and -R to parse offline. The table name and schema name are not displayed, and the column name is displayed as null. + +``` +mog_xlogdump -o -R -s -e Wal log file +``` + +The tool also retains the original functionality of pg_xlogdump. + +
+ +## Use Case 3 + +### Scenario + +If you want to see the header data content of the wal log, or to count some related information of the wal log, please use the mog_xlogdump original function. + +### Instruction + +1. header information + + ``` + ./mog_xlogdump -n 10 + ``` + + -n 10 indicates that only 10 rows of data are displayed. + +2. Statistics + + ``` + ./mog_xlogdump -z + ``` + +### Results + +- Result 1 + +![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/reference-guide/mog_xlogdump-3.png) + +- Result 2 + +![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/reference-guide/mog_xlogdump-4.png) \ No newline at end of file diff --git a/product/en/docs-mogdb/v3.0/about-mogdb/open-source-components/mogdb-monitor.md b/product/en/docs-mogdb/v3.0/about-mogdb/open-source-components/mogdb-monitor.md new file mode 100644 index 0000000000000000000000000000000000000000..b5ae7c9fa118df1b1b12a725fc28e81870a3fb00 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/about-mogdb/open-source-components/mogdb-monitor.md @@ -0,0 +1,27 @@ +--- +title: mogdb-monitor +summary: mogdb-monitor +author: Guo Huan +date: 2022-04-14 +--- + +# mogdb-monitor + +mogdb-monitor is a MogDB database cluster monitoring and deployment tool, with the help of the current very popular open source monitoring system prometheus framework, combined with the opengauss_exporter developed by Enmo database team, you can achieve a full range of detection of MogDB database. + +The core monitoring component opengauss_exporter has the following features. + +- Support all versions of MogDB/openGauss database +- Support for monitoring database clusters +- Support primary and standby database judgment within a cluster +- Support for automatic database discovery +- Support for custom query +- Supports online loading of configuration files +- Support for configuring the number of concurrent threads +- Support data collection information caching + +In terms of grafana display, Enmo also provide a complete set of dashboard, both an instance-level dashboard showing detailed information of each instance and a display big screen showing summary information of all instances, which, combined with the alertmanager component, can trigger alerts that meet the rules to relevant personnel in the first place. + +
+ +Please refer to [mogdb-monitor repository page](https://gitee.com/enmotech/mogdb-monitor) for details on how to obtain and use component. diff --git a/product/en/docs-mogdb/v3.0/about-mogdb/roadmap.md b/product/en/docs-mogdb/v3.0/about-mogdb/roadmap.md new file mode 100644 index 0000000000000000000000000000000000000000..321b2788e05facbd361337ed9de494797ce383bb --- /dev/null +++ b/product/en/docs-mogdb/v3.0/about-mogdb/roadmap.md @@ -0,0 +1,81 @@ +--- +title: Roadmap +summary: Roadmap +author: Guo Huan +date: 2021-06-15 +--- + +# Roadmap + +This document introduces the roadmap for MogDB 2.0.1. + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ComponentsUnmerged content
MogHASupport for automatic restart of primary instance
Support for database CPU usage limit setting
Support for customization of HA switching
Support for WEB interface, API and so on, facilitating small-scale deployment (EA phase)
MogDB ServerSupport setting the primary node to global read-only, which can prevent brain-split in the case of network isolation
Support for pg trgm plugin (full-text indexing plugin)
Support for custom operators and custom operator classes
Reduce resource consumption for database metrics collection
Optimize partition creation syntax
Support sorting of CJK (East Asian character set), with better performance for sorting large data volumes
Support for xid-based flashback queries
Support for sub partitions
Support for citus plug-in, enable scale-out to enhance distributed capacity
Support for online reindex without locking tables
MogDB ClientGo client: support for the SHA256 encryption algorithm
Python client: support for the SHA256 encryption algorithm and support for automatic identification of the primary node
MogDB pluginCompatibility support for wal2json plug-in used for the export and heterogeneous replication of the current logical log
Support for pg_dirtyread plug-in used for data recovery and query under special circumstances
Support for walminer plug-in used for online WAL log parsing (no reliance on logical logs)
Support for db_link plug-in used for connecting to PostgreSQL databases from MogDB
diff --git a/product/en/docs-mogdb/v3.0/about-mogdb/terms-of-use.md b/product/en/docs-mogdb/v3.0/about-mogdb/terms-of-use.md new file mode 100644 index 0000000000000000000000000000000000000000..4d6b1b0e8cbf657dd9b9fad24d592c840788a0ce --- /dev/null +++ b/product/en/docs-mogdb/v3.0/about-mogdb/terms-of-use.md @@ -0,0 +1,22 @@ +--- +title: Terms of Use +summary: Terms of Use +author: Guo Huan +date: 2021-06-01 +--- + +# Terms of Use + +**Copyright © 2009-2022 Yunhe Enmo (Beijing) Information Technology Co., Ltd. All rights reserved.** + +Your replication, use, modification, and distribution of this document are governed by the Creative Commons License Attribution-ShareAlike 4.0 International Public License (CC BY-SA 4.0). You can visit to view a human-readable summary of (and not a substitute for) CC BY-SA 4.0. For the complete CC BY-SA 4.0, visit . + +Certain document contents on this website are from the official openGauss website ()。 + +**Trademarks and Permissions** + +MogDB is a trademark of Yunhe Enmo (Beijing) Information Technology Co., Ltd. All other trademarks and registered trademarks mentioned in this document are the property of their respective holders. + +**Disclaimer** + +This document is used only as a guide. Unless otherwise specified by applicable laws or agreed by both parties in written form, all statements, information, and recommendations in this document are provided "AS IS" without warranties, guarantees or representations of any kind, including but not limited to non-infringement, timeliness, and specific purposes. diff --git a/product/en/docs-mogdb/v3.0/about-mogdb/test-report/ha/1-database-server-exception-testing.md b/product/en/docs-mogdb/v3.0/about-mogdb/test-report/ha/1-database-server-exception-testing.md new file mode 100644 index 0000000000000000000000000000000000000000..d799f1af3b37587385b0cef961d1b20f7e5d49e4 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/about-mogdb/test-report/ha/1-database-server-exception-testing.md @@ -0,0 +1,194 @@ +--- +title: MogDB Database Server Exception Testing +summary: MogDB Database Server Exception Testing +author: Liu Xu +date: 2021-03-04 +--- + +# MogDB Database Server Exception Testing + +## Test Objective + +The test aims to test the MogDB availability and stability in the scenarios where only one of the primary, standby, and arbitration MogDB nodes crashes, both primary and standby MogDB nodes crash, both primary and arbitration MogDB nodes crash, or both standby and arbitration MogDB nodes crash. + +## Test Environment + +| Category | Server Configuration | Client Configuration | Quantity | +| ----------- | :--------------------------: | :------------------: | :------: | +| CPU | Kunpeng 920 | Kunpeng 920 | 128 | +| Memory | DDR4,2933MT/s | DDR4,2933MT/s | 2048 GB | +| Hard disk | Nvme 3.5T | Nvme 3T | 4 | +| File system | Xfs | Xfs | 4 | +| OS | openEuler 20.03 (LTS) | Kylin V10 | | +| Database | MogDB 1.1.0 software package | | | +| Test tool | pgbench | | | + +## The Primary Database Node Crashes Abnormally + +Test solution: + +1. Restart the primary database node. +2. Monitor the TPS change. +3. Add a new standby node. +4. Monitor the cluster status and TPS change. + +Test procedure: + +1. Restart the primary database node. + + ```bash + reboot + ``` + +2. Make the original primary database node work as a standby database node in the cluster. + + ```bash + gs_ctl -b full build -D /gaussdata/openGauss/db1 + ``` + +3. Synchronize logs to the new standby database node until the synchronization is complete. + +Test result: + +- After the primary database node is shut down, TPS drops from 9000 to 0, which lasts for 50s or so. After a VIP is added to the new primary database node (node 2), TPS increases from 0 to 13000 or so. +- When the old primary database node is added to the cluster working as a new standby database node and operated for data synchronization, TPS drops from 13000 to 0. As the synchronization is complete, TPS is restored to 9000. + +## The Standby Database Node Crashes Abnormally + +Test solution: + +1. Restart the standby database node. + +2. Monitor the TPS change. + +3. Add the standby database node to the cluster. + +4. Monitor the cluster status and TPS change. + +Test procedure: + +1. Restart the standby database node. + + ``` + reboot + ``` + +2. Add the standby database node to the cluster. + + ``` + gs_ctl start -D /gaussdata/openGauss/db1 -M standby + ``` + +3. Synchronize data to the standby database node until the synchronization is complete. + + ``` + Standby Catchup + ``` + +Test result: + +- After the standby database node is shut down, TPS increases from 9000 or so to 13000 or so. +- After the standby database node is added to the cluster and operated for data synchronization, TPS drops from 13000 to 0. As the synchronization is complete, TPS is restored to 9000. + +## The Arbitration Node Crashes Abnormally + +Test solution: + +1. Add a VIP to the value of the **ping_list** parameter in the **node.conf** file. +2. Simulate the scenario where the arbitration node crashes. + +Test procedure: + +1. Add a VIP to the server IP addresses. + + ``` + ifconfig ens160:1 192.168.122.111 netmask 255.255.255.0 + ``` + +2. Add a VIP to the value of the **ping_list** parameter in the **node.conf** file. Synchronize the configuration file between the primary and standby database nodes and restart the monitor script. + +3. When the monitor script is running normally, disconnect the VIP of the simulated arbitration node and then observe the change. + + ``` + ifconfig ens160:1 down + ``` + +4. Manually connect the VIP of the simulated arbitration node and observe the change. + +Test result: + +- The monitor script of the primary and standby database nodes reports that the VIP of the arbitration node fails the ping operation. no other change occurs. +- TPS does not change. + +## Both Primary and Standby Database Nodes Crash Abnormally + +Test solution: + +1. Restart both the primary and standby database nodes. +2. Start the **node_run** scripts of the primary and standby database nodes. +3. Manually enable the primary and standby databases. + +Test procedure: + +1. Restart both the primary and standby database nodes. + + ``` + reboot + reboot + ``` + +2. Start the **node_run** scripts of the primary and standby database nodes. + +3. Manually enable the primary and standby databases. + +Test result: + +When the database is disabled, TPS drops to 0. After 15s the cluster is started, TPS increases to 9000. + +## Both Primary and Arbitration Nodes Crash Abnormally + +Test solution: + +Shut down the primary database node and disconnect the IP address of the arbitration node. + +Test procedure: + +1. Shut down the primary database node and disconnect the IP address of the arbitration node. + +2. Check the status of the standby database node. + +3. Start the primary database node and its monitor script. + + ``` + gs_ctl start -D /gaussdata/openGauss/db1 -M primary + ``` + +4. Check the cluster status. + +Test result: + +- After the primary database node and arbitration node are shut down, TPS drops from 9000 to 0. + +- When the primary database node restores, TPS increases to 9000. + +## Both Standby and Arbitration Nodes Crash Abnormally + +Test solution: + +Shut down the standby database node and disconnect the IP address of the arbitration node. + +Test procedure: + +1. Shut down the standby database node and disconnect the IP address of the arbitration node. + +2. Check the status of the primary database node. + + No change occurs. + +3. After the standby database node restores, start the monitor script of the standby database node. + + The error reported by the monitor script of the primary database node disappears. + +4. Start the standby database node. + +5. Check the cluster status. diff --git a/product/en/docs-mogdb/v3.0/about-mogdb/test-report/ha/2-network-exception-testing.md b/product/en/docs-mogdb/v3.0/about-mogdb/test-report/ha/2-network-exception-testing.md new file mode 100644 index 0000000000000000000000000000000000000000..50b695b52400152bfa92fb43f1689d8352ba72f1 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/about-mogdb/test-report/ha/2-network-exception-testing.md @@ -0,0 +1,177 @@ +--- +title: MogDB Network Exception Testing +summary: MogDB Network Exception Testing +author: Liu Xu +date: 2021-03-04 +--- + +# MogDB Network Exception Testing (Need to Be Reviewed Later) + +## Test Scope + +- The service NIC of the primary node is abnormal. + +- The heartbeat NIC of the primary node is abnormal. + +- Both the service and heartbeat NICs of the primary node are abnormal. + +- The service NIC of the standby node is abnormal. + +- The heartbeat NIC of the standby node is abnormal. + +- The VIP of the primary node is disconnected. + +## Test Environment + +| Category | Server Configuration | Client Configuration | Quantity | +| ----------- | :-------------------------: | :------------------: | :------: | +| CPU | Kunpeng 920 | Kunpeng 920 | 128 | +| Memory | DDR4,2933MT/s | DDR4,2933MT/s | 2048 GB | +| Hard disk | Nvme 3.5T | Nvme 3T | 4 | +| File system | Xfs | Xfs | 4 | +| OS | openEuler 20.03 (LTS) | Kylin V10 | | +| Database | MogDB1.1.0 software package | | | +| Test tool | pgbench | | | + +## The Service NIC of the Primary Database Node Is Abnormal + +Test solution: + +1. Disable the heartbeat NIC and then enable it. +2. Observe the database status. + +Test procedure: + +1. Record the original cluster status. + +2. Disable the heartbeat NIC and then enable it. + +3. Observe the status of the standby database node. + + The standby database node cannot be connected to the primary database node. + +4. Query the cluster status. + + No primary/standby switchover occurs. + +Test result: + +1. No primary/standby switchover occurs. +2. When the network of the primary database node restores, the status of the standby database node becomes normal. +3. When the service NIC of the primary database node is abnormal, TPS decreases from 9000 to 0. After 15s when the service NIC is enabled, TPS increases from 0 to 9000. + +## The Heartbeat NIC of the Primary Database Node Is Abnormal + +Test solution: + +1. Disable the heartbeat NIC and then enable it. +2. Observe the database status. + +Test procedure: + +1. Observe the database status. + +2. Disable the heartbeat NIC and then enable it. + +3. Observe the database status. + + No primary/standby switchover occurs. + +4. Observe the cluster status. + +Test result: + +1. No primary/standby switchover occurs. + +2. TPS does not change. + +## Both the Service and Heartbeat NICs of the Primary Database Node Are Abnormal + +Test solution: + +1. Disable the heartbeat NIC and then enable it. + +2. Observe the database status. + +Test procedure: + +1. Observe the original database status. + +2. Disable the heartbeat NIC and then enable it. + + The primary database node is normal. + +Test result: + +1. No primary/standby switchover occurs. + +2. When the service NIC is disconnected, TPS drops from 9000 to 0. After the service NIC is restored to normal, TPS increases to 9000. + +## The Service NIC of the Standby Database Node Is Abnormal + +Test solution: + +1. Disable the heartbeat NIC and then enable it. + +2. Observe the database status. + +Test procedure: + +1. Record the original cluster status. + +2. Disable the heartbeat NIC and then enable it. + +3. Observe the status of the primary database node. + +4. Ping the IP address of the service NIC of the standby database node. + + The IP address cannot be pinged. + + No primary/standby switchover occurs. + +Test result: + +1. The monitor script of the standby database node reports the heartbeat error indicating that no primary/standby switchover occurs. + +2. When the service NIC of the standby database node is abnormal, TPS of the primary database node increases to 13000. After the service NIC of the standby database node is restored to normal, TPS of the primary database node is restored to 9000. + +## The Heartbeat NIC of the Standby Database Node Is Abnormal + +Test solution: + +1. Disable the heartbeat NIC 60s and then enable it. + +2. Observe the database status. + +Test procedure: + +1. Observe the original database status. + +2. Disable the heartbeat NIC 60s and then enable it. + +3. After the script is executed, observe the database status. + No primary/standby switchover occurs. The monitor script of the primary database node reports an error that the IP address of the service NIC of the standby database node cannot be pinged. + +Test result: + +No primary/standby switchover occurs. + +## The VIP of the Primary Database Node Is Disconnected + +Test solution: + +Disable the NIC **bond0:1** and then enable it. + +Test procedure: + +1. Disable the NIC **bond0:1** and then enable it. + +2. Observe the database status. + +3. Observe the VIP of the primary database node. + +Test result: + +1. The VIP of the primary database node is automatically connected. + +2. When the VIP is disconnected and then connected, the TPS status is restored to normal. diff --git a/product/en/docs-mogdb/v3.0/about-mogdb/test-report/ha/3-routine-maintenance-testing.md b/product/en/docs-mogdb/v3.0/about-mogdb/test-report/ha/3-routine-maintenance-testing.md new file mode 100644 index 0000000000000000000000000000000000000000..d50a8f807f6607bd89a1c2b875c121fb9c76f3fc --- /dev/null +++ b/product/en/docs-mogdb/v3.0/about-mogdb/test-report/ha/3-routine-maintenance-testing.md @@ -0,0 +1,133 @@ +--- +title: Routine Maintenance Testing +summary: Routine Maintenance Testing +author: Guo Huan +date: 2021-04-25 +--- + +# Routine Maintenance Testing + +## Test Scope + +1. HA tool used for the switchover test +2. gs_ctl used for the switchover test +3. HA tool used for the failover test +4. gs_ctl used for the failover test +5. gs_ctl and HA contradiction test +6. Split-brain test for avoiding deploying two primary database nodes + +## Test Environment + +| Category | Server Configuration | Client Configuration | Quantity | +| ----------- | :-------------------: | :------------------: | :------: | +| CPU | Kunpeng 920 | Kunpeng 920 | 128 | +| Memory | DDR4,2933MT/s | DDR4,2933MT/s | 2048G | +| Hard disk | Nvme 3.5T | Nvme 3T | 4 | +| File system | Xfs | Xfs | 4 | +| OS | openEuler 20.03 (LTS) | Kylin V10 | | +| Database | MogDB1.1.0 | | | +| Tool | pgbench | | | + +## HA Tool Used for the Switchover Test + +Test procedure: + +1. Query the status of databases. + +2. Run the switch command on the standby database, and then query the database status. + + ```bash + /home/omm/venv/bin/python3 /home/omm/openGaussHA_standlone/switch.py -config /home/omm/openGaussHA_standlone/node.conf --switchover + ``` + +3. Query the VIP of the new primary database node. + +Test result: + +The primary/standby switchover is normal. + +## gs_ctl Used for the Switchover Test + +Test procedure: + +1. Query the database status. +2. Run the command to perform the switchover operation and query the status when finished. +3. Query the VIP. + +Test result: + +The primary/standby switchover is normal. + +## HA Tool Used for the Failover Test + +Test procedure: + +1. Query the current status of the database. + +2. Run the HA failover command. + +3. Observe the cluster status. + +4. The original standby database node works as a new primary database node. + +Test result: + +1. The original standby database node becomes the new primary database node with a VIP. + +2. The original primary database node is killed. + +3. TPS increases from 6000 to 13000 in about 10s after failover. + +## gs_ctl Used for the Failover Test + +Test procedure: + +1. Query the current status of the database. +2. Run the gs_ctl failover command. +3. Query the status of the new primary database node. +4. Query the status of the original primary database node. +5. Observe script changes. + +Test result: + +1. The new primary database node waits for 10s and then automatically shuts down, the original primary database node is not affected, and the VIP is still attached to the original primary database node. +2. About 5s or 6s after failover of the standby database node, the TPS increases from 9000 to 13000. No other changes occur except fluctuations. + +## gs_ctl and HA Contradiction Test + +Test procedure: + +1. Record the cluster status. + +2. Stop the primary and standby database nodes at the same time. + +3. Use gs_ctl to start the primary and standby database nodes. Designate the original standby database node as the primary database node, vice versa. + +4. Observe the script and cluster status. + +Test result: + +1. HA configuration is adjusted according to the actual situation. +2. TPS drops from 9000 to 0 after the primary and standby database nodes are shut down. At this time, HA detects the cluster configuration change and attaches the VIP to the new primary database node within 10s. After about 5s or 6s, TPS rises from 0 to 13000. + +## Split-Brain Test for Avoiding Deploying Two Primary Database Nodes + +Test procedure: + +1. Query the current status of the cluster. + +2. Use the gs_ctl command to restart the standby database node to make it work as the primary database node. + +Test result: + +1. After timeout for 10s, the primary database node that was restarted by an abnormal operation is killed by the monitoring script. + +2. Repair the standby database node and add it to the cluster. + + ```bash + gs_ctl build -b full -D /gaussdata/openGauss/db1 + ``` + +3. TPS increases from 6000 to 13000 after the standby database is restarted to work as the primary database node. + +4. TPS drops from 13000 to 9000 (same as that in the primary/standby deployment scenario) after the standby database node is added to the cluster. diff --git a/product/en/docs-mogdb/v3.0/about-mogdb/test-report/ha/4-service-exception-testing.md b/product/en/docs-mogdb/v3.0/about-mogdb/test-report/ha/4-service-exception-testing.md new file mode 100644 index 0000000000000000000000000000000000000000..119b5875829cbd534eb533e50542ac8fe74fecaa --- /dev/null +++ b/product/en/docs-mogdb/v3.0/about-mogdb/test-report/ha/4-service-exception-testing.md @@ -0,0 +1,84 @@ +--- +title: MogDB Service Exception Testing +summary: MogDB Service Exception Testing +author: Guo Huan +date: 2021-04-25 +--- + +# MogDB Service Exception Testing + +## Test Scope + +1. Database process exception for the standby node + +2. Monitor script exception for the standby node + +3. File system exception for the primary node (affecting HA process) + +## Test Environment + +| Category | Server Configuration | Client Configuration | Quantity | +| ----------- | :-------------------: | :------------------: | :------: | +| CPU | Kunpeng 920 | Kunpeng 920 | 128 | +| Memory | DDR4,2933MT/s | DDR4,2933MT/s | 2048G | +| Hard Disk | Nvme 3.5T | Nvme 3T | 4 | +| File system | Xfs | Xfs | 4 | +| OS | openEuler 20.03 (LTS) | Kylin V10 | | +| Database | MogDB1.1.0 | | | +| Tool | pgbench | | | + +## Database process exception for the standby node + +Test procedure: + +Kill the database process. + +Test result: + +1. Observe the cluster status. + +2. The script of the standby database node shows the heartbeat exception. + +3. No switchover occurs. + +4. Run the command on the standby database node. + + ```bash + gs_ctl start -D /gaussdata/openGauss/db1 -M standby + ``` + +5. The cluster is restored to normal and no switchover has occurred since then. + +6. When the standby database process is killed, TPS of the primary database node rises from 9000 to 13000. + +7. No primary/standby switchover occurs. + +## Monitor script exception for the standby node + +Test procedure: + +Kill the monitor script of the primary database node. + +Test result: + +1. Observe the cluster status. +2. No switchover occurs. +3. The monitor script of the primary database node reports script exception of the standby database node. +4. Restore the monitor script of the standby database node. +5. TPS maintains at 9000. + +## File system exception for the primary node (affecting HA process) + +Test procedure: + +Modify the permission of the primary database script called by HA, such as gs_ctl. + +Test result: + +1. After modifying the **rwx** permission of gs_ctl, the monitoring script of the primary database node reports heartbeat exception. + +2. The instance status cannot be detected. + +3. Query the current cluster status. + +4. After waiting about two minutes, no switchover occurs. diff --git a/product/en/docs-mogdb/v3.0/about-mogdb/test-report/ha/MogDB-ha-test-report.md b/product/en/docs-mogdb/v3.0/about-mogdb/test-report/ha/MogDB-ha-test-report.md new file mode 100644 index 0000000000000000000000000000000000000000..1ad5c7de3fff55835ff2319899739b5bd56d5160 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/about-mogdb/test-report/ha/MogDB-ha-test-report.md @@ -0,0 +1,36 @@ +--- +title: HA Test +summary: HA Test +author: Zhang Cuiping +date: 2021-06-07 +--- + +# HA Test + +## Test Objective + +Database HA (high availability) test aims at providing users with high-quality services, avoiding service interruption caused by such faults as server crash. Its HA lies in not only whether a database can provide services consistently but whether a database can ensure data consistency. + +## Test Environment + +| Category | Server Configuration | Client Configuration | Quantity | +| ----------- | :---------------------: | :------------------: | :------: | +| CPU | Kunpeng 920 | Kunpeng 920 | 128 | +| Memory | DDR4, 2933MT/s | DDR4, 2933MT/s | 2048 GB | +| Hard disk | NVME 3.5 TB | NVME 3 TB | 4 | +| File system | Xfs | Xfs | 4 | +| OS | openEuler 20.03 (LTS) | Kylin V10 | | +| Database | MogDB software package | | | +| Test tool | pgbench | | | + +## HA and Scalability Test + +| No. | Test Item | Description | +| ---- | ------------------------------------- | ------------------------------------------------------------ | +| 1 | Cluster with read-write separation | Supports separate routing of read and write requests and separate distribution of read and write tasks. | +| 2 | Capacity expansion/capacity reduction | In load scenarios, adding or reducing a physical device does not make a front-end application service interrupted. | +| 3 | Shared storage cluster | Supports two nodes sharing a storage cluster, automatic failover, and load balancing of concurrent transactions. | +| 4 | Service exception test | Guarantees the availability of an application in scenarios where the process of the standby database, the monitoring script of the standby database, and the file system of the primary database are abnormal. | +| 5 | Routine maintenance test | Supports the switchover feature provided by gs_ctl (database service control tool) and the failover feature provided by MogHA (HA component). | +| 6 | Database server exception test | Maximizes the availability and stability of an application when the primary node, standby node, or arbitration node crashes, or both the primary and standby nodes, or both the primary and arbitration nodes crash. | +| 7 | Network exception test | Maximizes the availability of an application when the service or heartbeat NIC of the primary node is abnormal, both the service and heartbeat NICs of the primary node are abnormal, the service NIC of the standby node is abnormal, or the VIP of the host is abnormal. | diff --git a/product/en/docs-mogdb/v3.0/about-mogdb/test-report/performance/1-performance-test-overview.md b/product/en/docs-mogdb/v3.0/about-mogdb/test-report/performance/1-performance-test-overview.md new file mode 100644 index 0000000000000000000000000000000000000000..d6a786378b0381b8075acc952699df1a7924370e --- /dev/null +++ b/product/en/docs-mogdb/v3.0/about-mogdb/test-report/performance/1-performance-test-overview.md @@ -0,0 +1,90 @@ +--- +title: Performance Test Overview +summary: Performance Test Overview +author: Liu Xu +date: 2021-04-27 +--- + +# Performance Test Overview + +## Test Indicators + +### Data Volume + +Each repository contains 70 MB of data. To simulate the actual data environment, 1000 repositories are created during initiation. After related indexes are added, the total size of data reaches 100 GB or so. + +### Number of Concurrent Requests + +This is to simulate the real user behavior and make the number of online user requests from the clients reach a certain number. + +### Average System Workload + +System workload indicates how busy the system CPU is, for example, how many processes are there waiting for being scheduled by the CPU, which reflects the pressure of the whole system. However, average system workload indicates the average number of processes in run queues in a specified time period, generally 1, 5, or 10 minutes. + +### CPU Usage + +CPU usage indicates the quantity of CPU resources occupied by the running programs, that is, the quantity of running programs on a server at a time point. High CPU usage indicates that a server runs a plenty of programs and vice versa. Whether the CPU usage is high or low is directly related to the CPU performance. CPU usage can determine whether a CPU reaches its bottleneck. + +### IOPS + +IOPS (Input/Output Operations Per Second) is a method for measuring the performance of a computer storage device, such as HDDs, SSDs, or SANs. It refers to the number of write/read times per second, which is one of the major indicators for measuring the disk performance. IOPS indicates the number of I/O requests that can be processed by the system in unit time. The I/O requests usually refer to data write/read requests. + +### IO Latency + +IO latency is also called IO response time, referring the time period from the time when the OS kernel sends a write/read command to the time when the OS kernel receives the IO response. Time for operating single IO indicates only the time of processing single IO in a disk. However, IO latency includes even the time of waiting for being processed for an IO in a queue. + +### tpmC + +TPC-C uses three performance and price metrics. TPC-C is used to measure performance with the unit of tpmC (Transactions Per Minute). C refers to the C benchmark program of TPC. tpmC indicates the number of new orders that are processed by the system per minute. + +## Test Program Preparation + +### Test Tools + +BenchmarkSQL is a typical open source database test tool with a built-in TPC-C test script which can be directly used for testing PostgreSQL, MySQL, Oracle, and SQL Server databases. It is used for testing TPC-C of OLTP (Online Transaction Processing) via JDBC (Java Database Connectivity). + +### Test Specifications + +TPC-C offers OLTP system test specifications and tests an OLTP system using a merchandising model. Transactions are divided into five categories, the content and characteristics of which are described as follows: + +- **NewOrder - Generation of new orders** + + Transaction content: For any one client, select 5 to 15 commodities from a specified repository and create new orders with 1% of the orders that fails the operation and requires rollback. + + Main characteristics: middleweight, frequent write/read operations, and high response speed + +- **Payment - Payment of order** + + Transaction content: For any one client, select a region and customers there from a specified repository randomly, pay an order with a random amount of money, and record this operation. + + Main characteristics: lightweight, frequent write/read operations, and high response speed。 + +- **OrderStatus - Query of latest orders** + + Transaction content: For any one client, select a region and customers there from a specified repository randomly, read the last order, and display the status of each commodity in the order. + + Main characteristics: middleweight, low read-only frequency, and high response speed + +- **Delivery - Delivery of packages** + + Transaction content: For any one client, select a delivery package randomly, update the account balance of the user whose order is being processed, and delete the order from the new order list. + + Main characteristics: 1 to 10 concurrent orders in a batch, low write/read frequency, and loose response time + +- **StockLevel - Analysis of stock status** + + Transaction content: For any one client, select the last 20 orders from a specified repository of a region, check the stock of all commodities of the order list, and calculate and display the stock of all commodities with its stock level lower than the threshold generated randomly. + + Main characteristics: heavyweight, low read-only frequency, and loose response time + +### Test Rules + +Before the test, TPC-C Benchmark specifies the initial status of a database, that is, the rule of data being generated in the database. The ITEM table includes a fixed number (100,000) of commodities. The number of warehouses can be adjusted. The initialized WAREHOUSE table includes 1000 records in this test. + +- The STOCK table contains 1000 x 100,000 records. (Each warehouse contains 100,000 commodities.) +- The DISTRICT table contains 1000 x 10 records. (Each warehouse provides services for 10 regions.) +- The CUSTOMER table contains 1000 x 10 x 3000 records. (There are 3000 customers in each region.) +- The HISTORY table contains 1000 x 10 x 3000 records. (Each customer has one transaction history.) +- The ORDER table contains 1000 x 10 x 3000 records (there are 3000 orders in each region). The last 900 orders generated are added to the NEW-ORDER table. Each order generates 5 to 15 ORDER-LINE records. + +> TPC-C uses tpmC to measure the maximum qualified throughput. The number of transactions depends on that of the new order transactions, that is, the number of new orders processed per minute. diff --git a/product/en/docs-mogdb/v3.0/about-mogdb/test-report/performance/2-mogdb-on-kunpeng-performance-test-report.md b/product/en/docs-mogdb/v3.0/about-mogdb/test-report/performance/2-mogdb-on-kunpeng-performance-test-report.md new file mode 100644 index 0000000000000000000000000000000000000000..2cbe9addd4623fa01565bade74c9e49170186f97 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/about-mogdb/test-report/performance/2-mogdb-on-kunpeng-performance-test-report.md @@ -0,0 +1,401 @@ +--- +title: Performance Test for MogDB on a Kunpeng Server +summary: Performance Test for MogDB on a Kunpeng Server +author: Liu Xu +date: 2021-03-04 +--- + +# Performance Test for MogDB on Kunpeng Servers + +## Test Objective + +This document describes the test of MogDB 2.0.0 on Kunpeng servers in scenarios where MogDB is deployed on a single node, one primary node and one standby node, or one primary node and two standby nodes (one synchronous standby and one asynchronous standby). + +## Test Environment + +### Environment Configuration + +| Category | Server Configuration | Client Configuration | Quantity | +| :---------: | :--------------------------: | :------------------: | :------: | +| CPU | Kunpeng 920 | Kunpeng 920 | 128 | +| Memory | DDR4,2933MT/s | DDR4,2933MT/s | 2048 GB | +| Hard disk | Nvme 3.5T | Nvme 3T | 4 | +| File system | Xfs | Xfs | 4 | +| OS | openEuler 20.03 (LTS) | Kylin V10 | | +| Database | MogDB 1.1.0 software package | | | + +### Test Tool + +| Name | Function | +| :--------------: | :----------------------------------------------------------- | +| BenchmarkSQL 5.0 | Open-source BenchmarkSQL developed based on Java is used for TPC-C test of OLTP database. It is used to evaluate the database transaction processing capability. | + +## Test Procedure + +### MogDB Database Operation + +1. Obtain the database installation package. + +2. Install the database. + +3. Create TPCC test user and database. + + ```sql + create user [username] identified by ‘passwd’; + grant [origin user] to [username]; + create database [dbname]; + ``` + +4. Disable the database and modify the **postgresql.conf** database configuration file by adding configuration parameters at the end of the file. + + For example, add the following parameters in the single node test: + + ```bash + max_connections = 4096 + + allow_concurrent_tuple_update = true + + audit_enabled = off + + checkpoint_segments = 1024 + + cstore_buffers =16MB + + enable_alarm = off + + enable_codegen = false + + enable_data_replicate = off + + full_page_writes = off + + max_files_per_process = 100000 + + max_prepared_transactions = 2048 + + shared_buffers = 350GB + + use_workload_manager = off + + wal_buffers = 1GB + + work_mem = 1MB + + log_min_messages = FATAL + + transaction_isolation = 'read committed' + + default_transaction_isolation = 'read committed' + + synchronous_commit = on + + fsync = on + + maintenance_work_mem = 2GB + + vacuum_cost_limit = 2000 + + autovacuum = on + + autovacuum_mode = vacuum + + autovacuum_max_workers = 5 + + autovacuum_naptime = 20s + + autovacuum_vacuum_cost_delay =10 + + xloginsert_locks = 48 + + update_lockwait_timeout =20min + + enable_mergejoin = off + + enable_nestloop = off + + enable_hashjoin = off + + enable_bitmapscan = on + + enable_material = off + + wal_log_hints = off + + log_duration = off + + checkpoint_timeout = 15min + + enable_save_datachanged_timestamp =FALSE + + enable_thread_pool = on + + thread_pool_attr = '812,4,(cpubind:0-27,32-59,64-91,96-123)' + + enable_double_write = on + + enable_incremental_checkpoint = on + + enable_opfusion = on + + advance_xlog_file_num = 10 + + numa_distribute_mode = 'all' + + track_activities = off + + enable_instr_track_wait = off + + enable_instr_rt_percentile = off + + track_counts =on + + track_sql_count = off + + enable_instr_cpu_timer = off + + plog_merge_age = 0 + + session_timeout = 0 + + enable_instance_metric_persistent = off + + enable_logical_io_statistics = off + + enable_user_metric_persistent =off + + enable_xlog_prune = off + + enable_resource_track = off + + instr_unique_sql_count = 0 + + enable_beta_opfusion = on + + enable_beta_nestloop_fusion = on + + autovacuum_vacuum_scale_factor = 0.02 + + autovacuum_analyze_scale_factor = 0.1 + + client_encoding = UTF8 + + lc_messages = en_US.UTF-8 + + lc_monetary = en_US.UTF-8 + + lc_numeric = en_US.UTF-8 + + lc_time = en_US.UTF-8 + + modify_initial_password = off + + ssl = off + + enable_memory_limit = off + + data_replicate_buffer_size = 16384 + + max_wal_senders = 8 + + log_line_prefix = '%m %u %d %h %p %S' + + vacuum_cost_limit = 10000 + + max_process_memory = 12582912 + + recovery_max_workers = 1 + + recovery_parallelism = 1 + + explain_perf_mode = normal + + remote_read_mode = non_authentication + + enable_page_lsn_check = off + + pagewriter_sleep = 100 + ``` + +### BenchmarkSQL Operation + +1. Modify the configuration file. + + Open the BenchmarkSQL installation directory and find the **[config file]** configuration file in the **run** directory. + + ``` + db=postgres + + driver=org.postgresql.Driver + + conn=jdbc:postgresql://[ip:port]/tpcc?prepareThreshold=1&batchMode=on&fetchsize=10 + + user=[user] + + password=[passwd] + + warehouses=1000 + + loadWorkers=80 + + terminals=812 + + //To run specified transactions per terminal- runMins must equal zero + + runTxnsPerTerminal=0 + + //To run for specified minutes- runTxnsPerTerminal must equal zero + + runMins=30 + + //Number of total transactions per minute + + limitTxnsPerMin=0 + + //Set to true to run in 4.x compatible mode. Set to false to use the + + //entire configured database evenly. + + terminalWarehouseFixed=false #true + + //The following five values must add up to 100 + + //The default percentages of 45, 43, 4, 4 & 4 match the TPC-C spec + + newOrderWeight=45 + + paymentWeight=43 + + orderStatusWeight=4 + + deliveryWeight=4 + + stockLevelWeight=4 + + // Directory name to create for collecting detailed result data. + + // Comment this out to suppress. + + //resultDirectory=my_result_%tY-%tm-%td_%tH%tM%tS + + //osCollectorScript=./misc/os_collector_linux.py + + //osCollectorInterval=1 + + //osCollectorSSHAddr=tpcc@127.0.0.1 + + //osCollectorDevices=net_eth0 blk_sda blk_sdg blk_sdh blk_sdi blk_sdj + ``` + +2. Run **runDataBuild.sh** to generate data. + + ``` + ./runDatabaseBuild.sh [config file] + ``` + +3. Run **runBenchmark.sh** to test the database. + + ``` + ./runBenchmark.sh [config file] + ``` + +### OS Configuration + +1. Modify **PAGESIZE** of the OS kernel (required only in EulerOS). + + **Install kernel-4.19.36-1.aarch64.rpm (*).** + + ``` + # rpm -Uvh --force --nodeps kernel-4.19.36-1.aarch64.rpm + + *: This file is based on the kernel package of linux 4.19.36. You can acquire it from the following directory: + + 10.44.133.121 (root/Huawei12#$) + + /data14/xy_packages/kernel-4.19.36-1.aarch64.rpm + ``` + + **Modify the boot options of the root in the OS kernel configuration file.** + + ``` + \# vim /boot/efi/EFI/euleros/grubenv //Back up the grubenv file before modification. + + \# GRUB Environment Block + + saved_entry=EulerOS (4.19.36) 2.0 (SP8) -- Changed to 4.19.36 + ``` + +### File System + +1. Change the value of **blocksize** to **8K** in the XFS file system. + + #Run the following commands to check the attached NVME disks: + + ``` + # df -h | grep nvme + + /dev/nvme0n1 3.7T 2.6T 1.2T 69% /data1 + + /dev/nvme1n1 3.7T 1.9T 1.8T 51% /data2 + + /dev/nvme2n1 3.7T 2.2T 1.6T 59% /data3 + + /dev/nvme3n1 3.7T 1.4T 2.3T 39% /data4 + + Run the xfs_info command to query information about NVME disks. + + \# xfs_info /data1 + ``` + +2. Back up the required data. + +3. Format the disk. + + ``` + Use the /dev/nvme0n1 disk and /data1 load point as an example and run the folowing commands: + + umount /data1 + + mkfs.xfs -b size=8192 /dev/nvme0n1 -f + + mount /dev/nvme0n1 /data1 + ``` + +## Test Items and Conclusions + +### Test Result Summary + +| Test Item | Data Volume | Concurrent Transactions | Average CPU Usage | IOPS | IO Latency | Write Ahead Logs | tpmC | Test Time (Minute) | +| :------------------------------------: | ----------- | ----------------------- | ----------------- | ------ | ---------- | ---------------- | ---------- | :----------------: | +| Single node | 100 GB | 500 | 77.49% | 17.96K | 819.05 us | 13260 | 1567226.12 | 10 | +| One primary node and one standby node | 100 GB | 500 | 57.64% | 5.31K | 842.78 us | 13272 | 1130307.87 | 10 | +| One primary node and two standby nodes | 100 GB | 500 | 60.77% | 5.3K | 821.66 us | 14324 | 1201560.28 | 10 | + +### Single Node + +- tpmC + + ![mogdb-on-kunpeng-1](https://cdn-mogdb.enmotech.com/docs-media/mogdb/about-mogdb/mogdb-on-kunpeng-1.png) + +- System data + + ![mogdb-on-kunpeng-2](https://cdn-mogdb.enmotech.com/docs-media/mogdb/about-mogdb/mogdb-on-kunpeng-2.png) + +### One Primary Node and One Standby Node + +- tpmC + + ![mogdb-on-kunpeng-3](https://cdn-mogdb.enmotech.com/docs-media/mogdb/about-mogdb/mogdb-on-kunpeng-3.png) + +- System data + + ![mogdb-on-kunpeng-4](https://cdn-mogdb.enmotech.com/docs-media/mogdb/about-mogdb/mogdb-on-kunpeng-4.png) + +### One Primary Node and Two Standby Nodes + +- tpmC + + ![mogdb-on-kunpeng-5](https://cdn-mogdb.enmotech.com/docs-media/mogdb/about-mogdb/mogdb-on-kunpeng-5.png) + +- System data + + ![mogdb-on-kunpeng-6](https://cdn-mogdb.enmotech.com/docs-media/mogdb/about-mogdb/mogdb-on-kunpeng-6.png) diff --git a/product/en/docs-mogdb/v3.0/about-mogdb/test-report/performance/3-mogdb-on-x86-performance-test-report.md b/product/en/docs-mogdb/v3.0/about-mogdb/test-report/performance/3-mogdb-on-x86-performance-test-report.md new file mode 100644 index 0000000000000000000000000000000000000000..94644b1bd75005bd4887cf27cdc2002e79821993 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/about-mogdb/test-report/performance/3-mogdb-on-x86-performance-test-report.md @@ -0,0 +1,654 @@ +--- +title: Performance Test for MogDB on x86 Servers +summary: Performance Test for MogDB on x86 Servers +author: Liu Xu +date: 2021-03-04 +--- + +# Performance Test for MogDB on x86 Servers + +## Test Objective + +This document describes the test of MogDB 2.0.0 on x86 servers in scenarios where MogDB is deployed on a single node, one primary node and one standby node, or one primary node and two standby nodes (one synchronous standby and one asynchronous standby). + +## Test Environment + +### Environment Configuration + +| Server Type | Fit Server | NFS5280M5 | +| :---------: | :--------------------------------------: | :--------------------------------------: | +| CPU | 144cIntel® Xeon(R) Gold 6240 CPU@2.60GHz | 64cIntel® Xeon(R) Gold 5218 CPU @2.30GHz | +| Memory | 768G | 128G | +| Hard disk | SAS SSD | SAS SSD | +| NIC | 10GE | 10GE | + +### Test Tool + +| Name | Function | +| :-------------- | :----------------------------------------------------------- | +| Benchmarksql5.0 | Open-source BenchmarkSQL developed based on Java is used for TPC-C test of OLTP database. It is used to evaluate the database transaction processing capability. | + +## Test Procedure + +### MogDB Database Operation + +1. Obtain the database installation package. + +2. Install the database. + +3. Create TPCC test user and database. + + ``` + create user [username] identified by ‘passwd’; + grant [origin user] to [username]; + create database [dbname]; + ``` + +4. Disable the database and modify the **postgresql.conf** database configuration file by adding configuration parameters at the end of the file. + + For example, add the following parameters in the single node test: + + ``` + max_connections = 4096 + + allow_concurrent_tuple_update = true + + audit_enabled = off + + checkpoint_segments = 1024 + + cstore_buffers =16MB + + enable_alarm = off + + enable_codegen = false + + enable_data_replicate = off + + full_page_writes = off + + max_files_per_process = 100000 + + max_prepared_transactions = 2048 + + shared_buffers = 350GB + + use_workload_manager = off + + wal_buffers = 1GB + + work_mem = 1MB + + log_min_messages = FATAL + + transaction_isolation = 'read committed' + + default_transaction_isolation = 'read committed' + + synchronous_commit = on + + fsync = on + + maintenance_work_mem = 2GB + + vacuum_cost_limit = 2000 + + autovacuum = on + + autovacuum_mode = vacuum + + autovacuum_max_workers = 5 + + autovacuum_naptime = 20s + + autovacuum_vacuum_cost_delay =10 + + xloginsert_locks = 48 + + update_lockwait_timeout =20min + + enable_mergejoin = off + + enable_nestloop = off + + enable_hashjoin = off + + enable_bitmapscan = on + + enable_material = off + + wal_log_hints = off + + log_duration = off + + checkpoint_timeout = 15min + + enable_save_datachanged_timestamp =FALSE + + enable_thread_pool = on + + thread_pool_attr = '812,4,(cpubind:0-27,32-59,64-91,96-123)' + + enable_double_write = on + + enable_incremental_checkpoint = on + + enable_opfusion = on + + advance_xlog_file_num = 10 + + numa_distribute_mode = 'all' + + track_activities = off + + enable_instr_track_wait = off + + enable_instr_rt_percentile = off + + track_counts =on + + track_sql_count = off + + enable_instr_cpu_timer = off + + plog_merge_age = 0 + + session_timeout = 0 + + enable_instance_metric_persistent = off + + enable_logical_io_statistics = off + + enable_user_metric_persistent =off + + enable_xlog_prune = off + + enable_resource_track = off + + instr_unique_sql_count = 0 + + enable_beta_opfusion = on + + enable_beta_nestloop_fusion = on + + autovacuum_vacuum_scale_factor = 0.02 + + autovacuum_analyze_scale_factor = 0.1 + + client_encoding = UTF8 + + lc_messages = en_US.UTF-8 + + lc_monetary = en_US.UTF-8 + + lc_numeric = en_US.UTF-8 + + lc_time = en_US.UTF-8 + + modify_initial_password = off + + ssl = off + + enable_memory_limit = off + + data_replicate_buffer_size = 16384 + + max_wal_senders = 8 + + log_line_prefix = '%m %u %d %h %p %S' + + vacuum_cost_limit = 10000 + + max_process_memory = 12582912 + + recovery_max_workers = 1 + + recovery_parallelism = 1 + + explain_perf_mode = normal + + remote_read_mode = non_authentication + + enable_page_lsn_check = off + + pagewriter_sleep = 100 + ``` + +### BenchmarkSQL Operation + +1. Modify the configuration file. + + Open the BenchmarkSQL installation directory and find the **[config file]** configuration file in the **run** directory. + + ``` + db=postgres + + driver=org.postgresql.Driver + + conn=jdbc:postgresql://[ip:port]/tpcc?prepareThreshold=1&batchMode=on&fetchsize=10 + + user=[user] + + password=[passwd] + + warehouses=1000 + + loadWorkers=80 + + terminals=812 + + //To run specified transactions per terminal- runMins must equal zero + + runTxnsPerTerminal=0 + + //To run for specified minutes- runTxnsPerTerminal must equal zero + + runMins=30 + + //Number of total transactions per minute + + limitTxnsPerMin=0 + + //Set to true to run in 4.x compatible mode. Set to false to use the + + //entire configured database evenly. + + terminalWarehouseFixed=false #true + + //The following five values must add up to 100 + + //The default percentages of 45, 43, 4, 4 & 4 match the TPC-C spec + + newOrderWeight=45 + + paymentWeight=43 + + orderStatusWeight=4 + + deliveryWeight=4 + + stockLevelWeight=4 + + // Directory name to create for collecting detailed result data. + + // Comment this out to suppress. + + //resultDirectory=my_result_%tY-%tm-%td_%tH%tM%tS + + //osCollectorScript=./misc/os_collector_linux.py + + //osCollectorInterval=1 + + //osCollectorSSHAddr=tpcc@127.0.0.1 + + //osCollectorDevices=net_eth0 blk_sda blk_sdg blk_sdh blk_sdi blk_sdj + ``` + +2. Run **runDataBuild.sh** to generate data. + + ``` + ./runDatabaseBuild.sh [config file] + ``` + +3. Run **runBenchmark.sh** to test the database. + + ``` + ./runBenchmark.sh [config file] + ``` + +### OS Parameters + +``` +vm.dirty_background_ratIO=5 + +vm.dirty_ratIO=10 + +kernel.sysrq=0 + +net.ipv4.ip_forward=0 + +net.ipv4.conf.all.send_redirects=0 + +net.ipv4.conf.default.send_redirects=0 + +net.ipv4.conf.all.accept_source_route=0 + +net.ipv4.conf.default.accept_source_route=0 + +net.ipv4.conf.all.accept_redirects=0 + +net.ipv4.conf.default.accept_redirects=0 + +net.ipv4.conf.all.secure_redirects=0 + +net.ipv4.conf.default.secure_redirects=0 + +net.ipv4.icmp_echo_ignore_broadcasts=1 + +net.ipv4.icmp_ignore_bogus_error_responses=1 + +net.ipv4.conf.all.rp_filter=1 + +net.ipv4.conf.default.rp_filter=1 + +net.ipv4.tcp_syncookies=1 + +kernel.dmesg_restrict=1 + +net.ipv6.conf.all.accept_redirects=0 + +net.ipv6.conf.default.accept_redirects=0 + +net.core.rmem_max = 21299200 + +net.core.rmem_default = 21299200 + +net.core.somaxconn = 65535 + +net.ipv4.tcp_tw_reuse = 1 + +net.sctp.sctp_mem = 94500000 915000000 927000000 + +net.ipv4.tcp_max_tw_buckets = 10000 + +net.ipv4.tcp_rmem = 8192 250000 16777216 + +kernel.sem = 250 6400000 1000 25600 + +net.core.wmem_default = 21299200 + +kernel.shmall = 1152921504606846720 + +net.core.wmem_max = 21299200 + +net.sctp.sctp_rmem = 8192 250000 16777216 + +net.core.netdev_max_backlog = 65535 + +kernel.shmmax = 18446744073709551615 + +net.sctp.sctp_wmem = 8192 250000 16777216 + +net.ipv4.tcp_keepalive_intvl = 30 + +net.ipv4.tcp_keepalive_time = 30 + +net.ipv4.tcp_wmem = 8192 250000 16777216 + +net.ipv4.tcp_max_syn_backlog = 65535 + +vm.oom_panic_on_oom=0 + +kernel.nmi_watchdog=0 + +kernel.nmi_watchdog=0 + +kernel.nmi_watchdog=0 + +kernel.nmi_watchdog=0 + +kernel.nmi_watchdog=0 + +net.ipv4.tcp_timestamps = 1 + +net.ipv4.tcp_tso_win_divisor = 30 + +net.sctp.path_max_retrans = 10 + +net.sctp.max_init_retransmits = 10 + +net.ipv4.tcp_retries1 = 5 + +net.ipv4.tcp_syn_retries = 5 + +net.ipv4.tcp_synack_retries = 5 + +kernel.core_uses_pid=1 + +kernel.core_pattern=/home/core/core-%e-%u-%s-%t-%p + +kernel.nmi_watchdog=0 + +kernel.nmi_watchdog=0 + +kernel.nmi_watchdog=0 + +kernel.nmi_watchdog=0 + +kernel.core_pattern=/home/core/core-%e-%u-%s-%t-%h + +net.core.netdev_max_backlog = 65535 + +net.core.rmem_default = 21299200 + +net.core.rmem_max = 21299200 + +net.core.somaxconn = 65535 + +net.core.wmem_default = 21299200 + +net.core.wmem_max = 21299200 + +net.ipv4.conf.all.accept_redirects = 0 + +net.ipv4.conf.all.rp_filter = 1 + +net.ipv4.conf.all.secure_redirects = 0 + +net.ipv4.conf.all.send_redirects = 0 + +net.ipv4.conf.default.accept_redirects = 0 + +net.ipv4.conf.default.accept_source_route = 0 + +net.ipv4.conf.default.rp_filter = 1 + +net.ipv4.conf.default.secure_redirects = 0 + +net.ipv4.conf.default.send_redirects = 0 + +net.ipv4.conf.enp135s0.accept_redirects = 0 + +net.ipv4.conf.enp135s0.accept_source_route = 0 + +net.ipv4.conf.enp135s0.forwarding = 1 + +net.ipv4.conf.enp135s0.rp_filter = 1 + +net.ipv4.conf.enp135s0.secure_redirects = 0 + +net.ipv4.conf.enp135s0.send_redirects = 0 + +net.ipv4.tcp_keepalive_intvl = 30 + +net.ipv4.tcp_keepalive_time = 30 + +net.ipv4.tcp_max_syn_backlog = 65535 + +net.ipv4.tcp_max_tw_buckets = 10000 + +net.ipv4.tcp_mem = 362715 483620 725430 + +\#net.ipv4.tcp_mem = 94500000 915000000 927000000 + +net.ipv4.tcp_retries1 = 5 + +net.ipv4.tcp_rmem = 8192 250000 16777216 + +net.ipv4.tcp_syn_retries = 5 + +net.ipv4.tcp_tso_win_divisor = 30 + +net.ipv4.tcp_tw_reuse = 1 + +net.ipv4.tcp_wmem = 8192 250000 16777216 + +net.ipv4.udp_mem = 725430 967240 1450860 + +\#net.ipv4.tcp_max_orphans = 3276800 + +\#net.ipv4.tcp_fin_timeout = 60 + +\#net.ipv4.ip_local_port_range = 26000 65535 + +net.ipv4.tcp_retries2 = 80 + +\#net.ipv4.ip_local_reserved_ports = 20050-30007 + +vm.min_free_kbytes = 40140150 +``` + +### Database Parameters + +| Parameter | MogDB | +| :-------------------------------- | :--------------------------------------------- | +| listen_addresses | Specific IP address | +| port | 26000 | +| max_connectIOns | 4096 | +| wal_level | hot_standby | +| archive_mode | on | +| archive_command | /bin/ture | +| max_wal_senders | 16 | +| wal_keep_segments | 16 | +| max_replicatIOn_slots | 8 | +| hot_standby | on | +| logging_collector | on | +| log_directory | Specify the directory of the installation tool | +| log_filename | PostgreSQL-%Y-%m-%d_%H%M%S.log | +| log_min_duratIOn_statement | 1800000 | +| log_line_prefix | %m%c%d%p%a%x%n%e | +| log_timezone | PRC | +| datestyle | iso,mdy | +| timezone | PRC | +| default_text_search_config | pg_catalog.english | +| applicatIOn_name | dn_6001 | +| max_prepared_transactIOns | 2048 | +| shared_buffers | 350 GB | +| wal_buffers | 1 GB | +| work_mem | 64 MB | +| log_min_messages | FATAL | +| synchronous_commit | on | +| fsync | on | +| maintenance_work_mem | 2 GB | +| autovacuum | on | +| autovacuum_max_workers | 5 | +| autovacuum_naptime | 20s | +| autovacuum_vacuum_cost_delay | 10 | +| enable_mergejoin | off | +| enable_nestloop | off | +| enable_hashjoin | off | +| enable_bitmapscan | on | +| enable_material | off | +| wal_log_hints | off | +| log_duratIOn | off | +| checkpoint_timeout | 15 min | +| track_activities | off | +| track_counts | on | +| autovacuum_vacuum_scale_factor | 0.02 | +| autovacuum_analyze_scale_factor | 0.1 | +| ssl | off | +| local_bind_address | Specific IP address | +| max_inner_tool_connectIOns | 10 | +| password_encryptIOn_type | 0 | +| comm_tcp_mode | on | +| comm_quota_size | 1024 KB | +| max_process_memory | 700 GB | +| bulk_write_ring_size | 2 GB | +| checkpoint_segments | 1024 | +| incremental_checkpoint_timeout | 60s | +| archive_dest | /log/archive | +| enable_slot_log | off | +| data_replicate_buffer_size | 128 MB | +| walsender_max_send_size | 8 MB | +| enable_kill_query | off | +| connectIOn_alARM_rate | 0.9 | +| alARM_report_interval | 10 | +| alARM_component | /opt/huawei/snas/bin/snas_cm_cmd | +| lockwait_timeout | 1200s | +| pgxc_node_name | xxx | +| audit_directory | Specify the directory of the installation tool | +| explain_perf_mode | pretty | +| job_queue_processes | 10 | +| default_storage_nodegroup | installatIOn | +| expected_computing_nodegroup | query | +| replicatIOn_type | 1 | +| recovery_max_workers | 4 | +| available_zone | AZ1 | +| allow_concurrent_tuple_update | TRUE | +| audit_enabled | off | +| cstore_buffers | 16 MB | +| enable_alARM | off | +| enable_codegen | FALSE | +| enable_data_replicate | off | +| max_file_per_process | 10000 | +| use_workload_manager | off | +| xloginsert_locks | 48 | +| update_lockwait_timeout | 20 min | +| enable_save_datachanged_timestamp | FALSE | +| enable_thread_pool | off | +| enable_double_write | on | +| enable_incremental_checkpoint | on | +| advance_xlog_file_num | 10 | +| enable_instr_track_wait | off | +| enable_instr_rt_percentile | off | +| track_sql_count | off | +| enable_instr_cpu_timer | off | +| plog_merge_age | 0 | +| sessIOn_timeout | 0 | +| enable_instance_metric_persistent | off | +| enable_logical_IO_statistics | off | +| enable_user_metric_persistent | off | +| enable_xlog_prune | off | +| enable_resource_track | off | +| instr_unique_sql_count | 0 | +| enable_beta_opfusIOn | on | +| enable_bete_netsloop_fusIOn | on | +| remote_read_mode | non_authenticatIOn | +| enable_page_lsn_check | off | +| pagewriter_sleep | 2s | +| enable_opfusIOn | on | +| max_redo_log_size | 100 GB | +| pagewrite_thread_num | 1 | +| bgwrite_thread_num | 1 | +| dirty_page_percent_max | 1 | +| candidate_buf_percent_target | 01 | + +## Test Items and Conclusions + +### Test Result Summary + +| Test Item | Data Volume | Concurrent Transactions | Average CPU Usage | IOPS | IO Latency | Write Ahead Logs | tpmC | Test Time (Minute) | +| -------------------------------------- | ----------- | ----------------------- | ----------------- | ----- | ---------- | ---------------- | -------- | ------------------ | +| Single node | 100 GB | 500 | 29.39% | 6.50K | 1.94 ms | 3974 | 520896.3 | 10 | +| One primary node and one standby node | 100 GB | 500 | 30.4% | 5.31K | 453.2 us | 3944 | 519993.5 | 10 | +| One primary node and two standby nodes | 100 GB | 500 | 26.35% | 7.66K | 531.9 us | 3535 | 480842.2 | 10 | + +### Single Node + +- tpmC + + ![mogdb-on-x86-1](https://cdn-mogdb.enmotech.com/docs-media/mogdb/about-mogdb/mogdb-on-x86-1.png) + + System data + + ![mogdb-on-x86-2](https://cdn-mogdb.enmotech.com/docs-media/mogdb/about-mogdb/mogdb-on-x86-2.png) + +### One Primary Node and One Standby Node + +- tpmC + + ![mogdb-on-x86-3](https://cdn-mogdb.enmotech.com/docs-media/mogdb/about-mogdb/mogdb-on-x86-3.png) + +- System data + + ![mogdb-on-x86-4](https://cdn-mogdb.enmotech.com/docs-media/mogdb/about-mogdb/mogdb-on-x86-4.png) + +### One Primary Node and Two Standby Nodes + +- tpmC + + ![mogdb-on-x86-5](https://cdn-mogdb.enmotech.com/docs-media/mogdb/about-mogdb/mogdb-on-x86-5.png) + +- System data + + ![mogdb-on-x86-6](https://cdn-mogdb.enmotech.com/docs-media/mogdb/about-mogdb/mogdb-on-x86-6.png) diff --git a/product/en/docs-mogdb/v3.0/about-mogdb/usage-limitations.md b/product/en/docs-mogdb/v3.0/about-mogdb/usage-limitations.md new file mode 100644 index 0000000000000000000000000000000000000000..84ea10ed57bb6548e1b19f1771cb49b4a0213799 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/about-mogdb/usage-limitations.md @@ -0,0 +1,27 @@ +--- +title: Usage Limitations +summary: Usage Limitations +author: Guo Huan +date: 2021-06-01 +--- + +# Usage Limitations + +This document describes the common usage limitations of MogDB. + +| Item | Upper limit | +| ----------------------------------------- | ---------------------------------------------------- | +| Database capacity | Depend on operating systems and hardware | +| Size of a single table | 32 TB | +| Size of a single row | 1 GB | +| Size of a single field in a row | 1 GB | +| Number of rows in a single table | 281474976710656 (2^48^) | +| Number of columns in a single table | 250~1600 (Varies depending on different field types) | +| Number of indexes in a single table | Unlimited | +| Number of columns in a compound index | 32 | +| Number of constraints in a single table | Unlimited | +| Number of concurrent connections | 10000 | +| Number of partitions in a partition table | 32768 (Range partition)/64 (HASH/List partition) | +| Size of a single partition | 32 TB | +| Number of rows in a single partition | 2^55^ | +| Maximum length of SQL text | 1048576 bytes (1MB) | diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/br/1-1-br.md b/product/en/docs-mogdb/v3.0/administrator-guide/br/1-1-br.md new file mode 100644 index 0000000000000000000000000000000000000000..0ecfdd3338cdd205b6c54c05f6c4c83fb5bc6ef1 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/br/1-1-br.md @@ -0,0 +1,93 @@ +--- +title: Overview +summary: Overview +author: Guo Huan +date: 2021-04-27 +--- + +# Overview + +For database security purposes, MogDB provides three backup types, multiple backup and restoration solutions, and data reliability assurance mechanisms. + +Backup and restoration can be classified into logical backup and restoration, physical backup and restoration, and flashback. + +- Logical backup and restoration: backs up data by logically exporting data. This method can dump data that is backed up at a certain time point, and restore data only to this backup point. A logical backup does not back up data processed between failure occurrence and the last backup. It applies to scenarios where data rarely changes. Such data damaged due to misoperation can be quickly restored using a logical backup. To restore all the data in a database through logical backup, rebuild a database and import the backup data. Logical backup is not recommended for databases requiring high data availability because it takes a long time for data restoration. Logical backup is a major approach to migrate and transfer data because it can be performed on any platform. + +- Physical backup and restoration: copies physical files in the unit of disk blocks from the primary node to the standby node to back up a database. A database can be restored using backup files, such as data files and archive log files. Physical backup is usually used for full backup, quickly backing up and restoring data at a low cost if properly planned. + +- Flashback: This function is used to restore dropped tables from the recycle bin. Like in a Window OS, dropped table information is stored in the recycle bin of databases. The MVCC mechanism is used to restore data to a specified point in time or change sequence number (CSN). + + The three data backup and restoration solutions supported by MogDB are as follows. Methods for restoring data in case of an exception differ for different backup and restoration solutions. + + **Table 1** Comparison of three backup and restoration types + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Backup TypeApplication ScenarioMediaTool NameRecovery TimeAdvantage and Disadvantage
Logical backup and restorationSmall volume of data needs to be processed.
You can back up a single table, multiple tables, a single database, or all databases. The backup data needs to be restored using gsql or gs_restore. When the data volume is large, the restoration takes a long time.
- Disk
- SSD
gs_dumpIt takes a long time to restore data in plain-text format. It takes a long time to restore data in archive format.This tool is used to export database information. Users can export a database or its objects (such as schemas, tables, and views). The database can be the default postgres database or a user-specified database. The exported file can be in plain-text format or archive format. Data in plain-text format can be restored only by using gsql, which takes a long time. Data in archive format can be restored only by using gs_restore. The restoration time is shorter than that of the plain-text format.
gs_dumpallLong data recovery timeThis tool is used to export all information of the openGauss database, including the data of the default postgres database, data of user-specified databases, and global objects of all openGauss databases.
Only data in plain-text format can be exported. The exported data can be restored only by using gsql, which takes a long time.
Physical backup and restorationHuge volume of data needs to be processed. It is mainly used for full backup and restoration as well as the backup of all WAL archive and run logs in the database.gs_backupSmall data volume and fast data recoveryThe OM tool for exporting database information can be used to export database parameter files and binary files. It helps the openGauss to back up and restore important data, and display help and version information. During the backup, you can select the type of the backup content. During the restoration, ensure that the backup file exists in the backup directory of each node. During cluster restoration, the cluster information in the static configuration file is used for restoration. It takes a short time to restore only parameter files.
gs_basebackupDuring the restoration, you can directly copy and replace the original files, or directly start the database on the backup database. The restoration takes a short time.This too is used to fully copy the binary files of the server database. Only the database at a certain time point can be backed up. With PITR, you can restore data to a time point after the full backup time point.
gs_probackupData can be directly restored to a backup point and the database can be started on the backup database. The restoration takes a short time.gs_probackup is a tool used to manage openGauss database backup and restoration. It periodically backs up openGauss instances. It supports the physical backup of a standalone database or a primary database node in a cluster. It supports the backup of contents in external directories, such as script files, configuration files, log files, and dump files. It supports incremental backup, periodic backup, and remote backup. The time required for incremental backup is shorter than that for full backup. You only need to back up the modified files. Currently, the data directory is backed up by default. If the tablespace is not in the data directory, you need to manually specify the tablespace directory to be backed up. Currently, data can be backed up only on the primary node.
FlashbackApplicable to:
1) A table is deleted by mistake.
2) Data in the tables needs to be restored to a specified time point or CSN.
NoneYou can restore a table to the status at a specified time point or before the table structure is deleted within a short period of time.Flashback can selectively and efficiently undo the impact of a committed transaction and recover from a human error. Before the flashback technology is used, the committed database modification can be retrieved only by means of restoring backup or PITR. The restoration takes several minutes or even hours. After the flashback technology is used, it takes only seconds to restore the committed data before the database is modified. The restoration time is irrelevant to the database size.
Flashback supports two recovery modes:
- Multi-version data restoration based on MVCC: applicable to the query and restoration of data that is deleted, updated, or inserted by mistake. You can configure the retention period of the old version and run the corresponding query or restoration command to query or restore data to a specified time point or CSN.
- Recovery based on the recycle bin (similar to that on Windows OS): This method is applicable to the recovery of tables that are dropped or truncated by mistake. You can configure the recycle bin switch and run the corresponding restoration command to restore the tables that are dropped or truncated by mistake.
+ +While backing up and restoring data, take the following aspects into consideration: + +- Whether the impact of data backup on services is acceptable + +- Database restoration efficiency + + To minimize the impact of database faults, try to minimize the restoration duration, achieving the highest restoration efficiency. + +- Data restorability + + Minimize data loss after the database is invalidated. + +- Database restoration cost + + There are many factors that need to be considered while you select a backup policy on the live network, such as backup objects, data volume, and network configuration. Table 2 lists available backup policies and applicable scenarios for each backup policy. + + **Table 2** Backup policies and scenarios + + | Backup Policy | Key Performance Factor | Typical Data Volume | Performance Specifications | + | :----------------------- | :----------------------------------------------------------- | :--------------------------------------------------------- | :----------------------------------------------------------- | + | Database instance backup | - Data amount
- Network configuration | Data volume: PB level
Object quantity: about 1 million | Backup:
- Data transfer rate on each host: 80 Mbit/s (NBU/EISOO+Disk)
- Disk I/O rate (SSD/HDD): about 90% | + | Table backup | - Schema where the table to be backed up resides
- Network configuration (NBU) | Data volume: 10 TB level | Backup: depends on query performance rate and I/O rate
NOTE:
For multi-table backup, the backup time is calculated as follows:
`Total time = Number of tables x Starting time + Total data volume/Data backup speed`
In the preceding information:
- The starting time of a disk is about 5s. The starting time of an NBU is longer than that of a disk (depending on the NBU deployment).
- The data backup speed is about 50 MB/s on a single node. (The speed is evaluated based on the backup of a 1 GB table from a physical host to a local disk.)
The smaller the table is, the lower the backup performance will be. | diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/br/1-2-br.md b/product/en/docs-mogdb/v3.0/administrator-guide/br/1-2-br.md new file mode 100644 index 0000000000000000000000000000000000000000..ed50d65f3e6af5df3fe7d7f467a7d5096aef46e6 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/br/1-2-br.md @@ -0,0 +1,904 @@ +--- +title: Physical Backup and Restoration +summary: Physical Backup and Restoration +author: Guo Huan +date: 2021-04-27 +--- + +# Physical Backup and Restoration + +## gs_basebackup + +### Background + +After MogDB is deployed, problems and exceptions may occur during database running. **gs_basebackup**, provided by MogDB, is used to perform basic physical backup. **gs_basebackup** copies the binary files of the database on the server using a replication protocol. To remotely execute **gs_basebackup**, you need to use the system administrator account. **gs_basebackup** supports hot backup and compressed backup. + +**NOTE:** + +- **gs_basebackup** supports only full backup. +- **gs_basebackup** supports hot backup and compressed backup. +- **gs_basebackup** cannot back up tablespaces containing absolute paths on the same server. This is because the absolute path is unique on the same machine, and brings about conflicts. However, it can back up tablespaces containing absolute paths on different machines. +- If the functions of incremental checkpoint and dual-write are enabled, **gs_basebackup** also backs up dual-write files. +- If the **pg_xlog** directory is a soft link, no soft link is created during backup. Data is directly backed up to the **pg_xlog** directory in the destination path. +- If the backup permission is revoked during the backup, the backup may fail or the backup data may be unavailable. +- MogDB does not support version upgrade. + +### Prerequisites + +- The MogDB database can be connected. +- User permissions are not revoked during the backup. +- In the **pg_hba.conf** file, the replication connection is allowed and the connection is established by a system administrator. +- If the Xlog transmission mode is **stream**, the number of **max_wal_senders** must be configured to at least one. +- If the Xlog transmission mode is **fetch**, the **wal_keep_segments** parameter must be set to a large value so that logs are not removed before the backup ends. +- During the restoration, backup files exist in the backup directory on all the nodes. If backup files are lost on any node, them to it from another node. + +### Syntax + +- Display help information. + + ``` + gs_basebackup -? | --help + ``` + +- Display version information. + + ``` + gs_basebackup -V | --version + ``` + +### Parameter Description + +The **gs_basebackup** tool can use the following types of parameters: + +- -D directory + +Directory for storing backup files. This parameter is mandatory. + +- Common parameters + + - -c, -checkpoint=fast|spread + + Sets the checkpoint mode to **fast** or **spread** (default). + + - -l, -label=LABEL + + Adds tags for the backup. + + - -P, -progress + + Enables the progress report. + + - -v, -verbose + + Enables the verbose mode. + + - -V, -version + + Prints the version and exits. + + - -?, -help + + Displays **gs_basebackup** command parameters. + + - -T, -tablespace-mapping=olddir=newdir + + During the backup, the tablespace in the **olddir** directory is relocated to the **newdir** directory. For this to take effect, **olddir** must exactly match the path where the tablespace is located (but it is not an error if the backup does not contain the tablespaces in **olddir**). **olddir** and **newdir** must be absolute paths. If a path happens to contain an equal sign (=), you can escape it with a backslash (). This option can be used multiple times for multiple tablespaces. + + - -F, -format=plain|tar + + Sets the output format to **plain** (default) or **tar**. If this parameter is not set, the default value **-format=plain** is used. The plain format writes the output as a flat file, using the same layout as the current data directory and tablespace. When the cluster has no extra tablespace, the entire database is placed in the target directory. If the cluster contains additional tablespaces, the primary data directory will be placed in the target directory, but all other tablespaces will be placed in the same absolute path on the server. The tar mode writes the output as a tar file in the target directory. The primary data directory is written to a file named **base.tar**, and other tablespaces are named after their OIDs. The generated .tar package must be decompressed using the **gs_tar** command. + + - -X, -xlog-method=fetch|stream + + Sets the Xlog transmission mode. If this parameter is not set, the default value **-xlog-method=stream** is used. The required write-ahead log files (WALs) are included in the backup. This includes all WALs generated during the backup. In fetch mode, WAL files are collected at the end of the backup. Therefore, the **wal_keep_segments** parameter must be set to a large value so that logs are not removed before the backup ends. If it has been rotated when the log is to be transmitted, the backup fails and is unavailable. In stream mode, WALs are streamed when a backup is created. This will open a second connection to the server and start streaming WALs while the backup is running. Therefore, it will use up to two connections configured by the **max_wal_senders** parameter. As long as the client can receive WALs, no additional WALs need to be stored on the host. + + - -x, -xlog + + Equivalent to using **-X** with the fetch method. + + - -Z -compress=level + + Enables gzip compression for the output of the tar file and sets the compression level (0 to 9, where 0 indicates no compression and 9 indicates the best compression). The compression is available only when the tar format is used. The suffix .gz is automatically added to the end of all .tar file names. + + - -z + + Enables gzip compression for tar file output and uses the default compression level. The compression is available only when the tar format is used. The suffix .gz is automatically added to the end of all .tar file names. + + - -t, -rw-timeout + + Sets the checkpoint time limit during backup. The default value is 120s. If the full checkpoint of the database is time-consuming, you can increase the value of **rw-timeout**. + +- Connection parameters + + - -h, -host=HOSTNAME + + Specifies the host name of the machine on which the server is running or the directory for the Unix-domain socket. + + - -p, -port=PORT + + Specifies the port number of the database server. + + You can modify the default port number using this parameter. + + - -U, -username=USERNAME + + Specifies the user that connects to the database. + + - -s, -status-interval=INTERVAL + + Specifies the time for sending status packets to the server, in seconds. + + - -w,-no-password + + Never issues a password prompt. + + - -W, -password + + Issues a password prompt when the **-U** parameter is used to connect to a local or remote database. + +### Example + +```bash +gs_basebackup -D /home/test/trunk/install/data/backup -h 127.0.0.1 -p 21233 +INFO: The starting position of the xlog of the full build is: 0/1B800000. The slot minimum LSN is: 0/1B800000. +``` + +### Restoring Data from Backup Files + +If a database is faulty, restore it from backup files. **gs_basebackup** backs up the database in binary mode. Therefore, you can directly and replace the original files or start the database on the backup database. + +![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + +- If the current database instance is running, a port conflict may occur when you start the database from the backup file. In this case, you need to modify the port parameter in the configuration file or specify a port when starting the database. +- If the current backup file is a primary/standby database, you may need to modify the replication connections between the master and slave databases. That is, **replconninfo1** and **replconninfo2** in the **postgresql.conf** file. + +To restore the original database, perform the following steps: + +1. Stop the database server. For details, see *Administrator Guide*. +2. the original database and all tablespaces to another location for future use. +3. Delete all or part of the files from the original database. +4. Use the database system user rights to restore the required database files from the backup. +5. If a link file exists in the database, modify the link file so that it can be linked to the correct file. +6. Restart the database server and check the database content to ensure that the database is restored to the required status. + +![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + +- Incremental restoration from backup files is not supported. +- After the restoration, check that the link file in the database is linked to the correct file. + +## gs_probackup + +### Background + +**gs_probackup** is a tool used to manage MogDB database backup and restoration. It periodically backs up the MogDB instances so that the server can be restored when the database is faulty. + +- It supports the physical backup of a standalone database or a primary node in a database. +- It supports the backup of contents in external directories, such as script files, configuration files, log files, and dump files. +- It supports incremental backup, periodic backup, and remote backup. +- It supports settings on the backup retention policy. + +### Prerequisites + +- The MogDB database can be connected. +- To use PTRACK incremental backup, manually add **enable_cbm_tracking = on** to **postgresql.conf**. + +### Important Notes + +- The backup must be performed by the user who runs the database server. +- The major version number of the database server to be backed up must be the same as that of the database server to be restored. +- To back up a database in remote mode using SSH, install the database of the same major version on the local and remote hosts, and run the **ssh-copy-id remote_user@remote_host** command to set an SSH connection without a password between the local host backup user and the remote host database user. +- In remote mode, only the subcommands **add-instance**, **backup**, and **restore** can be executed. +- Before running the **restore** subcommand, stop the mogdb process. +- When there is a user-defined tablespace, the **--external-dirs** parameter should be added when backing up, otherwise, the tablespace will not be backed up. +- When the scale of the backup is relatively large, in order to prevent timeout from occurring during the backup process, please adjust the parameters **session_timeout**, **wal_sender_timeout** of the **postgres.conf** file appropriately. And adjust the value of the parameter **--rw-timeout** appropriately in the command line parameters of the backup. +- When restoring, when using the **-T** option to redirect the external directory in the backup to a new directory, please specify the parameter **--external-mapping** at the same time. +- After the incremental backup is restored, the previously created logical replication slot is unavailable and needs to be deleted and rebuilt. + +### Command Description + +- Print the **gs_probackup** version. + + ```bash + gs_probackup -V|--version + gs_probackup version + ``` + +- Display brief information about the **gs_probackup** command. Alternatively, display details about parameters of a specified subcommand of **gs_probackup**. + + ```bash + gs_probackup -?|--help + gs_probackup help [command] + ``` + +- Initialize the backup directory in **backup-path**. The backup directory stores the contents that have been backed up. If the **backup-path** backup path exists, it must be empty. + + ```bash + gs_probackup init -B backup-path [--help] + ``` + +- Initialize a new backup instance in the backup directory of **backup-path** and generate the **pg_probackup.conf** configuration file, which saves the **gs_probackup**settings of the specified data directory **pgdata-path**. + + ```bash + gs_probackup add-instance -B backup-path -D pgdata-path --instance=instance_name + [-E external-directories-paths] + [remote_options] + [--help] + ``` + +- Delete the backup content related to the specified instance from the **backup-path** directory. + + ```bash + gs_probackup del-instance -B backup-path --instance=instance_name + [--help] + ``` + +- Add the specified connection, compression, and log-related settings to the **pg_probackup.conf** configuration file or modify the existing settings. You are not advised to manually edit the **pg_probackup.conf** configuration file. + + ```bash + gs_probackup set-config -B backup-path --instance=instance_name + [-D pgdata-path] [-E external-directories-paths] [--archive-timeout=timeout] + [--retention-redundancy=retention-redundancy] [--retention-window=retention-window] [--wal-depth=wal-depth] + [--compress-algorithm=compress-algorithm] [--compress-level=compress-level] + [-d dbname] [-h hostname] [-p port] [-U username] + [logging_options] [remote_options] + [--help] + ``` + +- Add the backup-related settings to the **backup.control** configuration file or modify the settings. + + ```bash + gs_probackup set-backup -B backup-path --instance=instance_name -i backup-id + [--note=text] [pinning_options] + [--help] + ``` + +- Display the content of the **pg_probackup.conf** configuration file in the backup directory. You can specify **-format=json** to display the information in JSON format. By default, the plain text format is used. + + ```bash + gs_probackup show-config -B backup-path --instance=instance_name + [--format=plain|json] + [--help] + ``` + +- Display the contents of the backup directory. If **instance_name** and **backup_id** are specified, detailed information about the backup is displayed. You can specify **-format=json** to display the information in JSON format. By default, the plain text format is used. + + ```bash + gs_probackup show -B backup-path + [--instance=instance_name [-i backup-id]] [--archive] [--format=plain|json] + [--help] + ``` + +- Create a backup for a specified database instance. + + ```bash + gs_probackup backup -B backup-path --instance=instance_name -b backup-mode + [-D pgdata-path] [-C] [-S slot-name] [--temp-slot] [--backup-pg-log] [-j threads_num] [--progress] + [--no-validate] [--skip-block-validation] [-E external-directories-paths] [--no-sync] [--note=text] + [--archive-timeout=timeout] [-t rwtimeout] + [logging_options] [retention_options] [compression_options] + [connection_options] [remote_options] [pinning_options] + [--help] + ``` + +- Restore a specified instance from the backup copy in the **backup-path** directory. If an instance to be restored is specified, **gs_probackup** will look for its latest backup and restore it to the specified recovery target. Otherwise, the latest backup of any instance is used. + + ```bash + gs_probackup restore -B backup-path --instance=instance_name + [-D pgdata-path] [-i backup_id] [-j threads_num] [--progress] [--force] [--no-sync] [--no-validate] [--skip-block-validation] + [--external-mapping=OLDDIR=NEWDIR] [-T OLDDIR=NEWDIR] [--skip-external-dirs] [-I incremental_mode] + [recovery_options] [remote_options] [logging_options] + [--help] + ``` + +- Merge all incremental backups between the specified incremental backup and its parent full backup into the parent full backup. The parent full backup will receive all merged data, while the merged incremental backup will be deleted as redundancy. + + ```bash + gs_probackup merge -B backup-path --instance=instance_name -i backup_id + [-j threads_num] [--progress] [logging_options] + [--help] + ``` + +- Delete a specified backup or delete backups that do not meet the current retention policy. + + ```bash + gs_probackup delete -B backup-path --instance=instance_name + [-i backup-id | --delete-expired | --merge-expired | --status=backup_status] + [--delete-wal] [-j threads_num] [--progress] + [--retention-redundancy=retention-redundancy] [--retention-window=retention-window] + [--wal-depth=wal-depth] [--dry-run] + [logging_options] + [--help] + ``` + +- Verify that all files required for restoring the database exist and are not damaged. If **instance_name**is not specified, **gs_probackup**verifies all available backups in the backup directory. If **instance_name**is specified and no additional options are specified, **gs_probackup**verifies all available backups for this backup instance. If both **instance_name** and **backup-id**or recovery objective-related options are specified, **gs_probackup**checks whether these options can be used to restore the database. + + ```bash + gs_probackup validate -B backup-path + [--instance=instance_name] [-i backup-id] + [-j threads_num] [--progress] [--skip-block-validation] + [--recovery-target-time=time | --recovery-target-xid=xid | --recovery-target-lsn=lsn | --recovery-target-name=target-name] + [--recovery-target-inclusive=boolean] + [logging_options] + [--help] + ``` + +### Parameter Description + +**Common parameters** + +- command + + Specifies subcommands except **version** and **help**: **init**, **add-instance**, **del-instance**, **set-config**, **set-backup**, **show-config**, **show**, **backup**, **restore**, **merge**, **delete**, and **validate**. + +- -?, -help + + Displays help information about the command line parameters of **gs_probackup** and exits. + + Only **-help** can be used in subcommands; **-?** is forbidden. + +- -V, -version + + Prints the **gs_probackup** version and exits. + +- -B *backup-path*, -backup-path=*backup-path* + + Backup path. + + System environment variable: *$BACKUP_PATH* + +- -D *pgdata-path*, -pgdata=*pgdata-path* + + Path of the data directory. + + System environment variable: *$PGDATA* + +- -instance=*instance_name* + + Instance name. + +- -i *backup-id*, -backup-id=*backup-id* + + Unique identifier of a backup. + +- -format=*format* + + Specifies format of the backup information to be displayed. The plain and JSON formats are supported. + + Default value: **plain** + +- -status=*backup_status* + + Deletes all backups in a specified state. The states are as follows: + + - **OK**: Backup is complete and valid. + + - **DONE**: Backup has been completed but not verified. + - **RUNNING**: Backup is in progress. + - **MERGING**: Backups are being merged. + - **DELETING**: Backup is being deleted. + - **CORRUPT**: Some backup files are damaged. + - **ERROR**: Backup fails due to an unexpected error. + - **ORPHAN**: Backup is invalid because one of its parent backups is corrupted or lost. + - -j *threads_num*, -threads=*threads_num* + + Sets the number of concurrent threads for the backup, restoration, and combination processes. + +- -archive + + Displays WAL archiving information. + +- -progress + + Displays progress. + +- -note=*text* + + Adds a note to the backup. + +**Backup-related parameters** + +- -b *backup-mode*, -backup-mode=*backup-mode* + + Specifies the backup mode. The value can be **FULL** or **PTRACK**. + + **FULL**: creates a full backup. The full backup contains all data files. + + **PTRACK**: creates a PTRACK incremental backup. + +- -C, -smooth-checkpoint + + Expands checkpoints within a period of time. By default, **gs_probackup** attempts to complete checkpoints as soon as possible. + +- -S *slot-name*, -slot=*slot-name* + + Specifies the replication slot for WAL stream processing. + +- -temp-slot + + Creates a temporary physical replication slot for WAL stream processing in the backup instance to ensure that all required WAL segments are still available during the backup. + + The default slot name is **pg_probackup_slot**, which can be changed using the **-slot/-S** option. + +- -backup-pg-log + + Includes the log directory in the backup. This directory typically contains log messages. By default, the log directory is included, but log files are not included. If the default log path is changed, you can use the **-E** parameter to back up log files. The following describes how to use the **-E** parameter. + +- -E *external-directories-paths*, -external-dirs=*external-directories-paths* + + Includes the specified directory in the backup. This option is useful for backing up scripts in external data directories, sql dumps, and configuration files. To back up multiple external directories, use colons (:) to separate their paths in Unix. + + Example: -E /tmp/dir1:/tmp/dir2 + +- -skip-block-validation + + Disables block-level verification to speed up backup. + +- -no-validate + + Skips the automatic verification when the backup is complete. + +- -no-sync + + Disables backup file synchronization to the disk. + +- -archive-timeout=*timeout* + + Specifies timeout interval for streaming processing, in seconds. + + Default value: **300** + +- -t rwtimeout + + Specifies timeout interval for a connection, in seconds. + + Default value: **120** + +**Restoration-related parameters** + +- -I, -incremental-mode=none|checksum|lsn + + Reuses the valid pages available in PGDATA if they are not modified. + + Default value: **none** + +- -external-mapping=*OLDDIR=NEWDIR* + + During restoration, the external directory contained in the backup is moved from **OLDDIR** to **NEWDIR**. **OLDDIR** and **NEWDIR** must be absolute paths. If the path contains an equal sign (=), use a backslash () to escape. This option can be specified for multiple directories. + +- -T *OLDDIR=NEWDIR*, -tablespace-mapping=*OLDDIR=NEWDIR* + + Relocates the tablespace from the **OLDDIR** directory to the **NEWDIR** directory during the restoration. **OLDDIR** and **NEWDIR** must be absolute paths. If the path contains an equal sign (=), use a backslash () to escape. This parameter can be specified multiple times for multiple tablespaces. This parameter must be used together with **-external-mapping**. + +- -skip-external-dirs + + Skips the external directories in the backup that are specified using the **-external-dirs** option. The contents of these directories will not be restored. + +- -skip-block-validation + + Skips block-level verification to speed up verification. During the automatic verification before the restoration, only file-level verification is performed. + +- -no-validate + + Skips the backup verification. + +- -force + + Specifies the invalid state that allows ignoring backup. This flag can be used if data needs to be restored from a damaged or invalid backup. Exercise caution when using it. + +**Recovery objective-related parameters (recovery_options)** + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** Currently, continuous WAL archiving PITR cannot be configured. Therefore, parameter usage is restricted as follows: To use continuously archived WAL logs for PITR, perform the following steps: +> +> 1. Replace the target database directory with the physical backup files. +> 2. Delete all files in the database directory **pg_xlog/**. +> 3. Copy the archived WAL log file to the **pg_xlog** file. (Or you can configure **restore_command** in the **recovery.conf** file to skip this step.) +> 4. Create the recovery command file **recovery.conf** in the database directory and specify the database recovery degree. +> 5. Start the database. +> 6. Connect to the database and check whether the database is recovered to the expected status. If the expected status is reached, run the **pg_xlog_replay_resume()** command so that the primary node can provide services externally. + +- -recovery-target-lsn=*lsn* + + Specifies LSN to be restored. Currently, only the backup stop LSN can be specified. + +- -recovery-target-name=*target-name* + + Specifies named savepoint to which data is restored. You can obtain the savepoint by viewing the recovery-name column in the backup. + +- -recovery-target-time=*time* + + Specifies time to which data is restored. Currently, only recovery-time can be specified. + +- -recovery-target-xid=*xid* + + Specifies transaction ID to which data is restored. Currently, only recovery-xid can be specified. + +- -recovery-target-inclusive=*boolean* + + When this parameter is set to **true**, the recovery target will include the specified content. + + When this parameter is set to **false**, the recovery target will not include the specified content. + + This parameter must be used together with **-recovery-target-name**, **-recovery-target-time**, **-recovery-target-lsn**, or **-recovery-target-xid**. + +**Retention-related parameters (retention_options)** + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** The following parameters can be used together with the **backup** and **delete** commands. + +- -retention-redundancy=*retention-redundancy* + + Number of full backups retained in the data directory. The value must be a positive integer. The value **0** indicates that the setting is disabled. + + Default value: **0** + +- -retention-window=*retention-window* + + Specifies the retention period. The value must be a positive integer. The value **0** indicates that the setting is disabled. + + Default value: **0** + +- -wal-depth=*wal-depth* + + Latest number of valid backups that must be retained on each timeline to perform the PITR capability. The value must be a positive integer. The value **0** indicates that the setting is disabled. + + Default value: **0** + +- -delete-wal + + Deletes unnecessary WAL files from any existing backup. + +- -delete-expired + + Deletes the backups that do not comply with the retention policy defined in the **pg_probackup.conf** configuration file. + +- -merge-expired + + Merges the oldest incremental backup that meets the retention policy requirements with its expired parent backup. + +- -dry-run + + Displays the status of all available backups. Expired backups will not be deleted or merged. + +**Fixed backup-related parameters (pinning_options)** + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** To exclude certain backups from the established retention policy, you can use the following parameters with the **backup** and **set-backup** commands. + +- -ttl=interval + + Specifies a fixed amount of time to back up data from the restoration time. The value must be a positive integer. The value **0** indicates that the backup is canceled. + + Supported unit: ms, s, min, h, d (default value: **s**) + + For example, **-ttl=30d**. + +- -expire-time=time + + Specifies the timestamp when the backup is invalid. The time stamp must comply with the ISO-8601 standard. + + For example, **-expire-time='2020-01-01 00:00:00+03'**. + +**Log-related parameters (logging_options)** + +Log levels: **verbose**, **log**, **info**, **warning**, **error**, and **off**. + +- -log-level-console=log-level-console + + Sets the level of logs to be sent to the console. Each level contains all the levels following it. A higher level indicates fewer messages sent. If this parameter is set to **off**, the log recording function of the console is disabled. + + Default value: **info** + +- -log-level-file=log-level-file + + Sets the level of logs to be sent to the log file. Each level contains all the levels following it. A higher level indicates fewer messages sent. If this parameter is set to **off**, the log file recording function is disabled. + + Default value: **off** + +- -log-filename=log-filename + + Specifies the name of the log file to be created. The file name can use the strftime mode. Therefore, **%-escapes** can be used to specify the file name that changes with time. + + For example, if the **pg_probackup-%u.log** mode is specified, pg_probackup generates a log file each day of the week, with **%u** replaced by the corresponding decimal number, that is, **pg_probackup-1.log**indicates Monday.**pg_probackup-2.log** indicates Tuesday, and so on. + + This parameter is valid if the **-log-level-file** parameter is specified to enable log file recording. + + Default value: **"pg_probackup.log"** + +- -error-log-filename=error-log-filename + + Specifies the name of the log file that is used only for error logs. The specifying method is the same as that of the **-log-filename** parameter. + + It is used for troubleshooting and monitoring. + +- -log-directory=log-directory + + Specifies the directory where log files are created. The value must be an absolute path. This directory is created when the first log is written. + + Default value: **$BACKUP_PATH/log** + +- -log-rotation-size=log-rotation-size + + Specifies the maximum size of a log file. If the maximum size is reached, the log file will be circulated after the **gs_probackup** command is executed. The **help** and **version** commands will not lead to a log file circulation. The value **0** indicates that the file size-based loop is disabled. + + The unit can be KB, MB, GB, or TB. The default unit is **KB**. + + Default value: **0** + +- -log-rotation-age=log-rotation-age + + Maximum life cycle of a log file. If the maximum size is reached, the log file will be circulated after the **gs_probackup** command is executed. The **help** and **version** commands will not lead to a log file circulation. The **$BACKUP_PATH/log/log_rotation** directory saves the time of the last created log file. The value **0** indicates that the time-based loop is disabled. + + Supported unit: ms, s, min, h, d (default value: **min**) + + Default value: **0** + +**Connection-related parameters (connection_options)** + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** The following parameters can be used together with the **backup** command. + +- -d *dbname*, -pgdatabase=dbname + + Specifies the name of the database to connect to. This connection is only used to manage the backup process. Therefore, you can connect to any existing database. If this parameter is not specified in the command line, the *PGDATABASE* environment variable, or the **pg_probackup.conf** configuration file, **gs_probackup** attempts to obtain the value from the *PGUSER* environment variable. If the *PGUSER* variable is not set, the value is obtained from the current user name. + + System environment variable: $PGDATABASE + +- -h *hostname*, -pghost=hostname + + Specifies the host name of the system on which the server is running. If the value begins with a slash (/), it is used as the directory for the UNIX domain socket. + + System environment variable: $PGHOST + + Default value: **local socket** + +- -p *port*, -pgport=_p_*ort* + + Specifies the TCP port or local Unix domain socket file name extension on which the server is listening for connections. + + System environment variable: $PGPORT + + Default value: **5432** + +- -U *username*, -pguser=username + + Specifies the username of the host to be connected. + + System environment variable: $PGUSER + +- -w, -no-password + + Never issues a password prompt. The connection attempt fails if the host requires password verification and the password is not provided in other ways. This parameter is useful in batch jobs and the scripts that require no user password. + +- -W *password*, -password=password + + User password for database connection. If the host uses the trust authentication policy, the administrator does not need to enter the **-W** parameter. If the **-W** parameter is not provided and you are not a system administrator, the system will ask you to enter a password. + +**Compression-related parameters (compression_options)** + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif)**NOTE:** The following parameters can be used together with the **backup** command. + +- -compress-algorithm=compress-algorithm + + Specifies the algorithm used to compress data file. + + The value can be **zlib**, **pglz**, or **none**. If **zlib** or **pglz** is set, compression is enabled. By default, the compression function is disabled. + + Default value: **none** + +- -compress-level=compress-level + + Specifies the compression level. Value range: 0-9 + + - **0** indicates no compression. + - **1** indicates that the compression ratio is the lowest and processing speed the fastest. + - **9** indicates that the compression ratio is the highest and processing speed the slowest. + - This parameter can be used together with **-compress-algorithm**. + + Default value: **1** + +- -compress + + Compresses with **-compress-algorithm=zlib** and **-compress-level=1**. + +**Remote mode-related parameters (remote_options)** + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif)**NOTE:** The following are parameters that remotely run **gs_probackup** through SSH, and can be used together with the **add-instance**, **set-config**, **backup**, and **restore** commands. + +- -remote-proto=protocol + + Specifies the protocol used for remote operations. Currently, only the SSH protocol is supported. Valid value: + + **ssh**: enables the remote backup mode through SSH. This is the default. + + **none**: The remote mode is disabled explicitly. + + If **-remote-host** is specified, this parameter can be omitted. + +- -remote-host=destination + + Specifies the IP address or host name of the remote host to be connected. + +- -remote-port=port + + Specifies the port number of the remote host to be connected. + + Default value: **22** + +- -remote-user=*username* + + Specifies the remote host user for SSH connection. If this parameter is not specified, the user who initiates the SSH connection is used. + + Default value: the current user. + +- -remote-path=path + + Specifies the installation directory of **gs_probackup** in the remote system. + + Default value: current path + + - --remote-lib=_libpath_ + + Specifies the lib directory installed by **gs_probackup** in the remote system. + +- -ssh-options=*ssh_options* + + Specifies the character string of the SSH command line parameter. + + Example: -ssh-options='-c cipher_spec -F configfile' + +>**NOTE:** +> +>* If the server does not respond due to a temporary network fault, **gs_probackup**will exit after waiting for **archive-timeout** seconds (300 seconds is set by default). +>* If the LSN of the standby server is different from that of the primary server, the database continuously updates the following log information. In this case, you need to rebuild the standby server. +> +>```bash +>LOG: walsender thread shut down +>LOG: walsender thread started +>LOG: received wal replication command: IDENTIFY_VERSION +>LOG: received wal replication command: IDENTIFY_MODE +>LOG: received wal replication command: IDENTIFY_SYSTEM +>LOG: received wal replication command: IDENTIFY_CONSISTENCE 0/D0002D8 +>LOG: remote request lsn/crc: [xxxxx] local max lsn/crc: [xxxxx] +>``` + +### Backup Process + +1. Initialize the backup directory. Create the **backups/** and **wal/** subdirectories in the specified directory to store backup files and WAL files respectively. + + ```bash + gs_probackup init -B backup_dir + ``` + +2. Add a new backup instance. **gs_probackup** can store backups of multiple database instances in the same backup directory. + + ```bash + gs_probackup add-instance -B backup_dir -D data_dir -instance instance_name + ``` + +3. Create a backup for a specified database instance. Before performing an incremental backup, you must create at least one full backup. + + ```bash + gs_probackup backup -B backup_dir -instance instance_name -b backup_mode + ``` + +4. Restore data from the backup of a specified instance. + + ```bash + gs_probackup restore -B backup_dir -instance instance_name -D pgdata-path -i backup_id + ``` + +### Troubleshooting + +| Problem Description | Cause and Solution | +| :----------------------------------------------------------- | :----------------------------------------------------------- | +| ERROR: query failed: ERROR: canceling statement due to conflict with recovery
| Cause: The operation performed on the standby node is accessing the storage row. The corresponding row is modified or deleted on the primary node, and the Xlog is replayed on the standby node. As a result, the operation is canceled on the standby node.
Solution:
1. Increase the values of the following parameters:
max_standby_archive_delaymax_standby_streaming_delay
2. Add the following configuration item:
hot_standby_feedback = on | + +## PITR Recovery + +### Background + +When a database breaks down or needs to be rolled back to a previous state, the point-in-time recovery (PITR) function of MogDB can be used to restore the database to any point in time after the backup and archive data is generated. + +**NOTE:** + +- PITR can only be restored to a point in time after the physical backup data is generated. +- Only the primary node can be restored using PITR. The standby node needs to be fully built to synchronize data with the primary node. + +### Prerequisites + +- Full data files have been physically backed up. +- WAL log files have been archived. + +### PITR Recovery Process + +1. Replace the target database directory with the physical backup files. +2. Delete all files in the database directory **pg_xlog/**. +3. the archived WAL log file to the **pg_xlog** file. (Or you can configure **restore_command** in the **recovery.conf** file to skip this step.) +4. Create the recovery command file **recovery.conf** in the database directory and specify the database recovery degree. +5. Start the database. +6. Connect to the database and check whether the database is recovered to the expected status. +7. If the expected status is reached, run the **pg_xlog_replay_resume()** command so that the primary node can provide services externally. + +### Configuring the recovery.conf File + +**Archive Recovery Configuration** + +- restore_command = string + + The **shell** command is used to obtain the archived WAL files among the WAL file series. Any %f in the string is replaced by the name of the file to retrieve from the archive, and any %p is replaced by the path name to it to on the server. Any %r is replaced by the name of the file containing the last valid restart point. + + For example: + + ```bash + restore_command = 'cp /mnt/server/archivedir/%f %p' + ``` + +- archive_cleanup_command = string + + This option parameter declares a **shell** command that is executed each time the system is restarted. **archive_cleanup_command** provides a mechanism for deleting unnecessary archived WAL files from the standby database. Any %r is replaced by the name of the file containing the last valid restart point. That is the earliest file that must be kept to allow recovery to be restartable, so all files older than %r can be safely removed. + + For example: + + ```bash + archive_cleanup_command = 'pg_archivecleanup /mnt/server/archivedir %r' + ``` + + If multiple standby servers need to be recovered from the same archive path, ensure that WAL files are not deleted from any standby server before the recovery. + +- recovery_end_command = string + + This parameter is optional and is used to declare a **shell** command that is executed only when the recovery is complete. **recovery_end_command** provides a cleanup mechanism for future replication and recovery. + +**Recovery Object Configuration** + +- recovery_target_name = string + + This parameter declares that the name is recovered to a recovery point created using pg_create_restore_point(). + + For example: + + ```bash + recovery_target_name = 'restore_point_1' + ``` + +- recovery_target_time = timestamp + + This parameter declares that the name is recovered to a specified timestamp. + + For example: + + ```bash + recovery_target_time = '2020-01-01 12:00:00' + ``` + +- recovery_target_xid = string + + This parameter declares that the name is recovered to a transaction ID. + + For example: + + ```bash + recovery_target_xid = '3000' + ``` + +- recovery_target_lsn = string + + This parameter declares that the name is recovered to the LSN specified by log. + + For example: + + ```bash + recovery_target_lsn = '0/0FFFFFF' + ``` + +- recovery_target_inclusive = boolean + + This parameter declares whether to stop the recovery after the recovery target is specified (**true**) or before the recovery target is specified (**false**). This declaration supports only the recovery targets **recovery_target_time**, **recovery_target_xid**, and **recovery_target_lsn**. + + For example: + + ```bash + recovery_target_inclusive = true + ``` + +**NOTE:** + +- Only one of the four configuration items **recovery_target_name**, **recovery_target_time**, **recovery_target_xid**, and **recovery_target_lsn** can be used at a time. +- If no recovery targets are configured or the configured target does not exist, data is recovered to the latest WAL log point by default. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/br/1-3-br.md b/product/en/docs-mogdb/v3.0/administrator-guide/br/1-3-br.md new file mode 100644 index 0000000000000000000000000000000000000000..6803a980f248053a6f03c6dd4d698cdc98434e30 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/br/1-3-br.md @@ -0,0 +1,1311 @@ +--- +title: Logical Backup and Restoration +summary: Logical Backup and Restoration +author: Guo Huan +date: 2021-04-27 +--- + +# Logical Backup and Restoration + +## gs_dump + +### Background + +gs_dump, provided by MogDB, is used to export database information. You can export a database or its objects (such as schemas, tables, and views). The database can be the default mogdb database or a user-specified database. + +**gs_dump** is executed by OS user **omm**. + +When **gs_dump** is used to export data, other users can still access (read and write) MogDB databases. + +**gs_dump** can export complete, consistent data. For example, if **gs_dump** is started to export database A at T1, data of the database at that time point will be exported, and modifications on the database after that time point will not be exported. + +**gs_dump** can export database information to a plain-text SQL script file or archive file. + +- Plain-text SQL script: It contains the SQL statements required to restore the database. You can use gsql to execute the SQL script. With only a little modification, the SQL script can rebuild a database on other hosts or database products. +- Archive file: It contains data required to restore the database. It can be a tar-, directory-, or custom-format archive. For details, see Table 1. The export result must be used with **gs_restore**to restore the database. The system allows users to select or even to sort the content to be imported. + +### Functions + +**gs_dump** can create export files in four formats, which are specified by **-F** or **-format=**, as listed in Table 1. + +**Table 1** Formats of exported files + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
FormatValue of -FDescriptionSuggestionCorresponding Import Tool
Plain-textpA plain-text script file containing SQL statements and commands. The commands can be executed on gsql, a command line terminal, to recreate database objects and load table data.You are advised to use plain-text exported files for small databases.Before using gsql to restore database objects, you can use a text editor to edit the plain-text export file as required.
CustomcA binary file that allows the restoration of all or selected database objects from an exported file.You are advised to use custom-format archive files for medium or large database.You can use gs_restore to import database objects from a custom-format archive.
DirectorydA directory containing directory files and the data files of tables and BLOB objects.-
.tartA tar-format archive that allows the restoration of all or selected database objects from an exported file. It cannot be further compressed and has an 8-GB limitation on the size of a single table.-
+ +![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +To reduce the size of an exported file, you can use **gs_dump** to compress it to a plain-text file or custom-format file. By default, a plain-text file is not compressed when generated. When a custom-format archive is generated, a medium level of compression is applied by default. Archived exported files cannot be compressed using **gs_dump**. When a plain-text file is exported in compressed mode, **gsql** fails to import data objects. + +### Precautions + +Do not modify an exported file or its content. Otherwise, restoration may fail. + +To ensure the data consistency and integrity, **gs_dump** acquires a share lock on a table to be dumped. If another transaction has acquired a share lock on the table, **gs_dump** waits until this lock is released and then locks the table for dumping. If the table cannot be locked within the specified time, the dump fails. You can customize the timeout duration to wait for lock release by specifying the **-lock-wait-timeout** parameter. + +### Syntax + +```bash +gs_dump [OPTION]... [DBNAME] +``` + +![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +**DBNAME** does not follow a short or long option. It specifies the database to be connected. +For example: +Specify **DBNAME** without a **-d** option preceding it. + +```bash +gs_dump -p port_number mogdb -f dump1.sql +``` + +or + +``` +export PGDATABASE=mogdb +gs_dump -p port_number -f dump1.sql +``` + +Environment variable:**PGDATABASE** + +### Parameter Description + +Common parameters + +- -f, -file=FILENAME + + Sends the output to the specified file or directory. If this parameter is omitted, the standard output is generated. If the output format is **(-F c/-F d/-F t)**, the **-f** parameter must be specified. If the value of the **-f** parameter contains a directory, the current user must have the read and write permissions on the directory, and the directory cannot be an existing one. + +- -F, -format=c|d|t|p + + Selects the exported file format. The format can be: + + - **p|plain**: Generates a text SQL script file. This is the default value. + + - **c|custom**: Outputs a custom-format archive as a directory to be used as the input of **gs_restore**. This is the most flexible output format in which users can manually select it and reorder the archived items during restoration. An archive in this format is compressed by default. + + - **d|directory**: Creates a directory containing directory files and the data files of tables and BLOBs. + + - **t|tar**: Outputs a .tar archive as the input of **gs_restore**. The .tar format is compatible with the directory format. Extracting a .tar archive generates a valid directory-format archive. However, the .tar archive cannot be further compressed and has an 8-GB limitation on the size of a single table. The order of table data items cannot be changed during restoration. + + A .tar archive can be used as input of **gsql**. + +- -v, -verbose + + Specifies the verbose mode. If it is specified, **gs_dump** writes detailed object comments and the number of startups/stops to the dump file, and progress messages to standard error. + +- -V, -version + + Prints the **gs_dump** version and exits. + +- -Z, -compress=0-9 + + Specifies the used compression level. + + Value range: 0-9 + + - **0** indicates no compression. + - **1** indicates that the compression ratio is the lowest and processing speed the fastest. + - **9** indicates that the compression ratio is the highest and processing speed the slowest. + + For the custom-format archive, this option specifies the compression level of a single table data segment. By default, data is compressed at a medium level. The plain-text and .tar archive formats do not support compression currently. + +- -lock-wait-timeout=TIMEOUT + + Do not keep waiting to obtain shared table locks since the beginning of the dump. Consider it as failed if you are unable to lock a table within the specified time. The timeout period can be specified in any of the formats accepted by **SET statement_timeout**. + +- -?, -help + + Displays help about **gs_dump** parameters and exits. + +Dump parameters: + +- -a, -data-only + + Generates only the data, not the schema (data definition). Dump the table data, big objects, and sequence values. + +- -b, -blobs + + Specifies a reserved port for function expansion. This parameter is not recommended. + +- -c, -clean + + Before writing the command of creating database objects into the backup file, writes the command of clearing (deleting) database objects to the backup files. (If no objects exist in the target database, **gs_restore** probably displays some error information.) + + This parameter is used only for the plain-text format. For the archive format, you can specify the option when using **gs_restore**. + +- -C, -create + + The backup file content starts with the commands of creating the database and connecting to the created database. (If the command script is executed in this mode, you can specify any database to run the command for creating a database. The data is restored to the created database instead of the specified database.) + + This parameter is used only for the plain-text format. For the archive format, you can specify the option when using **gs_restore**. + +- -E, -encoding=ENCODING + + Creates a dump file in the specified character set encoding. By default, the dump file is created in the database encoding. (Alternatively, you can set the environment variable **PGCLIENTENCODING** to the required dump encoding.) + +- -n, -schema=SCHEMA + + Dumps only schemas matching the schema names. This option contains the schema and all its contained objects. If this option is not specified, all non-system schemas in the target database will be dumped. Multiple schemas can be selected by specifying multiple **-n** options. The schema parameter is interpreted as a pattern according to the same rules used by the **\d** command of **gsql**. Therefore, multiple schemas can also be selected by writing wildcard characters in the pattern. When you use wildcard characters, quote the pattern to prevent the shell from expanding the wildcard characters. + + ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + + - If **-n** is specified, **gs_dump** does not dump any other database objects which the selected schemas might depend upon. Therefore, there is no guarantee that the results of a specific-schema dump can be automatically restored to an empty database. + - If **-n** is specified, the non-schema objects are not dumped. + + Multiple schemas can be dumped. Entering **-n schemaname** multiple times dumps multiple schemas. + + For example: + + ```bash + gs_dump -h host_name -p port_number mogdb -f backup/bkp_shl2.sql -n sch1 -n sch2 + ``` + + In the preceding example, **sch1** and **sch2** are dumped. + +- -N, -exclude-schema=SCHEMA + + Does not dump any schemas matching the schemas pattern. The pattern is interpreted according to the same rules as for **-n**. **-N** can be specified multiple times to exclude schemas matching any of the specified patterns. + + When both **-n** and **-N** are specified, the schemas that match at least one **-n** option but no **-N** is dumped. If **-N** is specified and **-n** is not, the schemas matching **-N** are excluded from what is normally dumped. + + Dump allows you to exclude multiple schemas during dumping. + + Specify **-N exclude schema name** to exclude multiple schemas during dumping. + + For example: + + ```bash + gs_dump -h host_name -p port_number mogdb -f backup/bkp_shl2.sql -N sch1 -N sch2 + ``` + + In the preceding example, **sch1** and **sch2** will be excluded during the dumping. + +- -o, -oids + + Dumps object identifiers (OIDs) as parts of the data in each table. Use this option if your application references the OID columns in some way. If the preceding situation does not occur, do not use this parameter. + +- -O, -no-owner + + Do not output commands to set ownership of objects to match the original database. By default, **gs_dump** issues the **ALTER OWNER** or **SET SESSION AUTHORIZATION** statement to set ownership of created database objects. These statements will fail when the script is running unless it is started by a system administrator (or the same user that owns all of the objects in the script). To make a script that can be stored by any user and give the user ownership of all objects, specify **-O**. + + This parameter is used only for the plain-text format. For the archive format, you can specify the option when using **gs_restore**. + +- -s, -schema-only + + Dumps only the object definition (schema) but not data. + +- -S, -sysadmin=NAME + + Specifies a reserved port for function expansion. This parameter is not recommended. + +- -t, -table=TABLE + + Specifies a list of tables, views, sequences, or foreign tables to be dumped. You can use multiple **-t** parameters or wildcard characters to specify tables. + + When you use wildcard characters, quote patterns to prevent the shell from expanding the wildcard characters. + + The **-n** and **-N** options have no effect when **-t** is used, because tables selected by using **-t** will be dumped regardless of those options. + + ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + + - The number of **-t** parameters must be less than or equal to 100. + - If the number of **-t** parameters is greater than 100, you are advised to use the **-include-table-file** parameter to replace some **-t** parameters. + - If **-t** is specified, **gs_dump** does not dump any other database objects which the selected tables might depend upon. Therefore, there is no guarantee that the results of a specific-table dump can be automatically restored to an empty database. + - **-t tablename** only dumps visible tables in the default search path. **-t '\*.tablename'** dumps *tablename* tables in all the schemas of the dumped database. **-t schema.table** dumps tables in a specific schema. + - **-t tablename** does not export trigger information from a table. + + For example: + + ```bash + gs_dump -h host_name -p port_number mogdb -f backup/bkp_shl2.sql -t schema1.table1 -t schema2.table2 + ``` + + In the preceding example, **schema1.table1** and **schema2.table2** are dumped. + +- -include-table-file=FILENAME + + Specifies the table file to be dumped. + +- -T, -exclude-table=TABLE + + Specifies a list of tables, views, sequences, or foreign tables not to be dumped. You can use multiple **-T** parameters or wildcard characters to specify tables. + + When **-t** and **-T** are input, the object will be stored in **-t** list not **-T** table object. + + For example: + + ```bash + gs_dump -h host_name -p port_number mogdb -f backup/bkp_shl2.sql -T table1 -T table2 + ``` + + In the preceding example, **table1** and **table2** are excluded from the dumping. + +- -exclude-table-file=FILENAME + + Specifies the table files that do not need to be dumped. + + ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + Same as **-include-table-file**, the content format of this parameter is as follows: + schema1.table1 schema2.table2 …… + +- -x, -no-privileges|-no-acl + + Prevents the dumping of access permissions (grant/revoke commands). + +- -binary-upgrade + + Specifies a reserved port for function expansion. This parameter is not recommended. + +- -binary-upgrade-usermap="USER1=USER2" + + Specifies a reserved port for function expansion. This parameter is not recommended. + +- -column-inserts|-attribute-inserts + + Exports data by running the **INSERT** command with explicit column names **{INSERT INTO table (column, …) VALUES …}**. This will cause a slow restoration. However, since this option generates an independent command for each row, an error in reloading a row causes only the loss of the row rather than the entire table content. + +- -disable-dollar-quoting + + Disables the use of dollar sign ($) for function bodies, and forces them to be quoted using the SQL standard string syntax. + +- -disable-triggers + + Specifies a reserved port for function expansion. This parameter is not recommended. + +- -exclude-table-data=TABLE + + Does not dump data that matches any of table patterns. The pattern is interpreted according to the same rules as for **-t**. + + **-exclude-table-data** can be entered more than once to exclude tables matching any of several patterns. When you need the specified table definition rather than data in the table, this option is helpful. + + To exclude data of all tables in the database, see -schema-only. + +- -inserts + + Dumps data by the **INSERT** statement (rather than **COPY**). This will cause a slow restoration. + + However, since this option generates an independent command for each row, an error in reloading a row causes only the loss of the row rather than the entire table content. The restoration may fail if you rearrange the column order. The **-column-inserts** option is unaffected against column order changes, though even slower. + +- -no-security-labels + + Specifies a reserved port for function expansion. This parameter is not recommended. + +- -no-tablespaces + + Does not issue commands to select tablespaces. All the objects will be created during restoration, no matter which tablespace is selected when using this option. + + This parameter is used only for the plain-text format. For the archive format, you can specify the option when using **gs_restore**. + +- -no-unlogged-table-data + + Specifies a reserved port for function expansion. This parameter is not recommended. + +- -non-lock-table + + Specifies a reserved port for function expansion. This parameter is not recommended. + +- -include-alter-table + + Dumps deleted columns of tables. This option records deleted columns. + +- -quote-all-identifiers + + Forcibly quotes all identifiers. This parameter is useful when you dump a database for migration to a later version, in which additional keywords may be introduced. + +- -section=SECTION + + Specifies dumped name sections (pre-data, data, or post-data). + +- -serializable-deferrable + + Uses a serializable transaction for the dump to ensure that the used snapshot is consistent with later database status. Perform this operation at a time point in the transaction flow, at which everything is normal. This ensures successful transaction and avoids serialization failures of other transactions, which requires serialization again. + + This option has no benefits for disaster recovery. During the upgrade of the original database, loading a database as a report or loading other shared read-only dump is helpful. If the option does not exist, dump reveals a status which is different from the submitted sequence status of any transaction. + + This option will make no difference if there are no active read-write transactions when **gs_dump** is started. If the read-write transactions are in active status, the dump start time will be delayed for an uncertain period. + +- -use-set-session-authorization + + Specifies that the standard SQL **SET SESSION AUTHORIZATION** command rather than **ALTER OWNER** is returned to ensure the object ownership. This makes dumping more standard. However, if a dump file contains objects that have historical problems, restoration may fail. A dump using **SET SESSION AUTHORIZATION** requires the system administrator permissions, whereas **ALTER OWNER** requires lower permissions. + +- -with-encryption=AES128 + + Specifies that dumping data needs to be encrypted using AES128. + +- -with-key=KEY + + Specifies that the key length of AES128 must be 16 bytes. + +**NOTE:** + +When using the gs_dump tool for encrypted export, only plain format export is supported. The data exported through -F plain needs to be imported through the gsql tool, and if it is imported through encryption, the -with-key parameter must be specified when importing through gsql. + +- -include-depend-objs + + Includes information about the objects that depend on the specified object in the backup result. This parameter takes effect only if the **-t** or **-include-table-file** parameter is specified. + +- -exclude-self + + Excludes information about the specified object from the backup result. This parameter takes effect only if the **-t** or **-include-table-file** parameter is specified. + +- -dont-overwrite-file + + The existing files in plain-text, .tar, and custom formats will be overwritten. This option is not used for the directory format. + + For example: + + Assume that the **backup.sql** file exists in the current directory. If you specify **-f backup.sql** in the input command, and the **backup.sql** file is generated in the current directory, the original file will be overwritten. + + If the backup file already exists and **-dont-overwrite-file** is specified, an error will be reported with the message that the dump file exists. + + ```bash + gs_dump -p port_number mogdb -f backup.sql -F plain --dont-overwrite-file + ``` + +![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + +- The **-s/-schema-only** and **-a/-data-only** parameters do not coexist. +- The **-c/-clean** and **-a/-data-only** parameters do not coexist. +- **-inserts/-column-inserts** and **-o/-oids** do not coexist, because **OIDS** cannot be set using the **INSERT** statement. +- **-role** must be used in conjunction with **-rolepassword**. +- **-binary-upgrade-usermap** must be used in conjunction with **-binary-upgrade**. +- **-include-depend-objs** or **-exclude-self** takes effect only when **-t** or **-include-table-file** is specified. +- **-exclude-self** must be used in conjunction with **-include-depend-objs**. + +Connection parameters: + +- -h, -host=HOSTNAME + + Specifies the host name. If the value begins with a slash (/), it is used as the directory for the UNIX domain socket. The default value is taken from the **PGHOST** environment variable (if available). Otherwise, a Unix domain socket connection is attempted. + + This parameter is used only for defining names of the hosts outside MogDB. The names of the hosts inside MogDB must be 127.0.0.1. + + Example:**host name** + + Environment variable:**PGHOST** + +- -p, -port=PORT + + Specifies the host port number. If the thread pool function is enabled, you are advised to use **pooler port**, that is, the host port number plus 1. + + Environment variable:**PGPORT** + +- -U, -username=NAME + + Specifies the username of the host to be connected. + + If the username of the host to be connected is not specified, the system administrator is used by default. + + Environment variable:**PGUSER** + +- -w, -no-password + + Never issues a password prompt. The connection attempt fails if the host requires password verification and the password is not provided in other ways. This parameter is useful in batch jobs and scripts in which no user password is required. + +- -W, -password=PASSWORD + + Specifies the user password for connection. If the host uses the trust authentication policy, the administrator does not need to enter the **-W** option. If the **-W** option is not provided and you are not a system administrator, the Dump Restore tool will ask you to enter a password. + +- -role=ROLENAME + + Specifies a role name to be used for creating the dump. If this option is selected, the **SET ROLE** command will be issued after the database is connected to **gs_dump**. It is useful when the authenticated user (specified by **-U**) lacks the permissions required by **gs_dump**. It allows the user to switch to a role with the required permissions. Some installations have a policy against logging in directly as a super administrator. This option allows dumping data without violating the policy. + +- -rolepassword=ROLEPASSWORD + + Specifies the password for a role. + +![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + +If any local additions need to be added to the template1 database in MogDB, restore the output of **gs_dump** into an empty database with caution. Otherwise, you are likely to obtain errors due to duplicate definitions of the added objects. To create an empty database without any local additions, data from template0 rather than template1. Example: + +``` +CREATE DATABASE foo WITH TEMPLATE template0; +``` + +The .tar file size must be smaller than 8 GB. (This is the .tar file format limitations.) The total size of a .tar archive and any of the other output formats are not limited, except possibly by the OS. + +The dump file generated by **gs_dump** does not contain the statistics used by the optimizer to make execution plans. Therefore, you are advised to run **ANALYZE** after restoring from a dump file to ensure optimal performance. The dump file does not contain any **ALTER DATABASE … SET** commands. These settings are dumped by **gs_dumpall**, along with database users and other installation settings. + +### Examples + +Use **gs_dump** to dump a database as a SQL text file or a file in other formats. + +In the following examples, **Bigdata@123** indicates the password for the database user. **backup/MPPDB_backup.sql** indicates an exported file where **backup** indicates the relative path of the current directory. **37300** indicates the port number of the database server. **mogdb** indicates the name of the database to be accessed. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** Before exporting files, ensure that the directory exists and you have the read and write permissions on the directory. + +Example 1: Use **gs_dump** to export the full information of the mogdb database. The exported **MPPDB_backup.sql** file is in plain-text format. + +```bash +gs_dump -U omm -W Bigdata@123 -f backup/MPPDB_backup.sql -p 37300 mogdb -F p +gs_dump[port='37300'][mogdb][2018-06-27 09:49:17]: The total objects number is 356. +gs_dump[port='37300'][mogdb][2018-06-27 09:49:17]: [100.00%] 356 objects have been dumped. +gs_dump[port='37300'][mogdb][2018-06-27 09:49:17]: dump database mogdb successfully +gs_dump[port='37300'][mogdb][2018-06-27 09:49:17]: total time: 1274 ms +``` + +Use **gsql** to import data from the exported plain-text file. + +Example 2: Use **gs_dump** to export the full information of the mogdb database. The exported **MPPDB_backup.tar** file is in .tar format. + +```bash +gs_dump -U omm -W Bigdata@123 -f backup/MPPDB_backup.tar -p 37300 mogdb -F t +gs_dump[port='37300'][mogdb][2018-06-27 10:02:24]: The total objects number is 1369. +gs_dump[port='37300'][mogdb][2018-06-27 10:02:53]: [100.00%] 1369 objects have been dumped. +gs_dump[port='37300'][mogdb][2018-06-27 10:02:53]: dump database mogdb successfully +gs_dump[port='37300'][mogdb][2018-06-27 10:02:53]: total time: 50086 ms +``` + +Example 3: Use **gs_dump** to export the full information of the mogdb database. The exported **MPPDB_backup.dmp** file is in custom format. + +```bash +gs_dump -U omm -W Bigdata@123 -f backup/MPPDB_backup.dmp -p 37300 mogdb -F c +gs_dump[port='37300'][mogdb][2018-06-27 10:05:40]: The total objects number is 1369. +gs_dump[port='37300'][mogdb][2018-06-27 10:06:03]: [100.00%] 1369 objects have been dumped. +gs_dump[port='37300'][mogdb][2018-06-27 10:06:03]: dump database mogdb successfully +gs_dump[port='37300'][mogdb][2018-06-27 10:06:03]: total time: 36620 ms +``` + +Example 4: Use **gs_dump** to export the full information of the mogdb database. The exported **MPPDB_backup** file is in directory format. + +```bash +gs_dump -U omm -W Bigdata@123 -f backup/MPPDB_backup -p 37300 mogdb -F d +gs_dump[port='37300'][mogdb][2018-06-27 10:16:04]: The total objects number is 1369. +gs_dump[port='37300'][mogdb][2018-06-27 10:16:23]: [100.00%] 1369 objects have been dumped. +gs_dump[port='37300'][mogdb][2018-06-27 10:16:23]: dump database mogdb successfully +gs_dump[port='37300'][mogdb][2018-06-27 10:16:23]: total time: 33977 ms +``` + +Example 5: Use **gs_dump** to export the information of the mogdb database, excluding the information of the table specified in the **/home/MPPDB_temp.sql** file. The exported **MPPDB_backup.sql** file is in plain-text format. + +```bash +gs_dump -U omm -W Bigdata@123 -p 37300 mogdb --exclude-table-file=/home/MPPDB_temp.sql -f backup/MPPDB_backup.sql +gs_dump[port='37300'][mogdb][2018-06-27 10:37:01]: The total objects number is 1367. +gs_dump[port='37300'][mogdb][2018-06-27 10:37:22]: [100.00%] 1367 objects have been dumped. +gs_dump[port='37300'][mogdb][2018-06-27 10:37:22]: dump database mogdb successfully +gs_dump[port='37300'][mogdb][2018-06-27 10:37:22]: total time: 37017 ms +``` + +Example 6: Use **gs_dump** to export only the information about the views that depend on the **testtable** table. Create another **testtable** table, and then restore the views that depend on it. + +- Back up only the views that depend on the **testtable** table. + + ```bash + gs_dump -s -p 37300 mogdb -t PUBLIC.testtable --include-depend-objs --exclude-self -f backup/MPPDB_backup.sql -F p + gs_dump[port='37300'][mogdb][2018-06-15 14:12:54]: The total objects number is 331. + gs_dump[port='37300'][mogdb][2018-06-15 14:12:54]: [100.00%] 331 objects have been dumped. + gs_dump[port='37300'][mogdb][2018-06-15 14:12:54]: dump database mogdb successfully + gs_dump[port='37300'][mogdb][2018-06-15 14:12:54]: total time: 327 ms + ``` + +- Change the name of the **testtable** table. + + ``` + gsql -p 37300 mogdb -r -c "ALTER TABLE PUBLIC.testtable RENAME TO testtable_bak;" + ``` + +- Create another **testtable** table. + + ``` + CREATE TABLE PUBLIC.testtable(a int, b int, c int); + ``` + +- Restore the views for the new **testtable** table. + + ``` + gsql -p 37300 mogdb -r -f backup/MPPDB_backup.sql + ``` + +## gs_dumpall + +### Background + +**gs_dumpall**, provided by MogDB, is used to export all MogDB database information, including data of the default database mogdb, user-defined databases, and common global objects of all MogDB databases. + +**gs_dumpall** is executed by OS user **omm**. + +When **gs_dumpall** is used to export data, other users can still access (read and write) MogDB databases. + +**gs_dumpall** can export complete, consistent data. For example, if **gs_dumpall** is started to export MogDB database at T1, data of the database at that time point will be exported, and modifications on the database after that time point will not be exported. + +**gs_dumpall** exports all MogDB databases in two parts: + +- **gs_dumpall** exports all global objects, including information about database users and groups, tablespaces, and attributes (for example, global access permissions). +- **gs_dumpall** invokes **gs_dump** to export SQL scripts from each MogDB database, which contain all the SQL statements required to restore databases. + +The exported files are both plain-text SQL scripts. Use gsql to execute them to restore MogDB databases. + +### Precautions + +- Do not modify an exported file or its content. Otherwise, restoration may fail. +- To ensure the data consistency and integrity, **gs_dumpall** acquires a share lock on a table to be dumped. If another transaction has acquired a share lock on the table, **gs_dumpall** waits until this lock is released and then locks the table for dumping. If the table cannot be locked within the specified time, the dump fails. You can customize the timeout duration to wait for lock release by specifying the **-lock-wait-timeout** parameter. +- During an export, **gs_dumpall** reads all tables in a database. Therefore, you need to connect to the database as an MogDB administrator to export a complete file. When you use **gsql** to execute SQL scripts, cluster administrator permissions are also required to add users and user groups, and create databases. + +### Syntax + +```bash +gs_dumpall [OPTION]... +``` + +### Parameter Description + +Common parameters: + +- -f, -filename=FILENAME + + Sends the output to the specified file. If this parameter is omitted, the standard output is generated. + +- -v, -verbose + + Specifies the verbose mode. If it is specified, **gs_dumpall** writes detailed object comments and number of startups/stops to the dump file, and progress messages to standard error. + +- -V, -version + + Prints the **gs_dumpall** version and exits. + +- -lock-wait-timeout=TIMEOUT + + Do not keep waiting to obtain shared table locks at the beginning of the dump. Consider it as failed if you are unable to lock a table within the specified time. The timeout period can be specified in any of the formats accepted by **SET statement_timeout**. + +- -?, -help + + Displays help about **gs_dumpall** parameters and exits. + +Dump parameters: + +- -a, -data-only + + Dumps only the data, not the schema (data definition). + +- -c, -clean + + Runs SQL statements to delete databases before rebuilding them. Statements for dumping roles and tablespaces are added. + +- -g, -globals-only + + Dumps only global objects (roles and tablespaces) but no databases. + +- -o, -oids + + Dumps object identifiers (OIDs) as parts of the data in each table. Use this parameter if your application references the OID columns in some way. If the preceding situation does not occur, do not use this parameter. + +- -O, -no-owner + + Do not output commands to set ownership of objects to match the original database. By default, **gs_dumpall** issues the **ALTER OWNER** or **SET SESSION AUTHORIZATION** command to set ownership of created schema objects. These statements will fail when the script is running unless it is started by a system administrator (or the same user that owns all of the objects in the script). To make a script that can be stored by any user and give the user ownership of all objects, specify **-O**. + +- -r, -roles-only + + Dumps only roles but not databases or tablespaces. + +- -s, -schema-only + + Dumps only the object definition (schema) but not data. + +- -S, -sysadmin=NAME + + Name of the system administrator during the dump. + +- -t, -tablespaces-only + + Dumps only tablespaces but not databases or roles. + +- -x, -no-privileges + + Prevents the dumping of access permissions (grant/revoke commands). + +- -column-inserts|-attribute-inserts + + Exports data by running the **INSERT** command with explicit column names **{INSERT INTO table (column, …) VALUES …}**. This will cause a slow restoration. However, since this option generates an independent command for each row, an error in reloading a row causes only the loss of the row rather than the entire table content. + +- -disable-dollar-quoting + + Disables the use of dollar sign ($) for function bodies, and forces them to be quoted using the SQL standard string syntax. + +- -disable-triggers + + Specifies a reserved port for function expansion. This parameter is not recommended. + +- -inserts + + Dumps data by the **INSERT** statement (rather than **COPY**). This will cause a slow restoration. The restoration may fail if you rearrange the column order. The **-column-inserts** option is unaffected against column order changes, though even slower. + +- -no-security-labels + + Specifies a reserved port for function expansion. This parameter is not recommended. + +- -no-tablespaces + + Does not generate output statements to create tablespaces or select tablespaces for objects. All the objects will be created during the restoration process, no matter which tablespace is selected when using this option. + +- -no-unlogged-table-data + + Specifies a reserved port for function expansion. This parameter is not recommended. + +- -quote-all-identifiers + + Forcibly quotes all identifiers. This parameter is useful when you dump a database for migration to a later version, in which additional keywords may be introduced. + +- -dont-overwrite-file + + Does not overwrite the current file. + +- -use-set-session-authorization + + Specifies that the standard SQL **SET SESSION AUTHORIZATION** command rather than **ALTER OWNER** is returned to ensure the object ownership. This makes dumping more standard. However, if a dump file contains objects that have historical problems, restoration may fail. A dump using **SET SESSION AUTHORIZATION** requires the system administrator rights, whereas **ALTER OWNER** requires lower permissions. + +- -with-encryption=AES128 + + Specifies that dumping data needs to be encrypted using AES128. + +- -with-key=KEY + + Specifies that the key length of AES128 must be 16 bytes. + +- -include-templatedb + + Includes template databases during the dump. + +- -binary-upgrade + + Specifies a reserved port for function expansion. This parameter is not recommended. + +- -binary-upgrade-usermap="USER1=USER2" + + Specifies a reserved port for function expansion. This parameter is not recommended. + +- -tablespaces-postfix + + Specifies a reserved port for function expansion. This parameter is not recommended. + +- -parallel-jobs + + Specifies the number of concurrent backup processes. The value range is 1-1000. + +![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + +- The **-g/-globals-only** and **-r/-roles-only** parameters do not coexist. +- The **-g/-globals-only** and **-t/-tablespaces-only** parameters do not coexist. +- The **-r/-roles-only** and **-t/-tablespaces-only** parameters do not coexist. +- The **-s/-schema-only** and **-a/-data-only** parameters do not coexist. +- The **-r/-roles-only** and **-a/-data-only** parameters do not coexist. +- The **-t/-tablespaces-only** and **-a/-data-only** parameters do not coexist. +- The **-g/-globals-only** and **-a/-data-only** parameters do not coexist. +- **-tablespaces-postfix** must be used in conjunction with **-binary-upgrade**. +- **-binary-upgrade-usermap** must be used in conjunction with **-binary-upgrade**. +- **-parallel-jobs** must be used in conjunction with **-f/-file**. + +Connection parameters: + +- -h, -host + + Specifies the host name. If the value begins with a slash (/), it is used as the directory for the UNIX domain socket. The default value is taken from the PGHOST environment (if variable). Otherwise, a Unix domain socket connection is attempted. + + This parameter is used only for defining names of the hosts outside MogDB. The names of the hosts inside MogDB must be 127.0.0.1. + + Environment Variable:**PGHOST** + +- -l, -database + + Specifies the name of the database connected to dump all objects and discover other databases to be dumped. If this parameter is not specified, the **mogdb** database will be used. If the **mogdb** database does not exist, **template1** will be used. + +- -p, -port + + TCP port or the local Unix-domain socket file extension on which the server is listening for connections. The default value is the *PGPORT* environment variable. + + If the thread pool function is enabled, you are advised to use **pooler port**, that is, the listening port number plus 1. + + Environment variable:**PGPORT** + +- -U, -username + + Specifies the user name to connect to. + + Environment variable:**PGUSER** + +- -w, -no-password + + Never issues a password prompt. The connection attempt fails if the host requires password verification and the password is not provided in other ways. This parameter is useful in batch jobs and scripts in which no user password is required. + +- -W, -password + + Specifies the user password for connection. If the host uses the trust authentication policy, the administrator does not need to enter the **-W** option. If the **-W** option is not provided and you are not a system administrator, the Dump Restore tool will ask you to enter a password. + +- -role + + Specifies a role name to be used for creating the dump. This option causes **gs_dumpall** to issue the **SET ROLE** statement after connecting to the database. It is useful when the authenticated user (specified by **-U**) lacks the permissions required by the **gs_dumpall**. It allows the user to switch to a role with the required permissions. Some installations have a policy against logging in directly as a system administrator. This option allows dumping data without violating the policy. + +- -rolepassword + + Specifies the password of the specific role. + +### Notice + +**gs_dumpall** internally invokes **gs_dump**. For details about the diagnosis information, see gs_dump. + +Once **gs_dumpall** is restored, run ANALYZE on each database so that the optimizer can provide useful statistics. + +**gs_dumpall** requires all needed tablespace directories to exit before the restoration. Otherwise, database creation will fail if the databases are in non-default locations. + +### Examples + +Use **gs_dumpall** to export all MogDB databases at a time. + +![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** **gs_dumpall** supports only plain-text format export. Therefore, only **gsql** can be used to restore a file exported using **gs_dumpall**. + +```bash +gs_dumpall -f backup/bkp2.sql -p 37300 +gs_dump[port='37300'][dbname='mogdb'][2018-06-27 09:55:09]: The total objects number is 2371. +gs_dump[port='37300'][dbname='mogdb'][2018-06-27 09:55:35]: [100.00%] 2371 objects have been dumped. +gs_dump[port='37300'][dbname='mogdb'][2018-06-27 09:55:46]: dump database dbname='mogdb' successfully +gs_dump[port='37300'][dbname='mogdb'][2018-06-27 09:55:46]: total time: 55567 ms +gs_dumpall[port='37300'][2018-06-27 09:55:46]: dumpall operation successful +gs_dumpall[port='37300'][2018-06-27 09:55:46]: total time: 56088 ms +``` + +## gs_restore + +### Background + +**gs_restore**, provided by MogDB, is used to import data that was exported using **gs_dump**. It can also be used to import files exported by **gs_dump**. + +**gs_restore** is executed by OS user **omm**. + +It has the following functions: + +- Importing data to the database + + If a database is specified, data is imported to the database. For parallel import, the password for connecting to the database is required. + +- Importing data to the script file + + If the database storing imported data is not specified, a script containing the SQL statement to recreate the database is created and written to a file or standard output. This script output is equivalent to the plain text output format of **gs_dump**. + +### Command Format + +``` +gs_restore [OPTION]... FILE +``` + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> +> - **FILE** does not have a short or long parameter. It is used to specify the location for the archive files. +> - The **dbname** or **-l** parameter is required as prerequisites. Users cannot enter **dbname** and **-l** parameters at the same time. +> - **gs_restore** incrementally imports data by default. To prevent data exceptions caused by multiple import operations, you are advised to use the **-c** parameter during the import. Before recreating database objects, delete the database objects that already exist in the database to be restored. +> - There is no option to control log printing. To hide logs, redirect the logs to the log file. If a large amount of table data needs to be restored, the table data will be restored in batches. Therefore, the log indicating that the table data has been imported is generated for multiple times. + +### Parameter Description + +Common parameters + +- -d, -dbname=NAME + + Connects to the **dbname** database and imports data to the database. + +- -f, -file=FILENAME + + Specifies the output file for the generated script, or uses the output file in the list specified using **-l**. + + The default is the standard output. + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + > **-f** cannot be used in conjunction with **-d**. + +- -F, -format=c|d|t + + Specifies the format of the archive. The format does not need to be specified because the **gs_restore** determines the format automatically. + + Value range: + + - **c/custom**: The archive form is the customized format in gs_dump. + - **d/directory**: The archive form is a directory archive format. + - **t/tar**: The archive form is a .tar archive format. + +- -l, -list + + Lists the forms of the archive. The operation output can be used for the input of the **-L** parameter. If filtering parameters, such as **-n** or **-t**, are used together with **-l**, they will restrict the listed items. + +- -v, -verbose + + Specifies the verbose mode. + +- -V, -version + + Prints the **gs_restore** version and exits. + +- -?, -help + + Displays help information about the parameters of **gs_restore** and exits. + +Parameters for importing data + +- -a, -data-only + + Imports only the data, not the schema (data definition). **gs_restore** incrementally imports data. + +- -c, -clean + + Cleans (deletes) existing database objects in the database to be restored before recreating them. + +- -C, -create + + Creates the database before importing data to it. (When this parameter is used, the database specified by **-d** is used to issue the initial **CREATE DATABASE** command. All data is imported to the created database.) + +- -e, -exit-on-error + + Exits if an error occurs when you send the SQL statement to the database. If you do not exit, the commands will still be sent and error information will be displayed when the import ends. + +- -I, -index=NAME + + Imports only the definition of the specified index. Multiple indexes can be imported. Enter **-I** _index_ multiple times to import multiple indexes. + + For example: + + ``` + gs_restore -h host_name -p port_number -d mogdb -I Index1 -I Index2 backup/MPPDB_backup.tar + ``` + + In this example, **Index1** and **Index2** will be imported. + +- -j, -jobs=NUM + + Specifies the number of concurrent, the most time-consuming jobs of **gs_restore** (such as loading data, creating indexes, or creating constraints). This parameter can greatly reduce the time to import a large database to a server running on a multiprocessor machine. + + Each job is one process or one thread, depending on the OS; and uses a separate connection to the server. + + The optimal value for this option depends on the server hardware setting, the client, the network, the number of CPU cores, and disk settings. It is recommended that the parameter be set to the number of CPU cores on the server. In addition, a larger value can also lead to faster import in many cases. However, an overly large value will lead to decreased performance because of thrashing. + + This parameter supports custom-format archives only. The input file must be a regular file (not the pipe file). This parameter can be ignored when you select the script method rather than connect to a database server. In addition, multiple jobs cannot be used in conjunction with the **-single-transaction** parameter. + +- -L, -use-list=FILENAME + + Imports only archive elements that are listed in **list-file** and imports them in the order that they appear in the file. If filtering parameters, such as **-n** or **-t**, are used in conjunction with **-L**, they will further limit the items to be imported. + + **list-file** is normally created by editing the output of a previous **-l** parameter. File lines can be moved or removed, and can also be commented out by placing a semicolon (;) at the beginning of the row. + +- -n, -schema=NAME + + Restores only objects that are listed in schemas. + + This parameter can be used in conjunction with the **-t** parameter to import a specific table. + + Entering **-n schemaname** multiple times can import multiple schemas. + + For example: + + ``` + gs_restore -h host_name -p port_number -d mogdb -n sch1 -n sch2 backup/MPPDB_backup.tar + ``` + + In this example, **sch1** and **sch2** will be imported. + +- -O, -no-owner + + Do not output commands to set ownership of objects to match the original database. By default, **gs_restore** issues the **ALTER OWNER** or **SET SESSION AUTHORIZATION** statement to set ownership of created schema elements. Unless the system administrator or the user who has all the objects in the script initially accesses the database. Otherwise, the statement will fail. Any user name can be used for the initial connection using **-O**, and this user will own all the created objects. + +- -P, -function=NAME(args) + + Imports only listed functions. You need to correctly spell the function name and the parameter based on the contents of the dump file in which the function exists. + + Entering **-P** alone means importing all function-name(args) functions in a file. Entering **-P** with **-n** means importing the function-name(args) functions in a specified schema. Entering **-P** multiple times and using **-n** once means that all imported functions are in the **-n** schema by default. + + You can enter **-n schema-name -P 'function-name(args)'** multiple times to import functions in specified schemas. + + For example: + + ``` + gs_restore -h host_name -p port_number -d mogdb -n test1 -P 'Func1(integer)' -n test2 -P 'Func2(integer)' backup/MPPDB_backup.tar + ``` + + In this example, both **Func1 (i integer)** in the **test1** schema and **Func2 (j integer)** in the **test2** schema will be imported. + +- -s, -schema-only + + Imports only schemas (data definitions), instead of data (table content). The current sequence value will not be imported. + +- -S, -sysadmin=NAME + + Specifies a reserved port for function expansion. This parameter is not recommended. + +- -t, -table=NAME + + Imports only listed table definitions or data, or both. This parameter can be used in conjunction with the **-n** parameter to specify a table object in a schema. When **-n** is not entered, the default schema is PUBLIC. Entering **-n schemaname -t tablename** multiple times can import multiple tables in a specified schema. + + For example: + + Import **table1** in the **PUBLIC** schema. + + ``` + gs_restore -h host_name -p port_number -d mogdb -t table1 backup/MPPDB_backup.tar + ``` + + Import **test1** in the **test1** schema and **test2** in the **test2** schema. + + ``` + gs_restore -h host_name -p port_number -d mogdb -n test1 -t test1 -n test2 -t test2 backup/MPPDB_backup.tar + ``` + + Import **table1** in the **PUBLIC** schema and **test1** in the **test1** schema. + + ``` + gs_restore -h host_name -p port_number -d mogdb -n PUBLIC -t table1 -n test1 -t table1 backup/MPPDB_backup.tar + ``` + +- -T, -trigger=NAME + + This parameter is reserved for extension. + +- -x, -no-privileges/-no-acl + + Prevents the import of access permissions (**GRANT**/**REVOKE** commands). + +- -1, -single-transaction + + Executes import as a single transaction (that is, commands are wrapped in **BEGIN**/**COMMIT**). + + This parameter ensures that either all the commands are completed successfully or no application is changed. This parameter means **-exit-on-error**. + +- -disable-triggers + + Specifies a reserved port for function expansion. This parameter is not recommended. + +- -no-data-for-failed-tables + + By default, table data will be imported even if the statement to create a table fails (for example, the table already exists). Data in such table is skipped using this parameter. This operation is useful if the target database already contains the desired table contents. + + This parameter takes effect only when you import data directly into a database, not when you output SQL scripts. + +- -no-security-labels + + Specifies a reserved port for function expansion. This parameter is not recommended. + +- -no-tablespaces + + Tablespaces excluding specified ones All objects will be created during the import process no matter which tablespace is selected when using this option. + +- -section=SECTION + + Imports the listed sections (such as pre-data, data, or post-data). + +- -use-set-session-authorization + + Is used for plain-text backup. + + Outputs the **SET SESSION AUTHORIZATION** statement instead of the **ALTER OWNER** statement to determine object ownership. This parameter makes dump more standards-compatible. If the records of objects in exported files are referenced, import may fail. Only administrators can use the **SET SESSION AUTHORIZATION** statement to dump data, and the administrators must manually change and verify the passwords of exported files by referencing the **SET SESSION AUTHORIZATION** statement before import. The **ALTER OWNER** statement requires lower permissions. + +- -with-key=KEY + + Specifies that the key length of AES128 must be 16 bytes. + + ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** If the dump is encrypted, enter the **-with-key=KEY** parameter in the **gs_restore** command. If it is not entered, you will receive an error message. Enter the same key while entering the dump. When the dump format is **c** or **t**, the dumped content has been processed, and therefore the input is not restricted by the encryption. + +![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-notice.gif)**NOTICE:** + +- If any local additions need to be added to the template1 database during the installation, restore the output of **gs_restore** into an empty database with caution. Otherwise, you are likely to obtain errors due to duplicate definitions of the added objects. To create an empty database without any local additions, data from template0 rather than template1. Example: + + ``` + CREATE DATABASE foo WITH TEMPLATE template0; + ``` + +- **gs_restore** cannot import large objects selectively. For example, it can only import the objects of a specified table. If an archive contains large objects, all large objects will be imported, or none of them will be restored if they are excluded by using **-L**, **-t**, or other parameters. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> +> 1. The **-d/-dbname** and **-f/-file** parameters do not coexist. +> 2. The **-s/-schema-only** and **-a/-data-only** parameters do not coexist. +> 3. The **-c/-clean** and **-a/-data-only** parameters do not coexist. +> 4. When **-single-transaction** is used, **-j/-jobs** must be a single job. +> 5. **-role** must be used in conjunction with **-rolepassword**. + +Connection parameters: + +- -h, -host=HOSTNAME + + Specifies the host name. If the value begins with a slash (/), it is used as the directory for the UNIX domain socket. The default value is taken from the *PGHOST* environment variable. If it is not set, a UNIX domain socket connection is attempted. + + This parameter is used only for defining names of the hosts outside MogDB. The names of the hosts inside MogDB must be 127.0.0.1. + +- -p, -port=PORT + + TCP port or the local Unix-domain socket file extension on which the server is listening for connections. The default value is the *PGPORT* environment variable. + + If the thread pool function is enabled, you are advised to use **pooler port**, that is, the listening port number plus 1. + +- -U, -username=NAME + + Specifies the user name to connect to. + +- -w, -no-password + + Never issues a password prompt. The connection attempt fails if the host requires password verification and the password is not provided in other ways. This parameter is useful in batch jobs and scripts in which no user password is required. + +- -W, -password=PASSWORD + + User password for database connection. If the host uses the trust authentication policy, the administrator does not need to enter the **-W** parameter. If the **-W** parameter is not provided and you are not a system administrator, **gs_restore** will ask you to enter a password. + +- -role=ROLENAME + + Specifies a role name for the import operation. If this parameter is selected, the **SET ROLE** statement will be issued after **gs_restore** connects to the database. It is useful when the authenticated user (specified by **-U**) lacks the permissions required by **gs_restore**. This parameter allows the user to switch to a role with the required permissions. Some installations have a policy against logging in directly as the initial user. This parameter allows data to be imported without violating the policy. + +- -rolepassword=ROLEPASSWORD + + Role password. + +### Example + +Special case: Execute the **gsql** tool. Run the following commands to import the **MPPDB_backup.sql** file in the export folder (in plain-text format) generated by **gs_dump**/**gs_dumpall** to the **mogdb** database: + +```bash +gsql -d mogdb -p 5432 -W Bigdata@123 -f /home/omm/test/MPPDB_backup.sql +SET +SET +SET +SET +SET +ALTER TABLE +ALTER TABLE +ALTER TABLE +ALTER TABLE +ALTER TABLE +CREATE INDEX +CREATE INDEX +CREATE INDEX +SET +CREATE INDEX +REVOKE +REVOKE +GRANT +GRANT +total time: 30476 ms +``` + +**gs_restore** is used to import the files exported by **gs_dump**. + +Example 1: Execute the **gs_restore** tool to import the exported **MPPDB_backup.dmp** file (custom format) to the **mogdb** database. + +```bash +gs_restore -W Bigdata@123 backup/MPPDB_backup.dmp -p 5432 -d mogdb +gs_restore: restore operation successful +gs_restore: total time: 13053 ms +``` + +Example 2: Execute the **gs_restore** tool to import the exported **MPPDB_backup.tar** file (.tar format) to the **mogdb** database. + +```bash +gs_restore backup/MPPDB_backup.tar -p 5432 -d mogdb +gs_restore[2017-07-21 19:16:26]: restore operation successful +gs_restore[2017-07-21 19:16:26]: total time: 21203 ms +``` + +Example 3: Execute the **gs_restore** tool to import the exported **MPPDB_backup** file (directory format) to the **mogdb** database. + +```bash +gs_restore backup/MPPDB_backup -p 5432 -d mogdb +gs_restore[2017-07-21 19:16:26]: restore operation successful +gs_restore[2017-07-21 19:16:26]: total time: 21003 ms +``` + +Example 4: Execute the **gs_restore** tool and run the following commands to import the **MPPDB_backup.dmp** file (in custom format). Specifically, import all the object definitions and data in the **PUBLIC** schema. Existing objects are deleted from the target database before the import. If an existing object references to an object in another schema, you need to manually delete the referenced object first. + +```bash +gs_restore backup/MPPDB_backup.dmp -p 5432 -d mogdb -e -c -n PUBLIC +gs_restore: [archiver (db)] Error while PROCESSING TOC: +gs_restore: [archiver (db)] Error from TOC entry 313; 1259 337399 TABLE table1 gaussdba +gs_restore: [archiver (db)] could not execute query: ERROR: cannot drop table table1 because other objects depend on it +DETAIL: view t1.v1 depends on table table1 +HINT: Use DROP ... CASCADE to drop the dependent objects too. + Command was: DROP TABLE public.table1; +``` + +Manually delete the referenced object and create it again after the import is complete. + +```bash +gs_restore backup/MPPDB_backup.dmp -p 5432 -d mogdb -e -c -n PUBLIC +gs_restore[2017-07-21 19:16:26]: restore operation successful +gs_restore[2017-07-21 19:16:26]: total time: 2203 ms +``` + +Example 5: Execute the **gs_restore** tool and run the following commands to import the **MPPDB_backup.dmp** file (in custom format). Specifically, import only the definition of **table1** in the **PUBLIC** schema. + +```bash +gs_restore backup/MPPDB_backup.dmp -p 5432 -d mogdb -e -c -s -n PUBLIC -t table1 +gs_restore[2017-07-21 19:16:26]: restore operation successful +gs_restore[2017-07-21 19:16:26]: total time: 21000 ms +``` + +Example 6: Execute the **gs_restore** tool and run the following commands to import the **MPPDB_backup.dmp** file (in custom format). Specifically, import only the data of **table1** in the **PUBLIC** schema. + +```bash +gs_restore backup/MPPDB_backup.dmp -p 5432 -d mogdb -e -a -n PUBLIC -t table1 +gs_restore[2017-07-21 19:16:26]: restore operation successful +gs_restore[2017-07-21 19:16:26]: total time: 20203 ms +``` + +## gs_backup + +### Background + +After MogDB is deployed, problems and exceptions may occur during database running. **gs_backup**, provided by MogDB, is used to help MogDB backup, restore important data, display help information and version information. + +### Prerequisites + +- The MogDB database can be connected. +- During the restoration, backup files exist in the backup directory on all the nodes. If backup files are lost on any node, copy them to it from another node. For binary files, you need to change the node name in the file name. +- You need to execute **gs_backup** command as OS user **omm**. + +### Syntax + +- Backup database host + + ```bash + gs_backup -t backup --backup-dir=BACKUPDIR [-h HOSTNAME] [--parameter] [--binary] [--all] [-l LOGFILE] + ``` + +- Restore database host + + ```bash + gs_backup -t restore --backup-dir=BACKUPDIR [-h HOSTNAME] [--parameter] [--binary] [--all] [-l LOGFILE] + ``` + +- Display help information + + ```bash + gs_backup -? | --help + ``` + +- Display version information + + ```bash + gs_backup -V | --version + ``` + +### Parameter Description + +The **gs_backup** tool can use the following types of parameters: + +- Backup database host parameters: + + - -h + + Specifies the name of the host where the backup file is stored. + + Range of values: host name. If the host name is not specified, it is distributed to MogDB. + + - -backup-dir=BACKUPDIR + + The path to save the backup file. + + - -parameter + + Back up the parameter file, only the parameter file is backed up by default if the **-parameter**, **-binary** and **-all** parameters are not specified. + + - -binary + + Back up the binary file. + + - -all + + Back up binary and parameter files. + + - -l + + Specify the log file and its storage path. + + Default value: $GAUSSLOG/om/gs_backup-YYYY-MM-DD_hhmmss.log + +- Restore database host parameters: + + - -h + + Specify the name of the host to be recovered. + + Range of values: host name. If no host is specified, MogDB is restored. + + - -backup-dir=BACKUPDIR + + Recover file extraction path. + + - -parameter + + Recover the parameter file, only the parameter file is recovered by default if the **-parameter**, **-binary** and **-all** parameters are not specified. + + - -binary + + Recover the binary file. + + - -all + + Recover binary and parameter files. + + - -l + + Specify the log file and its storage path. + + Default value: $GAUSSLOG/om/gs_backup-YYYY-MM-DD_hhmmss.log + +- Other parameters: + + - -?, -help + + Display help information. + + - -V, -version + + Display version information. + +### Examples + +- Use the **gs_backup** script to backup the database host. + + ```bash + gs_backup -t backup --backup-dir=/opt/software/mogdb/backup_dir -h plat1 --parameter + Backing up MogDB. + Parsing configuration files. + Successfully parsed the configuration file. + Performing remote backup. + Remote backup succeeded. + Successfully backed up MogDB. + ``` + +- Use the **gs_backup** script to restore the database host. + + ```bash + gs_backup -t restore --backup-dir=/opt/software/mogdb/backup_dir -h plat1 --parameter + Restoring MogDB. + Parsing the configuration file. + Successfully parsed configuration files. + Performing remote restoration. + Remote restoration succeeded. + Successfully restored MogDB. + ``` diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/br/1-4-br.md b/product/en/docs-mogdb/v3.0/administrator-guide/br/1-4-br.md new file mode 100644 index 0000000000000000000000000000000000000000..b8f94a31ff72cbb402220878a7dffd47d496cbcb --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/br/1-4-br.md @@ -0,0 +1,203 @@ +--- +title: Flashback Restoration +summary: Flashback Restoration +author: Guo Huan +date: 2021-10-12 +--- + +# Flashback Restoration + +Flashback restoration is a part of the database recovery technology. It can be used to selectively cancel the impact of a committed transaction and restore data from incorrect manual operations. Before the flashback technology is used, the committed database modification can be retrieved only by means of restoring backup and PITR. The restoration takes several minutes or even hours. After the flashback technology is used, it takes only seconds to restore the submitted data before the database is modified. The restoration time is irrelevant to the database size. + +**Flashback supports two restoration modes:** + +- MVCC-based multi-version data restoration (only Ustore is supported): It is suitable for query and restoration of data deleted, updated and inserted by mistake. Users can configure the retention time of the old version and execute the corresponding query or restore command to query or restore data to Specified point in time or CSN point. +- Restoration based on database recycle bin (only Astore is supported): It is suitable for the restoration of tables that are DROP and TRUNCATE by mistake. By configuring the recycle bin switch and executing the corresponding restore command, the user can retrieve the tables that were erroneously DROP and TRUNCATE. + +**Related parameters:** + +- undo_zone_count=16384 + + The number of undo zones that can be allocated in the memory, 0 means to disable the undo and Ustore tables, the recommended value is max_connections*4 + +- enable_default_ustore_table=on + + Enable default support for Ustore storage engine + +- version_retention_age=10000 + + The number of transactions retained by the old version, the old version that exceeds the number of transactions will be recycled and cleaned up + +- enable_recyclebin=on + + enable recycle bin + +- recyclebin_retention_time=15min + + Set the recycle bin object retention time, and the recycle bin objects that exceed this time will be automatically cleaned up + +
+ +## Flashback Query + +### Context + +Flashback query enables you to query a snapshot of a table at a certain time point in the past. This feature can be used to view and logically rebuild damaged data that is accidentally deleted or modified. The flashback query is based on the MVCC mechanism. You can retrieve and query the old version to obtain the data of the specified old version. + +### Syntax + +```ebnf+diagram +FlashBack ::= {[ ONLY ] table_name [ * ] [ partition_clause ] [ [ AS ] alias [ ( column_alias [, ...] ) ] ] +[ TABLESAMPLE sampling_method ( argument [, ...] ) [ REPEATABLE ( seed ) ] ] +[TIMECAPSULE { TIMESTAMP | CSN } expression ] +|( select ) [ AS ] alias [ ( column_alias [, ...] ) ] +|with_query_name [ [ AS ] alias [ ( column_alias [, ...] ) ] ] +|function_name ( [ argument [, ...] ] ) [ AS ] alias [ ( column_alias [, ...] | column_definition [, ...] ) ] +|function_name ( [ argument [, ...] ] ) AS ( column_definition [, ...] ) +|from_item [ NATURAL ] join_type from_item [ ON join_condition | USING ( join_column [, ...] ) ]} +``` + +In the syntax tree, **TIMECAPSULE {TIMESTAMP | CSN} expression** is a new expression for the flashback function. **TIMECAPSULE** indicates that the flashback function is used. **TIMESTAMP** and **CSN** indicate that the flashback function uses specific time point information or commit sequence number (CSN) information. + +### Parameter Description + +- TIMESTAMP + - Specifies a history time point of the table data to be queried. +- CSN + - Specifies a logical commit time point of the data in the entire database to be queried. Each CSN in the database represents a consistency point of the entire database. To query the data under a CSN means to query the data related to the consistency point in the database through SQL statements. + +### Examples + +- Example 1: + + ```sql + SELECT * FROM t1 TIMECAPSULE TIMESTAMP to_timestamp ('2020-02-11 10:13:22.724718', 'YYYY-MM-DD HH24:MI:SS.FF'); + ``` + +- Example 2: + + ```sql + SELECT * FROM t1 TIMECAPSULE CSN 9617; + ``` + +- Example 3: + + ```sql + SELECT * FROM t1 AS t TIMECAPSULE TIMESTAMP to_timestamp ('2020-02-11 10:13:22.724718', 'YYYY-MM-DD HH24:MI:SS.FF'); + ``` + +- Example 4: + + ```sql + SELECT * FROM t1 AS t TIMECAPSULE CSN 9617; + ``` + +## Flashback Table + +### Context + +Flashback table enables you to restore a table to a specific point in time. When only one table or a group of tables are logically damaged instead of the entire database, this feature can be used to quickly restore the table data. Based on the MVCC mechanism, the flashback table deletes incremental data at a specified time point and after the specified time point and retrieves the data deleted at the specified time point and the current time point to restore table-level data. + +### Syntax + +```ebnf+diagram +FlashBack ::= TIMECAPSULE TABLE table_name TO { TIMESTAMP | CSN } expression +``` + +### Examples + +```sql +TIMECAPSULE TABLE t1 TO TIMESTAMP to_timestamp ('2020-02-11 10:13:22.724718', 'YYYY-MM-DD HH24:MI:SS.FF'); +TIMECAPSULE TABLE t1 TO CSN 9617; +``` + +## Flashback DROP/TRUNCATE + +### Context + +Flashback drop enables you to restore tables that are dropped by mistake and their auxiliary structures, such as indexes and table constraints, from the recycle bin. Flashback drop is based on the recycle bin mechanism. You can restore physical table files recorded in the recycle bin to restore dropped tables. + +Flashback truncate enables you to restore tables that are truncated by mistake and restore the physical data of the truncated tables and indexes from the recycle bin. Flashback truncate is based on the recycle bin mechanism. You can restore physical table files recorded in the recycle bin to restore truncated tables. + +### Prerequisites + +- The **enable_recyclebin** parameter has been set for enabling the recycle bin. +- The **recyclebin_retention** parameter has been set for specifying the retention period of objects in the recycle bin. The objects will be automatically deleted after the retention period expires. + +### Syntax + +- Drop a table. + + ```ebnf+diagram + DropTable ::= DROP TABLE table_name [PURGE] + ``` + +- Purge objects in the recycle bin. + + ```ebnf+diagram + PurgeRecyclebin ::= PURGE { TABLE { table_name } + | INDEX { index_name } + | RECYCLEBIN + } + ``` + +- Flash back a dropped table. + + ```ebnf+diagram + TimecapsuleTable ::= TIMECAPSULE TABLE { table_name } TO BEFORE DROP [RENAME TO new_tablename] + ``` + +- Truncate a table. + + ```ebnf+diagram + TruncateTable ::= TRUNCATE TABLE { table_name } [ PURGE ] + ``` + +- Flash back a truncated table. + + ```ebnf+diagram + TimecapsuleTable ::= TIMECAPSULE TABLE { table_name } TO BEFORE TRUNCATE + ``` + +### Parameter Description + +- DROP/TRUNCATE TABLE table_name PURGE + - Purges table data in the recycle bin by default. +- PURGE RECYCLEBIN + - Purges objects in the recycle bin. +- **TO BEFORE DROP** + +Retrieves dropped tables and their subobjects from the recycle bin. + +You can specify either the original user-specified name of the table or the system-generated name assigned to the object when it was dropped. + +- System-generated recycle bin object names are unique. Therefore, if you specify the system-generated name, the database retrieves that specified object. To see the contents of your recycle bin, run **select \* from pg_recyclebin;**. +- If you specify the user-specified name and the recycle bin contains more than one object of that name, the database retrieves the object that was moved to the recycle bin most recently. If you want to retrieve an older version of the table, then do one of these things: + - Specify the system-generated recycle bin name of the table you want to retrieve. + - Run **TIMECAPSULE TABLE ... TO BEFORE DROP** statements until you retrieve the table you want. + - When a dropped table is restored, only the base table name is restored, and the names of other subobjects remain the same as those in the recycle bin. You can run the DDL command to manually change the names of subobjects as required. + - The recycle bin does not support write operations such as DML, DCL, and DDL, and does not support DQL query operations (supported in later versions). +- **RENAME TO** + +Specifies a new name for the table retrieved from the recycle bin. + +- **TO BEFORE TRUNCATE** + +Flashes back to the point in time before the TRUNCATE operation. + +### Syntax Example + +```sql +DROP TABLE t1 PURGE; + +PURGE TABLE t1; +PURGE TABLE "BIN$04LhcpndanfgMAAAAAANPw==$0"; +PURGE INDEX i1; +PURGE INDEX "BIN$04LhcpndanfgMAAAAAANPw==$0"; +PURGE RECYCLEBIN; + +TIMECAPSULE TABLE t1 TO BEFORE DROP; +TIMECAPSULE TABLE t1 TO BEFORE DROP RENAME TO new_t1; +TIMECAPSULE TABLE "BIN$04LhcpndanfgMAAAAAANPw==$0" TO BEFORE DROP; +TIMECAPSULE TABLE "BIN$04LhcpndanfgMAAAAAANPw==$0" TO BEFORE DROP RENAME TO new_t1; +``` diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/column-store-tables-management.md b/product/en/docs-mogdb/v3.0/administrator-guide/column-store-tables-management.md new file mode 100644 index 0000000000000000000000000000000000000000..8e210216c483c34a7d97c478cc2b4d389ada9701 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/column-store-tables-management.md @@ -0,0 +1,386 @@ +--- +title: Column-store Tables Management +summary: Column-store Tables Management +author: Guo Huan +date: 2021-04-09 +--- + +# Column-store Tables Management + +## What is Column-store + +Row-store stores tables to disk partitions by row, and column-store stores tables to disk partitions by column. By default, a row-store table is created. For details about differences between row storage and column storage, see Figure 1. + +**Figure 1** Differences between row storage and column storage + +![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/column-store-tables-management.png) + +In the preceding figure, the upper left part is a row-store table, and the upper right part shows how the row-store table is stored on a disk; the lower left part is a column-store table, and the lower right part shows how the column-store table is stored on a disk. From the above figure, you can clearly see that the data of a row-store table are put together, but they are kept separately in column-store table. + +## Advantages and Disadvantages of Row-store and Column-store Tables and Their Usage Scenario + +Both storage models have benefits and drawbacks. + +| Storage Model | Benefit | Drawback | +| :------------- | :----------------------------------------------------------- | :----------------------------------------------------------- | +| Row storage | Record data is stored together. Data can be easily inserted and updated. | All the columns of a record are read after the **SELECT** statement is executed even if only certain columns are required. | +| Column storage | Only the columns involved in a query are read. Projections are efficient. Any column can serve as an index. | The selected columns need to be reconstructed after the **SELECT** statement is executed. Data cannot be easily inserted or updated. | + +Generally, if a table contains many columns (called a wide table) and its query involves only a few columns, column storage is recommended. Row storage is recommended if a table contains only a few columns and a query involves most of the fields. + +| Storage Model | Application Scenarios | +| :------------- | :----------------------------------------------------------- | +| Row storage | Point queries (simple index-based queries that only return a few records)Scenarios requiring frequent addition, deletion, and modification | +| Column storage | Statistical analysis queries (requiring a large number of association and grouping operations)Ad hoc queries (using uncertain query conditions and unable to utilize indexes to scan row-store tables) | + +MogDB supports hybrid row storage and column storage. Each storage model applies to specific scenarios. Select an appropriate model when creating a table. Generally, MogDB is used for transactional processing databases. By default, row storage is used. Column storage is used only when complex queries in large data volume are performed. + +## Selecting a Storage Model + +- Update frequency + + If data is frequently updated, use a row-store table. + +- Data insertion frequency + + If a small amount of data is frequently inserted each time, use a row-store table. + +- Number of columns + + If a table is to contain many columns, use a column-store table. + +- Number of columns to be queried + + If only a small number of columns (less than 50% of the total) is queried each time, use a column-store table. + +- Compression ratio + + The compression ratio of a column-store table is higher than that of a row-store table. High compression ratio consumes more CPU resources. + +## Constraints of Column-store Table + +- The column-store table does not support arrays. +- The number of column-store tables is recommended to be no more than 1000. +- The table-level constraints of the column-store table only support **PARTIAL CLUSTER KEY**, and do not support table-level constraints such as primary and foreign keys. +- The field constraints of the column-store table only support **NULL**, **NOT NULL** and **DEFAULT** constant values. +- The column-store table does not support the **alter** command to modify field constraints. +- The column-store table supports the delta table, which is controlled by the parameter **enable_delta_store** whether to enable or not, and the threshold value for entering the delta table is controlled by the parameter **deltarow_threshold**. + +## Related Parameters + +- cstore_buffers + + The size of the shared buffer used by the column-store, the default value: 32768KB. + +- partition_mem_batch + + Specify the number of caches. In order to optimize the batch insertion of column-store partition tables, the data will be cached during the batch insertion process and then written to disk in batches. Default value: 256. + +- partition_max_cache_size + + Specify the size of the data buffer area. In order to optimize the batch insertion of column-store partition tables, the data will be cached during the batch insertion process and then written to disk in batches. Default value: 2GB. + +- enable_delta_store + + In order to enhance the performance of single data import in column-store and solve the problem of disk redundancy, whether it is necessary to enable the function of column-store delta table and use it in conjunction with the parameter **DELTAROW_THRESHOLD**. Default value: off. + +## Create Table Commands + +MogDB creates normal tables as uncompressed row-store tables by default. + +``` +mogdb=# \dt +No relations found. +mogdb=# create table test_t(id serial primary key ,col1 varchar(8),col2 decimal(6,2),create_time timestamptz not null default now()); +NOTICE: CREATE TABLE will create implicit sequence "test_t_id_seq" for serial column "test_t.id" +NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "test_t_pkey" for table "test_t" +CREATE TABLE +mogdb=# \dt+ + List of relations + Schema | Name | Type | Owner | Size | Storage | Description +--------+--------+-------+-------+---------+----------------------------------+------------- + public | test_t | table | omm | 0 bytes | {orientation=row,compression=no} | +(1 row) + +mogdb=# +``` + +To create a column-store table, you need to specify **orientation=column**, the default compression level is **low**. + +``` +mogdb=# create table column_t(id serial,col1 varchar(8),col2 decimal(6,2),create_time timestamptz not null default now()) with (orientation=column ); +NOTICE: CREATE TABLE will create implicit sequence "column_t_id_seq" for serial column "column_t.id" +CREATE TABLE +mogdb=# \dt+ + List of relations + Schema | Name | Type | Owner | Size | Storage | Description +--------+----------+-------+-------+---------+--------------------------------------+------------- + public | column_t | table | omm | 16 kB | {orientation=column,compression=low} | + public | test_t | table | omm | 0 bytes | {orientation=row,compression=no} | +(2 rows) + +mogdb=# \d+ column_t + Table "public.column_t" + Column | Type | Modifiers | Storage | Stats target | Description +-------------+--------------------------+-------------------------------------------------------+----------+--------------+------------- + id | integer | not null default nextval('column_t_id_seq'::regclass) | plain | | + col1 | character varying(8) | | extended | | + col2 | numeric(6,2) | | main | | + create_time | timestamp with time zone | not null default now() | plain | | +Has OIDs: no +Options: orientation=column, compression=low +``` + +Add partial clustered storage columns to the column-store table. + +``` +mogdb=# \d+ column_t + Table "public.column_t" + Column | Type | Modifiers | Storage | Stats target | Description +-------------+--------------------------+-------------------------------------------------------+----------+--------------+------------- + id | integer | not null default nextval('column_t_id_seq'::regclass) | plain | | + col1 | character varying(8) | | extended | | + col2 | numeric(6,2) | | main | | + create_time | timestamp with time zone | not null default now() | plain | | +Has OIDs: no +Options: orientation=column, compression=low + +mogdb=# alter table column_t add PARTIAL CLUSTER KEY(id); +ALTER TABLE +mogdb=# \d+ column_t + Table "public.column_t" + Column | Type | Modifiers | Storage | Stats target | Description +-------------+--------------------------+-------------------------------------------------------+----------+--------------+------------- + id | integer | not null default nextval('column_t_id_seq'::regclass) | plain | | + col1 | character varying(8) | | extended | | + col2 | numeric(6,2) | | main | | + create_time | timestamp with time zone | not null default now() | plain | | +Partial Cluster : + "column_t_cluster" PARTIAL CLUSTER KEY (id) +Has OIDs: no +Options: orientation=column, compression=low + +mogdb=# +``` + +Create column-store tables with partial clustered storage directly. + +``` +mogdb=# create table column_c(id serial,col1 varchar(8),col2 decimal(6,2),create_time timestamptz not null default now(),PARTIAL CLUSTER KEY(id)) with (orientation=column ); +NOTICE: CREATE TABLE will create implicit sequence "column_c_id_seq" for serial column "column_c.id" +CREATE TABLE +mogdb=# \d+ column_c + Table "public.column_c" + Column | Type | Modifiers | Storage | Stats target | Description +-------------+--------------------------+-------------------------------------------------------+----------+--------------+------------- + id | integer | not null default nextval('column_c_id_seq'::regclass) | plain | | + col1 | character varying(8) | | extended | | + col2 | numeric(6,2) | | main | | + create_time | timestamp with time zone | not null default now() | plain | | +Partial Cluster : + "column_c_cluster" PARTIAL CLUSTER KEY (id) +Has OIDs: no +Options: orientation=column, compression=low + +mogdb=# +``` + +Please refer to **Supported Data Types** > **Data Types Supported by Column-store Tables** under the **Reference Guide** for the data types supported by column-store tables. + +## Column-store versus Row-store + +**Used disk space** + +- The default size of the column-store table is 16K, the compression level is **low**. + +- The default size of the row-store table is 0bytes, the compression level is **no**. + +- Insert 1 million pieces of data into the two tables separately , and compare the occupied disk size. + + ``` + mogdb=# \dt+ + List of relations + Schema | Name | Type | Owner | Size | Storage | Description + --------+-----------+-------+-------+---------+-----------------------------------------+------------- + public | column_t | table | omm | 16 kB | {orientation=column,compression=low} | + public | column_th | table | omm | 16 kB | {orientation=column,compression=high} | + public | column_tm | table | omm | 16 kB | {orientation=column,compression=middle} | + public | row_tc | table | omm | 0 bytes | {orientation=row,compression=yes} | + public | test_t | table | omm | 0 bytes | {orientation=row,compression=no} | + (5 rows) + + mogdb=# insert into column_t select generate_series(1,1000000),left(md5(random()::text),8),random()::numeric(6,2); + INSERT 0 1000000 + Time: 11328.880 ms + mogdb=# insert into column_th select generate_series(1,1000000),left(md5(random()::text),8),random()::numeric(6,2); + INSERT 0 1000000 + Time: 10188.634 ms + mogdb=# insert into column_tm select generate_series(1,1000000),left(md5(random()::text),8),random()::numeric(6,2); + INSERT 0 1000000 + Time: 9802.739 ms + mogdb=# insert into test_t select generate_series(1,1000000),left(md5(random()::text),8),random()::numeric(6,2); + INSERT 0 1000000 + Time: 17404.945 ms + mogdb=# insert into row_tc select generate_series(1,1000000),left(md5(random()::text),8),random()::numeric(6,2); + INSERT 0 1000000 + Time: 12394.866 ms + mogdb=# \dt+ + List of relations + Schema | Name | Type | Owner | Size | Storage | Description + --------+-----------+-------+-------+----------+-----------------------------------------+------------- + public | column_t | table | omm | 12 MB | {orientation=column,compression=low} | + public | column_th | table | omm | 8304 kB | {orientation=column,compression=high} | + public | column_tm | table | omm | 10168 kB | {orientation=column,compression=middle} | + public | row_tc | table | omm | 58 MB | {orientation=row,compression=yes} | + public | test_t | table | omm | 58 MB | {orientation=row,compression=no} | + (5 rows) + + mogdb=# + ``` + +- The higher the compression level of the column-store table is, the less the disk space it uses. + +- After the row-store table is compressed, the size of the disk space dose not decrease significantly. + +- Column-store table take up nearly 6 times less disk space than row-store table. + +**DML Comparison** + +Search for a single column: + +``` +--- +---Search by range, column-store is nearly 20 times faster than row-store +--- +mogdb=# select col1 from test_t where id>=100010 and id<100020; + col1 +---------- + 4257a3f3 + 3d397284 + 64343438 + 6eb7bdb7 + d1c9073d + 6aeb037c + 1d424974 + 223235ab + 329de235 + 2f02adc1 +(10 rows) + +Time: 77.341 ms +mogdb=# select col1 from column_t where id>=100010 and id<100020; + col1 +---------- + d4837c30 + 87a46f7a + 2f42a9c9 + 4481c793 + 68800204 + 613b9205 + 9d8f4a0a + 5cc4ff9e + f948cd10 + f2775cee +(10 rows) + +Time: 3.884 ms + +--- +---Search Randomly, column-store is nearly 35 times faster than row-store +--- + +mogdb=# select col1 from test_t limit 10; + col1 +---------- + c2780d93 + 294be14d + 4e53b761 + 2c10f8a2 + ae776743 + 7d683c66 + b3b40054 + 7e56edf9 + a7b7336e + ea3d47d9 +(10 rows) + +Time: 249.887 ms +mogdb=# select col1 from column_t limit 10; + col1 +---------- + a745d77b + 4b6df494 + 76fed9c1 + 70c9664d + 3384de8a + 4158f3bf + 5d1c3b9f + 341876bb + f396f4ed + abfd78bb +(10 rows) + +Time: 7.738 ms +``` + +Search for all the data: + +``` +--- +---Row-store is 30% faster than column-store search +--- +mogdb=# select * from test_t limit 10; + id | col1 | col2 | create_time +----+----------+------+------------------------------- + 1 | c2780d93 | .37 | 2020-10-26 14:27:33.304108+08 + 2 | 294be14d | .57 | 2020-10-26 14:27:33.304108+08 + 3 | 4e53b761 | .98 | 2020-10-26 14:27:33.304108+08 + 4 | 2c10f8a2 | .27 | 2020-10-26 14:27:33.304108+08 + 5 | ae776743 | .97 | 2020-10-26 14:27:33.304108+08 + 6 | 7d683c66 | .58 | 2020-10-26 14:27:33.304108+08 + 7 | b3b40054 | .44 | 2020-10-26 14:27:33.304108+08 + 8 | 7e56edf9 | .43 | 2020-10-26 14:27:33.304108+08 + 9 | a7b7336e | .31 | 2020-10-26 14:27:33.304108+08 + 10 | ea3d47d9 | .42 | 2020-10-26 14:27:33.304108+08 +(10 rows) + +Time: 6.822 ms + +mogdb=# select * from column_t limit 10; + id | col1 | col2 | create_time +----+----------+------+------------------------------- + 1 | a745d77b | .33 | 2020-10-26 14:28:20.633253+08 + 2 | 4b6df494 | .42 | 2020-10-26 14:28:20.633253+08 + 3 | 76fed9c1 | .73 | 2020-10-26 14:28:20.633253+08 + 4 | 70c9664d | .74 | 2020-10-26 14:28:20.633253+08 + 5 | 3384de8a | .48 | 2020-10-26 14:28:20.633253+08 + 6 | 4158f3bf | .59 | 2020-10-26 14:28:20.633253+08 + 7 | 5d1c3b9f | .63 | 2020-10-26 14:28:20.633253+08 + 8 | 341876bb | .97 | 2020-10-26 14:28:20.633253+08 + 9 | f396f4ed | .73 | 2020-10-26 14:28:20.633253+08 + 10 | abfd78bb | .30 | 2020-10-26 14:28:20.633253+08 +(10 rows) + +Time: 9.982 ms +``` + +Update data: + +``` +--- +---Update a field directly, column-store is nearly 7 times faster than row-store +--- +mogdb=# update test_t set col1=col1; +UPDATE 1000000 +Time: 19779.978 ms +mogdb=# update column_t set col1=col1; +UPDATE 1000000 +Time: 2702.339 ms +``` + +## Conclusion + +1. The Column-store table saves nearly 6 times the disk space usage compared to the row-store table. +2. When searching for the specified field, the column-store table is about 20-35 times faster than the row-store table. +3. When searching for all the data, the column-store table is 30% slower than the row-store table. +4. When importing data in batches in the default compression mode, and column-store table is 40% faster than the row-store table. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/exporting-data/1-using-gs_dump-and-gs_dumpall-to-export-data-overview.md b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/exporting-data/1-using-gs_dump-and-gs_dumpall-to-export-data-overview.md new file mode 100644 index 0000000000000000000000000000000000000000..3ade33a59d28689d3b013c72729a7e9cb0659501 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/exporting-data/1-using-gs_dump-and-gs_dumpall-to-export-data-overview.md @@ -0,0 +1,91 @@ +--- +title: Using gs_dump and gs_dumpall to Export Data Overview +summary: Using gs_dump and gs_dumpall to Export Data Overview +author: Guo Huan +date: 2021-03-04 +--- + +# Using gs_dump and gs_dumpall to Export Data Overview + +MogDB provides **gs_dump** and **gs_dumpall** to export required database objects and related information. You can use a tool to import the exported data to a target database for database migration. **gs_dump** exports a single database or its objects. **gs_dumpall** exports all databases or global objects in MogDB. For details, see [Table 1](#Scenarios). + +**Table 1** Scenarios + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Application ScenarioExport GranularityExport FormatImport Method
Exporting a single databaseDatabase-level export
+ - Export full information of a cluster.
You can use the exported information to create a database containing the same data as the current one.
+ - Export all object definitions of a database, including the definitions of the database, functions, schemas, tables, indexes, and stored procedures. +
You can use the exported object definitions to quickly create a database that is the same as the current one, except that the new database does not have data.
+ - Export data of a cluster.
+ - Plaintext
+ - Custom
+ - Directory
+ - .tar +
+ - For details about how to import data files in text format, see Using a gsql Meta-Command to Import Data.
+ - For details about how to import data files in .tar, directory, or custom format, see Using gs_restore to Import Data.
Schema-level export
+ - Export full information of a schema.
+ - Export data of a schema.
+ - Export all object definitions of a schema, including the definitions of tables, stored procedures, and indexes. +
Table-level export
+ - Export full information of a table.
+ - Export data of a table.
+ - Export the definition of a table. +
Exporting all databasesDatabase-level export
+ - Export full information of a cluster.
You can use the exported information to create a host environment containing the same databases, global objects, and data as the current one.
+ - Export all object definitions of a database, including the definitions of tablespaces, databases, functions, schemas, tables, indexes, and stored procedures.
+ You can use the exported object definitions to quickly create a host environment that is the same as the current one, containing the same databases and tablespaces but no data.
+ - Export data only.
PlaintextFor details about how to import data files, see Using a gsql Meta-Command to Import Data.
Global object export
+- Export tablespaces.
+- Export roles.
+- Export tablespaces and roles.
+ +**gs_dump** and **gs_dumpall** use **-U** to specify the user that performs the export. If the specified user does not have the required permissions, data cannot be exported. In this case, you can set **-role** in the **gs_dump** or **gs_dumpall** command to the role that has the permissions. Then, **gs_dump** or **gs_dumpall** uses the specified role to export data. See Table 1 for application scenarios and [Data Export By a User Without Required Permissions](4-data-export-by-a-user-without-required-permissions) for operation details. + +**gs_dump** and **gs_dumpall** encrypt the exported data files. These files are decrypted before being imported. In this way, data disclosure is prevented, protecting database security. + +When **gs_dump** or **gs_dumpall** is used to export data from a cluster, other users can still access (read and write) databases in MogDB. + +**gs_dump** and **gs_dumpall** can export complete, consistent data. For example, if **gs_dump** is executed to export database A or **gs_dumpall** is executed to export all databases from MogDB at T1, data of database A or all databases in MogDB at that time point will be exported, and modifications on the databases after that time point will not be exported. + +**Precautions** + +- Do not modify an exported file or its content. Otherwise, restoration may fail. + +- If there are more than 500,000 objects (data tables, views, and indexes) in a database, you are advised to use **gs_guc** to set the following parameters for database nodes. This operation is not required if the parameter values are greater than the recommended ones. + + ```bash + gs_guc set -N all -I all -c 'max_prepared_transactions = 1000' + gs_guc set -N all -I all -c 'max_locks_per_transaction = 512' + ``` + +- For data consistency and integrity, **gs_dump** and **gs_dumpall** set a share lock for a table to dump. If a share lock has been set for the table in other transactions, **gs_dump** and **gs_dumpall** lock the table after it is released. If the table cannot be locked within the specified time, the dump fails. You can customize the timeout duration to wait for lock release by specifying the **-lock-wait-timeout** parameter. + +- During an export, **gs_dumpall** reads all tables in a database. Therefore, you need to connect to the database as a MogDB administrator to export a complete file. When you use **gsql** to execute SQL scripts, cluster administrator permissions are also required to add users and user groups, and create databases. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/exporting-data/2-exporting-a-single-database.md b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/exporting-data/2-exporting-a-single-database.md new file mode 100644 index 0000000000000000000000000000000000000000..39539c96601eae9d7485f4f1a2209c868ce8f20d --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/exporting-data/2-exporting-a-single-database.md @@ -0,0 +1,288 @@ +--- +title: Exporting a Single Database +summary: Exporting a Single Database +author: Guo Huan +date: 2021-03-04 +--- + +# Exporting a Single Database + +## Exporting a Database + +You can use **gs_dump** to export data and all object definitions of a database from MogDB. You can specify the information to export as follows: + +- Export full information of a database, including its data and all object definitions. + + You can use the exported information to create a database containing the same data as the current one. + +- Export all object definitions of a database, including the definitions of the database, functions, schemas, tables, indexes, and stored procedures. + + You can use the exported object definitions to quickly create a database that is the same as the current one, except that the new database does not have data. + +- Export data of a database. + +### Procedure + +1. Log in as the OS user **omm** to the primary node of the database. + +2. Use **gs_dump** to export data of the **userdatabase** database. + + ```bash + gs_dump -U jack -f /home/omm/backup/userdatabase_backup.tar -p 8000 postgres -F t + Password: + ``` + + **Table 1** Common parameters + + | Parameter | Description | Example Value | + | :-------- | :----------------------------------------------------------- | :------------------------------------------ | + | -U | Username for database connection.
NOTE:
If the username is not specified, the initial system administrator created during installation is used for connection by default. | -U jack | + | -W | User password for database connection.
- This parameter is not required for database administrators if the trust policy is used for authentication.
- If you connect to the database without specifying this parameter and you are not a database administrator, you will be prompted to enter the password. | -W abcd@123 | + | -f | Folder to store exported files. If this parameter is not specified, the exported files are stored in the standard output. | -f /home/omm/backup/**postgres**_backup.tar | + | -p | TCP port or local Unix-domain socket file extension on which the server is listening for connections. | -p 8000 | + | dbname | Name of the database to export. | postgres | + | -F | Select the format of file to export. The values of **-F** are as follows:
- **p**: plaintext
- **c**: custom
- **d**: directory
- **t**: .tar | -F t | + + For details about other parameters, see "Tool Reference > Server Tools > [gs_dump](5-gs_dump)" in the **Reference Guide**. + +### Examples + +Example 1: Run **gs_dump** to export full information of the **postgres** database. The exported files are in .sql format. + +```bash +gs_dump -f /home/omm/backup/postgres_backup.sql -p 8000 postgres -F p +Password: +gs_dump[port='8000'][postgres][2017-07-21 15:36:13]: dump database postgres successfully +gs_dump[port='8000'][postgres][2017-07-21 15:36:13]: total time: 3793 ms +``` + +Example 2: Run **gs_dump** to export data of the **postgres** database, excluding object definitions. The exported files are in a custom format. + +```bash +gs_dump -f /home/omm/backup/postgres_data_backup.dmp -p 8000 postgres -a -F c +Password: +gs_dump[port='8000'][postgres][2017-07-21 15:36:13]: dump database postgres successfully +gs_dump[port='8000'][postgres][2017-07-21 15:36:13]: total time: 3793 ms +``` + +Example 3: Run **gs_dump** to export object definitions of the **postgres** database. The exported files are in .sql format. + +```bash +gs_dump -f /home/omm/backup/postgres_def_backup.sql -p 8000 postgres -s -F p +Password: +gs_dump[port='8000'][postgres][2017-07-20 15:04:14]: dump database postgres successfully +gs_dump[port='8000'][postgres][2017-07-20 15:04:14]: total time: 472 ms +``` + +Example 4: Run **gs_dump** to export object definitions of the **postgres** database. The exported files are in text format and are encrypted. + +```bash +gs_dump -f /home/omm/backup/postgres_def_backup.sql -p 8000 postgres --with-encryption AES128 --with-key 1234567812345678 -s -F p +Password: +gs_dump[port='8000'][postgres][2018-11-14 11:25:18]: dump database postgres successfully +gs_dump[port='8000'][postgres][2018-11-14 11:25:18]: total time: 1161 ms +``` + +## Exporting a Schema + +You can use **gs_dump** to export data and all object definitions of a schema from MogDB. You can export one or more specified schemas as needed. You can specify the information to export as follows: + +- Export full information of a schema, including its data and object definitions. +- Export data of a schema, excluding its object definitions. +- Export the object definitions of a schema, including the definitions of tables, stored procedures, and indexes. + +### Procedure + +1. Log in as the OS user **omm** to the primary node of the database. + +2. Run **gs_dump** to export the **hr** and **public** schemas. + + ```bash + gs_dump -W Bigdata@123 -U jack -f /home/omm/backup/MPPDB_schema_backup -p 8000 human_resource -n hr -n public -F d + ``` + + **Table 1** Common parameters + + | Parameter | Description | Example Value | + | :-------- | :----------------------------------------------------------- | :----------------------------------------------------------- | + | -U | Username for database connection. | -U jack | + | -W | User password for database connection.
- This parameter is not required for database administrators if the trust policy is used for authentication.
- If you connect to the database without specifying this parameter and you are not a database administrator, you will be prompted to enter the password. | -W Bigdata@123 | + | -f | Folder to store exported files. If this parameter is not specified, the exported files are stored in the standard output. | -f /home/omm/backup/MPPDB*_*schema_backup | + | -p | TCP port or local Unix-domain socket file extension on which the server is listening for connections. | -p 8000 | + | dbname | Name of the database to export. | human_resource | + | -n | Names of schemas to export. Data of the specified schemas will also be exported.
- Single schema: Enter **-n** **schemaname**.
- Multiple schemas: Enter **-n** **schemaname** for each schema. | - Single schemas:**-n hr**
- Multiple schemas:**-n hr -n public** | + | -F | Select the format of file to export. The values of **-F** are as follows:
- **p**: plaintext
- **c**: custom
- **d**: directory
- **t**: .tar | -F d | + + For details about other parameters, see "Tool Reference > Server Tools > [gs_dump](5-gs_dump)" in the **Reference Guide**. + +### Examples + +Example 1: Run **gs_dump** to export full information of the **hr** schema. The exported files stored in text format. + +```bash +gs_dump -W Bigdata@123 -f /home/omm/backup/MPPDB_schema_backup.sql -p 8000 human_resource -n hr -F p +gs_dump[port='8000'][human_resource][2017-07-21 16:05:55]: dump database human_resource successfully +gs_dump[port='8000'][human_resource][2017-07-21 16:05:55]: total time: 2425 ms +``` + +Example 2: Run **gs_dump** to export data of the **hr** schema. The exported files are in .tar format. + +```bash +gs_dump -W Bigdata@123 -f /home/omm/backup/MPPDB_schema_data_backup.tar -p 8000 human_resource -n hr -a -F t +gs_dump[port='8000'][human_resource][2018-11-14 15:07:16]: dump database human_resource successfully +gs_dump[port='8000'][human_resource][2018-11-14 15:07:16]: total time: 1865 ms +``` + +Example 3: Run **gs_dump** to export the object definitions of the **hr** schema. The exported files are stored in a directory. + +```bash +gs_dump -W Bigdata@123 -f /home/omm/backup/MPPDB_schema_def_backup -p 8000 human_resource -n hr -s -F d +gs_dump[port='8000'][human_resource][2018-11-14 15:11:34]: dump database human_resource successfully +gs_dump[port='8000'][human_resource][2018-11-14 15:11:34]: total time: 1652 ms +``` + +Example 4: Run **gs_dump** to export the **human_resource** database excluding the **hr** schema. The exported files are in a custom format. + +```bash +gs_dump -W Bigdata@123 -f /home/omm/backup/MPPDB_schema_backup.dmp -p 8000 human_resource -N hr -F c +gs_dump[port='8000'][human_resource][2017-07-21 16:06:31]: dump database human_resource successfully +gs_dump[port='8000'][human_resource][2017-07-21 16:06:31]: total time: 2522 ms +``` + +Example 5: Run **gs_dump** to export the object definitions of the **hr** and **public** schemas. The exported files are in .tar format. + +```bash +gs_dump -W Bigdata@123 -f /home/omm/backup/MPPDB_schema_backup1.tar -p 8000 human_resource -n hr -n public -s -F t +gs_dump[port='8000'][human_resource][2017-07-21 16:07:16]: dump database human_resource successfully +gs_dump[port='8000'][human_resource][2017-07-21 16:07:16]: total time: 2132 ms +``` + +Example 6: Run **gs_dump** to export the **human_resource** database excluding the **hr** and **public** schemas. The exported files are in a custom format. + +```bash +gs_dump -W Bigdata@123 -f /home/omm/backup/MPPDB_schema_backup2.dmp -p 8000 human_resource -N hr -N public -F c +gs_dump[port='8000'][human_resource][2017-07-21 16:07:55]: dump database human_resource successfully +gs_dump[port='8000'][human_resource][2017-07-21 16:07:55]: total time: 2296 ms +``` + +Example 7: Run **gs_dump** to export all tables (views, sequences, and foreign tables are also included) in the **public** schema and the **staffs** table in the **hr** schema, including data and table definition. The exported files are in a custom format. + +```bash +gs_dump -W Bigdata@123 -f /home/omm/backup/MPPDB_backup3.dmp -p 8000 human_resource -t public.* -t hr.staffs -F c +gs_dump[port='8000'][human_resource][2018-12-13 09:40:24]: dump database human_resource successfully +gs_dump[port='8000'][human_resource][2018-12-13 09:40:24]: total time: 896 ms +``` + +## Exporting a Table + +You can use **gs_dump** to export data and definition of a table-level object from MogDB. Views, sequences, and foreign tables are special tables. You can export one or more specified tables as needed. You can specify the information to export as follows: + +- Export full information of a table, including its data and definition. +- Export data of a table. +- Export the definition of a table. + +### Procedure + +1. Log in as the OS user **omm** to the primary node of the database. + +2. Run **gs_dump** to export the **hr.staffs** and **hr.employments** tables. + + ```bash + gs_dump -W Bigdata@123 -U jack -f /home/omm/backup/MPPDB_table_backup -p 8000 human_resource -t hr.staffs -t hr.employments -F d + ``` + + **Table 1** Common parameters + + | Parameter | Description | Example Value | + | :-------- | :----------------------------------------------------------- | :----------------------------------------------------------- | + | -U | Username for database connection. | -U jack | + | -W | User password for database connection.
- This parameter is not required for database administrators if the trust policy is used for authentication.
- If you connect to the database without specifying this parameter and you are not a database administrator, you will be prompted to enter the password. | -W Bigdata@123 | + | -f | Folder to store exported files. If this parameter is not specified, the exported files are stored in the standard output. | -f /home/omm/backup/MPPDB_table_backup | + | -p | TCP port or local Unix-domain socket file extension on which the server is listening for connections. | -p 8000 | + | dbname | Name of the database to export. | human_resource | + | -t | Table (or view, sequence, foreign table) to export. You can specify multiple tables by listing them or using wildcard characters. When you use wildcard characters, quote wildcard patterns with single quotation marks (") to prevent the shell from expanding the wildcard characters.
- Single table: Enter **-t** **schema.table**.
- Multiple tables: Enter **-t** **schema.table** for each table. | - Single table: **-t hr.staffs**
- **Multiple tables:**-t hr.staffs -t hr.employments** | + | -F | Select the format of file to export. The values of **-F** are as follows:
- **p**: plaintext
- **c**: custom
- **d**: directory
- **t**: .tar | -F d | + + For details about other parameters, see "Tool Reference > Server Tools > [gs_dump](5-gs_dump)" in the **Reference Guide**. + +### Examples + +Example 1: Run **gs_dump** to export full information of the **hr.staffs** table. The exported files are in text format. + +```bash +gs_dump -W Bigdata@123 -f /home/omm/backup/MPPDB_table_backup.sql -p 8000 human_resource -t hr.staffs -F p +gs_dump[port='8000'][human_resource][2017-07-21 17:05:10]: dump database human_resource successfully +gs_dump[port='8000'][human_resource][2017-07-21 17:05:10]: total time: 3116 ms +``` + +Example 2: Run **gs_dump** to export data of the **hr.staffs** table. The exported files are in .tar format. + +```bash +gs_dump -W Bigdata@123 -f /home/omm/backup/MPPDB_table_data_backup.tar -p 8000 human_resource -t hr.staffs -a -F t +gs_dump[port='8000'][human_resource][2017-07-21 17:04:26]: dump database human_resource successfully +gs_dump[port='8000'][human_resource][2017-07-21 17:04:26]: total time: 2570 ms +``` + +Example 3: Run **gs_dump** to export the definition of the **hr.staffs** table. The exported files are stored in a directory. + +```bash +gs_dump -W Bigdata@123 -f /home/omm/backup/MPPDB_table_def_backup -p 8000 human_resource -t hr.staffs -s -F d +gs_dump[port='8000'][human_resource][2017-07-21 17:03:09]: dump database human_resource successfully +gs_dump[port='8000'][human_resource][2017-07-21 17:03:09]: total time: 2297 ms +``` + +Example 4: Run **gs_dump** to export the **human_resource** database excluding the **hr.staffs** table. The exported files are in a custom format. + +```bash +gs_dump -W Bigdata@123 -f /home/omm/backup/MPPDB_table_backup4.dmp -p 8000 human_resource -T hr.staffs -F c +gs_dump[port='8000'][human_resource][2017-07-21 17:14:11]: dump database human_resource successfully +gs_dump[port='8000'][human_resource][2017-07-21 17:14:11]: total time: 2450 ms +``` + +Example 5: Run **gs_dump** to export the **hr.staffs** and **hr.employments** tables. The exported files are in text format. + +```bash +gs_dump -W Bigdata@123 -f /home/omm/backup/MPPDB_table_backup1.sql -p 8000 human_resource -t hr.staffs -t hr.employments -F p +gs_dump[port='8000'][human_resource][2017-07-21 17:19:42]: dump database human_resource successfully +gs_dump[port='8000'][human_resource][2017-07-21 17:19:42]: total time: 2414 ms +``` + +Example 6: Run **gs_dump** to export the **human_resource** database excluding the **hr.staffs** and **hr.employments** tables. The exported files are in text format. + +```bash +gs_dump -W Bigdata@123 -f /home/omm/backup/MPPDB_table_backup2.sql -p 8000 human_resource -T hr.staffs -T hr.employments -F p +gs_dump[port='8000'][human_resource][2017-07-21 17:21:02]: dump database human_resource successfully +gs_dump[port='8000'][human_resource][2017-07-21 17:21:02]: total time: 3165 ms +``` + +Example 7: Run **gs_dump** to export data and definition of the **hr.staffs** table, and the definition of the **hr.employments** table. The exported files are in .tar format. + +```bash +gs_dump -W Bigdata@123 -f /home/omm/backup/MPPDB_table_backup3.tar -p 8000 human_resource -t hr.staffs -t hr.employments --exclude-table-data hr.employments -F t +gs_dump[port='8000'][human_resource][2018-11-14 11:32:02]: dump database human_resource successfully +gs_dump[port='8000'][human_resource][2018-11-14 11:32:02]: total time: 1645 ms +``` + +Example 8: Run **gs_dump** to export data and definition of the **hr.staffs** table, encrypt the exported files, and store them in text format. + +```bash +gs_dump -W Bigdata@123 -f /home/omm/backup/MPPDB_table_backup4.sql -p 8000 human_resource -t hr.staffs --with-encryption AES128 --with-key 1212121212121212 -F p +gs_dump[port='8000'][human_resource][2018-11-14 11:35:30]: dump database human_resource successfully +gs_dump[port='8000'][human_resource][2018-11-14 11:35:30]: total time: 6708 ms +``` + +Example 9: Run **gs_dump** to export all tables (views, sequences, and foreign tables are also included) in the **public** schema and the **staffs** table in the **hr** schema, including data and table definition. The exported files are in a custom format. + +```bash +gs_dump -W Bigdata@123 -f /home/omm/backup/MPPDB_table_backup5.dmp -p 8000 human_resource -t public.* -t hr.staffs -F c +gs_dump[port='8000'][human_resource][2018-12-13 09:40:24]: dump database human_resource successfully +gs_dump[port='8000'][human_resource][2018-12-13 09:40:24]: total time: 896 ms +``` + +Example 10: Run **gs_dump** to export the definition of the view referencing to the **test1** table in the **t1** schema. The exported files are in a custom format. + +```bash +gs_dump -W Bigdata@123 -U jack -f /home/omm/backup/MPPDB_view_backup6 -p 8000 human_resource -t t1.test1 --include-depend-objs --exclude-self -F d +gs_dump[port='8000'][jack][2018-11-14 17:21:18]: dump database human_resource successfully +gs_dump[port='8000'][jack][2018-11-14 17:21:23]: total time: 4239 ms +``` diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/exporting-data/3-exporting-all-databases.md b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/exporting-data/3-exporting-all-databases.md new file mode 100644 index 0000000000000000000000000000000000000000..f730f0f7e678147399fd673f96735af305fd20c6 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/exporting-data/3-exporting-all-databases.md @@ -0,0 +1,121 @@ +--- +title: Exporting All Databases +summary: Exporting All Databases +author: Guo Huan +date: 2021-03-04 +--- + +# Exporting All Databases + +## Exporting All Databases + +You can use **gs_dumpall** to export full information of all databases in MogDB, including information about each database and global objects in MogDB. You can specify the information to export as follows: + +- Export full information of all databases, including information about each database and global objects (such as roles and tablespaces) in MogDB. + + You can use the exported information to create a host environment containing the same databases, global objects, and data as the current one. + +- Export data of all databases, excluding all object definitions and global objects. + +- Export all object definitions of all databases, including the definitions of tablespaces, databases, functions, schemas, tables, indexes, and stored procedures. + + You can use the exported object definitions to quickly create a host environment that is the same as the current one, containing the same databases and tablespaces but no data. + +### Procedure + +1. Log in as the OS user **omm** to the primary node of the database. + +2. Run **gs_dumpall** to export full information of all databases. + + ```bash + gs_dumpall -W Bigdata@123 -U omm -f /home/omm/backup/MPPDB_backup.sql -p 8000 + ``` + + **Table 1** Common parameters + + | Parameter | Description | Example Value | + | :-------- | :----------------------------------------------------------- | :----------------------------------- | + | -U | Username for database connection. The user must be an MogDB administrator. | -U omm | + | -W | User password for database connection.
- This parameter is not required for database administrators if the trust policy is used for authentication.
- If you connect to the database without specifying this parameter and you are not a database administrator, you will be prompted to enter the password. | -W Bigdata@123 | + | -f | Folder to store exported files. If this parameter is not specified, the exported files are stored in the standard output. | -f /home/omm/backup/MPPDB_backup.sql | + | -p | TCP port or local Unix-domain socket file extension on which the server is listening for connections. | -p 8000 | + + For details about other parameters, see "Tool Reference > Server Tools > [gs_dumpall](6-gs_dumpall)" in the **Reference Guide**. + +### Examples + +Example 1: Run **gs_dumpall** as the cluster administrator **omm** to export full information of all databases in a cluster. After the command is executed, a large amount of output information will be displayed. **total time** will be displayed at the end of the information, indicating that the backup is successful. In this example, only relative output information is included. + +```bash +gs_dumpall -W Bigdata@123 -U omm -f /home/omm/backup/MPPDB_backup.sql -p 8000 +gs_dumpall[port='8000'][2017-07-21 15:57:31]: dumpall operation successful +gs_dumpall[port='8000'][2017-07-21 15:57:31]: total time: 9627 ms +``` + +Example 2: Run **gs_dumpall** as the cluster administrator **omm** to export object definitions of all databases in a cluster. The exported files are in text format. After the command is executed, a large amount of output information will be displayed. **total time** will be displayed at the end of the information, indicating that the backup is successful. In this example, only relative output information is included. + +```bash +gs_dumpall -W Bigdata@123 -U omm -f /home/omm/backup/MPPDB_backup.sql -p 8000 -s +gs_dumpall[port='8000'][2018-11-14 11:28:14]: dumpall operation successful +gs_dumpall[port='8000'][2018-11-14 11:28:14]: total time: 4147 ms +``` + +Example 3: Run **gs_dumpall** to export data of all databases in a cluster, encrypt the exported files, and store them in text format. After the command is executed, a large amount of output information will be displayed. **total time** will be displayed at the end of the information, indicating that the backup is successful. In this example, only relative output information is included. + +```bash +gs_dumpall -f /home/omm/backup/MPPDB_backup.sql -p 8000 -a --with-encryption AES128 --with-key 1234567812345678 +gs_dumpall[port='8000'][2018-11-14 11:32:26]: dumpall operation successful +gs_dumpall[port='8000'][2018-11-14 11:23:26]: total time: 4147 ms +``` + +## Exporting Global Objects + +You can use **gs_dumpall** to export global objects, including database users, user groups, tablespaces, and attributes (for example, global access permissions), from MogDB. + +### Procedure + +1. Log in as the OS user **omm** to the primary node of the database. + +2. Run **gs_dumpall** to export global tablespaces. + + ```bash + gs_dumpall -W Bigdata@123 -U omm -f /home/omm/backup/MPPDB_tablespace.sql -p 8000 -t + ``` + + **Table 1** Common parameters + + | Parameter | Description | Example Value | + | :-------- | :----------------------------------------------------------- | :------------------------------------------- | + | -U | Username for database connection. The user must be an MogDB administrator. | -U omm | + | -W | User password for database connection.
- This parameter is not required for database administrators if the trust policy is used for authentication.
- If you connect to the database without specifying this parameter and you are not a database administrator, you will be prompted to enter the password. | -W Bigdata@123 | + | -f | Folder to store exported files. If this parameter is not specified, the exported files are stored in the standard output. | -f /home/omm/backup/**MPPDB_tablespace**.sql | + | -p | TCP port or local Unix-domain socket file extension on which the server is listening for connections. | -p 8000 | + | -t | Dumps only tablespaces. You can also use **-tablespaces-only** alternatively. | - | + + For details about other parameters, see "Tool Reference > Server Tools > [gs_dumpall](6-gs_dumpall)" in the **Reference Guide**. + +### Examples + +Example 1: Run **gs_dumpall** as the cluster administrator **omm** to export global tablespaces and users of all databases. The exported files are in text format. + +```bash +gs_dumpall -W Bigdata@123 -U omm -f /home/omm/backup/MPPDB_globals.sql -p 8000 -g +gs_dumpall[port='8000'][2018-11-14 19:06:24]: dumpall operation successful +gs_dumpall[port='8000'][2018-11-14 19:06:24]: total time: 1150 ms +``` + +Example 2: Run **gs_dumpall** as the cluster administrator **omm** to export global tablespaces of all databases, encrypt the exported files, and store them in text format. + +```bash +gs_dumpall -W Bigdata@123 -U omm -f /home/omm/backup/MPPDB_tablespace.sql -p 8000 -t --with-encryption AES128 --with-key 1212121212121212 +gs_dumpall[port='8000'][2018-11-14 19:00:58]: dumpall operation successful +gs_dumpall[port='8000'][2018-11-14 19:00:58]: total time: 186 ms +``` + +Example 3: Run **gs_dumpall** as the cluster administrator **omm** to export global users of all databases. The exported files are in text format. + +```bash +gs_dumpall -W Bigdata@123 -U omm -f /home/omm/backup/MPPDB_user.sql -p 8000 -r +gs_dumpall[port='8000'][2018-11-14 19:03:18]: dumpall operation successful +gs_dumpall[port='8000'][2018-11-14 19:03:18]: total time: 162 ms +``` diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/exporting-data/4-data-export-by-a-user-without-required-permissions.md b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/exporting-data/4-data-export-by-a-user-without-required-permissions.md new file mode 100644 index 0000000000000000000000000000000000000000..025caa51aa7788f04405068e20e3645c30191d42 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/exporting-data/4-data-export-by-a-user-without-required-permissions.md @@ -0,0 +1,82 @@ +--- +title: Data Export By a User Without Required Permissions +summary: Data Export By a User Without Required Permissions +author: Guo Huan +date: 2021-03-04 +--- + +# Data Export By a User Without Required Permissions + +**gs_dump** and **gs_dumpall** use **-U** to specify the user that performs the export. If the specified user does not have the required permissions, data cannot be exported. In this case, you need to assign the permission to a user who does not have the permission, and then set the **-role** parameter in the export command to specify the role with the permission. Then, **gs_dump** or **gs_dumpall** uses the **-role** parameter to specify a role to export data. + +## Procedure + +1. Log in as the OS user **omm** to the primary node of the database. + +2. Use **gs_dump** to export data of the **human_resource** database. + + User **jack** does not have the permissions to export data of the **human_resource** database and the role **role1** has this permission. To export data of the **human_resource** database, you need to assign the permission of **role1** to **jack** and set **-role** to **role1** in the export command. The exported files are in .tar format. + + ```bash + gs_dump -U jack -f /home/omm/backup/MPPDB_backup.tar -p 8000 human_resource --role role1 --rolepassword abc@1234 -F t + Password: + ``` + + **Table 1** Common parameters + + | Parameter | Description | Example Value | + | :------------ | :----------------------------------------------------------- | :----------------------------------- | + | -U | Username for database connection. | -U jack | + | -W | User password for database connection.
- This parameter is not required for database administrators if the trust policy is used for authentication.
- If you connect to the database without specifying this parameter and you are not a database administrator, you will be prompted to enter the password. | -W Bigdata@123 | + | -f | Folder to store exported files. If this parameter is not specified, the exported files are stored in the standard output. | -f /home/omm/backup/MPPDB_backup.tar | + | -p | TCP port or local Unix-domain socket file extension on which the server is listening for connections. | -p 8000 | + | dbname | Name of the database to export. | human_resource | + | -role | Role name for the export operation. After this parameter is set, the **SET ROLE** command will be issued after **gs_dump** or **gs_dumpall** connects to the database. It is useful when the user specified by **-U** does not have the permissions required by **gs_dump** or **gs_dumpall**. This parameter allows you to switch to a role with the required permissions. | -r role1 | + | -rolepassword | Role password. | -rolepassword abc@1234 | + | -F | Select the format of file to export. The values of **-F** are as follows:
- **p**: plaintext
- **c**: custom
- **d**: directory
- **t**: .tar | -F t | + + For details about other parameters, see "Tool Reference > Server Tools > [gs_dump](5-gs_dump)" or "[gs_dumpall](6-gs_dumpall)" in the **Reference Guide**. + +## Examples + +Example 1: User **jack** does not have the permissions required to export data of the **human_resource** database using **gs_dump** and the role **role1** has the permissions. To export data of the **human_resource** database, you can set **-role** to **role1** in the **gs_dump** command. The exported files are in .tar format. + +```bash +$ human_resource=# CREATE USER jack IDENTIFIED BY "1234@abc"; +CREATE ROLE +human_resource=# GRANT role1 TO jack; +GRANT ROLE + +$ gs_dump -U jack -f /home/omm/backup/MPPDB_backup11.tar -p 8000 human_resource --role role1 --rolepassword abc@1234 -F t +Password: +gs_dump[port='8000'][human_resource][2017-07-21 16:21:10]: dump database human_resource successfully +gs_dump[port='8000'][human_resource][2017-07-21 16:21:10]: total time: 4239 ms +``` + +Example 2: User **jack** does not have the permissions required to export the **public** schema using **gs_dump** and the role **role1** has the permissions. To export the **public** schema, you can set **-role** to **role1** in the **gs_dump** command. The exported files are in .tar format. + +```bash +$ human_resource=# CREATE USER jack IDENTIFIED BY "1234@abc"; +CREATE ROLE +human_resource=# GRANT role1 TO jack; +GRANT ROLE + +$ gs_dump -U jack -f /home/omm/backup/MPPDB_backup12.tar -p 8000 human_resource -n public --role role1 --rolepassword abc@1234 -F t +Password: +gs_dump[port='8000'][human_resource][2017-07-21 16:21:10]: dump database human_resource successfully +gs_dump[port='8000'][human_resource][2017-07-21 16:21:10]: total time: 3278 ms +``` + +Example 3: User **jack** does not have the permissions required to export all databases in a cluster using **gs_dumpall** and the role **role1** (cluster administrator) has the permissions. To export all the databases, you can set **-role** to **role1** in the **gs_dumpall** command. The exported files are in text format. + +```bash +$ human_resource=# CREATE USER jack IDENTIFIED BY "1234@abc"; +CREATE ROLE +human_resource=# GRANT role1 TO jack; +GRANT ROLE + +$ gs_dumpall -U jack -f /home/omm/backup/MPPDB_backup.sql -p 8000 --role role1 --rolepassword abc@1234 +Password: +gs_dumpall[port='8000'][human_resource][2018-11-14 17:26:18]: dumpall operation successful +gs_dumpall[port='8000'][human_resource][2018-11-14 17:26:18]: total time: 6437 ms +``` diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/1-import-modes.md b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/1-import-modes.md new file mode 100644 index 0000000000000000000000000000000000000000..db486ed377a2bcd89e66e9230e8246f825bea0f0 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/1-import-modes.md @@ -0,0 +1,18 @@ +--- +title: Import Modes +summary: Import Modes +author: Guo Huan +date: 2021-03-04 +--- + +# Import Modes + +You can use **INSERT**, **COPY**, or **\copy** (a **gsql** meta-command) to import data to the MogDB database. The methods have different characteristics. For details, see Table 1. + +**Table 1** Import modes + +| Mode | Characteristics | +| :--------------------------------- | :----------------------------------------------------------- | +| INSERT | Insert one or more rows of data, or insert data from a specified table. | +| COPY | Run the **COPY FROM STDIN** statement to write data into the MogDB database.
Service data does not need to be stored in files when it is written from other databases to the MogDB database through the CopyManager interface driven by JDBC. | +| **\copy**, a **gsql** meta-command | Different from the SQL **COPY** statement, the **\copy** command can read data from or write data into only local files on a **gsql** client.
NOTE:
**\copy** applies only to small-scale data import in good format. It does not preprocess invalid characters or provide error tolerance. Therefore, **\copy** cannot be used in scenarios where abnormal data exists. **COPY** is preferred for data import. | diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/10-managing-concurrent-write-operations.md b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/10-managing-concurrent-write-operations.md new file mode 100644 index 0000000000000000000000000000000000000000..5cd138eb7682d54bc9a971ad82b2a167dcd9eb39 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/10-managing-concurrent-write-operations.md @@ -0,0 +1,177 @@ +--- +title: Managing Concurrent Write Operations +summary: Managing Concurrent Write Operations +author: Guo Huan +date: 2021-03-04 +--- + +# Managing Concurrent Write Operations + +## Transaction Isolation + +MogDB manages transactions based on MVCC and two-phase locks, avoiding conflicts between read and write operations. SELECT is a read-only operation, whereas UPDATE and DELETE are read/write operations. + +- There is no conflict between read/write and read-only operations, or between read/write operations. Each concurrent transaction creates a snapshot when it starts. Concurrent transactions cannot detect updates made by each other. + - At the **READ COMMITTED** level, if transaction T1 is committed, transaction T2 can see changes made by T1. + - At the **REPEATABLE READ** level, if T2 starts before T1 is committed, T2 will not see changes made by T1 even after T1 is committed. The query results in a transaction are consistent and unaffected by other transactions. +- Read/Write operations use row-level locks. Different transactions can concurrently update the same table but not the same row. A row update transaction will start only after the previous one is committed. + - **READ COMMITTED**: At this level, a transaction can access only committed data. This is the default level. + - **REPEATABLE READ**: Only data committed before transaction start is read. Uncommitted data or data committed in other concurrent transactions cannot be read. + +## Write and Read/Write Operations + +Statements for write-only and read/write operations are as follows: + +- **INSERT**, used to insert one or more rows of data into a table +- **UPDATE**, used to modify existing data in a table +- **DELETE**, used to delete existing data from a table +- **COPY**, used to import data + +INSERT and COPY are write-only operations. Only one of them can be performed at a time. If INSERT or COPY of transaction T1 locks a table, INSERT or COPY of transaction T2 needs to wait until T1 unlocks the table. + +UPDATE and DELETE operations are read/write operations. They need to query for the target rows before modifying data. Concurrent transactions cannot see changes made by each other, and UPDATE and DELETE operations read snapshots of data committed before their transactions start. Write operations use row-level locks. If T2 starts after T1 and is to update the same row as T1 does, T2 waits for T1 to finish update. If T1 is not complete within the specified timeout duration, T2 will time out. If T1 and T2 update different rows in a table, they can be concurrently executed. + +## Potential Deadlocks During Concurrent Write + +Whenever transactions involve updates of more than one table, there is always the possibility that concurrently running transactions become deadlocked when they both try to write to the same set of tables. A transaction releases all of its locks at once when it either commits or rolls back; it does not relinquish locks one at a time. For example, transactions T1 and T2 start at roughly the same time. + +- If T1 starts writing to table A and T2 starts writing to table B, both transactions can proceed without conflict. However, if T1 finishes writing to table A and needs to start writing to the same rows as T2 does in table B, it will not be able to proceed because T2 still holds the lock on B. Conversely, if T2 finishes writing to table B and needs to start writing to the same rows as T1 does in table A, it will not be able to proceed either because T1 still holds the lock on A. In this case, a deadlock occurs. If T1 is committed and releases the lock within the lock timeout duration, subsequent update can proceed. If a lock times out, an error is reported and the corresponding transaction exits. +- If T1 updates rows 1 to 5 and T2 updates rows 6 to 10 in the same table, the two transactions do not conflict. However, if T1 finishes the update and proceeds to update rows 6 to 10, and T2 proceeds to update rows 1 to 5, neither of them can continue. If either of the transactions is committed and releases the lock within the lock timeout duration, subsequent update can proceed. If a lock times out, an error is reported and the corresponding transaction exits. + +## Concurrent Write Examples + +This section uses the **test** table as an example to describe how to perform concurrent **INSERT** and **DELETE** in the same table, concurrent **INSERT** in the same table, concurrent **UPDATE** in the same table, and concurrent import and queries. + +```sql +CREATE TABLE test(id int, name char(50), address varchar(255)); +``` + +### Concurrent INSERT and DELETE in the Same Table + +Transaction T1: + +```sql +START TRANSACTION; +INSERT INTO test VALUES(1,'test1','test123'); +COMMIT; +``` + +Transaction T2: + +```sql +START TRANSACTION; +DELETE test WHERE NAME='test1'; +COMMIT; +``` + +Scenario 1: + +T1 is started but not committed. At this time, T2 is started. After **INSERT** of T1 is complete, **DELETE** of T2 is performed. In this case, **DELETE 0** is displayed, because T1 is not committed and T2 cannot see the data inserted by T1. + +Scenario 2: + +- **READ COMMITTED** level + + T1 is started but not committed. At this time, T2 is started. After **INSERT** of T1 is complete, T1 is committed and **DELETE** of T2 is executed. In this case, **DELETE 1** is displayed, because T2 can see the data inserted by T1. + +- **REPEATABLE READ** level + + T1 is started but not committed. At this time, T2 is started. After **INSERT** of T1 is complete, T1 is committed and **DELETE** of T2 is executed. In this case, **DELETE 0** is displayed, because the data obtained in queries is consistent in a transaction. + +### Concurrent INSERT in the Same table + +Transaction T1: + +```sql +START TRANSACTION; +INSERT INTO test VALUES(2,'test2','test123'); +COMMIT; +``` + +Transaction T2: + +```sql +START TRANSACTION; +INSERT INTO test VALUES(3,'test3','test123'); +COMMIT; +``` + +Scenario 1: + +T1 is started but not committed. At this time, T2 is started. After **INSERT** of T1 is complete, **INSERT** of T2 is executed and succeeds. At the **READ COMMITTED** and **REPEATABLE READ** levels, the **SELECT** statement of T1 cannot see data inserted by T2, and a query in T2 cannot see data inserted by T1. + +Scenario 2: + +- **READ COMMITTED** level + + T1 is started but not committed. At this time, T2 is started. After **INSERT** of T1 is complete, T1 is committed. In T2, a query executed after **INSERT** can see the data inserted by T1. + +- **REPEATABLE READ** level + + T1 is started but not committed. At this time, T2 is started. After **INSERT** of T1 is complete, T1 is committed. In T2, a query executed after **INSERT** cannot see the data inserted by T1. + +### Concurrent UPDATE in the Same Table + +Transaction T1: + +```sql +START TRANSACTION; +UPDATE test SET address='test1234' WHERE name='test1'; +COMMIT; +``` + +Transaction T2: + +```sql +START TRANSACTION; +UPDATE test SET address='test1234' WHERE name='test2'; +COMMIT; +``` + +Transaction T3: + +```sql +START TRANSACTION; +UPDATE test SET address='test1234' WHERE name='test1'; +COMMIT; +``` + +Scenario 1: + +T1 is started but not committed. At this time, T2 is started. **UPDATE** of T1 and then T2 starts, and both of them succeed. This is because the **UPDATE** operations use row-level locks and do not conflict when they update different rows. + +Scenario 2: + +T1 is started but not committed. At this time, T3 is started. **UPDATE** of T1 and then T3 starts, and **UPDATE** of T1 succeeds. **UPDATE** of T3 times out. This is because T1 and T3 update the same row and the lock is held by T1 at the time of the update. + +### Concurrent Data Import and Queries + +Transaction T1: + +```sql +START TRANSACTION; +COPY test FROM '...'; +COMMIT; +``` + +Transaction T2: + +```sql +START TRANSACTION; +SELECT * FROM test; +COMMIT; +``` + +Scenario 1: + +T1 is started but not committed. At this time, T2 is started. **COPY** of T1 and then **SELECT** of T2 starts, and both of them succeed. In this case, T2 cannot see the data added by **COPY** of T1. + +Scenario 2: + +- **READ COMMITTED** level + + T1 is started but not committed. At this time, T2 is started. **COPY** of T1 is complete and T1 is committed. In this case, T2 can see the data added by **COPY** of T1. + +- **REPEATABLE READ** level + + T1 is started but not committed. At this time, T2 is started. **COPY** of T1 is complete and T1 is committed. In this case, T2 cannot see the data added by **COPY** of T1. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/2-running-the-INSERT-statement-to-insert-data.md b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/2-running-the-INSERT-statement-to-insert-data.md new file mode 100644 index 0000000000000000000000000000000000000000..9149247c9a26027cae83556801d74e8124cc959f --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/2-running-the-INSERT-statement-to-insert-data.md @@ -0,0 +1,20 @@ +--- +title: Running the INSERT Statement to Insert Data +summary: Running the INSERT Statement to Insert Data +author: Guo Huan +date: 2021-03-04 +--- + +# Running the INSERT Statement to Insert Data + +Run the **INSERT** statement to write data into the MogDB database in either of the following ways: + +- Use the client tool provided by the MogDB database to write data into MogDB. + + For details, see Inserting Data to Tables. + +- Connect to the database using the JDBC or ODBC driver and run the **INSERT** statement to write data into the MogDB database. + + For details, see Connecting to a Database. + +You can add, modify, and delete database transactions for the MogDB database. **INSERT** is the simplest way to write data and applies to scenarios with small data volume and low concurrency. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/3-running-the-COPY-FROM-STDIN-statement-to-import-data.md b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/3-running-the-COPY-FROM-STDIN-statement-to-import-data.md new file mode 100644 index 0000000000000000000000000000000000000000..ba2053c3b65c3091c635a0aa28b0cdba6b3c339d --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/3-running-the-COPY-FROM-STDIN-statement-to-import-data.md @@ -0,0 +1,318 @@ +--- +title: Running the COPY FROM STDIN Statement to Import Data +summary: Running the COPY FROM STDIN Statement to Import Data +author: Guo Huan +date: 2021-03-04 +--- + +# Running the COPY FROM STDIN Statement to Import Data + +
+ +## Data Import Using COPY FROM STDIN + +Run the **COPY FROM STDIN** statement to import data to MogDB in either of the following ways: + +- Write data into the MogDB database by typing. For details, see COPY. +- Import data from a file or database to MogDB through the CopyManager interface driven by JDBC. You can use any parameters in the **COPY** syntax. + +
+ +## Introduction to the CopyManager Class + +CopyManager is an API class provided by the JDBC driver in MogDB. It is used to import data to the MogDB database in batches. + +
+ +### Inheritance Relationship of CopyManager + +The CopyManager class is in the **org.postgresql.copy** package and inherits the java.lang.Object class. The declaration of the class is as follows: + +```java +public class CopyManager +extends Object +``` + +
+ +### Construction Method + +```java +public CopyManager(BaseConnection connection) +throws SQLException +``` + +
+ +### Common Methods + +**Table 1** Common methods of CopyManager + +| Return Value | Method | Description | throws | +| :----------- | :--------------------------------------------------- | :----------------------------------------------------------- | :----------------------- | +| CopyIn | copyIn(String sql) | - | SQLException | +| long | copyIn(String sql, InputStream from) | Uses **COPY FROM STDIN** to quickly import data to tables in a database from InputStream. | SQLException,IOException | +| long | copyIn(String sql, InputStream from, int bufferSize) | Uses **COPY FROM STDIN** to quickly import data to tables in a database from InputStream. | SQLException,IOException | +| long | copyIn(String sql, Reader from) | Uses **COPY FROM STDIN** to quickly import data to tables in a database from Reader. | SQLException,IOException | +| long | copyIn(String sql, Reader from, int bufferSize) | Uses **COPY FROM STDIN** to quickly import data to tables in a database from Reader. | SQLException,IOException | +| CopyOut | copyOut(String sql) | - | SQLException | +| long | copyOut(String sql, OutputStream to) | Sends the result set of **COPY TO STDOUT** from the database to the OutputStream class. | SQLException,IOException | +| long | copyOut(String sql, Writer to) | Sends the result set of **COPY TO STDOUT** from the database to the Writer class. | SQLException,IOException | + +
+ +## Handling Import Errors + +### Scenarios + +Handle errors that occurred during data import. + +### Querying Error Information + +Errors that occur when data is imported are divided into data format errors and non-data format errors. + +- Data format errors + + When creating a foreign table, specify **LOG INTO error_table_name**. Data format errors during data import will be written into the specified table. You can run the following SQL statement to query error details: + + ```sql + mogdb=# SELECT * FROM error_table_name; + ``` + + Table 1 lists the columns of the *error_table_name* table. + + **Table 1** Columns in the error information table + + | Column Name | Type | Description | + | :---------- | :----------------------- | :----------------------------------------------------------- | + | nodeid | integer | ID of the node where an error is reported | + | begintime | timestamp with time zone | Time when a data format error was reported | + | filename | character varying | Name of the source data file where a data format error occurs | + | rownum | bigint | Number of the row where a data format error occurs in a source data file | + | rawrecord | text | Raw record of a data format error in the source data file | + | detail | text | Error details | + +- Non-data format errors + + A non-data format error leads to the failure of an entire data import task. You can locate and troubleshoot a non-data format error based on the error message displayed during data import. + +### Handling Data Import Errors + +Troubleshoot data import errors based on obtained error information and descriptions in the following table. + +**Table 2** Handling data import errors + +| Error Message | Cause | Solution | +| :----------------------------------------------------------- | :----------------------------------------------------------- | :----------------------------------------------------------- | +| missing data for column "r_reason_desc" | 1. The number of columns in the source data file is less than that in the foreign table.
2. In a TEXT-format source data file, an escape character (for example, ) leads to delimiter or quote mislocation.
**Example:** The target table contains three columns, and the following data is imported. The escape character () converts the delimiter (\|) into the value of the second column, causing the value of the third column to lose.
`BE|Belgium|1` | 1. If an error is reported due to missing columns, perform the following operations:
- Add the value of the **r_reason_desc** column to the source data file.
- When creating a foreign table, set the parameter **fill_missing_fields** to **on**. In this way, if the last column of a row in the source data file is missing, it will be set to **NULL** and no error will be reported.
2. Check whether the row where an error is reported contains the escape character (). If the row contains such a character, you are advised to set the parameter **noescaping** to **true** when creating a foreign table, indicating that the escape character () and the characters following it are not escaped. | +| extra data after last expected column | The number of columns in the source data file is greater than that in the foreign table. | - Delete extra columns from the source data file.
- When creating a foreign table, set the parameter **ignore_extra_data** to **on**. In this way, if the number of columns in the source data file is greater than that in the foreign table, the extra columns at the end of rows will not be imported. | +| invalid input syntax for type numeric: "a" | The data type is incorrect. | In the source data file, change the data type of the columns to import. If this error information is displayed, change the data type to **numeric**. | +| null value in column "staff_id" violates not-null constraint | The not-null constraint is violated. | In the source data file, add values to the specified columns. If this error information is displayed, add values to the **staff_id** column. | +| duplicate key value violates unique constraint "reg_id_pk" | The unique constraint is violated. | - Delete duplicate rows from the source data file.
- Run the **SELECT** statement with the **DISTINCT** keyword to ensure that all imported rows are unique.
`mogdb=# INSERT INTO reasons SELECT DISTINCT * FROM foreign_tpcds_reasons;` | +| value too long for type character varying(16) | The column length exceeds the upper limit. | In the source data file, change the column length. If this error information is displayed, reduce the column length to no greater than 16 bytes (VARCHAR2). | + +
+ +## Example 1: Importing and Exporting Data Through Local Files + +When the JAVA language is used for secondary development based on MogDB, you can use the CopyManager interface to export data from the database to a local file or import a local file to the database by streaming. The file can be in CSV or TEXT format. + +The sample program is as follows. Load the MogDB JDBC driver before executing it. + +```java +import java.sql.Connection; +import java.sql.DriverManager; +import java.io.IOException; +import java.io.FileInputStream; +import java.io.FileOutputStream; +import java.sql.SQLException; +import org.postgresql.copy.CopyManager; +import org.postgresql.core.BaseConnection; + +public class Copy{ + + public static void main(String[] args) + { + String urls = new String("jdbc:postgresql://localhost:8000/postgres"); // URL of the database + String username = new String("username"); // Username + String password = new String("passwd"); // Password + String tablename = new String("migration_table"); // Table information + String tablename1 = new String("migration_table_1"); // Table information + String driver = "org.postgresql.Driver"; + Connection conn = null; + + try { + Class.forName(driver); + conn = DriverManager.getConnection(urls, username, password); + } catch (ClassNotFoundException e) { + e.printStackTrace(System.out); + } catch (SQLException e) { + e.printStackTrace(System.out); + } + + // Export data from the migration_table table to the d:/data.txt file. + try { + copyToFile(conn, "d:/data.txt", "(SELECT * FROM migration_table)"); + } catch (SQLException e) { + // TODO Auto-generated catch block + e.printStackTrace(); + } catch (IOException e) { + // TODO Auto-generated catch block + e.printStackTrace(); + } + // Import data from the d:/data.txt file to the migration_table_1 table. + try { + copyFromFile(conn, "d:/data.txt", tablename1); + } catch (SQLException e) { + // TODO Auto-generated catch block + e.printStackTrace(); + } catch (IOException e) { + // TODO Auto-generated catch block + e.printStackTrace(); + } + + // Export data from the migration_table_1 table to the d:/data1.txt file. + try { + copyToFile(conn, "d:/data1.txt", tablename1); + } catch (SQLException e) { + // TODO Auto-generated catch block + e.printStackTrace(); + } catch (IOException e) { + // TODO Auto-generated catch block + e.printStackTrace(); + } + } + + public static void copyFromFile(Connection connection, String filePath, String tableName) + throws SQLException, IOException { + + FileInputStream fileInputStream = null; + + try { + CopyManager copyManager = new CopyManager((BaseConnection)connection); + fileInputStream = new FileInputStream(filePath); + copyManager.copyIn("COPY " + tableName + " FROM STDIN with (" + "DELIMITER"+"'"+ delimiter + "'" + "ENCODING " + "'" + encoding + "')", fileInputStream); + } finally { + if (fileInputStream != null) { + try { + fileInputStream.close(); + } catch (IOException e) { + e.printStackTrace(); + } + } + } + } + public static void copyToFile(Connection connection, String filePath, String tableOrQuery) + throws SQLException, IOException { + + FileOutputStream fileOutputStream = null; + + try { + CopyManager copyManager = new CopyManager((BaseConnection)connection); + fileOutputStream = new FileOutputStream(filePath); + copyManager.copyOut("COPY " + tableOrQuery + " TO STDOUT", fileOutputStream); + } finally { + if (fileOutputStream != null) { + try { + fileOutputStream.close(); + } catch (IOException e) { + e.printStackTrace(); + } + } + } + } +} +``` + +
+ +## Example 2: Migrating Data from a MySQL Database to the MogDB Database + +The following example shows how to use CopyManager to migrate data from MySQL to the MogDB database. + +```java +import java.io.StringReader; +import java.sql.Connection; +import java.sql.DriverManager; +import java.sql.ResultSet; +import java.sql.SQLException; +import java.sql.Statement; + +import org.postgresql.copy.CopyManager; +import org.postgresql.core.BaseConnection; + +public class Migration{ + + public static void main(String[] args) { + String url = new String("jdbc:postgresql://localhost:8000/postgres"); // URL of the database + String user = new String("username"); // MogDB database user name + String pass = new String("passwd"); // MogDB database password + String tablename = new String("migration_table_1"); // Table information + String delimiter = new String("|"); // Delimiter + String encoding = new String("UTF8"); // Character set + String driver = "org.postgresql.Driver"; + StringBuffer buffer = new StringBuffer(); // Buffer to store formatted data + + try { + // Obtain the query result set of the source database. + ResultSet rs = getDataSet(); + + // Traverse the result set and obtain records row by row. + // The values of columns in each record are separated by the specified delimiter and end with a linefeed, forming strings. + // Add the strings to the buffer. + while (rs.next()) { + buffer.append(rs.getString(1) + delimiter + + rs.getString(2) + delimiter + + rs.getString(3) + delimiter + + rs.getString(4) + + "\n"); + } + rs.close(); + + try { + // Connect to the target database. + Class.forName(driver); + Connection conn = DriverManager.getConnection(url, user, pass); + BaseConnection baseConn = (BaseConnection) conn; + baseConn.setAutoCommit(false); + + // Initialize the table. + String sql = "Copy " + tablename + " from STDIN with (DELIMITER " + "'" + delimiter + "'" +","+ " ENCODING " + "'" + encoding + "'"); + + // Commit data in the buffer. + CopyManager cp = new CopyManager(baseConn); + StringReader reader = new StringReader(buffer.toString()); + cp.copyIn(sql, reader); + baseConn.commit(); + reader.close(); + baseConn.close(); + } catch (ClassNotFoundException e) { + e.printStackTrace(System.out); + } catch (SQLException e) { + e.printStackTrace(System.out); + } + + } catch (Exception e) { + e.printStackTrace(); + } + } + + //******************************** + // Return the query result set from the source database. + //********************************* + private static ResultSet getDataSet() { + ResultSet rs = null; + try { + Class.forName("com.MY.jdbc.Driver").newInstance(); + Connection conn = DriverManager.getConnection("jdbc:MY://10.119.179.227:3306/jack?useSSL=false&allowPublicKeyRetrieval=true", "jack", "Gauss@123"); + Statement stmt = conn.createStatement(); + rs = stmt.executeQuery("select * from migration_table"); + } catch (SQLException e) { + e.printStackTrace(); + } catch (Exception e) { + e.printStackTrace(); + } + return rs; + } +} +``` diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/4-using-a-gsql-meta-command-to-import-data.md b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/4-using-a-gsql-meta-command-to-import-data.md new file mode 100644 index 0000000000000000000000000000000000000000..2215c6858e3499faa37d2eb157d1994e6f1ce453 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/4-using-a-gsql-meta-command-to-import-data.md @@ -0,0 +1,207 @@ +--- +title: Using a gsql Meta-Command to Import Data +summary: Using a gsql Meta-Command to Import Data +author: Guo Huan +date: 2021-03-04 +--- + +# Using a gsql Meta-Command to Import Data + +The **gsql** tool provides the **\copy** meta-command to import data. + +**\copy Command** + +## Syntax + +``` +\copy { table [ ( column_list ) ] | + +( query ) } { from | to } { filename | + +stdin | stdout | pstdin | pstdout } + +[ with ] [ binary ] [ delimiter + +[ as ] 'character' ] [ null [ as ] 'string' ] + +[ csv [ header ] [ quote [ as ] + +'character' ] [ escape [ as ] 'character' ] + +[ force quote column_list | * ] [ force + +not null column_list ] ] +``` + +You can run this command to import or export data after logging in to a database on any gsql client. Different from the **COPY** statement in SQL, this command performs read/write operations on local files rather than files on database servers. The accessibility and permissions of the local files are restricted to local users. + +> **NOTE:** +> +> **\copy** applies only to small-scale data import in good format. It does not preprocess invalid characters or provide error tolerance. Therefore, **\copy** cannot be used in scenarios where abnormal data exists. **GDS** or **COPY** is preferred for data import. + +**Parameter Description** + +- table + + Specifies the name (possibly schema-qualified) of an existing table. + + Value range: an existing table name + +- column_list + + Specifies an optional list of columns to be copied. + + Value range: any field in the table. If no column list is specified, all columns of the table will be copied. + +- query + + Specifies that the results are to be copied. + + Value range: a **SELECT** or **VALUES** command in parentheses + +- filename + + Specifies the absolute path of a file. To run the **COPY** command, the user must have the write permission for this path. + +- stdin + + Specifies that input comes from the standard input. + +- stdout + + Specifies that output goes to the standard output. + +- pstdin + + Specifies that input comes from the gsql client. + +- pstout + +- Specifies that output goes to the gsql client. + +- binary + + Specifies that data is stored and read in binary mode instead of text mode. In binary mode, you cannot declare **DELIMITER**, **NULL**, or **CSV**. After **binary** is specified, CSV, FIXED, and TEXT cannot be specified through **option** or **copy_option**. + +- delimiter [ as ] 'character' + + Specifies the character that separates columns within each row (line) of the file. + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + > + > - The value of **delimiter** cannot be **\r** or **\n**. + > - A delimiter cannot be the same as the null value. The delimiter for the CSV format cannot be same as the **quote** value. + > - The delimiter of TEXT data cannot contain any of the following characters: \\.abcdefghijklmnopqrstuvwxyz0123456789. + > - The data length of a single row should be less than 1 GB. A row that has many columns using long delimiters cannot contain much valid data. + > - You are advised to use multi-character delimiters or invisible delimiters. For example, you can use multi-characters (such as $^&) and invisible characters (such as 0x07, 0x08, and 0x1b). + + Value range: a multi-character delimiter within 10 bytes + + Default value: + + - A tab character in TEXT format + - A comma (,) in CSV format + - No delimiter in FIXED format + +- null [ as ] 'string' + + Specifies the string that represents a null value. + + Value range: + + - A null value cannot be **\\r** or **\\n**. The maximum length is 100 characters. + - A null value cannot be the same as the **delimiter** or **quote** value. + + Default value: + + - The default value for the CSV format is an empty string without quotation marks. + - The default value for the TEXT format is **\\N**. + +- header + + Specifies whether a file contains a header with the names of each column in the file. **header** is available only for CSV and FIXED files. + + When data is imported, if **header** is **on**, the first row of the data file will be identified as the header and ignored. If **header** is **off**, the first row will be identified as a data row. + + When data is exported, if header is **on**, **fileheader** must be specified. **fileheader** specifies the content in the header. If **header** is **off**, an exported file does not contain a header. + + Value range:**true/on** and **false/off** + + Default value: false + +- quote [ as ] 'character' + + Specifies a quoted character string for a CSV file. + + Default value: a double quotation mark (") + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + > + > - The value of **quote** cannot be the same as that of the **delimiter** or null parameter. + > - The value of **quote** must be a single-byte character. + > - Invisible characters are recommended, such as 0x07, 0x08, and 0x1b. + +- escape [ as ] 'character' + + Specifies an escape character for a CSV file. The value must be a single-byte character. + + Default value: a double quotation mark (") If the value is the same as that of **quote**, it will be replaced by **\0**. + +- force quote column_list | * + + In **CSV COPY TO** mode, forces quotation marks to be used for all non-null values in each specified column. Null values are not quoted. + + Value range: an existing column name + +- force not null column_list + + Assigns a value to a specified column in **CSV COPY FROM** mode. + + Value range: an existing column name + +**Examples** + +1. Create a target table **a**. + + ```sql + mogdb=# CREATE TABLE a(a int); + ``` + +2. Import data. + + Copy data from **stdin** to table **a**. + + ```sql + mogdb=# \copy a from stdin; + ``` + + When the **>>** characters are displayed, enter data. To end your input, enter a backslash and a period (\.). + + ```sql + Enter data to be copied followed by a newline. + End with a backslash and a period on a line by itself. + >> 1 + >> 2 + >> \. + ``` + + Query data imported to table **a**. + + ```sql + mogdb=# SELECT * FROM a; + a + --- + 1 + 2 + (2 rows) + ``` + +3. Copy data from a local file to table **a**. The following assumes that the local file is **/home/omm/2.csv**. + + - Commas (,) are used as delimiters. + + - If the number of columns defined in a source data file is greater than that in a foreign table, extra columns will be ignored during import. + + ```sql + mogdb=# \copy a FROM '/home/omm/2.csv' WITH (delimiter',',IGNORE_EXTRA_DATA 'on'); + ``` diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/5-using-gs_restore-to-import-data.md b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/5-using-gs_restore-to-import-data.md new file mode 100644 index 0000000000000000000000000000000000000000..576040d27756e5093bcc9e5219c04b53d950d6af --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/5-using-gs_restore-to-import-data.md @@ -0,0 +1,263 @@ +--- +title: Using gs_restore to Import Data +summary: Using gs_restore to Import Data +author: Guo Huan +date: 2021-03-04 +--- + +# Using gs_restore to Import Data + +## Scenarios + +**gs_restore** is an import tool provided by the MogDB database. You can use **gs_restore** to import the files exported by **gs_dump** to a database. **gs_restore** can import the files in .tar, custom, or directory format. + +**gs_restore** can: + +- Import data to a database. + + If a database is specified, data is imported to the database. If multiple databases are specified, the password for connecting to each database also needs to be specified. + +- Import data to a script. + + If no database is specified, a script containing the SQL statement to recreate the database is created and written to a file or standard output. This script output is equivalent to the plain text output of **gs_dump**. + +You can specify and sort the data to import. + +## Procedure + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> **gs_restore** incrementally imports data by default. To prevent data exception caused by consecutive imports, use the **-e** and **-c** parameters for each import. **-c** indicates that existing data is deleted from the target database before each import. **-e** indicates that the system ignores the import task with an error (error message is displayed after the import process is complete) and proceeds with the next by default. Therefore, you need to exit the system if an error occurs when you send the SQL statement to the database. + +1. Log in as the OS user **omm** to the primary node of the database. + +2. Use **gs_restore** to import all object definitions from the exported file of the entire **mogdb** database to the **backupdb** database. + + ```bash + $ gs_restore -U jack /home/omm/backup/MPPDB_backup.tar -p 8000 -d backupdb -s -e -c + Password: + ``` + + **Table 1** Common parameters + + | Parameters | Description | Example Value | + | :--------- | :----------------------------------------------------------- | :------------ | + | -U | Username for database connection. | -U jack | + | -W | User password for database connection.
- This parameter is not required for database administrators if the trust policy is used for authentication.
- If you connect to the database without specifying this parameter and you are not a database administrator, you will be prompted to enter the password. | -W abcd@123 | + | -d | Database to which data will be imported. | -d backupdb | + | -p | TCP port or local Unix-domain socket file extension on which the server is listening for connections. | -p 8000 | + | -e | Exits if an error occurs when you send the SQL statement to the database. Error messages are displayed after the import process is complete. | - | + | -c | Cleans existing objects from the target database before the import. | - | + | -s | Imports only object definitions in schemas and does not import data. Sequence values will also not be imported. | - | + + For details about other parameters, see "Tool Reference > Server Tools > [gs_restore](9-gs_restore)" in the **Reference Guide**. + +## Examples + +Example 1: Run **gs_restore** to import data and all object definitions of the **mogdb** database from the **MPPDB_backup.dmp** file (custom format). + +```bash +$ gs_restore backup/MPPDB_backup.dmp -p 8000 -d backupdb +Password: +gs_restore[2017-07-21 19:16:26]: restore operation successful +gs_restore: total time: 13053 ms +``` + +Example 2: Run **gs_restore** to import data and all object definitions of the **mogdb** database from the **MPPDB_backup.tar** file. + +```bash +$ gs_restore backup/MPPDB_backup.tar -p 8000 -d backupdb +gs_restore[2017-07-21 19:21:32]: restore operation successful +gs_restore[2017-07-21 19:21:32]: total time: 21203 ms +``` + +Example 3: Run **gs_restore** to import data and all object definitions of the **mogdb** database from the **MPPDB_backup** directory. + +```bash +$ gs_restore backup/MPPDB_backup -p 8000 -d backupdb +gs_restore[2017-07-21 19:26:46]: restore operation successful +gs_restore[2017-07-21 19:26:46]: total time: 21003 ms +``` + +Example 4: Run **gs_restore** to import all object definitions of the database from the **MPPDB_backup.tar** file to the **backupdb** database. Table data is not imported. + +```bash +$ gs_restore /home/omm/backup/MPPDB_backup.tar -p 8000 -d backupdb -s -e -c +Password: +gs_restore[2017-07-21 19:46:27]: restore operation successful +gs_restore[2017-07-21 19:46:27]: total time: 32993 ms +``` + +Example 5: Run **gs_restore** to import data and all definitions in the **PUBLIC** schema from the **MPPDB_backup.dmp** file. Existing objects are deleted from the target database before the import. If an existing object references to an object in another schema, manually delete the referenced object first. + +```bash +$ gs_restore backup/MPPDB_backup.dmp -p 8000 -d backupdb -e -c -n PUBLIC +gs_restore: [archiver (db)] Error while PROCESSING TOC: +gs_restore: [archiver (db)] Error from TOC entry 313; 1259 337399 TABLE table1 gaussdba +gs_restore: [archiver (db)] could not execute query: ERROR: cannot drop table table1 because other objects depend on it +DETAIL: view t1.v1 depends on table table1 +HINT: Use DROP ... CASCADE to drop the dependent objects too. +Command was: DROP TABLE public.table1; +``` + +Manually delete the referenced object and create it again after the import is complete. + +```bash +$ gs_restore backup/MPPDB_backup.dmp -p 8000 -d backupdb -e -c -n PUBLIC +gs_restore[2017-07-21 19:52:26]: restore operation successful +gs_restore[2017-07-21 19:52:26]: total time: 2203 ms +``` + +Example 6: Run **gs_restore** to import the definition of the **hr.staffs** table in the **hr** schema from the **MPPDB_backup.dmp** file. Before the import, the **hr.staffs** table does not exist. + +```bash +$ gs_restore backup/MPPDB_backup.dmp -p 8000 -d backupdb -e -c -s -n hr -t hr.staffs +gs_restore[2017-07-21 19:56:29]: restore operation successful +gs_restore[2017-07-21 19:56:29]: total time: 21000 ms +``` + +Example 7: Run **gs_restore** to import data of the **hr.staffs** table in **hr** schema from the **MPPDB_backup.dmp** file. Before the import, the **hr.staffs** table is empty. + +```bash +$ gs_restore backup/MPPDB_backup.dmp -p 8000 -d backupdb -e -a -n hr -t hr.staffs +gs_restore[2017-07-21 20:12:32]: restore operation successful +gs_restore[2017-07-21 20:12:32]: total time: 20203 ms +``` + +Example 8: Run **gs_restore** to import the definition of the **hr.staffs** table. Before the import, the **hr.staffs** table already exists. + +```sql +human_resource=# select * from hr.staffs; + staff_id | first_name | last_name | email | phone_number | hire_date | employment_id | salary | commission_pct | manager_id | section_id +----------+-------------+-------------+----------+--------------------+---------------------+---------------+----------+----------------+------------+------------ + 200 | Jennifer | Whalen | JWHALEN | 515.123.4444 | 1987-09-17 00:00:00 | AD_ASST | 4400.00 | | 101 | 10 + 201 | Michael | Hartstein | MHARTSTE | 515.123.5555 | 1996-02-17 00:00:00 | MK_MAN | 13000.00 | | 100 | 20 + +$ gsql -d human_resource -p 8000 + +gsql ((MogDB x.x.x build 56189e20) compiled at 2022-01-07 18:47:53 commit 0 last mr ) +Non-SSL connection (SSL connection is recommended when requiring high-security) +Type "help" for help. + +human_resource=# drop table hr.staffs CASCADE; +NOTICE: drop cascades to view hr.staff_details_view +DROP TABLE + +$ gs_restore /home/omm/backup/MPPDB_backup.tar -p 8000 -d human_resource -n hr -t staffs -s -e +restore operation successful +total time: 904 ms + +human_resource=# select * from hr.staffs; + staff_id | first_name | last_name | email | phone_number | hire_date | employment_id | salary | commission_pct | manager_id | section_id +----------+------------+-----------+-------+--------------+-----------+---------------+--------+----------------+------------+------------ +(0 rows) +``` + +Example 9: Run **gs_restore** to import data and definitions of the **staffs** and **areas** tables. Before the import, the **staffs** and **areas** tables do not exist. + +```sql +human_resource=# \d + List of relations + Schema | Name | Type | Owner | Storage +--------+--------------------+-------+----------+---------------------------------- + hr | employment_history | table | omm | {orientation=row,compression=no} + hr | employments | table | omm | {orientation=row,compression=no} + hr | places | table | omm | {orientation=row,compression=no} + hr | sections | table | omm | {orientation=row,compression=no} + hr | states | table | omm | {orientation=row,compression=no} +(5 rows) + +$ gs_restore /home/mogdb/backup/MPPDB_backup.tar -p 8000 -d human_resource -n hr -t staffs -n hr -t areas +restore operation successful +total time: 724 ms + +human_resource=# \d + List of relations + Schema | Name | Type | Owner | Storage +--------+--------------------+-------+----------+---------------------------------- + hr | areas | table | omm | {orientation=row,compression=no} + hr | employment_history | table | omm | {orientation=row,compression=no} + hr | employments | table | omm | {orientation=row,compression=no} + hr | places | table | omm | {orientation=row,compression=no} + hr | sections | table | omm | {orientation=row,compression=no} + hr | staffs | table | omm | {orientation=row,compression=no} + hr | states | table | omm | {orientation=row,compression=no} +(7 rows) + +human_resource=# select * from hr.areas; + area_id | area_name +---------+------------------------ + 4 | Middle East and Africa + 1 | Europe + 2 | Americas + 3 | Asia +(4 rows) +``` + +Example 10: Run **gs_restore** to import data and all object definitions in the **hr** schema. + +```bash +$ gs_restore /home/omm/backup/MPPDB_backup1.dmp 8000 -d backupdb -n hr -e -c +restore operation successful +total time: 702 ms +``` + +Example 11: Run **gs_restore** to import all object definitions in the **hr** and **hr1** schemas to the **backupdb** database. + +```bash +$ gs_restore /home/omm/backup/MPPDB_backup2.dmp -p 8000 -d backupdb -n hr -n hr1 -s +restore operation successful +total time: 665 ms +``` + +Example 12: Run **gs_restore** to decrypt the files exported from the **human_resource** database and import them to the **backupdb** database. + +```sql +mogdb=# create database backupdb; +CREATE DATABASE + +$ gs_restore /home/omm/backup/MPPDB_backup.tar -p 8000 -d backupdb --with-key=1234567812345678 +restore operation successful +total time: 23472 ms + +$ gsql -d backupdb -p 8000 -r + +gsql ((MogDB x.x.x build 56189e20) compiled at 2022-01-07 18:47:53 commit 0 last mr ) +Non-SSL connection (SSL connection is recommended when requiring high-security) +Type "help" for help. + +backupdb=# select * from hr.areas; + area_id | area_name +---------+------------------------ + 4 | Middle East and Africa + 1 | Europe + 2 | Americas + 3 | Asia +(4 rows) +``` + +Example 13: **user 1** does not have the permission to import data from an exported file to the **backupdb** database and **role1** has this permission. To import the exported data to the **backupdb** database, you can set **-role** to **role1** in the **gs_restore** command. + +```sql +human_resource=# CREATE USER user1 IDENTIFIED BY "1234@abc"; +CREATE ROLE role1 with SYSADMIN IDENTIFIED BY "abc@1234"; + +$ gs_restore -U user1 /home/omm/backup/MPPDB_backup.tar -p 8000 -d backupdb --role role1 --rolepassword abc@1234 +Password: +restore operation successful +total time: 554 ms + +$ gsql -d backupdb -p 8000 -r + +gsql ((MogDB x.x.x build 56189e20) compiled at 2022-01-07 18:47:53 commit 0 last mr ) +Non-SSL connection (SSL connection is recommended when requiring high-security) +Type "help" for help. + +backupdb=# select * from hr.areas; + area_id | area_name +---------+------------------------ + 4 | Middle East and Africa + 1 | Europe + 2 | Americas + 3 | Asia +(4 rows) +``` diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/6-updating-data-in-a-table.md b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/6-updating-data-in-a-table.md new file mode 100644 index 0000000000000000000000000000000000000000..d1a64607026b303845f0533fe68e5b040ec2a16a --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/6-updating-data-in-a-table.md @@ -0,0 +1,166 @@ +--- +title: Updating Data in a Table +summary: Updating Data in a Table +author: Guo Huan +date: 2021-03-04 +--- + +# Updating Data in a Table + +## Updating a Table by Using DML Statements + +In MogDB, you can update a table by running DML statements. + +### Procedure + +There is a table named **customer_t** and the table structure is as follows: + +```sql +CREATE TABLE customer_t +( c_customer_sk integer, + c_customer_id char(5), + c_first_name char(6), + c_last_name char(8) +) ; +``` + +You can run the following DML statements to update data in the table. + +- Run the **INSERT** statement to insert data into the table. + + - Insert a row to the **customer_t** table. + + ```sql + INSERT INTO customer_t (c_customer_sk, c_customer_id, c_first_name,c_last_name) VALUES (3769, 5, 'Grace','White'); + ``` + + - Insert multiple rows to the **customer_t** table. + + ```sql + INSERT INTO customer_t (c_customer_sk, c_customer_id, c_first_name,c_last_name) VALUES + (6885, 1, 'Joes', 'Hunter'), + (4321, 2, 'Lily','Carter'), + (9527, 3, 'James', 'Cook'), + (9500, 4, 'Lucy', 'Baker'); + ``` + + For details on how to use **INSERT**, see Inserting Data to Tables. + +- Run the **UPDATE** statement to update data in the table. Change the value of the **c_customer_id** column to **0**. + + ```sql + UPDATE customer_t SET c_customer_id = 0; + ``` + + For details on how to use **UPDATE**, see UPDATE. + +- Run the **DELETE** statement to delete rows from the table. + + You can use the **WHERE** clause to specify the rows whose data is to delete. If you do not specify it, all rows in the table are deleted and only the data structure is retained. + + ```sql + DELETE FROM customer_t WHERE c_last_name = 'Baker'; + ``` + + For details on how to use **DELETE**, see DELETE. + +- Run the **TRUNCATE** statement to delete all rows from the table. + + ```sql + TRUNCATE TABLE customer_t; + ``` + + For details on how to use **TRUNCATE**, see TRUNCATE. + + The **DELETE** statement deletes a row of data each time whereas the **TRUNCATE** statement deletes data by releasing the data page stored in the table. Therefore, data can be deleted more quickly by using **TRUNCATE** than using **DELETE**. + + **DELETE** deletes table data but does not release table storage space. **TRUNCATE** deletes table data and releases table storage space. + +## Updating and Inserting Data by Using the MERGE INTO Statement + +To add all or a large amount of data in a table to an existing table, you can run the **MERGE INTO** statement in MogDB to merge the two tables so that data can be quickly added to the existing table. + +The **MERGE INTO** statement matches data in a source table with that in a target table based on a join condition. If data matches, **UPDATE** will be executed on the target table. Otherwise, **INSERT** will be executed. This statement is a convenient way to combine multiple operations and avoids multiple **INSERT** or **UPDATE** statements. + +### Prerequisites + +You have the **INSERT** and **UPDATE** permissions for the target table and the **SELECT** permission for the source table. + +### Procedure + +1. Create a source table named **products** and insert data. + + ```sql + mogdb=# CREATE TABLE products + ( product_id INTEGER, + product_name VARCHAR2(60), + category VARCHAR2(60) + ); + + mogdb=# INSERT INTO products VALUES + (1502, 'olympus camera', 'electrncs'), + (1601, 'lamaze', 'toys'), + (1666, 'harry potter', 'toys'), + (1700, 'wait interface', 'books'); + ``` + +2. Create a target table named **newproducts** and insert data. + + ```sql + mogdb=# CREATE TABLE newproducts + ( product_id INTEGER, + product_name VARCHAR2(60), + category VARCHAR2(60) + ); + + mogdb=# INSERT INTO newproducts VALUES + (1501, 'vivitar 35mm', 'electrncs'), + (1502, 'olympus ', 'electrncs'), + (1600, 'play gym', 'toys'), + (1601, 'lamaze', 'toys'), + (1666, 'harry potter', 'dvd'); + ``` + +3. Run the **MERGE INTO** statement to merge data in the source table **products** into the target table **newproducts**. + + ```sql + MERGE INTO newproducts np + USING products p + ON (np.product_id = p.product_id ) + WHEN MATCHED THEN + UPDATE SET np.product_name = p.product_name, np.category = p.category + WHEN NOT MATCHED THEN + INSERT VALUES (p.product_id, p.product_name, p.category) ; + ``` + + For details on parameters in the statement, see [Table 1](#Parameters in the MERGE INTO statement). For more information, see MERGE INTO. + + **Table 1** Parameters in the MERGE INTO statement + + | Parameter | Description | Example Value | + | :-------------------------- | :----------------------------------------------------------- | :----------------------------------------------------------- | + | **INTO** clause | Specifies a target table that is to be updated or has data to be inserted.
A table alias is supported. | Value: newproducts np
The table name is **newproducts** and the alias is **np**. | + | **USING** clause | Specifies a source table. A table alias is supported.
If the target table is a replication table, the source table must also be a replication table. | Value: products p
The table name is **products** and the alias is **p**. | + | **ON** clause | Specifies a join condition between a target table and a source table.
Columns in the join condition cannot be updated. | Value: np.product_id = p.product_id
The join condition is that the **product_id** column in the target table **newproducts** has equivalent values as the **product_id** column in the source table **products**. | + | **WHEN MATCHED** clause | Performs **UPDATE** if data in the source table matches that in the target table based on the condition.
- Only one **WHEN MATCHED** clause can be specified.
- The **WHEN MATCHED** clause can be omitted. If it is omitted, no operation will be performed on the rows that meet the condition in the **ON** clause.
- Columns involved in the distribution key of the target table cannot be updated. | Value: WHEN MATCHED THEN UPDATE SET np.product_name = p.product_name, np.category = p.category
When the condition in the **ON** clause is met, the values of the **product_name** and **category** columns in the target table **newproducts** are replaced with the values in the corresponding columns in the source table **products**. | + | **WHEN NOT MATCHED** clause | Performs **INSERT** if data in the source table does not match that in the target table based on the condition.
- Only one **WHEN NOT MATCHED** clause can be specified.
- The **WHEN NOT MATCHED** clause can be omitted.
- An **INSERT** clause can contain only one **VALUES**.
- The **WHEN MATCHED** and **WHEN NOT MATCHED** clauses can be exchanged in sequence. One of them can be omitted, but they cannot be omitted at the same time. | Value: WHEN NOT MATCHED THEN INSERT VALUES (p.product_id, p.product_name, p.category)
Insert rows in the source table **products** that do not meet the condition in the **ON** clause into the target table **newproducts**. | + +4. Query the target table **newproducts** after the merge. + + ```sql + SELECT * FROM newproducts; + ``` + + The command output is as follows: + + ```sql + product_id | product_name | category + ------------+----------------+----------- + 1501 | vivitar 35mm | electrncs + 1502 | olympus camera | electrncs + 1666 | harry potter | toys + 1600 | play gym | toys + 1601 | lamaze | toys + 1700 | wait interface | books + (6 rows) + ``` diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/7-deep-copy.md b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/7-deep-copy.md new file mode 100644 index 0000000000000000000000000000000000000000..15c0a82b53c13c648683e7e9b918441c7aa8f608 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/7-deep-copy.md @@ -0,0 +1,116 @@ +--- +title: Deep Copy +summary: Deep Copy +author: Guo Huan +date: 2021-03-04 +--- + +# Deep Copy + +After data is imported, you can perform a deep copy to modify a partition key, change a row-store table to a column-store table, or add a partial cluster key. A deep copy re-creates a table and batch inserts data into the table. + +MogDB provides three deep copy methods. + +## Performing a Deep Copy by Using the CREATE TABLE Statement + +Run the **CREATE TABLE** statement to create a copy of the original table, batch insert data of the original table into the copy, and rename the copy to the name of the original table. + +When creating the copy, you can specify table and column attributes, such as the primary key. + +**Procedure** + +Perform the following operations to carry out a deep copy for the **customer_t** table: + +1. Run the **CREATE TABLE** statement to create the copy **customer_t_copy** of the **customer_t** table. + + ```sql + CREATE TABLE customer_t_copy + ( c_customer_sk integer, + c_customer_id char(5), + c_first_name char(6), + c_last_name char(8) + ) ; + ``` + +2. Run the **INSERT INTO…SELECT** statement to batch insert data of the original table into the copy. + + ```sql + INSERT INTO customer_t_copy (SELECT * FROM customer_t); + ``` + +3. Delete the original table. + + ```sql + DROP TABLE customer_t; + ``` + +4. Run the **ALTER TABLE** statement to rename the copy to the name of the original table. + + ```sql + ALTER TABLE customer_t_copy RENAME TO customer_t; + ``` + +## Performing a Deep Copy by Using the CREATE TABLE LIKE Statement + +Run the **CREATE TABLE LIKE** statement to create a copy of the original table, batch insert data of the original table into the copy, and rename the copy to the name of the original table. This method does not inherit the primary key attributes of the original table. You can use the **ALTER TABLE** statement to add them. + +**Procedure** + +1. Run the **CREATE TABLE LIKE** statement to create the copy **customer_t_copy** of the **customer_t** table. + + ```sql + CREATE TABLE customer_t_copy (LIKE customer_t); + ``` + +2. Run the **INSERT INTO…SELECT** statement to batch insert data of the original table into the copy. + + ```sql + INSERT INTO customer_t_copy (SELECT * FROM customer_t); + ``` + +3. Delete the original table. + + ```sql + DROP TABLE customer_t; + ``` + +4. Run the **ALTER TABLE** statement to rename the copy to the name of the original table. + + ```sql + ALTER TABLE customer_t_copy RENAME TO customer_t; + ``` + +## Performing a Deep Copy by Creating a Temporary Table and Truncating the Original Table + +Run the **CREATE TABLE ….** **AS** statement to create a temporary table for the original table, truncate the original table, and batch insert data of the temporary data into the original table. + +When creating the temporary table, retain the primary key attributes of the original table. This method is recommended if the original table has dependency items. + +**Procedure** + +1. Run the **CREATE TABLE AS** statement to create a temporary table **customer_t_temp** for the **customer_t** table. + + ```sql + CREATE TEMP TABLE customer_t_temp AS SELECT * FROM customer_t; + ``` + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + > Compared with the use of permanent tables, the use of temporary tables can improve performance but may incur data loss. A temporary table is automatically deleted at the end of the session where it is located. If data loss is unacceptable, use a permanent table. + +2. Truncate the original table **customer_t**. + + ```sql + TRUNCATE customer_t; + ``` + +3. Run the **INSERT INTO…SELECT** statement to batch insert data of the temporary table into the original table. + + ```sql + INSERT INTO customer_t (SELECT * FROM customer_t_temp); + ``` + +4. Delete the temporary table **customer_t_temp**. + + ```sql + DROP TABLE customer_t_temp; + ``` diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/8-ANALYZE-table.md b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/8-ANALYZE-table.md new file mode 100644 index 0000000000000000000000000000000000000000..ba0ef2e9b65b6f8fef54cb416354c81fc746ed3d --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/8-ANALYZE-table.md @@ -0,0 +1,46 @@ +--- +title: ANALYZE Table +summary: ANALYZE Table +author: Guo Huan +date: 2021-03-04 +--- + +# ANALYZE Table + +The execution plan generator needs to use table statistics to generate the most effective query execution plan to improve query performance. After data is imported, you are advised to run the **ANALYZE** statement to update table statistics. The statistics are stored in the system catalog **PG_STATISTIC**. + +## ANALYZE Table + +**ANALYZE** supports row-store and column-store tables. **ANALYZE** can also collect statistics about specified columns of a local table. For details on **ANALYZE**, see [ANALYZE | ANALYSE](32-ANALYZE-ANALYSE). + +Update table statistics. + +Do **ANALYZE** to the **product_info** table. + +```sql +ANALYZE product_info; +``` + +```sql +ANALYZE +``` + +## autoanalyze + +MogDB provides the GUC parameter autovacuum to specify whether to enable the autovacuum function of the database. + +If **autovacuum** is set to **on**, the system will start the autovacuum thread to automatically analyze tables when the data volume in the table reaches the threshold. This is the autoanalyze function. + +- For an empty table, when the number of rows inserted to it is greater than 50, **ANALYZE** is automatically triggered. +- For a table containing data, the threshold is 50 + 10% x **reltuples**, where **reltuples** indicates the total number of rows in the table. + +The autovacuum function also depends on the following two GUC parameters in addition to **autovacuum**: + +- track_counts: This parameter must be set to **on** to enable statistics collection about the database. +- autovacuum_max_workers: This parameter must be set to a value greater than **0** to specify the maximum number of concurrent autovacuum threads. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-notice.gif) **NOTICE:** +> +> - The autoanalyze function supports the default sampling mode but not percentage sampling. +> - The autoanalyze function does not collect multi-column statistics, which only supports percentage sampling. +> - The autoanalyze function supports row-store and column-store tables and does not support foreign tables, temporary tables, unlogged tables, and TOAST tables. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/9-doing-VACUUM-to-a-table.md b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/9-doing-VACUUM-to-a-table.md new file mode 100644 index 0000000000000000000000000000000000000000..06de9a0db25db95529fbef3049d539a5294a90e2 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/importing-and-exporting-data/importing-data/9-doing-VACUUM-to-a-table.md @@ -0,0 +1,22 @@ +--- +title: Doing VACUUM to a Table +summary: Doing VACUUM to a Table +author: Guo Huan +date: 2021-03-04 +--- + +# Doing VACUUM to a Table + +If a large number of rows were updated or deleted during import, run **VACUUM FULL** before **ANALYZE**. A large number of UPDATE and DELETE operations generate huge disk page fragments, which reduces query efficiency. **VACUUM FULL** can restore disk page fragments and return them to the OS. + +Run the **VACUUM FULL** statement. + +Do **VACUUM FULL** to the **product_info** table. + +```sql +VACUUM FULL product_info +``` + +```sql +VACUUM +``` diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/1-introducing-mot/1-mot-introduction.md b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/1-introducing-mot/1-mot-introduction.md new file mode 100644 index 0000000000000000000000000000000000000000..4d56487bfd6dcf6b5058fceb5f4aa6bf19cb9e8d --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/1-introducing-mot/1-mot-introduction.md @@ -0,0 +1,33 @@ +--- +title: MOT Introduction +summary: MOT Introduction +author: Zhang Cuiping +date: 2021-03-04 +--- + +# MOT Introduction + +MogDB introduces Memory-Optimized Tables (MOT) storage engine - a transactional row-based store (rowstore), that is optimized for many-core and large memory servers. MOT is a state-of-the-art production-grade feature (Beta release) of the MogDB database that provides greater performance for transactional workloads. MOT is fully ACID compliant and includes strict durability and high availability support. Businesses can leverage MOT for mission-critical, performance-sensitive Online Transaction Processing (OLTP) applications in order to achieve high performance, high throughput, low and predictable latency and high utilization of many-core servers. MOT is especially suited to leverage and scale-up when run on modern servers with multiple sockets and many-core processors, such as Huawei Taishan servers with ARM/Kunpeng processors and x86-based Dell or similar servers. + +**Figure 1** Memory-Optimized Storage Engine Within MogDB + +![memory-optimized-storage-engine-within-opengauss](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-introduction-2.png) + +[Figure 1](#memoryoptimized) presents the Memory-Optimized Storage Engine component (in green) of MogDB database and is responsible for managing MOT and transactions. + +MOT tables are created side-by-side regular disk-based tables. MOT's effective design enables almost full SQL coverage and support for a full database feature-set, such as stored procedures and user-defined functions (excluding the features listed in **MOT SQL Coverage and Limitations** section). + +With data and indexes stored totally in-memory, a Non-Uniform Memory Access (NUMA)-aware design, algorithms that eliminate lock and latch contention and query native compilation, MOT provides faster data access and more efficient transaction execution. + +MOT's effective almost lock-free design and highly tuned implementation enable exceptional near-linear throughput scale-up on many-core servers - probably the best in the industry. + +Memory-Optimized Tables are fully ACID compliant, as follows: + +- **Atomicity -** An atomic transaction is an indivisible series of database operations that either all occur or none occur after a transaction has been completed (committed or aborted, respectively). +- **Consistency -** Every transaction leaves the database in a consistent (data integrity) state. +- **Isolation -** Transactions cannot interfere with each other. MOT supports repeatable-reads and read-committed isolation levels. In the next release, MOT will also support serializable isolation. See the **MOT Isolation Levels** section for more information. +- **Durability -** The effects of successfully completed (committed) transactions must persist despite crashes and failures. MOT is fully integrated with the WAL-based logging of MogDB. Both synchronous and asynchronous logging options are supported. MOT also uniquely supports synchronous + group commit with NUMA-awareness optimization. See the **MOT Durability Concepts** section for more information. + +The MOT Engine was published in the VLDB 2020 (an International Conference on ‘Very Large Data Bases" or VLDB): + +**Industrial-Strength OLTP Using Main Memory and Many Cores**, VLDB 2020 vol. 13 - [Paper](http://www.vldb.org/pvldb/vol13/p3099-avni.pdf), [Video on youtube](https://www.youtube.com/watch?v=xcAbww6x8wo), [Video on bilibili](https://www.bilibili.com/video/BV1MA411n7ef?p=97). diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/1-introducing-mot/2-mot-features-and-benefits.md b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/1-introducing-mot/2-mot-features-and-benefits.md new file mode 100644 index 0000000000000000000000000000000000000000..1713c463ebe20e1f1b972747b3cee069dc6de4ec --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/1-introducing-mot/2-mot-features-and-benefits.md @@ -0,0 +1,22 @@ +--- +title: MOT Features and Benefits +summary: MOT Features and Benefits +author: Zhang Cuiping +date: 2021-03-04 +--- + +# MOT Features and Benefits + +MOT provide users with significant benefits in performance (query and transaction latency), scalability (throughput and concurrency) and in some cases cost (high resource utilization) - + +- **Low Latency -** Provides fast query and transaction response time +- **High Throughput -** Supports spikes and constantly high user concurrency +- **High Resource Utilization -** Utilizes hardware to its full extent + +Using MOT, applications are able to achieve more 2.5 to 4 times (2.5x - 4x) higher throughput. For example, in our TPC-C benchmarks (interactive transactions and synchronous logging) performed both on Huawei Taishan Kunpeng-based (ARM) servers and on Dell x86 Intel Xeon-based servers, MOT provides throughput gains that vary from 2.5x on a 2-socket server to 3.7x on a 4-socket server, reaching 4.8M (million) tpmC on an ARM 4-socket 256-cores server. + +The lower latency provided by MOT reduces transaction speed by 3x to 5.5x, as observed in TPC-C benchmarks. + +Additionally, MOT enables extremely high utilization of server resources when running under high load and contention, which is a well-known problem for all leading industry databases. Using MOT, utilization reaches 99% on 4-socket server, compared with much lower utilization observed when testing other industry leading databases. + +This abilities are especially evident and important on modern many-core servers. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/1-introducing-mot/3-mot-key-technologies.md b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/1-introducing-mot/3-mot-key-technologies.md new file mode 100644 index 0000000000000000000000000000000000000000..3bf4108c17a46edc1a760b6734314a8939a4dea6 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/1-introducing-mot/3-mot-key-technologies.md @@ -0,0 +1,24 @@ +--- +title: MOT Key Technologies +summary: MOT Key Technologies +author: Zhang Cuiping +date: 2021-03-04 +--- + +# MOT Key Technologies + +The following key MOT technologies enable its benefits: + +- **Memory Optimized Data Structures -** With the objective of achieving optimal high concurrent throughput and predictable low latency, all data and indexes are in memory, no intermediate page buffers are used and minimal, short-duration locks are used. Data structures and all algorithms have been specialized and optimized for in-memory design. +- **Lock-free Transaction Management -** The MOT storage engine applies an optimistic approach to achieving data integrity versus concurrency and high throughput. During a transaction, an MOT table does not place locks on any version of the data rows being updated, thus significantly reducing contention in some high-volume systems. Optimistic Concurrency Control (OCC) statements within a transaction are implemented without locks, and all data modifications are performed in a part of the memory that is dedicated to private transactions (also called *Private Transaction Memory*). This means that during a transaction, the relevant data is updated in the Private Transaction Memory, thus enabling lock-less reads and writes; and a very short duration lock is only placed at the Commit phase. For more details, see the **MOT Concurrency Control Mechanism** section. +- **Lock-free Index -** Because database data and indexes stored totally in-memory, having an efficient index data structure and algorithm is essential. The MOT Index is based on state-of-the-art Masstree a fast and scalable Key Value (KV) store for multi-core systems, implemented as a Trie of B+ trees. In this way, excellent performance is achieved on many-core servers and during high concurrent workloads. This index applies various advanced techniques in order to optimize performance, such as an optimistic lock approach, cache-line awareness and memory prefetching. +- **NUMA-aware Memory Management -** MOT memory access is designed with Non-Uniform Memory Access (NUMA) awareness. NUMA-aware algorithms enhance the performance of a data layout in memory so that threads access the memory that is physically attached to the core on which the thread is running. This is handled by the memory controller without requiring an extra hop by using an interconnect, such as Intel QPI. MOT's smart memory control module with pre-allocated memory pools for various memory objects improves performance, reduces locks and ensures stability. Allocation of a transaction's memory objects is always NUMA-local. Deallocated objects are returned to the pool. Minimal usage of OS malloc during transactions circumvents unnecessary locks. +- **Efficient Durability - Logging and Checkpoint -** Achieving disk persistence (also known as *durability*) is a crucial requirement for being ACID compliant (the **D** stands for Durability). All current disks (including the SSD and NVMe) are significantly slower than memory and thus are always the bottleneck of a memory-based database. As an in-memory storage engine with full durability support, MOT's durability design must implement a wide variety of algorithmic optimizations in order to ensure durability, while still achieving the speed and throughput objectives for which it was designed. These optimizations include - + - Parallel logging, which is also available in all MogDB disk tables + - Log buffering per transaction and lock-less transaction preparation + - Updating delta records, meaning only logging changes + - In addition to synchronous and asynchronous, innovative NUMA-aware group commit logging + - State-of-the-art database checkpoints (CALC) enable the lowest memory and computational overhead. +- **High SQL Coverage and Feature Set -** By extending and relying on the PostgreSQL Foreign Data Wrappers (FDW) + Index support, the entire range of SQL is covered, including stored procedures, user-defined functions and system function calls. You may refer to the **MOT SQL Coverage and Limitations** section for a list of the features that are not supported. +- **Queries Native Compilation using PREPARE Statements -** Queries and transaction statements can be executed in an interactive manner by using PREPARE client commands that have been precompiled into a native execution format (which are also known as *Code-Gen* or *Just-in-Time [JIT]* compilation). This achieves an average of 30% higher performance. Compilation and Lite Execution are applied when possible, and if not, applicable queries are processed using the standard execution path. A Cache Plan module (that has been optimized for OLTP) re-uses compilation results throughout an entire session (even using different bind settings), as well as across different sessions. +- **Seamless Integration of MOT and MogDB Database -** The MOT operates side by side the disk-based storage engine within an integrated envelope. MOT's main memory engine and disk-based storage engines co-exist side by side in order to support multiple application scenarios, while internally reusing database auxiliary services, such as a Write-Ahead Logging (WAL) Redo Log, Replication, Checkpointing, Recovery, High Availability and so on. Users benefit from the unified deployment, configuration and access of both disk-based tables and MOT tables. This provides a flexible and cost-efficient choice of which storage engine to use according to specific requirements. For example, to place highly performance-sensitive data that causes bottlenecks into memory. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/1-introducing-mot/4-mot-usage-scenarios.md b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/1-introducing-mot/4-mot-usage-scenarios.md new file mode 100644 index 0000000000000000000000000000000000000000..ef56b9ec8f19f87d2a511506afaed5bbae603f48 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/1-introducing-mot/4-mot-usage-scenarios.md @@ -0,0 +1,22 @@ +--- +title: MOT Usage Scenarios +summary: MOT Usage Scenarios +author: Zhang Cuiping +date: 2021-03-04 +--- + +# MOT Usage Scenarios + +MOT can significantly speed up an application's overall performance, depending on the characteristics of the workload. MOT improves the performance of transaction processing by making data access and transaction execution more efficient and minimizing redirections by removing lock and latch contention between concurrently executing transactions. + +MOT's extreme speed stems from the fact that it is optimized around concurrent in-memory usage management (not just because it is in memory). Data storage, access and processing algorithms were designed from the ground up to take advantage of the latest state of the art enhancements in in-memory and high-concurrency computing. + +MogDB enables an application to use any combination of MOT tables and standard disk-based tables. MOT is especially beneficial for enabling your most active, high-contention and performance-sensitive application tables that have proven to be bottlenecks and for tables that require a predictable low-latency access and high throughput. + +MOT tables can be used for a variety of application use cases, which include: + +- **High-throughput Transactions Processing -** This is the primary scenario for using MOT, because it supports large transaction volume that requires consistently low latency for individual transactions. Examples of such applications are real-time decision systems, payment systems, financial instrument trading, sports betting, mobile gaming, ad delivery and so on. +- **Acceleration of Performance Bottlenecks -** High contention tables can significantly benefit from using MOT, even when other tables are on disk. The conversion of such tables (in addition to related tables and tables that are referenced together in queries and transactions) result in a significant performance boost as the result of lower latencies, less contention and locks, and increased server throughput ability. +- **Elimination of Mid-Tier Cache -** Cloud and Mobile applications tend to have periodic or spikes of massive workload. Additionally, many of these applications have 80% or above read-workload, with frequent repetitive queries. To sustain the workload spikes, as well to provide optimal user experience by low-latency response time, applications sometimes deploy a mid-tier caching layer. Such additional layers increase development complexity and time, and also increase operational costs. MOT provides a great alternative, simplifying the application architecture with a consistent and high performance data store, while shortening development cycles and reducing CAPEX and OPEX costs. +- **Large-scale Data Streaming and Data Ingestion -** MOT tables enables large-scale streamlined data processing in the Cloud (for Mobile, M2M and IoT), Transactional Processing (TP), Analytical Processing (AP) and Machine Learning (ML). MOT tables are especially good at consistently and quickly ingesting large volumes of data from many different sources at the same time. The data can be later processed, transformed and moved in slower disk-based tables. Alternatively, MOT enables the querying of consistent and up-date data that enable real-time conclusions. In IoT and cloud applications with many real-time data streams, it is common to have special data ingestion and processing triers. For instance, an Apache Kafka cluster can be used to ingest data of 100,000 events/sec with a 10msec latency. A periodic batch processing task enriches and converts the collected data into an alternative format to be placed into a relational database for further analysis. MOT can support such scenarios (while eliminating the separate data ingestion tier) by ingesting data streams directly into MOT relational tables, ready for analysis and decisions. This enables faster data collection and processing, MOT eliminates costly tiers and slow batch processing, increases consistency, increases freshness of analyzed data, as well as lowers Total Cost of Ownership (TCO). +- **Lower TCO -** Higher resource efficiency and mid-tier elimination can save 30% to 90%. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/1-introducing-mot/5-mot-performance-benchmarks.md b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/1-introducing-mot/5-mot-performance-benchmarks.md new file mode 100644 index 0000000000000000000000000000000000000000..5a113129038e098d2736fb5b33db0614799e3493 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/1-introducing-mot/5-mot-performance-benchmarks.md @@ -0,0 +1,189 @@ +--- +title: MOT Performance Benchmarks +summary: MOT Performance Benchmarks +author: Zhang Cuiping +date: 2021-03-04 +--- + +# MOT Performance Benchmarks + +Our performance tests are based on the TPC-C Benchmark that is commonly used both by industry and academia. + +Ours tests used BenchmarkSQL (see **MOT Sample TPC-C Benchmark**) and generates the workload using interactive SQL commands, as opposed to stored procedures. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** Using the stored procedures approach may produce even higher performance results because it involves significantly less networking roundtrips and database envelope SQL processing cycles. + +All tests that evaluated the performance of MogDB MOT vs DISK used synchronous logging and its optimized **group-commit=on** version in MOT. + +Finally, we performed an additional test in order to evaluate MOT's ability to quickly and ingest massive quantities of data and to serve as an alternative to a mid-tier data ingestion solutions. + +All tests were performed in June 2020. + +The following shows various types of MOT performance benchmarks. + +## MOT Hardware + +The tests were performed on servers with the following configuration and with 10Gbe networking - + +- ARM64/Kunpeng 920-based 2-socket servers, model Taishan 2280 v2 (total 128 Cores), 800GB RAM, 1TB NVMe disk. For a detailed server specification, see - OS: openEuler + +- ARM64/Kunpeng 960-based 4-socket servers, model Taishan 2480 v2 (total 256 Cores), 512GB RAM, 1TB NVMe disk. For a detailed server specification, see - OS: openEuler + +- x86-based Dell servers, with 2-sockets of Intel Xeon Gold 6154 CPU @ 3GHz with 18 Cores (72 Cores, with hyper-threading=on), 1TB RAM, 1TB SSD OS: CentOS 7.6 + +- x86-based SuperMicro server, with 8-sockets of Intel(R) Xeon(R) CPU E7-8890 v4 @ 2.20GHz 24 cores (total 384 Cores, with hyper-threading=on), 1TB RAM, 1.2TB SSD (Seagate 1200 SSD 200GB, SAS 12Gb/s). OS: Ubuntu 16.04.2 LTS + +- x86-based Huawei server, with 4-sockets of Intel(R) Xeon(R) CPU E7-8890 v4 2.2Ghz (total 96 Cores, with hyper-threading=on), 512GB RAM, SSD 2TB OS: CentOS 7.6 + +## MOT Results - Summary + +MOT provides higher performance than disk-tables by a factor of 2.5x to 4.1x and reaches 4.8 million tpmC on ARM/Kunpeng-based servers with 256 cores. The results clearly demonstrate MOT's exceptional ability to scale-up and utilize all hardware resources. Performance jumps as the quantity of CPU sockets and server cores increases. + +MOT delivers up to 30,000 tpmC/core on ARM/Kunpeng-based servers and up to 40,000 tpmC/core on x86-based servers. + +Due to a more efficient durability mechanism, in MOT the replication overhead of a Primary/Secondary High Availability scenario is 7% on ARM/Kunpeng and 2% on x86 servers, as opposed to the overhead in disk tables of 20% on ARM/Kunpeng and 15% on x86 servers. + +Finally, MOT delivers 2.5x lower latency, with TPC-C transaction response times of 2 to 7 times faster. + +## MOT High Throughput + +The following shows the results of various MOT table high throughput tests. + +### ARM/Kunpeng 2-Socket 128 Cores + +**Performance** + +The following figure shows the results of testing the TPC-C benchmark on a Huawei ARM/Kunpeng server that has two sockets and 128 cores. + +Four types of tests were performed - + +- Two tests were performed on MOT tables and another two tests were performed on MogDB disk-based tables. +- Two of the tests were performed on a Single node (without high availability), meaning that no replication was performed to a secondary node. The other two tests were performed on Primary/Secondary nodes (with high availability), meaning that data written to the primary node was replicated to a secondary node. + +MOT tables are represented in orange and disk-based tables are represented in blue. + +**Figure 1** ARM/Kunpeng 2-Socket 128 Cores - Performance Benchmarks + +![arm-kunpeng-2-socket-128-cores-performance-benchmarks](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-10.png) + +The results showed that: + +- As expected, the performance of MOT tables is significantly greater than of disk-based tables in all cases. +- For a Single Node - 3.8M tpmC for MOT tables versus 1.5M tpmC for disk-based tables +- For a Primary/Secondary Node - 3.5M tpmC for MOT tables versus 1.2M tpmC for disk-based tables +- For production grade (high-availability) servers (Primary/Secondary Node) that require replication, the benefit of using MOT tables is even more significant than for a Single Node (without high-availability, meaning no replication). +- The MOT replication overhead of a Primary/Secondary High Availability scenario is 7% on ARM/Kunpeng and 2% on x86 servers, as opposed to the overhead of disk tables of 20% on ARM/Kunpeng and 15% on x86 servers. + +**Performance per CPU core** + +The following figure shows the TPC-C benchmark performance/throughput results per core of the tests performed on a Huawei ARM/Kunpeng server that has two sockets and 128 cores. The same four types of tests were performed (as described above). + +**Figure 2** ARM/Kunpeng 2-Socket 128 Cores - Performance per Core Benchmarks + +![arm-kunpeng-2-socket-128-cores-performance-per-core-benchmarks](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-11.png) + +The results showed that as expected, the performance of MOT tables is significantly greater per core than of disk-based tables in all cases. It also shows that for production grade (high-availability) servers (Primary/Secondary Node) that require replication, the benefit of using MOT tables is even more significant than for a Single Node (without high-availability, meaning no replication). + +### ARM/Kunpeng 4-Socket 256 Cores + +The following demonstrates MOT's excellent concurrency control performance by showing the tpmC per quantity of connections. + +**Figure 3** ARM/Kunpeng 4-Socket 256 Cores - Performance Benchmarks + +![arm-kunpeng-4-socket-256-cores-performance-benchmarks](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-12.png) + +The results show that performance increases significantly even when there are many cores and that peak performance of 4.8M tpmC is achieved at 768 connections. + +### x86-based Servers + +- **8-Socket 384 Cores** + +The following demonstrates MOT’s excellent concurrency control performance by comparing the tpmC per quantity of connections between disk-based tables and MOT. This test was performed on an x86 server with eight sockets and 384 cores. The orange represents the results of the MOT table. + +**Figure 4** x86 8-Socket 384 Cores - Performance Benchmarks + +![x86-8-socket-384-cores-performance-benchmarks](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-13.png) + +The results show that MOT tables significantly outperform disk-based tables and have very highly efficient performance per core on a 386 core server, reaching over 3M tpmC / core. + +- **4-Socket 96 Cores** + +3.9 million tpmC was achieved by MOT on this 4-socket 96 cores server. The following figure shows a highly efficient MOT table performance per core reaching 40,000 tpmC / core. + +**Figure 5** 4-Socket 96 Cores - Performance Benchmarks + +![4-socket-96-cores-performance-benchmarks](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-14.png) + +## MOT Low Latency + +The following was measured on ARM/Kunpeng 2-socket server (128 cores). The numbers scale is milliseconds (ms). + +**Figure 1** Low Latency (90th%) - Performance Benchmarks + +![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-15.png) + +MOT's average transaction speed is 2.5x, with MOT latency of 10.5 ms, compared to 23-25ms for disk tables. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** The average was calculated by taking into account all TPC-C 5 transaction percentage distributions. For more information, you may refer to the description of TPC-C transactions in the **MOT Sample TPC-C Benchmark** section. + +**Figure 2** Low Latency (90th%, Transaction Average) - Performance Benchmarks + +![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-16.png) + +## MOT RTO and Cold-Start Time + +### High Availability Recovery Time Objective (RTO) + +MOT is fully integrated into MogDB, including support for high-availability scenarios consisting of primary and secondary deployments. The WAL Redo Log's replication mechanism replicates changes into the secondary database node and uses it for replay. + +If a Failover event occurs, whether it is due to an unplanned primary node failure or due to a planned maintenance event, the secondary node quickly becomes active. The amount of time that it takes to recover and replay the WAL Redo Log and to enable connections is also referred to as the Recovery Time Objective (RTO). + +**The RTO of MogDB, including the MOT, is less than 10 seconds.** + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** The Recovery Time Objective (RTO) is the duration of time and a service level within which a business process must be restored after a disaster in order to avoid unacceptable consequences associated with a break in continuity. In other words, the RTO is the answer to the question: "How much time did it take to recover after notification of a business process disruption?" + +In addition, as shown in the **MOT High Throughput** section in MOT the replication overhead of a Primary/Secondary High Availability scenario is only 7% on ARM/Kunpeng servers and 2% on x86 servers, as opposed to the replication overhead of disk-tables, which is 20% on ARM/Kunpeng and 15% on x86 servers. + +### Cold-Start Recovery Time + +Cold-start Recovery time is the amount of time it takes for a system to become fully operational after a stopped mode. In memory databases, this includes the loading of all data and indexes into memory, thus it depends on data size, hardware bandwidth, and on software algorithms to process it efficiently. + +Our MOT tests using ARM servers with NVMe disks demonstrate the ability to load **100 GB of database checkpoint in 40 seconds (2.5 GB/sec)**. Because MOT does not persist indexes and therefore they are created at cold-start, the actual size of the loaded data + indexes is approximately 50% more. Therefore, can be converted to **MOT cold-start time of Data + Index capacity of 150GB in 40 seconds,** or **225 GB per minute (3.75 GB/sec)**. + +The following figure demonstrates cold-start process and how long it takes to load data into a MOT table from the disk after a cold start. + +**Figure 1** Cold-Start Time - Performance Benchmarks + +![cold-start-time-performance-benchmarks](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-17.png) + +- **Database Size -** The total amount of time to load the entire database (in GB) is represented by the blue line and the **TIME (sec)** Y axis on the left. +- **Throughput -** The quantity of database GB throughput per second is represented by the orange line and the **Throughput GB/sec** Y axis on the right. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** The performance demonstrated during the test is very close to the bandwidth of the SSD hardware. Therefore, it is feasible that higher (or lower) performance may be achieved on a different platform. + +## MOT Resource Utilization + +The following figure shows the resource utilization of the test performed on a x86 server with four sockets, 96 cores and 512GB RAM server. It demonstrates that a MOT table is able to efficiently and consistently consume almost all available CPU resources. For example, it shows that almost 100% CPU percentage utilization is achieved for 192 cores and 3.9M tpmC. + +- **tmpC -** Number of TPC-C transactions completed per minute is represented by the orange bar and the **tpmC** Y axis on the left. +- **CPU % Utilization -** The amount of CPU utilization is represented by the blue line and the **CPU %** Y axis on the right. + +**Figure 1** Resource Utilization - Performance Benchmarks + +![resource-utilization-performance-benchmarks](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-18.png) + +## MOT Data Ingestion Speed + +This test simulates realtime data streams arriving from massive IoT, cloud or mobile devices that need to be quickly and continuously ingested into the database on a massive scale. + +- The test involved ingesting large quantities of data, as follows - + + - 10 million rows were sent by 500 threads, 2000 rounds, 10 records (rows) in each insert command, each record was 200 bytes. + - The client and database were on different machines. Database server - x86 2-socket, 72 cores. + +- Performance Results + + - **Throughput - 10,000** Records/Core or **2** MB/Core. + - **Latency - 2.8ms per a 10 records** bulk insert (includes client-server networking) + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-caution.gif) **CAUTION:** We are projecting that multiple additional, and even significant, performance improvements will be made by MOT for this scenario. Click **MOT Usage Scenarios** for more information about large-scale data streaming and data ingestion. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/2-using-mot/1-using-mot-overview.md b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/2-using-mot/1-using-mot-overview.md new file mode 100644 index 0000000000000000000000000000000000000000..a54764d553c93fc863a8b68b0005e8abd5bab3ce --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/2-using-mot/1-using-mot-overview.md @@ -0,0 +1,18 @@ +--- +title: Using MOT Overview +summary: Using MOT Overview +author: Zhang Cuiping +date: 2021-03-04 +--- + +# Using MOT Overview + +MOT is automatically deployed as part of openGauss. You may refer to the **MOT Preparation** section for a description of how to estimate and plan required memory and storage resources in order to sustain your workload. The **MOT Deployment** section describes all the configuration settings in MOT, as well as non-mandatory options for server optimization. + +Using MOT tables is quite simple. The syntax of all MOT commands is the same as for disk-based tables and includes support for most of standard PostgreSQL SQL, DDL and DML commands and features, such as Stored Procedures. Only the create and drop table statements in MOT differ from the statements for disk-based tables in openGauss. You may refer to the **MOT Usage** section for a description of these two simple commands, to learn how to convert a disk-based table into an MOT table, to get higher performance using Query Native Compilation and PREPARE statements and for a description of external tool support and the limitations of the MOT engine. + +The **MOT Administration** section describes how to perform database maintenance, monitoring and analysis of logs and reported errors. Lastly, the **MOT Sample TPC-C Benchmark** section describes how to perform a standard TPC-C benchmark. + +- Read the following topics to learn how to use MOT - + + ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/using-mot-overview-2.png) diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/2-using-mot/2-mot-preparation.md b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/2-using-mot/2-mot-preparation.md new file mode 100644 index 0000000000000000000000000000000000000000..3cceb8b2a61ef33fce7dd877b24daca2f1e19efb --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/2-using-mot/2-mot-preparation.md @@ -0,0 +1,206 @@ +--- +title: MOT Preparation +summary: MOT Preparation +author: Zhang Cuiping +date: 2021-03-04 +--- + +# MOT Preparation + +The following describes the prerequisites and the memory and storage planning to perform in order to prepare to use MOT. + +## MOT Prerequisites + +The following specifies the hardware and software prerequisites for using MogDB MOT. + +### Supported Hardware + +MOT can utilize state-of-the-art hardware, as well as support existing hardware platforms. Both x86 architecture and ARM by Huawei Kunpeng architecture are supported. + +MOT is fully aligned with the hardware supported by the MogDB database. For more information, see the *MogDB Installation Guide*. + +### CPU + +MOT delivers exceptional performance on many-core servers (scale-up). MOT significantly outperforms the competition in these environments and provides near-linear scaling and extremely high resource utilization. + +Even so, users can already start realizing MOT's performance benefits on both low-end, mid-range and high-end servers, starting from one or two CPU sockets, as well as four and even eight CPU sockets. Very high performance and resource utilization are also expected on very high-end servers that have 16 or even 32 sockets (for such cases, we recommend contacting Enmo support). + +### Memory + +MOT supports standard RAM/DRAM for its data and transaction management. All MOT tables’ data and indexes reside in-memory; therefore, the memory capacity must support the data capacity and still have space for further growth. For detailed information about memory requirements and planning, see the **MOT Memory and Storage Planning** section. + +### Storage IO + +MOT is a durable database and uses persistent storage (disk/SSD/NVMe drive[s]) for transaction log operations and periodic checkpoints. + +We recommend using a storage device with low latency, such as SSD with a RAID-1 configuration, NVMe or any enterprise-grade storage system. When appropriate hardware is used, the database transaction processing and contention are the bottleneck, not the IO. + +For detailed memory requirements and planning, see the **MOT Memory and Storage Planning** section. + +Supported Operating Systems + +MOT is fully aligned with the operating systems supported by MogDB. + +MOT supports both bare-metal and virtualized environments that run the following operating systems on a bare-metal server or virtual machine - + +- **x86 -** CentOS 7.6 and EulerOS 2.0 +- **ARM -** openEuler and EulerOS + +### OS Optimization + +MOT does not require any special modifications or the installation of new software. However, several optional optimizations can enhance performance. You may refer to the **MOT Server Optimization - x86** and **MOT Server Optimization - ARM Huawei Taishan 2P/4P** sections for a description of the optimizations that enable maximal performance. + +## MOT Memory and Storage Planning + +This section describes the considerations and guidelines for evaluating, estimating and planning the quantity of memory and storage capacity to suit your specific application needs. This section also describes the various data aspects that affect the quantity of required memory, such as the size of data and indexes for the planned tables, memory to sustain transaction management and how fast the data is growing. + +### MOT Memory Planning + +MOT belongs to the in-memory database class (IMDB) in which all tables and indexes reside entirely in memory. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** Memory storage is volatile, meaning that it requires power to maintain the stored information. Disk storage is persistent, meaning that it is written to disk, which is non-volatile storage. MOT uses both, having all data in memory, while persisting (by WAL logging) transactional changes to disk with strict consistency (in synchronous logging mode). + +Sufficient physical memory must exist on the server in order to maintain the tables in their initial state, as well as to accommodate the related workload and growth of data. All this is in addition to the memory that is required for the traditional disk-based engine, tables and sessions that support the workload of disk-based tables. Therefore, planning ahead for enough memory to contain them all is essential. + +Even so, you can get started with whatever amount of memory you have and perform basic tasks and evaluation tests. Later, when you are ready for production, the following issues should be addressed. + +- **Memory Configuration Settings** + + Similar to standard PG , the memory of the MogDB database process is controlled by the upper limit in its max_process_memory setting, which is defined in the postgres.conf file. The MOT engine and all its components and threads, reside within the MogDB process. Therefore, the memory allocated to MOT also operates within the upper boundary defined by max_process_memory for the entire MogDB database process. + + The amount of memory that MOT can reserve for itself is defined as a portion of max_process_memory. It is either a percentage of it or an absolute value that is less than it. This portion is defined in the mot.conf configuration file by the _mot__memory settings. + + The portion of max_process_memory that can be used by MOT must still leave at least 2 GB available for the PG (MogDB) envelope. Therefore, in order to ensure this, MOT verifies the following during database startup - + + ``` + (max_mot_global_memory + max_mot_local_memory) + 2GB < max_process_memory + ``` + + If this limit is breached, then MOT memory internal limits are adjusted in order to provide the maximum possible within the limitations described above. This adjustment is performed during startup and calculates the value of MOT max memory accordingly. + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** MOT max memory is a logically calculated value of either the configured settings or their adjusted values of (max_mot_global_memory + max_mot_local_memory). + + In this case, a warning is issued to the server log, as shown below - + + **Warning Examples** + + Two messages are reported - the problem and the solution. + + The following is an example of a warning message reporting the problem - + + ``` + [WARNING] MOT engine maximum memory definitions (global: 9830 MB, local: 1843 MB, session large store: 0 MB, total: 11673 MB) breach GaussDB maximum process memory restriction (12288 MB) and/or total system memory (64243 MB). MOT values shall be adjusted accordingly to preserve required gap (2048 MB). + ``` + + The following is an example of a warning message indicating that MOT is automatically adjusting the memory limits - + + ``` + [WARNING] Adjusting MOT memory limits: global = 8623 MB, local = 1617 MB, session large store = 0 MB, total = 10240 MB + ``` + + This is the only place that shows the new memory limits. + + Additionally, MOT does not allow the insertion of additional data when the total memory usage approaches the chosen memory limits. The threshold for determining when additional data insertions are no longer allowed, is defined as a percentage of MOT max memory (which is a calculated value, as described above). The default is 90, meaning 90%. Attempting to add additional data over this threshold returns an error to the user and is also registered in the database log file. + +- **Minimum and Maximum** + + In order to secure memory for future operations, MOT pre-allocates memory based on the minimum global and local settings. The database administrator should specify the minimum amount of memory required for the MOT tables and sessions to sustain their workload. This ensures that this minimal memory is allocated to MOT even if another excessive memory-consuming application runs on the same server as the database and competes with the database for memory resources. The maximum values are used to limit memory growth. + +- **Global and Local** + + The memory used by MOT is comprised of two parts - + + - **Global Memory -** Global memory is a long-term memory pool that contains the data and indexes of MOT tables. It is evenly distributed across NUMA-nodes and is shared by all CPU cores. + - **Local Memory -** Local memory is a memory pool used for short-term objects. Its primary consumers are sessions handling transactions. These sessions are storing data changes in the part of the memory dedicated to the relevant specific transaction (known as *transaction private memory*). Data changes are moved to the global memory at the commit phase. Memory object allocation is performed in NUMA-local manner in order to achieve the lowest possible latency. + + Deallocated objects are put back in the relevant memory pools. Minimal use of operating system memory allocation (malloc) functions during transactions circumvents unnecessary locks and latches. + + The allocation of these two memory parts is controlled by the dedicated **min/max_mot_global_memory** and **min/max_mot_local_memory** settings. If MOT global memory usage gets too close to this defined maximum, then MOT protects itself and does not accept new data. Attempts to allocate memory beyond this limit are denied and an error is reported to the user. + +- **Minimum Memory Requirements** + + To get started and perform a minimal evaluation of MOT performance, there are a few requirements. + + Make sure that the **max_process_memory** (as defined in **postgres.conf**) has sufficient capacity for MOT tables and sessions (configured by **mix/max_mot_global_memory** and **mix/max_mot_local_memory**), in addition to the disk tables buffer and extra memory. For simple tests, the default **mot.conf** settings can be used. + +- **Actual Memory Requirements During Production** + + In a typical OLTP workload, with 80:20 read:write ratio on average, MOT memory usage per table is 60% higher than in disk-based tables (this includes both the data and the indexes). This is due to the use of more optimal data structures and algorithms that enable faster access, with CPU-cache awareness and memory-prefetching. + + The actual memory requirement for a specific application depends on the quantity of data, the expected workload and especially on the data growth. + +- **Max Global Memory Planning - Data + Index Size** + + To plan for maximum global memory - + + 1. Determine the size of a specific disk table (including both its data and all its indexes). The following statistical query can be used to determine the data size of the **customer** table and the **customer_pkey** index size - + - **Data size -** select pg_relation_size(‘customer'); + - **Index -** select pg_relation_size('customer_pkey'); + 2. Add 60%, which is the common requirement in MOT relative to the current size of the disk-based data and index. + 3. Add an additional percentage for the expected growth of data. For example - + + 5% monthly growth = 80% yearly growth (1.05^12). Thus, in order to sustain a year's growth, allocate 80% more memory than is currently used by the tables. + + This completes the estimation and planning of the max_mot_global_memory value. The actual setting can be defined either as an absolute value or a percentage of the Postgres max_process_memory. The exact value is typically finetuned during deployment. + +- **Max Local Memory Planning - Concurrent Session Support** + + Local memory needs are primarily a function of the quantity of concurrent sessions. The typical OLTP workload of an average session uses up to 8 MB. This should be multiplied by the quantity of sessions and then a little bit extra should be added. + + A memory calculation can be performed in this manner and then finetuned, as follows - + + ``` + SESSION_COUNT * SESSION_SIZE (8 MB) + SOME_EXTRA (100MB should be enough) + ``` + + The default specifies 15% of Postgres's max_process_memory, which by default is 12 GB. This equals 1.8 GB, which is sufficient for 230 sessions, which is the requirement for the max_mot_local memory. The actual setting can be defined either in absolute values or as a percentage of the Postgres max_process_memory. The exact value is typically finetuned during deployment. + + **Unusually Large Transactions** + + Some transactions are unusually large because they apply changes to a large number of rows. This may increase a single session's local memory up to the maximum allowed limit, which is 1 GB. For example - + + ``` + delete from SOME_VERY_LARGE_TABLE; + ``` + + Take this scenario into consideration when configuring the max_mot_local_memory setting, as well as during application development. + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** You may refer to the **MEMORY (MOT)** section for more information about configuration settings. + +### Storage IO + +MOT is a memory-optimized, persistent database storage engine. A disk drive(s) is required for storing the Redo Log (WAL) and a periodic checkpoint. + +It is recommended to use a storage device with low latency, such as SSD with a RAID-1 configuration, NVMe or any enterprise-grade storage system. When appropriate hardware is used, the database transaction processing and contention are the bottleneck, not the IO. + +Since the persistent storage is much slower than RAM memory, the IO operations (logging and checkpoint) can create a bottleneck for both an in-memory and memory-optimized databases. However, MOT has a highly efficient durability design and implementation that is optimized for modern hardware (such as SSD and NVMe). In addition, MOT has minimized and optimized writing points (for example, by using parallel logging, a single log record per transaction and NUMA-aware transaction group writing) and has minimized the data written to disk (for example, only logging the delta or updated columns of the changed records and only logging a transaction at the commit phase). + +### Required Capacity + +The required capacity is determined by the requirements of checkpointing and logging, as described below - + +- **Checkpointing** + + A checkpoint saves a snapshot of all the data to disk. + + Twice the size of all data should be allocated for checkpointing. There is no need to allocate space for the indexes for checkpointing + + Checkpointing = 2x the MOT Data Size (rows only, index is not persistent). + + Twice the size is required because a snapshot is saved to disk of the entire size of the data, and in addition, the same amount of space should be allocated for the checkpoint that is in progress. When a checkpoint process finishes, the previous checkpoint files are deleted. + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** In the next MogDB release, MOT will have an incremental checkpoint feature, which will significantly reduce this storage capacity requirement. + +- **Logging** + + MOT table log records are written to the same database transaction log as the other records of disk-based tables. + + The size of the log depends on the transactional throughput, the size of the data changes and the time between checkpoints (at each time checkpoint the Redo Log is truncated and starts to expand again). + + MOT tables use less log bandwidth and have lower IO contention than disk-based tables. This is enabled by multiple mechanisms. + + For example, MOT does not log every operation before a transaction has been completed. It is only logged at the commit phase and only the updated delta record is logged (not full records like for disk-based tables). + + In order to ensure that the log IO device does not become a bottleneck, the log file must be placed on a drive that has low latency. + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** You may refer to the **STORAGE (MOT)** section for more information about configuration settings. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/2-using-mot/3-mot-deployment.md b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/2-using-mot/3-mot-deployment.md new file mode 100644 index 0000000000000000000000000000000000000000..1bf1320b9f6b91492427e40529ee6ffe2eb1c133 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/2-using-mot/3-mot-deployment.md @@ -0,0 +1,660 @@ +--- +title: MOT Deployment +summary: MOT Deployment +author: Zhang Cuiping +date: 2021-03-04 +--- + +# MOT Deployment + +The following sections describe various mandatory and optional settings for optimal deployment. + +## MOT Server Optimization - x86 + +Generally, databases are bounded by the following components - + +- **CPU -** A faster CPU speeds up any CPU-bound database. +- **Disk -** High-speed SSD/NVME speeds up any I/O-bound database. +- **Network -** A faster network speeds up any **SQL\\Net**-bound database. + +In addition to the above, the following general-purpose server settings are used by default and may significantly affect a database's performance. + +MOT performance tuning is a crucial step for ensuring fast application functionality and data retrieval. MOT can utilize state-of-the-art hardware, and therefore it is extremely important to tune each system in order to achieve maximum throughput. + +The following are optional settings for optimizing MOT database performance running on an Intel x86 server. These settings are optimal for high throughput workloads - + +### BIOS + +- Hyper Threading - ON + + Activation (HT=ON) is highly recommended. + + We recommend turning hyper threading ON while running OLTP workloads on MOT. When hyper-threading is used, some OLTP workloads demonstrate performance gains of up to40%. + +### OS Environment Settings + +- NUMA + + Disable NUMA balancing, as described below. MOT performs its own memory management with extremely efficient NUMA-awareness, much more than the default methods used by the operating system. + + ``` + echo 0 > /proc/sys/kernel/numa_balancing + ``` + +- Services + + Disable Services, as described below - + + ``` + service irqbalance stop # MANADATORY + service sysmonitor stop # OPTIONAL, performance + service rsyslog stop # OPTIONAL, performance + ``` + +- Tuned Service + + The following section is mandatory. + + The server must run the throughput-performance profile - + + ``` + [...]$ tuned-adm profile throughput-performance + ``` + + The **throughput-performance** profile is broadly applicable tuning that provides excellent performance across a variety of common server workloads. + + Other less suitable profiles for MogDB and MOT server that may affect MOT's overall performance are - balanced, desktop, latency-performance, network-latency, network-throughput and powersave. + +- Sysctl + + The following lists the recommended operating system settings for best performance. + + - Add the following settings to /etc/sysctl.conf and run sysctl -p + + ```bash + net.ipv4.ip_local_port_range = 9000 65535 + kernel.sysrq = 1 + kernel.panic_on_oops = 1 + kernel.panic = 5 + kernel.hung_task_timeout_secs = 3600 + kernel.hung_task_panic = 1 + vm.oom_dump_tasks = 1 + kernel.softlockup_panic = 1 + fs.file-max = 640000 + kernel.msgmnb = 7000000 + kernel.sched_min_granularity_ns = 10000000 + kernel.sched_wakeup_granularity_ns = 15000000 + kernel.numa_balancing=0 + vm.max_map_count = 1048576 + net.ipv4.tcp_max_tw_buckets = 10000 + net.ipv4.tcp_tw_reuse = 1 + net.ipv4.tcp_tw_recycle = 1 + net.ipv4.tcp_keepalive_time = 30 + net.ipv4.tcp_keepalive_probes = 9 + net.ipv4.tcp_keepalive_intvl = 30 + net.ipv4.tcp_retries2 = 80 + kernel.sem = 250 6400000 1000 25600 + net.core.wmem_max = 21299200 + net.core.rmem_max = 21299200 + net.core.wmem_default = 21299200 + net.core.rmem_default = 21299200 + #net.sctp.sctp_mem = 94500000 915000000 927000000 + #net.sctp.sctp_rmem = 8192 250000 16777216 + #net.sctp.sctp_wmem = 8192 250000 16777216 + net.ipv4.tcp_rmem = 8192 250000 16777216 + net.ipv4.tcp_wmem = 8192 250000 16777216 + net.core.somaxconn = 65535 + vm.min_free_kbytes = 26351629 + net.core.netdev_max_backlog = 65535 + net.ipv4.tcp_max_syn_backlog = 65535 + #net.sctp.addip_enable = 0 + net.ipv4.tcp_syncookies = 1 + vm.overcommit_memory = 0 + net.ipv4.tcp_retries1 = 5 + net.ipv4.tcp_syn_retries = 5 + ``` + + - Update the section of /etc/security/limits.conf to the following - + + ```bash + soft nofile 100000 + hard nofile 100000 + ``` + + The **soft** and a **hard** limit settings specify the quantity of files that a process may have opened at once. The soft limit may be changed by each process running these limits up to the hard limit value. + +- Disk/SSD + + The following describes how to ensure that disk R/W performance is suitable for database synchronous commit mode. + + To do so, test your disk bandwidth using the following + + ``` + [...]$ sync; dd if=/dev/zero of=testfile bs=1M count=1024; sync + 1024+0 records in + 1024+0 records out + 1073741824 bytes (1.1 GB) copied, 1.36034 s, 789 MB/s + ``` + + In case the disk bandwidth is significantly below the above number (789 MB/s), it may create a performance bottleneck for MogDB, and especially for MOT. + +### Network + +Use a 10Gbps network or higher. + +To verify, use iperf, as follows - + +``` +Server side: iperf -s +Client side: iperf -c +``` + +- rc.local - Network Card Tuning + + The following optional settings have a significant effect on performance - + + 1. Copy set_irq_affinity.sh from to /var/scripts/. + + 2. Put in /etc/rc.d/rc.local and run chmod in order to ensure that the following script is executed during boot - + + ```bash + chmod +x /etc/rc.d/rc.local + var/scripts/set_irq_affinity.sh -x all + ethtool -K gro off + ethtool -C adaptive-rx on adaptive-tx on + Replace with the network card, i.e. ens5f1 + ``` + +## MOT Server Optimization - ARM Huawei Taishan 2P/4P + +The following are optional settings for optimizing MOT database performance running on an ARM/Kunpeng-based Huawei Taishan 2280 v2 server powered by 2-sockets with a total of 256 Cores and Taishan 2480 v2 server powered by 4-sockets with a total of 256 Cores. + +Unless indicated otherwise, the following settings are for both client and server machines - + +### BIOS + +Modify related BIOS settings, as follows - + +1. Select **BIOS** - **Advanced** - **MISC Config**. Set **Support Smmu** to **Disabled**. + +2. Select **BIOS** - **Advanced** - **MISC Config**. Set **CPU Prefetching Configuration** to **Disabled**. + + ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-deployment-1.png) + +3. Select **BIOS** - **Advanced** - **Memory Config**. Set **Die Interleaving** to **Disabled**. + + ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-deployment-2.png) + +4. Select **BIOS** - **Advanced** - **Performance Config**. Set **Power Policy** to **Performance**. + + ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-deployment-3.png) + +### OS - Kernel and Boot + +- The following operating system kernel and boot parameters are usually configured by a sysadmin. + + Configure the kernel parameters, as follows - + + ```bash + net.ipv4.ip_local_port_range = 9000 65535 + kernel.sysrq = 1 + kernel.panic_on_oops = 1 + kernel.panic = 5 + kernel.hung_task_timeout_secs = 3600 + kernel.hung_task_panic = 1 + vm.oom_dump_tasks = 1 + kernel.softlockup_panic = 1 + fs.file-max = 640000 + kernel.msgmnb = 7000000 + kernel.sched_min_granularity_ns = 10000000 + kernel.sched_wakeup_granularity_ns = 15000000 + kernel.numa_balancing=0 + vm.max_map_count = 1048576 + net.ipv4.tcp_max_tw_buckets = 10000 + net.ipv4.tcp_tw_reuse = 1 + net.ipv4.tcp_tw_recycle = 1 + net.ipv4.tcp_keepalive_time = 30 + net.ipv4.tcp_keepalive_probes = 9 + net.ipv4.tcp_keepalive_intvl = 30 + net.ipv4.tcp_retries2 = 80 + kernel.sem = 32000 1024000000 500 32000 + kernel.shmall = 52805669 + kernel.shmmax = 18446744073692774399 + sys.fs.file-max = 6536438 + net.core.wmem_max = 21299200 + net.core.rmem_max = 21299200 + net.core.wmem_default = 21299200 + net.core.rmem_default = 21299200 + net.ipv4.tcp_rmem = 8192 250000 16777216 + net.ipv4.tcp_wmem = 8192 250000 16777216 + net.core.somaxconn = 65535 + vm.min_free_kbytes = 5270325 + net.core.netdev_max_backlog = 65535 + net.ipv4.tcp_max_syn_backlog = 65535 + net.ipv4.tcp_syncookies = 1 + vm.overcommit_memory = 0 + net.ipv4.tcp_retries1 = 5 + net.ipv4.tcp_syn_retries = 5 + ##NEW + kernel.sched_autogroup_enabled=0 + kernel.sched_min_granularity_ns=2000000 + kernel.sched_latency_ns=10000000 + kernel.sched_wakeup_granularity_ns=5000000 + kernel.sched_migration_cost_ns=500000 + vm.dirty_background_bytes=33554432 + kernel.shmmax=21474836480 + net.ipv4.tcp_timestamps = 0 + net.ipv6.conf.all.disable_ipv6=1 + net.ipv6.conf.default.disable_ipv6=1 + net.ipv4.tcp_keepalive_time=600 + net.ipv4.tcp_keepalive_probes=3 + kernel.core_uses_pid=1 + ``` + +- Tuned Service + + The following section is mandatory. + + The server must run a throughput-performance profile - + + ``` + [...]$ tuned-adm profile throughput-performance + ``` + + The **throughput-performance** profile is broadly applicable tuning that provides excellent performance across a variety of common server workloads. + + Other less suitable profiles for MogDB and MOT server that may affect MOT's overall performance are - balanced, desktop, latency-performance, network-latency, network-throughput and powersave. + +- Boot Tuning + + Add **iommu.passthrough=1** to the **kernel boot arguments**. + + When operating in **pass-through** mode, the adapter does require **DMA translation to the memory,** which improves performance. + +## MOT Configuration Settings + +MOT is provided preconfigured to creating working MOT Tables. For best results, it is recommended to customize the MOT configuration (defined in the file named mot.conf) according to your application's specific requirements and your preferences. + +This file is read-only upon server startup. If you edit this file while the system is running, then the server must be reloaded in order for the changes to take effect. + +The mot.conf file is located in the same folder as the postgres.conf configuration file. + +Read the **General Guidelines** section and then review and configure the following sections of the mot.conf file, as needed. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** The topics listed above describe each of the setting sections in the mot.conf file. In addition to the above topics, for an overview of all the aspects of a specific MOT feature (such as Recovery), you may refer to the relevant topic of this user manual. For example, the mot.conf file has a Recovery section that contains settings that affect MOT recovery and this is described in the **MOT Recovery** section that is listed above. In addition, for a full description of all aspects of Recovery, you may refer to the **MOT Recovery** section of the Administration chapter of this user manual. Reference links are also provided in each relevant section of the descriptions below. + +The following topics describe each section in the mot.conf file and the settings that it contains, as well as the default value of each. + +### General Guidelines + +The following are general guidelines for editing the mot.conf file. + +- Each setting appears with its default value as follows - + + ``` + # name = value + ``` + +- Blank/white space is acceptable. + +- Comments are indicated by placing a number sign (#) anywhere on a line. + +- The default values of each setting appear as a comment throughout this file. + +- In case a parameter is uncommented and a new value is placed, the new setting is defined. + +- Changes to the mot.conf file are applied only at the start or reload of the database server. + +Memory Units are represented as follows - + +- KB - Kilobytes +- MB - Megabytes +- GB - Gigabytes +- TB - Terabytes + +If no memory units are specified, then bytes are assumed. + +Some memory units are represented as a percentage of the **max_process_memory** setting that is configured in **postgresql.conf**. For example - **20%**. + +Time units are represented as follows - + +- us - microseconds (or micros) +- ms - milliseconds (or millis) +- s - seconds (or secs) +- min - minutes (or mins) +- h - hours +- d - days + +If no time units are specified, then microseconds are assumed. + +### REDO LOG (MOT) + +- **enable_group_commit = false** + + Specifies whether to use group commit. + + This option is only relevant when MogDB is configured to use synchronous commit, meaning only when the synchronous_commit setting in postgresql.conf is configured to any value other than off. + +- **group_commit_size = 16** + +- **group_commit_timeout = 10 ms** + + This option is only relevant when the MOT engine has been configured to **Synchronous Group Commit** logging. This means that the synchronous_commit setting in postgresql.conf is configured to true and the enable_group_commit parameter in the mot.conf configuration file is configured to true. + + Defines which of the following determines when a group of transactions is recorded in the WAL Redo Log - + + **group_commit_size** - The quantity of committed transactions in a group. For example, **16** means that when 16 transactions in the same group have been committed by their client application, then an entry is written to disk in the WAL Redo Log for each of the 16 transactions. + + **group_commit_timeout** - A timeout period in ms. For example, **10** means that after 10 ms, an entry is written to disk in the WAL Redo Log for each of the transactions in the same group that have been committed by their client application in the lats 10 ms. + + A commit group is closed after either the configured number of transactions has arrived or after the configured timeout period since the group was opened. After the group is closed, all the transactions in the group wait for a group flush to complete execution and then notify the client that each transaction has ended. + + You may refer to **MOT Logging - WAL Redo Log** section for more information about the WAL Redo Log and synchronous group commit logging. + +### CHECKPOINT (MOT) + +- **checkpoint_dir =** + + Specifies the directory in which checkpoint data is to be stored. The default location is in the data folder of each data node. + +- **checkpoint_segsize = 16 MB** + + Specifies the segment size used during checkpoint. Checkpoint is performed in segments. When a segment is full, it is serialized to disk and a new segment is opened for the subsequent checkpoint data. + +- **checkpoint_workers = 3** + + Specifies the number of workers to use during checkpoint. + + Checkpoint is performed in parallel by several MOT engine workers. The quantity of workers may substantially affect the overall performance of the entire checkpoint operation, as well as the operation of other running transactions. To achieve a shorter checkpoint duration, a larger number of workers should be used, up to the optimal number (which varies based on the hardware and workload). However, be aware that if this number is too large, it may negatively impact the execution time of other running transactions. Keep this number as low as possible to minimize the effect on the runtime of other running transactions, but at the cost of longer checkpoint duration. + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** You may refer to the **MOT Checkpoints** section for more information about configuration settings. + +### RECOVERY (MOT) + +- **checkpoint_recovery_workers = 3** + + Specifies the number of workers (threads) to use during checkpoint data recovery. Each MOT engine worker runs on its own core and can process a different table in parallel by reading it into memory. For example, while the default is three-course, you might prefer to set this parameter to the number of cores that are available for processing. After recovery these threads are stopped and killed. + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** You may refer to the **MOT Recovery** section for more information about configuration settings. + +### STATISTICS (MOT) + +- **enable_stats = false** + + Configures periodic statistics for printing. + +- **print_stats_period = 10 minute** + + Configures the time period for printing a summary statistics report. + +- **print_full_stats_period = 1 hours** + + Configures the time period for printing a full statistics report. + + The following settings configure the various sections included in the periodic statistics report. If none of them are configured, then the statistics report is suppressed. + +- **enable_log_recovery_stats = false** + + Log recovery statistics contain various Redo Log recovery metrics. + +- **enable_db_session_stats = false** + + Database session statistics contain transaction events, such commits, rollbacks and so on. + +- **enable_network_stats = false** + + Network statistics contain connection/disconnection events. + +- **enable_log_stats = false** + + Log statistics contain details regarding the Redo Log. + +- **enable_memory_stats = false** + + Memory statistics contain memory-layer details. + +- **enable_process_stats = false** + + Process statistics contain total memory and CPU consumption for the current process. + +- **enable_system_stats = false** + + System statistics contain total memory and CPU consumption for the entire system. + +- **enable_jit_stats = false** + + JIT statistics contain information regarding JIT query compilation and execution. + +### ERROR LOG (MOT) + +- **log_level = INFO** + + Configures the log level of messages issued by the MOT engine and recorded in the Error log of the database server. Valid values are PANIC, ERROR, WARN, INFO, TRACE, DEBUG, DIAG1 and DIAG2. + +- **Log.COMPONENT.LOGGER.log_level=LOG_LEVEL** + + Configures specific loggers using the syntax described below. + + For example, to configure the TRACE log level for the ThreadIdPool logger in system component, use the following syntax - + + ``` + Log.System.ThreadIdPool.log_level=TRACE + ``` + + To configure the log level for all loggers under some component, use the following syntax - + + ``` + Log.COMPONENT.log_level=LOG_LEVEL + ``` + + For example - + + ``` + Log.System.log_level=DEBUG + ``` + +### MEMORY (MOT) + +- **enable_numa = true** + + Specifies whether to use NUMA-aware memory allocation. + + When disabled, all affinity configurations are disabled as well. + + MOT engine assumes that all the available NUMA nodes have memory. If the machine has some special configuration in which some of the NUMA nodes have no memory, then the MOT engine initialization and hence the database server startup will fail. In such machines, it is recommended that this configuration value be set to false, in order to prevent startup failures and let the MOT engine to function normally without using NUMA-aware memory allocation. + +- **affinity_mode = fill-physical-first** + + Configures the affinity mode of threads for the user session and internal MOT tasks. + + When a thread pool is used, this value is ignored for user sessions, as their affinity is governed by the thread pool. However, it is still used for internal MOT tasks. + + Valid values are **fill-socket-first**, **equal-per-socket**, **fill-physical-first** and **none** - + + - **Fill-socket-first** attaches threads to cores in the same socket until the socket is full and then moves to the next socket. + - **Equal-per-socket** spreads threads evenly among all sockets. + - **Fill-physical-first** attaches threads to physical cores in the same socket until all physical cores are employed and then moves to the next socket. When all physical cores are used, then the process begins again with hyper-threaded cores. + - **None** disables any affinity configuration and lets the system scheduler determine on which core each thread is scheduled to run. + +- **lazy_load_chunk_directory = true** + + Configures the chunk directory mode that is used for memory chunk lookup. + + **Lazy** mode configures the chunk directory to load parts of it on demand, thus reducing the initial memory footprint (from 1 GB to 1 MB approximately). However, this may result in minor performance penalties and errors in extreme conditions of memory distress. In contrast, using a **non-lazy** chunk directory allocates an additional 1 GB of initial memory, produces slightly higher performance and ensures that chunk directory errors are avoided during memory distress. + +- **reserve_memory_mode = virtual** + + Configures the memory reservation mode (either **physical** or **virtual**). + + Whenever memory is allocated from the kernel, this configuration value is consulted to determine whether the allocated memory is to be resident (**physical**) or not (**virtual**). This relates primarily to preallocation, but may also affect runtime allocations. For **physical** reservation mode, the entire allocated memory region is made resident by forcing page faults on all pages spanned by the memory region. Configuring **virtual** memory reservation may result in faster memory allocation (particularly during preallocation), but may result in page faults during the initial access (and thus may result in a slight performance hit) and more sever errors when physical memory is unavailable. In contrast, physical memory allocation is slower, but later access is both faster and guaranteed. + +- **store_memory_policy = compact** + + Configures the memory storage policy (**compact** or **expanding**). + + When **compact** policy is defined, unused memory is released back to the kernel, until the lower memory limit is reached (see **min_mot_memory** below). In **expanding** policy, unused memory is stored in the MOT engine for later reuse. A **compact** storage policy reduces the memory footprint of the MOT engine, but may occasionally result in minor performance degradation. In addition, it may result in unavailable memory during memory distress. In contrast, **expanding** mode uses more memory, but results in faster memory allocation and provides a greater guarantee that memory can be re-allocated after being de-allocated. + +- **chunk_alloc_policy = auto** + + Configures the chunk allocation policy for global memory. + + MOT memory is organized in chunks of 2 MB each. The source NUMA node and the memory layout of each chunk affect the spread of table data among NUMA nodes, and therefore can significantly affect the data access time. When allocating a chunk on a specific NUMA node, the allocation policy is consulted. + + Available values are **auto**, **local**, **page-interleaved**, **chunk-interleaved** and **native** - + + - **Auto** policy selects a chunk allocation policy based on the current hardware. + - **Local** policy allocates each chunk on its respective NUMA node. + - **Page-interleaved** policy allocates chunks that are composed of interleaved memory 4-kilobyte pages from all NUMA nodes. + - **Chunk-interleaved** policy allocates chunks in a round robin fashion from all NUMA nodes. + - **Native** policy allocates chunks by calling the native system memory allocator. + +- **chunk_prealloc_worker_count = 8** + + Configures the number of workers per NUMA node participating in memory preallocation. + +- **max_mot_global_memory = 80%** + + Configures the maximum memory limit for the global memory of the MOT engine. + + Specifying a percentage value relates to the total defined by **max_process_memory** configured in **postgresql.conf**. + + The MOT engine memory is divided into global (long-term) memory that is mainly used to store user data and local (short-term) memory that is mainly used by user sessions for local needs. + + Any attempt to allocate memory beyond this limit is denied and an error is reported to the user. Ensure that the sum of **max_mot_global_memory** and **max_mot_local_memory** do not exceed the **max_process_memory** configured in **postgresql.conf**. + +- **min_mot_global_memory = 0 MB** + + Configures the minimum memory limit for the global memory of the MOT engine. + + Specifying a percentage value relates to the total defined by the **max_process_memory** configured in **postgresql.conf**. + + This value is used for the preallocation of memory during startup, as well as to ensure that a minimum amount of memory is available for the MOT engine during its normal operation. When using **compact** storage policy (see **store_memory_policy** above), this value designates the lower limit under which memory is not released back to the kernel, but rather kept in the MOT engine for later reuse. + +- **max_mot_local_memory = 15%** + + Configures the maximum memory limit for the local memory of the MOT engine. + + Specifying a percentage value relates to the total defined by the **max_process_memory** configured in **postgresql.conf**. + + MOT engine memory is divided into global (long-term) memory that is mainly used to store user data and local (short-term) memory that is mainly used by user session for local needs. + + Any attempt to allocate memory beyond this limit is denied and an error is reported to the user. Ensure that the sum of **max_mot_global_memory** and **max_mot_local_memory** do not exceed the **max_process_memory** configured in **postgresql.conf**. + +- **min_mot_local_memory = 0 MB** + + Configures the minimum memory limit for the local memory of the MOT engine. + + Specifying a percentage value relates to the total defined by the **max_process_memory** configured in **postgresql.conf**. + + This value is used for preallocation of memory during startup, as well as to ensure that a minimum amount of memory is available for the MOT engine during its normal operation. When using compact storage policy (see **store_memory_policy** above), this value designates the lower limit under which memory is not released back to the kernel, but rather kept in the MOT engine for later reuse. + +- **max_mot_session_memory = 0 MB** + + Configures the maximum memory limit for a single session in the MOT engine. + + Typically, sessions in the MOT engine can allocate as much local memory as needed, so long as the local memory limit is not exceeded. To prevent a single session from taking too much memory, and thereby denying memory from other sessions, this configuration item is used to restrict small session-local memory allocations (up to 1,022 KB). + + Make sure that this configuration item does not affect large or huge session-local memory allocations. + + A value of zero denotes no restriction on any session-local small allocations per session, except for the restriction arising from the local memory allocation limit configured by **max_mot_local_memory**. + + Note: Percentage values cannot be set for this configuration item. + +- **min_mot_session_memory = 0 MB** + + Configures the minimum memory reservation for a single session in the MOT engine. + + This value is used to preallocate memory during session creation, as well as to ensure that a minimum amount of memory is available for the session to perform its normal operation. + + Note: Percentage values cannot be set for this configuration item. + +- **session_large_buffer_store_size = 0 MB** + + Configures the large buffer store for sessions. + + When a user session executes a query that requires a lot of memory (for example, when using many rows), the large buffer store is used to increase the certainty level that such memory is available and to serve this memory request more quickly. Any memory allocation for a session exceeding 1,022 KB is considered as a large memory allocation. If the large buffer store is not used or is depleted, such allocations are treated as huge allocations that are served directly from the kernel. + + Note: Percentage values cannot be set for this configuration item. + +- **session_large_buffer_store_max_object_size = 0 MB** + + Configures the maximum object size in the large allocation buffer store for sessions. + + Internally, the large buffer store is divided into objects of varying sizes. This value is used to set an upper limit on objects originating from the large buffer store, as well as to determine the internal division of the buffer store into objects of various size. + + This size cannot exceed 1⁄8 of the **session_large_buffer_store_size**. If it does, it is adjusted to the maximum possible. + + Note: Percentage values cannot be set for this configuration item. + +- **session_max_huge_object_size = 1 GB** + + Configures the maximum size of a single huge memory allocation made by a session. + + Huge allocations are served directly from the kernel and therefore are not guaranteed to succeed. + + This value also pertains to global (meaning not session-related) memory allocations. + + Note: Percentage values cannot be set for this configuration item. + +### GARBAGE COLLECTION (MOT) + +- **enable_gc = true** + + Specifies whether to use the Garbage Collector (GC). + +- **reclaim_threshold = 512 KB** + + Configures the memory threshold for the garbage collector. + + Each session manages its own list of to-be-reclaimed objects and performs its own garbage collection during transaction commitment. This value determines the total memory threshold of objects waiting to be reclaimed, above which garbage collection is triggered for a session. + + In general, the trade-off here is between un-reclaimed objects vs garbage collection frequency. Setting a low value keeps low levels of un-reclaimed memory, but causes frequent garbage collection that may affect performance. Setting a high value triggers garbage collection less frequently, but results in higher levels of un-reclaimed memory. This setting is dependent upon the overall workload. + +- **reclaim_batch_size = 8000** + + Configures the batch size for garbage collection. + + The garbage collector reclaims memory from objects in batches, in order to restrict the number of objects being reclaimed in a single garbage collection pass. The intent of this approach is to minimize the operation time of a single garbage collection pass. + +- **high_reclaim_threshold = 8 MB** + + Configures the high memory threshold for garbage collection. + + Because garbage collection works in batches, it is possible that a session may have many objects that can be reclaimed, but which were not. In such situations, in order to prevent garbage collection lists from becoming too bloated, this value is used to continue reclaiming objects within a single pass, even though that batch size limit has been reached, until the total size of the still-waiting-to-be-reclaimed objects is less than this threshold, or there are no more objects eligible for reclamation. + +### JIT (MOT) + +- **enable_mot_codegen = true** + + Specifies whether to use JIT query compilation and execution for planned queries. + + JIT query execution enables JIT-compiled code to be prepared for a prepared query during its planning phase. The resulting JIT-compiled function is executed whenever the prepared query is invoked. JIT compilation usually takes place in the form of LLVM. On platforms where LLVM is not natively supported, MOT provides a software-based fallback called Tiny Virtual Machine (TVM). + +- **force_mot_pseudo_codegen = false** + + Specifies whether to use TVM (pseudo-LLVM) even though LLVM is supported on the current platform. + + On platforms where LLVM is not natively supported, MOT automatically defaults to TVM. + + On platforms where LLVM is natively supported, LLVM is used by default. This configuration item enables the use of TVM for JIT compilation and execution on platforms on which LLVM is supported. + +- **enable_mot_codegen_print = false** + + Specifies whether to print emitted LLVM/TVM IR code for JIT-compiled queries. + +- **mot_codegen_limit = 100** + + Limits the number of JIT queries allowed per user session. + +### Default mot.conf + +The minimum settings and configuration specify to point the **postgresql.conf** file to the location of the **mot.conf** file - + +``` +postgresql.conf +mot_config_file = '/tmp/gauss/mot.conf' +``` + +Ensure that the value of the max_process_memory setting is sufficient to include the global (data and index) and local (sessions) memory of MOT tables. + +The default content of **mot.conf** is sufficient to get started. The settings can be optimized later. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/2-using-mot/4-mot-usage.md b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/2-using-mot/4-mot-usage.md new file mode 100644 index 0000000000000000000000000000000000000000..29124b8864c6fe5daf92407a5c77ac1f5d4be8a8 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/2-using-mot/4-mot-usage.md @@ -0,0 +1,496 @@ +--- +title: MOT Usage +summary: MOT Usage +author: Zhang Cuiping +date: 2021-03-04 +--- + +# MOT Usage + +Using MOT tables is quite simple and is described in the few short sections below. + +MogDB enables an application to use of MOT tables and standard disk-based tables. You can use MOT tables for your most active, high-contention and throughput-sensitive application tables or you can use MOT tables for all your application's tables. + +The following commands describe how to create MOT tables and how to convert existing disk-based tables into MOT tables in order to accelerate an application's database-related performance. MOT is especially beneficial when applied to tables that have proven to be bottlenecks. + +The following is a simple overview of the tasks related to working with MOT tables: + +- Granting User Permissions +- Creating/Dropping an MOT Table +- Creating an Index for an MOT Table +- Converting a Disk Table into an MOT Table +- Query Native Compilation +- Retrying an Aborted Transaction +- MOT External Support Tools +- MOT SQL Coverage and Limitations + +## Granting User Permissions + +The following describes how to assign a database user permission to access the MOT storage engine. This is performed only once per database user, and is usually done during the initial configuration phase. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** The granting of user permissions is required because MOT is integrated into the MogDB database by using and extending the Foreign Data Wrapper (FDW) mechanism, which requires granting user access permissions. + +To enable a specific user to create and access MOT tables (DDL, DML, SELECT) - + +Run the following statement only once - + +```sql +GRANT USAGE ON FOREIGN SERVER mot_server TO ; +``` + +All keywords are not case sensitive. + +## Creating/Dropping an MOT Table + +Creating a Memory Optimized Table (MOT) is very simple. Only the create and drop table statements in MOT differ from the statements for disk-based tables in MogDB. The syntax of **all other** commands for SELECT, DML and DDL are the same for MOT tables as for MogDB disk-based tables. + +- To create an MOT table - + + ```sql + create FOREIGN table test(x int) [server mot_server]; + ``` + +- Always use the FOREIGN keyword to refer to MOT tables. + +- The [server mot_server] part is optional when creating an MOT table because MOT is an integrated engine, not a separate server. + +- The above is an extremely simple example creating a table named **test** with a single integer column named **x**. In the next section (**Creating an Index**) a more realistic example is provided. + +- MOT tables cannot be created if incremental checkpoint is enabled in postgresql.conf. So please set enable_incremental_checkpoint to off before creating the MOT. + +- To drop an MOT table named test - + + ```sql + drop FOREIGN table test; + ``` + +For a description of the limitations of supported features for MOT tables, such as data types, see the **MOT SQL Coverage and Limitations** section. + +## Creating an Index for an MOT Table + +Standard PostgreSQL create and drop index statements are supported. + +For example - + +```sql +create index text_index1 on test(x) ; +``` + +The following is a complete example of creating an index for the ORDER table in a TPC-C workload - + +```sql +create FOREIGN table bmsql_oorder ( + o_w_id integer not null, + o_d_id integer not null, + o_id integer not null, + o_c_id integer not null, + o_carrier_id integer, + o_ol_cnt integer, + o_all_local integer, + o_entry_d timestamp, + primarykey (o_w_id, o_d_id, o_id) +); + +create index bmsql_oorder_index1 on bmsql_oorder(o_w_id, o_d_id, o_c_id, o_id) ; +``` + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** There is no need to specify the **FOREIGN** keyword before the MOT table name, because it is only created for create and drop table commands. + +For MOT index limitations, see the Index subsection under the _SQL Coverage and Limitations_ section. + +## Converting a Disk Table into an MOT Table + +The direct conversion of disk tables into MOT tables is not yet possible, meaning that no ALTER TABLE statement yet exists that converts a disk-based table into an MOT table. + +The following describes how to manually perform a few steps in order to convert a disk-based table into an MOT table, as well as how the **gs_dump** tool is used to export data and the **gs_restore** tool is used to import data. + +### Prerequisite Check + +Check that the schema of the disk table to be converted into an MOT table contains all required columns. + +Check whether the schema contains any unsupported column data types, as described in the _Unsupported Data Types_ section. + +If a specific column is not supported, then it is recommended to first create a secondary disk table with an updated schema. This schema is the same as the original table, except that all the unsupported types have been converted into supported types. + +Afterwards, use the following script to export this secondary disk table and then import it into an MOT table. + +### Converting + +To covert a disk-based table into an MOT table, perform the following - + +1. Suspend application activity. +2. Use **gs_dump** tool to dump the table’s data into a physical file on disk. Make sure to use the **data only**. +3. Rename your original disk-based table. +4. Create an MOT table with the same table name and schema. Make sure to use the create FOREIGN keyword to specify that it will be an MOT table. +5. Use **gs_restore** to load/restore data from the disk file into the database table. +6. Visually/manually verify that all the original data was imported correctly into the new MOT table. An example is provided below. +7. Resume application activity. + +**IMPORTANT Note** **-** In this way, since the table name remains the same, application queries and relevant database stored-procedures will be able to access the new MOT table seamlessly without code changes. Please note that MOT does not currently support cross-engine multi-table queries (such as by using Join, Union and sub-query) and cross-engine multi-table transactions. Therefore, if an original table is accessed somewhere in a multi-table query, stored procedure or transaction, you must either convert all related disk-tables into MOT tables or alter the relevant code in the application or the database. + +### Conversion Example + +Let's say that you have a database name **benchmarksql** and a table named **customer** (which is a disk-based table) to be migrated it into an MOT table. + +To migrate the customer table into an MOT table, perform the following - + +1. Check your source table column types. Verify that all types are supported by MOT, refer to section *Unsupported Data Types*. + + ```sql + benchmarksql-# \d+ customer + Table "public.customer" + Column | Type | Modifiers | Storage | Stats target | Description + --------+---------+-----------+---------+--------------+------------- + x | integer | | plain | | + y | integer | | plain | | + Has OIDs: no + Options: orientation=row, compression=no + ``` + +2. Check your source table data. + + ```sql + benchmarksql=# select * from customer; + x | y + ---+--- + 1 | 2 + 3 | 4 + (2 rows) + ``` + +3. Dump table data only by using **gs_dump**. + + ```sql + $ gs_dump -Fc benchmarksql -a --table customer -f customer.dump + gs_dump[port='15500'][benchmarksql][2020-06-04 16:45:38]: dump database benchmarksql successfully + gs_dump[port='15500'][benchmarksql][2020-06-04 16:45:38]: total time: 332 ms + ``` + +4. Rename the source table name. + + ```sql + benchmarksql=# alter table customer rename to customer_bk; + ALTER TABLE + ``` + +5. Create the MOT table to be exactly the same as the source table. + + ```sql + benchmarksql=# create foreign table customer (x int, y int); + CREATE FOREIGN TABLE + benchmarksql=# select * from customer; + x | y + ---+--- + (0 rows) + ``` + +6. Import the source dump data into the new MOT table. + + ```sql + $ gs_restore -C -d benchmarksql customer.dump + restore operation successful + total time: 24 ms + Check that the data was imported successfully. + benchmarksql=# select * from customer; + x | y + ---+--- + 1 | 2 + 3 | 4 + (2 rows) + + benchmarksql=# \d + List of relations + Schema | Name | Type | Owner | Storage + --------+-------------+---------------+--------+---------------------------------- + public | customer | foreign table | aharon | + public | customer_bk | table | aharon | {orientation=row,compression=no} + (2 rows) + ``` + +## Query Native Compilation + +An additional feature of MOT is the ability to prepare and parse *pre-compiled full queries* in a native format (using a PREPARE statement) before they are needed for execution. + +This native format can later be executed (using an EXECUTE command) more efficiently. This type of execution is much quicker because the native format bypasses multiple database processing layers during execution and thus enables better performance. + +This division of labor avoids repetitive parse analysis operations. In this way, queries and transaction statements are executed in an interactive manner. This feature is sometimes called *Just-In-Time (JIT)* query compilation. + +### Query Compilation - PREPARE Statement + +To use MOT’s native query compilation, call the PREPARE client statement before the query is executed. This instructs MOT to pre-compile the query and/or to pre-load previously pre-compiled code from a cache. + +The following is an example of PREPARE syntax in SQL - + +```sql +PREPARE name [ ( data_type [, ...] ) ] AS statement +``` + +PREPARE creates a prepared statement in the database server, which is a server-side object that can be used to optimize performance. + +### Execute Command + +When an EXECUTE command is subsequently issued, the prepared statement is parsed, analyzed, rewritten and executed. This division of labor avoids repetitive parse analysis operations, while enabling the execution plan to depend on specific provided setting values. + +The following is an example of how to invoke a PREPARE and then an EXECUTE statement in a Java application. + +```sql +conn = DriverManager.getConnection(connectionUrl, connectionUser, connectionPassword); + +// Example 1: PREPARE without bind settings +String query = "SELECT * FROM getusers"; +PreparedStatement prepStmt1 = conn.prepareStatement(query); +ResultSet rs1 = pstatement.executeQuery()) +while (rs1.next()) {…} + +// Example 2: PREPARE with bind settings +String sqlStmt = "SELECT * FROM employees where first_name=? and last_name like ?"; +PreparedStatement prepStmt2 = conn.prepareStatement(sqlStmt); +prepStmt2.setString(1, "Mark"); // first name "Mark" +prepStmt2.setString(2, "%n%"); // last name contains a letter "n" +ResultSet rs2 = prepStmt2.executeQuery()) +while (rs2.next()) {…} +``` + +The following describes the supported and unsupported features of MOT compilation. + +### Supported Queries for Lite Execution + +The following query types are suitable for lite execution - + +- Simple point queries - + - SELECT (including SELECT for UPDATE) + - UPDATE + - DELETE +- INSERT query +- Range UPDATE queries that refer to a full prefix of the primary key +- Range SELECT queries that refer to a full prefix of the primary key +- JOIN queries where one or both parts collapse to a point query +- JOIN queries that refer to a full prefix of the primary key in each joined table + +### Unsupported Queries for Lite Execution + +Any special query attribute disqualifies it for Lite Execution. In particular, if any of the following conditions apply, then the query is declared as unsuitable for Lite Execution. You may refer to the Unsupported Queries for Native Compilation and Lite Execution section for more information. + +It is important to emphasize that in case a query statement does not fit + +native compilation and lite execution, no error is reported to the client and the query will still be executed in a normal and standard manner. + +For more information about MOT native compilation capabilities, see either the section about Query Native Compilation or a more detailed information in the Query Native Compilation (JIT) section. + +## Retrying an Aborted Transaction + +In Optimistic Concurrency Control (OCC) (such as the one used by MOT) during a transaction (using any isolation level) no locks are placed on a record until the COMMIT phase. This is a powerful advantage that significantly increases performance. Its drawback is that an update may fail if another session attempts to update the same record. This results in an entire transaction that must be aborted. These so called *Update Conflicts* are detected by MOT at the commit time by a version checking mechanism. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** A similar abort happens on engines using pessimistic concurrency control, such as standard PG and the MogDB disk-based tables, when SERIALIZABLE or REPEATABLE-READ isolation level are used. + +Such update conflicts are quite rare in common OLTP scenarios and are especially rare in our experience with MOT. However, because there is still a chance that they may happen, developers should consider resolving this issue using transaction retry code. + +The following describes how to retry a table command after multiple sessions attempt to update the same table simultaneously. You may refer to the OCC vs 2PL Differences by Example section for more detailed information. The following example is taken from TPC-C payment transaction. + +```java +int commitAborts = 0; + +while (commitAborts < RETRY_LIMIT) { + + try { + stmt =db.stmtPaymentUpdateDistrict; + stmt.setDouble(1, 100); + stmt.setInt(2, 1); + stmt.setInt(3, 1); + stmt.executeUpdate(); + + db.commit(); + + break; + } + catch (SQLException se) { + if(se != null && se.getMessage().contains("could not serialize access due to concurrent update")) { + log.error("commmit abort = " + se.getMessage()); + commitAborts++; + continue; + }else { + db.rollback(); + } + + break; + } +} +``` + +## MOT External Support Tools + +The following external MogDB tools have been modified in order to support MOT. Make sure to use the most recent version of each. An overview describing MOT-related usage is provided below. For a full description of these tools and their usage, refer to the MogDB Tools Reference document. + +### gs_ctl (Full and Incremental) + +This tool is used to create a standby server from a primary server, as well as to synchronize a server with another copy of the same server after their timelines have diverged. + +At the end of the operation, the latest MOT checkpoint is fetched by the tool, taking into consideration the **checkpoint_dir** configuration setting value. + +The checkpoint is fetched from the source server's **checkpoint_dir** to the destination server's **checkpoint_dir**. + +Currently, MOT does not support an incremental checkpoint. Therefore, the gs_ctl incremental build does not work in an incremental manner for MOT, but rather in FULL mode. The Postgres (disk-tables) incremental build can still be done incrementally. + +### gs_basebackup + +gs_basebackup is used to prepare base backups of a running server, without affecting other database clients. + +The MOT checkpoint is fetched at the end of the operation as well. However, the checkpoint's location is taken from **checkpoint_dir** in the source server and is transferred to the data directory of the source in order to back it up correctly. + +### gs_dump + +gs_dump is used to export the database schema and data to a file. It also supports MOT tables. + +### gs_restore + +gs_restore is used to import the database schema and data from a file. It also supports MOT tables. + +## MOT SQL Coverage and Limitations + +MOT design enables almost complete coverage of SQL and future feature sets. For example, standard Postgres SQL is mostly supported, as well common database features, such as stored procedures and user defined functions. + +The following describes the various types of SQL coverages and limitations - + +### Unsupported Features + +The following features are not supported by MOT - + +- Engine Interop - No cross-engine (Disk+MOT) queries, views or transactions. Planned for 2021. +- MVCC, Isolation - No snapshot/serializable isolation. Planned for 2021. +- Native Compilation (JIT) - Limited SQL coverage. Also, JIT compilation of stored procedures is not supported. +- LOCAL memory is limited to 1 GB. A transaction can only change data of less than 1 GB. +- Capacity (Data+Index) is limited to available memory. Anti-caching + Data Tiering will be available in the future. +- No full-text search index. +- Do not support Logical copy. + +In addition, the following are detailed lists of various general limitations of MOT tables, MOT indexes, Query and DML syntax and the features and limitations of Query Native Compilation. + +### MOT Table Limitations + +The following lists the functionality limitations of MOT tables - + +- Partition by range +- AES encryption +- Stream operations +- User-defined types +- Sub-transactions +- DML triggers +- DDL triggers +- Collations other than "C" or "POSIX" + +### Unsupported Table DDLs + +- Alter table +- Create table, like including +- Create table as select +- Partition by range +- Create table with no-logging clause +- DEFERRABLE primary key +- Reindex +- Tablespace +- Create schema with subcommands + +### Unsupported Data Types + +- UUID +- User-Defined Type (UDF) +- Array data type +- NVARCHAR2(n) +- Clob +- Name +- Blob +- Raw +- Path +- Circle +- Reltime +- Bit varying(10) +- Tsvector +- Tsquery +- JSON +- Box +- Text +- Line +- Point +- LSEG +- POLYGON +- INET +- CIDR +- MACADDR +- Smalldatetime +- BYTEA +- Bit +- Varbit +- OID +- Money +- Any unlimited varchar/character varying +- HSTORE + +### UnsupportedIndex DDLs and Index + +- Create index on decimal/numeric + +- Create index on nullable columns + +- Create index, index per table > 9 + +- Create index on key size > 256 + + The key size includes the column size in bytes + a column additional size, which is an overhead required to maintain the index. The below table lists the column additional size for different column types. + + Additionally, in case of non-unique indexes an extra 8 bytes is required. + + Thus, the following pseudo code calculates the **key size**: + + ```java + keySize =0; + + for each (column in index){ + keySize += (columnSize + columnAddSize); + } + if (index is non_unique) { + keySize += 8; + } + ``` + + | Column Type | Column Size | Column Additional Size | + | :---------- | :---------- | :--------------------- | + | varchar | N | 4 | + | tinyint | 1 | 1 | + | smallint | 2 | 1 | + | int | 4 | 1 | + | bigint | 8 | 1 | + | float | 4 | 2 | + | float8 | 8 | 3 | + + Types that are not specified in above table, the column additional size is zero (for instance timestamp). + +### Unsupported DMLs + +- Merge into +- Select into +- Lock table +- Copy from table +- Upsert + +### Unsupported Queries for Native Compilation and Lite Execution + +- The query refers to more than two tables +- The query has any one of the following attributes - + - Aggregation on non-primitive types + - Window functions + - Sub-query sub-links + - Distinct-ON modifier (distinct clause is from DISTINCT ON) + - Recursive (WITH RECURSIVE was specified) + - Modifying CTE (has INSERT/UPDATE/DELETE in WITH) + +In addition, the following clauses disqualify a query from lite execution - + +- Returning list +- Group By clause +- Grouping sets +- Having clause +- Windows clause +- Distinct clause +- Sort clause that does not conform to native index order +- Set operations +- Constraint dependencies diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/2-using-mot/5-mot-administration.md b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/2-using-mot/5-mot-administration.md new file mode 100644 index 0000000000000000000000000000000000000000..5b2171fe07ada57ec189d6b63e0a420bb9c4c604 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/2-using-mot/5-mot-administration.md @@ -0,0 +1,419 @@ +--- +title: MOT Administration +summary: MOT Administration +author: Zhang Cuiping +date: 2021-03-04 +--- + +# MOT Administration + +The following describes various MOT administration topics. + +## MOT Durability + +Durability refers to long-term data protection (also known as *disk persistence*). Durability means that stored data does not suffer from any kind of degradation or corruption, so that data is never lost or compromised. Durability ensures that data and the MOT engine are restored to a consistent state after a planned shutdown (for example, for maintenance) or an unplanned crash (for example, a power failure). + +Memory storage is volatile, meaning that it requires power to maintain the stored information. Disk storage, on the other hand, is non-volatile, meaning that it does not require power to maintain stored information, thus, it can survive a power shutdown. MOT uses both types of storage - it has all data in memory, while persisting transactional changes to disk **MOT Durability** and by maintaining frequent periodic **MOT Checkpoints** in order to ensure data recovery in case of shutdown. + +The user must ensure sufficient disk space for the logging and Checkpointing operations. A separated drive can be used for the Checkpoint to improve performance by reducing disk I/O load. + +You may refer to the **MOT Key Technologies** section for an overview of how durability is implemented in the MOT engine. + +**To configure durability -** + +To ensure strict consistency, configure the synchronous_commit parameter to **On** in the postgres.conf configuration file. + +**MOTs WAL Redo Log and Checkpoints enable durability, as described below -** + +### MOT Logging - WAL Redo Log + +To ensure Durability, MOT is fully integrated with the MogDB's Write-Ahead Logging (WAL) mechanism, so that MOT persists data in WAL records using MogDB's XLOG interface. This means that every addition, update, and deletion to an MOT table’s record is recorded as an entry in the WAL. This ensures that the most current data state can be regenerated and recovered from this non-volatile log. For example, if three new rows were added to a table, two were deleted and one was updated, then six entries would be recorded in the log. + +MOT log records are written to the same WAL as the other records of MogDB disk-based tables. + +MOT only logs an operation at the transaction commit phase. + +MOT only logs the updated delta record in order to minimize the amount of data written to disk. + +During recovery, data is loaded from the last known or a specific Checkpoint; and then the WAL Redo log is used to complete the data changes that occur from that point forward. + +The WAL (Redo Log) retains all the table row modifications until a Checkpoint is performed (as described above). The log can then be truncated in order to reduce recovery time and to save disk space. + +**Note** - In order to ensure that the log IO device does not become a bottleneck, the log file must be placed on a drive that has low latency. + +### MOT Logging Types + +Two synchronous transaction logging options and one asynchronous transaction logging option are supported (these are also supported by the standard MogDB disk engine). MOT also supports synchronous Group Commit logging with NUMA-awareness optimization, as described below. + +According to your configuration, one of the following types of logging is implemented - + +- **Synchronous Redo Logging** + + The **Synchronous Redo Logging** option is the simplest and most strict redo logger. When a transaction is committed by a client application, the transaction redo entries are recorded in the WAL (Redo Log), as follows - + + 1. While a transaction is in progress, it is stored in the MOT's memory. + 2. After a transaction finishes and the client application sends a Commit command, the transaction is locked and then written to the WAL Redo Log on the disk. This means that while the transaction log entries are being written to the log, the client application is still waiting for a response. + 3. As soon as the transaction's entire buffer is written to the log, the changes to the data in memory take place and then the transaction is committed. After the transaction has been committed, the client application is notified that the transaction is complete. + + **Summary** + + The **Synchronous Redo Logging** option is the safest and most strict because it ensures total synchronization of the client application and the WAL Redo log entries for each transaction as it is committed; thus ensuring total durability and consistency with absolutely no data loss. This logging option prevents the situation where a client application might mark a transaction as successful, when it has not yet been persisted to disk. + + The downside of the **Synchronous Redo Logging** option is that it is the slowest logging mechanism of the three options. This is because a client application must wait until all data is written to disk and because of the frequent disk writes (which typically slow down the database). + +- **Group Synchronous Redo Logging** + + The **Group Synchronous Redo Logging** option is very similar to the **Synchronous Redo Logging** option, because it also ensures total durability with absolutely no data loss and total synchronization of the client application and the WAL (Redo Log) entries. The difference is that the **Group Synchronous Redo Logging** option writes _groups of transaction_r edo entries to the WAL Redo Log on the disk at the same time, instead of writing each and every transaction as it is committed. Using Group Synchronous Redo Logging reduces the amount of disk I/Os and thus improves performance, especially when running a heavy workload. + + The MOT engine performs synchronous Group Commit logging with Non-Uniform Memory Access (NUMA)-awareness optimization by automatically grouping transactions according to the NUMA socket of the core on which the transaction is running. + + You may refer to the **NUMA Awareness Allocation and Affinity** section for more information about NUMA-aware memory access. + + When a transaction commits, a group of entries are recorded in the WAL Redo Log, as follows - + + 1. While a transaction is in progress, it is stored in the memory. The MOT engine groups transactions in buckets according to the NUMA socket of the core on which the transaction is running. This means that all the transactions running on the same socket are grouped together and that multiple groups will be filling in parallel according to the core on which the transaction is running. + + Writing transactions to the WAL is more efficient in this manner because all the buffers from the same socket are written to disk together. + + **Note** - Each thread runs on a single core/CPU which belongs to a single socket and each thread only writes to the socket of the core on which it is running. + + 2. After a transaction finishes and the client application sends a Commit command, the transaction redo log entries are serialized together with other transactions that belong to the same group. + + 3. After the configured criteria are fulfilled for a specific group of transactions (quantity of committed transactions or timeout period as describes in the **REDO LOG (MOT)** section), the transactions in this group are written to the WAL on the disk. This means that while these log entries are being written to the log, the client applications that issued the commit are waiting for a response. + + 4. As soon as all the transaction buffers in the NUMA-aware group have been written to the log, all the transactions in the group are performing the necessary changes to the memory store and the clients are notified that these transactions are complete. + + **Summary** + + The **Group Synchronous Redo Logging** option is a an extremely safe and strict logging option because it ensures total synchronization of the client application and the WAL Redo log entries; thus ensuring total durability and consistency with absolutely no data loss. This logging option prevents the situation where a client application might mark a transaction as successful, when it has not yet been persisted to disk. + + On one hand this option has fewer disk writes than the **Synchronous Redo Logging** option, which may mean that it is faster. The downside is that transactions are locked for longer, meaning that they are locked until after all the transactions in the same NUMA memory have been written to the WAL Redo Log on the disk. + + The benefits of using this option depend on the type of transactional workload. For example, this option benefits systems that have many transactions (and less so for systems that have few transactions, because there are few disk writes anyway). + +- **Asynchronous Redo Logging** + + The **Asynchronous Redo Logging** option is the fastest logging method, However, it does not ensure no data loss, meaning that some data that is still in the buffer and was not yet written to disk may get lost upon a power failure or database crash. When a transaction is committed by a client application, the transaction redo entries are recorded in internal buffers and written to disk at preconfigured intervals. The client application does not wait for the data being written to disk. It continues to the next transaction. This is what makes asynchronous redo logging the fastest logging method. + + When a transaction is committed by a client application, the transaction redo entries are recorded in the WAL Redo Log, as follows - + + 1. While a transaction is in progress, it is stored in the MOT's memory. + 2. After a transaction finishes and the client application sends a Commit command, the transaction redo entries are written to internal buffers, but are not yet written to disk. Then changes to the MOT data memory take place and the client application is notified that the transaction is committed. + 3. At a preconfigured interval, a redo log thread running in the background collects all the buffered redo log entries and writes them to disk. + + **Summary** + + The Asynchronous Redo Logging option is the fastest logging option because it does not require the client application to wait for data being written to disk. In addition, it groups many transactions redo entries and writes them together, thus reducing the amount of disk I/Os that slow down the MOT engine. + + The downside of the Asynchronous Redo Logging option is that it does not ensure that data will not get lost upon a crash or failure. Data that was committed, but was not yet written to disk, is not durable on commit and thus cannot be recovered in case of a failure. The Asynchronous Redo Logging option is most relevant for applications that are willing to sacrifice data recovery (consistency) over performance. + +### Configuring Logging + +Two synchronous transaction logging options and one asynchronous transaction logging option are supported by the standard MogDB disk engine. + +To configure logging - + +1. The determination of whether synchronous or asynchronous transaction logging is performed is configured in the synchronous_commit **(On = Synchronous)** parameters in the postgres.conf configuration file. + +If a synchronous mode of transaction logging has been selected (synchronous_commit = **On**, as described above), then the enable_group_commit parameter in the mot.conf configuration file determines whether the **Group Synchronous Redo Logging** option or the **Synchronous Redo Logging** option is used. For **Group Synchronous Redo Logging**, you must also define in the mot.conf file which of the following thresholds determine when a group of transactions is recorded in the WAL + +- group_commit_size **-** The quantity of committed transactions in a group. For example, **16** means that when 16 transactions in the same group have been committed by a client application, then an entry is written to disk in the WAL Redo Log for all 16 transactions. + +- group_commit_timeout - A timeout period in ms. For example, **10** means that after 10 ms, an entry is written to disk in the WAL Redo Log for each of the transactions in the same group that have been committed by their client application in the last 10 ms. + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** You may refer to the **REDO LOG (MOT)** for more information about configuration settings. + +### MOT Checkpoints + +A Checkpoint is the point in time at which all the data of a table's rows is saved in files on persistent storage in order to create a full durable database image. It is a snapshot of the data at a specific point in time. + +A Checkpoint is required in order to reduce a database's recovery time by shortening the quantity of WAL (Redo Log) entries that must be replayed in order to ensure durability. Checkpoint's also reduce the storage space required to keep all the log entries. + +If there were no Checkpoints, then in order to recover a database, all the WAL redo entries would have to be replayed from the beginning of time, which could take days/weeks depending on the quantity of records in the database. Checkpoints record the current state of the database and enable old redo entries to be discarded. + +Checkpoints are essential during recovery scenarios (especially for a cold start). First, the data is loaded from the last known or a specific Checkpoint; and then the WAL is used to complete the data changes that occurred since then. + +For example - If the same table row is modified 100 times, then 100 entries are recorded in the log. When Checkpoints are used, then even if a specific table row was modified 100 times, it is recorded in the Checkpoint a single time. After the recording of a Checkpoint, recovery can be performed on the basis of that Checkpoint and only the WAL Redo Log entries that occurred since the Checkpoint need be played. + +## MOT Recovery + +The main objective of MOT Recovery is to restore the data and the MOT engine to a consistent state after a planned shutdown (for example, for maintenance) or an unplanned crash (for example, after a power failure). + +MOT recovery is performed automatically with the recovery of the rest of the MogDB database and is fully integrated into MogDB recovery process (also called a *Cold Start*). + +MOT recovery consists of two stages - + +**Checkpoint Recovery -** First, data must be recovered from the latest Checkpoint file on disk by loading it into memory rows and creating indexes. + +**WAL Redo Log Recovery -** Afterwards, the recent data (which was not captured in the Checkpoint) must be recovered from the WAL Redo Log by replaying records that were added to the log since the Checkpoint that was used in the Checkpoint Recovery (described above). + +The WAL Redo Log recovery is managed and triggered by MogDB. + +- To configure recovery. + +- While WAL recovery is performed in a serial manner, the Checkpoint recovery can be configured to run in a multi-threaded manner (meaning in parallel by multiple workers). + +- Configure the **Checkpoint_recovery_workers** parameter in the **mot.conf** file, which is described in the **RECOVERY (MOT)** section. + +## MOT Replication and High Availability + +Since MOT is integrated into MogDB and uses/supports its replication and high availability, both synchronous and asynchronous replication are supported out of the box. + +The MogDB gs_ctl tool is used for availability control and to operate the cluster. This includes gs_ctl switchover, gs_ctl failover, gs_ctl build and so on. + +You may refer to the MogDB Tools Reference document for more information. + +- To configure replication and high availability. +- Refer to the relevant MogDB documentation. + +## MOT Memory Management + +For planning and finetuning, see the **MOT Memory and Storage Planning** and **MOT Configuration Settings** sections. + +## MOT Vacuum + +Use VACUUM for garbage collection and optionally to analyze a database, , as follows - + +- [PG] + + In Postgress (PG), the VACUUM reclaims storage occupied by dead tuples. In normal PG operation, tuples that are deleted or that are made obsolete by an update are not physically removed from their table. They remain present until a VACUUM is done. Therefore, it is necessary to perform a VACUUM periodically, especially on frequently updated tables. + +- [MOT Extension] + + MOT tables do not need a periodic VACUUM operation, since dead/empty tuples are re-used by new ones. MOT tables require VACUUM operations only when their size is significantly reduced and they do not expect to grow to their original size in the near future. + + For example, an application that periodically (for example, once in a week) performs a large deletion of a table/tables data while inserting new data takes days and does not necessarily require the same quantity of rows. In such cases, it makes sense to activate the VACUUM. + + The VACUUM operation on MOT tables is always transformed into a VACUUM FULL with an exclusive table lock. + +- Supported Syntax and Limitations + + Activation of the VACUUM operation is performed in a standard manner. + + ```sql + VACUUM [FULL | ANALYZE] [ table ]; + ``` + + Only the FULL and ANALYZE VACUUM options are supported. The VACUUM operation can only be performed on an entire MOT table. + + The following PG vacuum options are not supported: + + - FREEZE + - VERBOSE + - Column specification + - LAZY mode (partial table scan) + + Additionally, the following functionality is not supported + + - AUTOVACUUM + +## MOT Statistics + +Statistics are intended for performance analysis or debugging. It is uncommon to turn them ON in a production environment (by default, they are OFF). Statistics are primarily used by database developers and to a lesser degree by database users. + +There is some impact on performance, particularly on the server. Impact on the user is negligible. + +The statistics are saved in the database server log. The log is located in the data folder and named **postgresql-DATE-TIME.log**. + +Refer to **STATISTICS (MOT)** for detailed configuration options. + +## MOT Monitoring + +All syntax for monitoring of PG-based FDW tables is supported. This includes Table or Index sizes (as described below). In addition, special functions exist for monitoring MOT memory consumption, including MOT Global Memory, MOT Local Memory and a single client session. + +### Table and Index Sizes + +The size of tables and indexes can be monitored by querying pg_relation_size. + +For example + +**Data Size** + +```sql +select pg_relation_size('customer'); +``` + +**Index** + +```sql +select pg_relation_size('customer_pkey'); +``` + +### MOT GLOBAL Memory Details + +Check the size of MOT global memory, which includes primarily the data and indexes. + +```sql +select * from mot_global_memory_detail(); +``` + +Result - + +```sql +numa_node | reserved_size | used_size +----------------+----------------+------------- +-1 | 194716368896 | 25908215808 +0 | 446693376 | 446693376 +1 | 452984832 | 452984832 +2 | 452984832 | 452984832 +3 | 452984832 | 452984832 +4 | 452984832 | 452984832 +5 | 364904448 | 364904448 +6 | 301989888 | 301989888 +7 | 301989888 | 301989888 +``` + +Where - + +- -1 is the total memory. +- 0..7 are NUMA memory nodes. + +### MOT LOCAL Memory Details + +Check the size of MOT local memory, which includes session memory. + +```sql +select * from mot_local_memory_detail(); +``` + +Result - + +```sql +numa_node | reserved_size | used_size +----------------+----------------+------------- +-1 | 144703488 | 144703488 +0 | 25165824 | 25165824 +1 | 25165824 | 25165824 +2 | 18874368 | 18874368 +3 | 18874368 | 18874368 +4 | 18874368 | 18874368 +5 | 12582912 | 12582912 +6 | 12582912 | 12582912 +7 | 12582912 | 12582912 +``` + +Where - + +- -1 is the total memory. +- 0..7 are NUMA memory nodes. + +### Session Memory + +Memory for session management is taken from the MOT local memory. + +Memory usage by all active sessions (connections) is possible using the following query - + +```sql +select * from mot_session_memory_detail(); +``` + +Result - + +```sql +sessid | total_size | free_size | used_size +----------------------------------------+-----------+----------+---------- +1591175063.139755603855104 | 6291456 | 1800704 | 4490752 + +``` + +Legend - + +- **total_size -** is allocated for the session +- **free_size -** not in use +- **used_size -** In actual use + +The following query enables a DBA to determine the state of local memory used by the current session - + +```sql +select * from mot_session_memory_detail() + where sessid = pg_current_sessionid(); +``` + +Result - + +![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-administration-1.png) + +## MOT Error Messages + +Errors may be caused by a variety of scenarios. All errors are logged in the database server log file. In addition, user-related errors are returned to the user as part of the response to the query, transaction or stored procedure execution or to database administration action. + +- Errors reported in the Server log include - Function, Entity, Context, Error message, Error description and Severity. +- Errors reported to users are translated into standard PostgreSQL error codes and may consist of an MOT-specific message and description. + +The following lists the error messages, error descriptions and error codes. The error code is actually an internal code and not logged or returned to users. + +### Errors Written the Log File + +All errors are logged in the database server log file. The following lists the errors that are written to the database server log file and are **not** returned to the user. The log is located in the data folder and named **postgresql-DATE-TIME.log**. + +**Table 1** Errors Written Only to the Log File + +| Message in the Log | Error Internal Code | +| :---------------------------------- | :------------------------------- | +| Error code denoting success | MOT_NO_ERROR 0 | +| Out of memory | MOT_ERROR_OOM 1 | +| Invalid configuration | MOT_ERROR_INVALID_CFG 2 | +| Invalid argument passed to function | MOT_ERROR_INVALID_ARG 3 | +| System call failed | MOT_ERROR_SYSTEM_FAILURE 4 | +| Resource limit reached | MOT_ERROR_RESOURCE_LIMIT 5 | +| Internal logic error | MOT_ERROR_INTERNAL 6 | +| Resource unavailable | MOT_ERROR_RESOURCE_UNAVAILABLE 7 | +| Unique violation | MOT_ERROR_UNIQUE_VIOLATION 8 | +| Invalid memory allocation size | MOT_ERROR_INVALID_MEMORY_SIZE 9 | +| Index out of range | MOT_ERROR_INDEX_OUT_OF_RANGE 10 | +| Error code unknown | MOT_ERROR_INVALID_STATE 11 | + +### Errors Returned to the User + +The following lists the errors that are written to the database server log file and are returned to the user. + +MOT returns PG standard error codes to the envelope using a Return Code (RC). Some RCs cause the generation of an error message to the user who is interacting with the database. + +The PG code (described below) is returned internally by MOT to the database envelope, which reacts to it according to standard PG behavior. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** %s, %u and %lu in the message are replaced by relevant error information, such as query, table name or another information. - %s - String - %u - Number - %lu - Number + +**Table 2** Errors Returned to the User and Logged to the Log File + +| Short and Long Description Returned to the User | PG Code | Internal Error Code | +| :---------------------------------------------------- | :------------------------------ | :------------------------------ | +| Success.Denotes success | ERRCODE_SUCCESSFUL_COMPLETIONCOMPLETION | RC_OK = 0 | +| FailureUnknown error has occurred. | ERRCODE_FDW_ERROR | RC_ERROR = 1 | +| Unknown error has occurred.Denotes aborted operation. | ERRCODE_FDW_ERROR | RC_ABORT | +| Column definition of %s is not supported.Column type %s is not supported yet. | ERRCODE_INVALID_COLUMN_DEFINITION | RC_UNSUPPORTED_COL_TYPE | +| Column definition of %s is not supported.Column type Array of %s is not supported yet. | ERRCODE_INVALID_COLUMN_DEFINITION | RC_UNSUPPORTED_COL_TYPE_ARR | +| Column size %d exceeds max tuple size %u.Column definition of %s is not supported. | ERRCODE_FEATURE_NOT_SUPPORTED | RC_EXCEEDS_MAX_ROW_SIZE | +| Column name %s exceeds max name size %u.Column definition of %s is not supported. | ERRCODE_INVALID_COLUMN_DEFINITION | RC_COL_NAME_EXCEEDS_MAX_SIZE | +| Column size %d exceeds max size %u.Column definition of %s is not supported. | ERRCODE_INVALID_COLUMN_DEFINITION | RC_COL_SIZE_INVLALID | +| Cannot create table.Cannot add column %s; as the number of declared columns exceeds the maximum declared columns. | ERRCODE_FEATURE_NOT_SUPPORTED | RC_TABLE_EXCEEDS_MAX_DECLARED_COLS | +| Cannot create index.Total column size is greater than maximum index size %u. | ERRCODE_FDW_KEY_SIZE_EXCEEDS_MAX_ALLOWED | RC_INDEX_EXCEEDS_MAX_SIZE | +| Cannot create index.Total number of indexes for table %s is greater than the maximum number of indexes allowed %u. | ERRCODE_FDW_TOO_MANY_INDEXES | RC_TABLE_EXCEEDS_MAX_INDEXES | +| Cannot execute statement.Maximum number of DDLs per transaction reached the maximum %u. | ERRCODE_FDW_TOO_MANY_DDL_CHANGES_IN_TRANSACTION_NOT_ALLOWED | RC_TXN_EXCEEDS_MAX_DDLS | +| Unique constraint violationDuplicate key value violates unique constraint \"%s\"".Key %s already exists. | ERRCODE_UNIQUE_VIOLATION | RC_UNIQUE_VIOLATION | +| Table \"%s\" does not exist. | ERRCODE_UNDEFINED_TABLE | RC_TABLE_NOT_FOUND | +| Index \"%s\" does not exist. | ERRCODE_UNDEFINED_TABLE | RC_INDEX_NOT_FOUND | +| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_LOCAL_ROW_FOUND | +| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_LOCAL_ROW_NOT_FOUND | +| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_LOCAL_ROW_DELETED | +| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_INSERT_ON_EXIST | +| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_INDEX_RETRY_INSERT | +| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_INDEX_DELETE | +| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_LOCAL_ROW_NOT_VISIBLE | +| Memory is temporarily unavailable. | ERRCODE_OUT_OF_LOGICAL_MEMORY | RC_MEMORY_ALLOCATION_ERROR | +| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_ILLEGAL_ROW_STATE | +| Null constraint violated.NULL value cannot be inserted into non-null column %s at table %s. | ERRCODE_FDW_ERROR | RC_NULL_VIOLATION | +| Critical error.Critical error: %s. | ERRCODE_FDW_ERROR | RC_PANIC | +| A checkpoint is in progress - cannot truncate table. | ERRCODE_FDW_OPERATION_NOT_SUPPORTED | RC_NA | +| Unknown error has occurred. | ERRCODE_FDW_ERROR | RC_MAX_VALUE | +| <recovery message> | - | ERRCODE_CONFIG_FILE_ERROR | +| <recovery message> | - | ERRCODE_INVALID_TABLE_DEFINITION | +| Memory engine - Failed to perform commit prepared. | - | ERRCODE_INVALID_TRANSACTION_STATE | +| Invalid option <option name> | - | ERRCODE_FDW_INVALID_OPTION_NAME | +| Invalid memory allocation request size. | - | ERRCODE_INVALID_PARAMETER_VALUE | +| Memory is temporarily unavailable. | - | ERRCODE_OUT_OF_LOGICAL_MEMORY | +| Could not serialize access due to concurrent update. | - | ERRCODE_T_R_SERIALIZATION_FAILURE | +| Alter table operation is not supported for memory table.Cannot create MOT tables while incremental checkpoint is enabled.Re-index is not supported for memory tables. | - | ERRCODE_FDW_OPERATION_NOT_SUPPORTED | +| Allocation of table metadata failed. | - | ERRCODE_OUT_OF_MEMORY | +| Database with OID %u does not exist. | - | ERRCODE_UNDEFINED_DATABASE | +| Value exceeds maximum precision: %d. | - | ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE | +| You have reached a maximum logical capacity %lu of allowed %lu. | - | ERRCODE_OUT_OF_LOGICAL_MEMORY | diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/2-using-mot/6-mot-sample-tpcc-benchmark.md b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/2-using-mot/6-mot-sample-tpcc-benchmark.md new file mode 100644 index 0000000000000000000000000000000000000000..e0fb87fe182d30dd8e48c27573f27c42bab06efc --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/2-using-mot/6-mot-sample-tpcc-benchmark.md @@ -0,0 +1,116 @@ +--- +title: MOT Sample TPC-C Benchmark +summary: MOT Sample TPC-C Benchmark +author: Zhang Cuiping +date: 2021-03-04 +--- + +# MOT Sample TPC-C Benchmark + +## TPC-C Introduction + +The TPC-C Benchmark is an industry standard benchmark for measuring the performance of Online Transaction Processing (OLTP) systems. It is based on a complex database and a number of different transaction types that are executed on it. TPC-C is both a hardware-independent and a software-independent benchmark and can thus be run on every test platform. An official overview of the benchmark model can be found at the tpc.org website here - . + +The database consists of nine tables of various structures and thus also nine types of data records. The size and quantity of the data records varies per table. A mix of five concurrent transactions of varying types and complexities is executed on the database, which are largely online or in part queued for deferred batch processing. Because these tables compete for limited system resources, many system components are stressed and data changes are executed in a variety of ways. + +**Table 1** TPC-C Database Structure + +| Table | Number of Entries | +| :--------- | :--------------------------------------- | +| Warehouse | n | +| Item | 100,000 | +| Stock | n x 100,000 | +| District | n x 10 | +| Customer | 3,000 per district, 30,000 per warehouse | +| Order | Number of customers (initial value) | +| New order | 30% of the orders (initial value) | +| Order line | ~ 10 per order | +| History | Number of customers (initial value) | + +The transaction mix represents the complete business processing of an order - from its entry through to its delivery. More specifically, the provided mix is designed to produce an equal number of new-order transactions and payment transactions and to produce a single delivery transaction, a single order-status transaction and a single stock-level transaction for every ten new-order transactions. + +**Table 2** TPC-C Transactions Ratio + +| Transaction Level ≥ 4% | Share of All Transactions | +| :--------------------- | :------------------------ | +| TPC-C New order | ≤ 45% | +| Payment | ≥ 43% | +| Order status | ≥ 4% | +| Delivery | ≥ 4% (batch) | +| Stock level | ≥ 4% | + +There are two ways to execute the transactions - **as stored procedures** (which allow higher throughput) and in **standard interactive SQL mode**. + +**Performance Metric - tpm-C** + +The tpm-C metric is the number of new-order transactions executed per minute. Given the required mix and a wide range of complexity and types among the transactions, this metric most closely simulates a comprehensive business activity, not just one or two transactions or computer operations. For this reason, the tpm-C metric is considered to be a measure of business throughput. + +The tpm-C unit of measure is expressed as transactions-per-minute-C, whereas "C" stands for TPC-C specific benchmark. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** The official TPC-C Benchmark specification can be accessed at - . Some of the rules of this specification are generally not fulfilled in the industry, because they are too strict for industry reality. For example, Scaling rules - (a) tpm-C / Warehouse must be >9 and <12.86 (implying that a very high warehouses rate is required in order to achieve a high tpm-C rate, which also means that an extremely large database and memory capacity are required); and (b) 10x terminals x Warehouses (implying a huge quantity of simulated clients). + +## System-Level Optimization + +Follow the instructions in the **MOT Server Optimization - x86** section. The following section describes the key system-level optimizations for deploying the MogDB database on a Huawei Taishan server and on a Euler 2.8 operating system for ultimate performance. + +## BenchmarkSQL - An Open-Source TPC-C Tool + +For example, to test TPCC, the **BenchmarkSQL** can be used, as follows - + +- Download **benchmarksql** from the following link - +- The schema creation scripts in the **benchmarksql** tool need to be adjusted to MOT syntax and unsupported DDLs need to be avoided. The adjusted scripts can be directly downloaded from the following link - . The contents of this tar file includes sql.common.mogdb.mot folder and jTPCCTData.java file as well as a sample configuration file postgresql.conf and a TPCC properties file props.mot for reference. +- Place the sql.common.mogdb.mot folder in the same level as sql.common under run folder and replace the file src/client/jTPCCTData.java with the downloaded java file. +- Edit the file runDatabaseBuild.sh under run folder to remove **extraHistID** from **AFTER_LOAD** list to avoid unsupported alter table DDL. +- Replace the JDBC driver under lib/postgres folder with the MogDB JDBC driver available from the following link - . + +The only change done in the downloaded java file (compared to the original one) was to comment the error log printing for serialization and duplicate key errors. These errors are normal in case of MOT, since it uses Optimistic Concurrency Control (OCC) mechanism. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** The benchmark test is executed using a standard interactive SQL mode without stored procedures. + +## Running the Benchmark + +Anyone can run the benchmark by starting up the server and running the **benchmarksql** scripts. + +To run the benchmark - + +1. Go to the **benchmarksql** run folder and rename sql.common to sql.common.orig. +2. Create a link sql.common to sql.common.mogdb.mot in order to test MOT. +3. Start up the database server. +4. Configure the props.pg file in the client. +5. Run the benchmark. + +## Results Report + +- Results in CLI + + BenchmarkSQL results should appear as follows - + + ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-sample-tpcc-benchmark-1.jpg) + + Over time, the benchmark measures and averages the committed transactions. The example above benchmarks for two minutes. + + The score is **2.71M tmp-C** (new-orders per-minute), which is 45% of the total committed transactions, meaning the **tpmTOTAL**. + +- Detailed Result Report + + The following is an example of a detailed result report - + + **Figure 1** Detailed Result Report + + ![detailed-result-report](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-sample-tpcc-benchmark-2.png) + + ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-sample-tpcc-benchmark-3.png) + + BenchmarkSQL collects detailed performance statistics and operating system performance data (if configured). + + This information can show the latency of the queries, and thus expose bottlenecks related to storage/network/CPU. + +- Results of TPC-C of MOT on Huawei Taishan 2480 + + Our TPC-C benchmark dated 01-May-2020 with an MogDB database installed on Taishan 2480 server (a 4-socket ARM/Kunpeng server), achieved a throughput of 4.79M tpm-C. + + A near linear scalability was demonstrated, as shown below - + + **Figure 2** Results of TPC-C of MOT on Huawei Taishan 2480 + + ![results-of-tpc-c-of-mot-on-huawei-taishan-2480](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-sample-tpcc-benchmark-4.png) diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-1.md b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-1.md new file mode 100644 index 0000000000000000000000000000000000000000..3b0a39a439eda8d332f6527826490c6da449b2f3 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-1.md @@ -0,0 +1,90 @@ +--- +title: MOT Scale-up Architecture +summary: MOT Scale-up Architecture +author: Zhang Cuiping +date: 2021-03-04 +--- + +# MOT Scale-up Architecture + +To **scale up** means to add additional cores to the *same machine* in order to add computing power. To scale up refers to the most common traditional form of adding computing power in a machine that has a single pair of controllers and multiple cores. Scale-up architecture is limited by the scalability limits of a machine’s controller. + +## Technical Requirements + +MOT has been designed to achieve the following - + +- **Linear Scale-up -** MOT delivers a transactional storage engine that utilizes all the cores of a single NUMA architecture server in order to provide near-linear scale-up performance. This means that MOT is targeted to achieve a direct, near-linear relationship between the quantity of cores in a machine and the multiples of performance increase. + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** The near-linear scale-up results achieved by MOT significantly outperform all other existing solutions, and come as close as possible to achieving optimal results, which are limited by the physical restrictions and limitations of hardware, such as wires. + +- **No Maximum Number of Cores Limitation -** MOT does not place any limits on the maximum quantity of cores. This means that MOT is scalable from a single core up to 1,000s of cores, with minimal degradation per additional core, even when crossing NUMA socket boundaries. + +- **Extremely High Transactional Throughout -** MOT delivers a transactional storage engine that can achieve extremely high transactional throughout compared with any other OLTP vendor on the market. + +- **Extremely Low Transactional Latency -** MOT delivers a transactional storage engine that can reach extremely low transactional latency compared with any other OLTP vendor on the market. + +- **Seamless Integration and Leveraging with/of MogDB -** MOT integrates its transactional engine in a standard and seamless manner with the MogDB product. In this way, MOT reuses maximum functionality from the MogDB layers that are situated on top of its transactional storage engine. + +## Design Principles + +To achieve the requirements described above (especially in an environment with many-cores), our storage engine's architecture implements the following techniques and strategies - + +- **Data and indexes only reside in memory**. +- **Data and indexes are not laid out with physical partitions** (because these might achieve lower performance for certain types of applications). +- Transaction concurrency control is based on **Optimistic Concurrency Control (OCC)** without any centralized contention points. See the **MOT Concurrency Control Mechanism** section for more information about OCC. +- **Parallel Redo Logs (ultimately per core)** are used to efficiently avoid a central locking point. +- **Indexes are lock-free**. See the **MOT Indexes** section for more information about lock-free indexes. +- **NUMA-awareness memory allocation** is used to avoid cross-socket access, especially for session lifecycle objects. See the **NUMA Awareness Allocation and Affinity** section for more information about NUMA-awareness. +- **A customized MOT memory management allocator** with pre-cached object pools is used to avoid expensive runtime allocation and extra points of contention. This dedicated MOT memory allocator makes memory allocation more efficient by pre-accessing relatively large chunks of memory from the operation system as needed and then divvying it out to the MOT as needed. + +## Integration using Foreign Data Wrappers (FDW) + +MOT complies with and leverages MogDB's standard extensibility mechanism - Foreign Data Wrapper (FDW), as shown in the following diagram. + +The PostgreSQL Foreign Data Wrapper (FDW) feature enables the creation of foreign tables in an MOT database that are proxies for some other data source, such as Oracle, MySQL, PostgreSQL and so on. When a query is made on a foreign table, the FDW queries the external data source and returns the results, as if they were coming from a table in your database. + +MogDB relies on the PostgreSQL Foreign Data Wrappers (FDW) and Index support so that SQL is entirely covered, including stored procedures, user defined functions, system functions calls. + +**Figure 1** MOT Architecture + +![mot-architecture](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-scale-up-architecture-2.png) + +In the diagram above, the MOT engine is represented in green, while the existing MogDB (based on Postgres) components are represented in the top part of this diagram in blue. As you can see, the Foreign Data Wrapper (FDW) mediates between the MOT engine and the MogDB components. + +**MOT-Related FDW Customizations** + +Integrating MOT through FDW enables the reuse of the most upper layer MogDB functionality and therefore significantly shortened MOT's time-to-market without compromising SQL coverage. + +However, the original FDW mechanism in MogDB was not designed for storage engine extensions, and therefore lacks the following essential functionalities - + +- Index awareness of foreign tables to be calculated in the query planning phase +- Complete DDL interfaces +- Complete transaction lifecycle interfaces +- Checkpoint interfaces +- Redo Log interface +- Recovery interfaces +- Vacuum interfaces + +In order to support all the missing functionalities, the SQL layer and FDW interface layer were extended to provide the necessary infrastructure in order to enable the plugging in of the MOT transactional storage engine. + +## Result - Linear Scale-up + +The following shows the results achieved by the MOT design principles and implementation described above. + +To the best of our knowledge, MOT outperforms all existing industry-grade OLTP databases in transactional throughput of ACID-compliant workloads. + +MogDB and MOT have been tested on the following many-core systems with excellent performance scalability results. The tests were performed both on x86 Intel-based and ARM/Kunpeng-based many-core servers. You may refer to the **MOT Performance Benchmarks** section for more detailed performance review. + +Our TPC-C benchmark dated June 2020 tested an MogDB MOT database on a Taishan 2480 server. A 4-socket ARM/Kunpeng server, achieved throughput of 4.8 M tpmC. The following graph shows the near-linear nature of the results, meaning that it shows a significant increase in performance correlating to the increase of the quantity of cores - + +**Figure 2** TPC-C on ARM (256 Cores) + +![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-12.png) + +The following is an additional example that shows a test on an x86-based server also showing CPU utilization. + +**Figure 3** tpmC vs CPU Usage + +![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-performance-benchmarks-18.png) + +The chart shows that MOT demonstrates a significant performance increase correlation with an increase of the quantity of cores. MOT consumes more and more of the CPU correlating to the increase of the quantity of cores. Other industry solutions do not increase and sometimes show slightly degraded performance, which is a well-known problem in the database industry that affects customers’ CAPEX and OPEX expenses and operational efficiency. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-2.md b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-2.md new file mode 100644 index 0000000000000000000000000000000000000000..aab31c2a03baeef5b1ebf3c8cc0b8fbbe8e77365 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-2.md @@ -0,0 +1,179 @@ +--- +title: MOT Concurrency Control Mechanism +summary: MOT Concurrency Control Mechanism +author: Zhang Cuiping +date: 2021-03-04 +--- + +# MOT Concurrency Control Mechanism + +After investing extensive research to find the best concurrency control mechanism, we concluded that SILO based on OCC is the best ACID-compliant OCC algorithm for MOT. SILO provides the best foundation for MOT's challenging requirements. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** MOT is fully Atomicity, Consistency, Isolation, Durability (ACID)-compliant, as described in the **MOT Introduction** section. + +The following topics describe MOT's concurrency control mechanism - + +## MOT Local and Global Memory + +SILO manages both a local memory and a global memory, as shown in Figure 1. + +- **Global** memory is long-term shared memory is shared by all cores and is used primarily to store all the table data and indexes +- **Local** memory is short-term memory that is used primarily by sessions for handling transactions and store data changes in a primate to transaction memory until the commit phase. + +When a transaction change is required, SILO handles the copying of all that transaction's data from the global memory into the local memory. Minimal locks are placed on the global memory according to the OCC approach, so that the contention time in the global shared memory is extremely minimal. After the transaction’ change has been completed, this data is pushed back from the local memory to the global memory. + +The basic interactive transactional flow with our SILO-enhanced concurrency control is shown in the figure below - + +**Figure 1** Private (Local) Memory (for each transaction) and a Global Memory (for all the transactions of all the cores) + +![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-concurrency-control-mechanism-2.png) + +For more details, refer to the Industrial-Strength OLTP Using Main Memory and Many-cores document [**Comparison - Disk vs. MOT**]. + +## MOT SILO Enhancements + +SILO in its basic algorithm flow outperformed many other ACID-compliant OCCs that we tested in our research experiments. However, in order to make it a product-grade mechanism, we had to enhance it with many essential functionalities that were missing in the original design, such as - + +- Added support for interactive mode transactions, where transactions are running SQL by SQL from the client side and not as a single step on the server side +- Added optimistic inserts +- Added support for non-unique indexes +- Added support for read-after-write in transactions so that users can see their own changes before they are committed +- Added support for lockless cooperative garbage collection +- Added support for lockless checkpoints +- Added support for fast recovery +- Added support for two-phase commit in a distributed deployment + +Adding these enhancements without breaking the scalable characteristic of the original SILO was very challenging. + +## MOT Isolation Levels + +Even though MOT is fully ACID-compliant (as described in the section), not all isolation levels are supported in MogDB 2.1. The following table describes all isolation levels, as well as what is and what is not supported by MOT. + +**Table 1** Isolation Levels + +| Isolation Level | Description | +| :--------------- | :----------------------------------------------------------- | +| READ UNCOMMITTED | **Not supported by MOT.** | +| READ COMMITTED | **Supported by MOT.**
The READ COMMITTED isolation level that guarantees that any data that is read was already committed when it was read. It simply restricts the reader from seeing any intermediate, uncommitted or dirty reads. Data is free to be changed after it has been read so that READ COMMITTED does not guarantee that if the transaction re-issues the read, that the same data will be found. | +| SNAPSHOT | **Not supported by MOT.**
The SNAPSHOT isolation level makes the same guarantees as SERIALIZABLE, except that concurrent transactions can modify the data. Instead, it forces every reader to see its own version of the world (its own snapshot). This makes it very easy to program, plus it is very scalable, because it does not block concurrent updates. However, in many implementations this isolation level requires higher server resources. | +| REPEATABLE READ | **Supported by MOT.**
REPEATABLE READ is a higher isolation level that (in addition to the guarantees of the READ COMMITTED isolation level) guarantees that any data that is read cannot change. If a transaction reads the same data again, it will find the same previously read data in place, unchanged and available to be read.
Because of the optimistic model, concurrent transactions are not prevented from updating rows read by this transaction. Instead, at commit time this transaction validates that the REPEATABLE READ isolation level has not been violated. If it has, this transaction is rolled back and must be retried. | +| SERIALIZABLE | **Not supported by MOT**.
Serializable isolation makes an even stronger guarantee. In addition to everything that the REPEATABLE READ isolation level guarantees, it also guarantees that no new data can be seen by a subsequent read.
It is named SERIALIZABLE because the isolation is so strict that it is almost a bit like having the transactions run in series rather than concurrently. | + +The following table shows the concurrency side effects enabled by the different isolation levels. + +**Table 2** Concurrency Side Effects Enabled by Isolation Levels + +| Isolation Level | Description | Non-repeatable Read | Phantom | +| :--------------- | :---------- | :------------------ | :------ | +| READ UNCOMMITTED | Yes | Yes | Yes | +| READ COMMITTED | No | Yes | Yes | +| REPEATABLE READ | No | No | Yes | +| SNAPSHOT | No | No | No | +| SERIALIZABLE | No | No | No | + +In the near future release, MogDB MOT will also support both SNAPSHOT and SERIALIZABLE isolation levels. + +## MOT Optimistic Concurrency Control + +The Concurrency Control Module (CC Module for short) provides all the transactional requirements for the Main Memory Engine. The primary objective of the CC Module is to provide the Main Memory Engine with support for various isolation levels. + +### Optimistic OCC vs. Pessimistic 2PL + +The functional differences of Pessimistic 2PL (2-Phase Locking) vs. Optimistic Concurrency Control (OCC) involve pessimistic versus optimistic approaches to transaction integrity. + +Disk-based tables use a pessimistic approach, which is the most commonly used database method. The MOT Engine use an optimistic approach. + +The primary functional difference between the pessimistic approach and the optimistic approach is that if a conflict occurs - + +- The pessimistic approach causes the client to wait. +- The optimistic approach causes one of the transactions to fail, so that the failed transaction must be retried by the client. + +**Optimistic Concurrency Control Approach (Used by MOT)** + +The **Optimistic Concurrency Control (OCC)** approach detects conflicts as they occur, and performs validation checks at commit time. + +The optimistic approach has less overhead and is usually more efficient, partly because transaction conflicts are uncommon in most applications. + +The functional differences between optimistic and pessimistic approaches is larger when the REPEATABLE READ isolation level is enforced and is largest for the SERIALIZABLE isolation level. + +**Pessimistic Approaches (Not used by MOT)** + +The **Pessimistic Concurrency Control** (2PL or 2-Phase Locking) approach uses locks to block potential conflicts before they occur. A lock is applied when a statement is executed and released when the transaction is committed. Disk-based row-stores use this approach (with the addition of Multi-version Concurrency Control [MVCC]). + +In 2PL algorithms, while a transaction is writing a row, no other transaction can access it; and while a row is being read, no other transaction can overwrite it. Each row is locked at access time for both reading and writing; and the lock is released at commit time. These algorithms require a scheme for handling and avoiding deadlock. Deadlock can be detected by calculating cycles in a wait-for graph. Deadlock can be avoided by keeping time ordering using TSO or by some kind of back-off scheme. + +**Encounter Time Locking (ETL)** + +Another approach is Encounter Time Locking (ETL), where reads are handled in an optimistic manner, but writes lock the data that they access. As a result, writes from different ETL transactions are aware of each other and can decide to abort. It has been empirically verified that ETL improves the performance of OCC in two ways - + +- First, ETL detects conflicts early on and often increases transaction throughput. This is because transactions do not perform useless operations, because conflicts discovered at commit time (in general) cannot be solved without aborting at least one transaction. +- Second, encounter-time locking Reads-After-Writes (RAW) are handled efficiently without requiring expensive or complex mechanisms. + +**Conclusion** + +OCC is the fastest option for most workloads. This finding has also been observed in our preliminary research phase. + +One of the reasons is that when every core executes multiple threads, a lock is likely to be held by a swapped thread, especially in interactive mode. Another reason is that pessimistic algorithms involve deadlock detection (which introduces overhead) and usually uses read-write locks (which are less efficient than standard spin-locks). + +We have chosen Silo because it was simpler than other existing options, such as TicToc, while maintaining the same performance for most workloads. ETL is sometimes faster than OCC, but it introduces spurious aborts which may confuse a user, in contrast to OCC which aborts only at commit. + +### OCC vs 2PL Differences by Example + +The following shows the differences between two user experiences - Pessimistic (for disk-based tables) and Optimistic (MOT tables) when sessions update the same table simultaneously. + +In this example, the following table test command is run - + +``` +table "TEST" - create table test (x int, y int, z int, primary key(x)); +``` + +This example describes two aspects of the same test - user experience (operations in the example) and retry requirements. + +**Example Pessimistic Approach - Used in Disk-based Tables** + +The following is an example of the Pessimistic approach (which is not Mot). Any Isolation Level may apply. + +The following two sessions perform a transaction that attempts to update a single table. + +A WAIT LOCK action occurs and the client experience is that session #2 is *stuck* until Session #1 has completed a COMMIT. Only afterwards, is Session #2 able to progress. + +However, when this approach is used, both sessions succeed and no abort occurs (unless SERIALIZABLE or REPEATABLE-READ isolation level is applied), which results in the entire transaction needing to be retried. + +**Table 1** Pessimistic Approach Code Example + +| | Session 1 | Session 2 | +| :--- | :------------------------------- | :----------------------------------------------------------- | +| t0 | Begin | Begin | +| t1 | update test set y=200 where x=1; | | +| t2 | y=200 | Update test set y=300 where x=1; - Wait on lock | +| t4 | Commit | | +| | | Unlock | +| | | Commit(in READ-COMMITTED this will succeed, in SERIALIZABLE it will fail) | +| | | y = 300 | + +**Example Optimistic Approach - Used in MOT** + +The following is an example of the Optimistic approach. + +It describes the situation of creating an MOT table and then having two concurrent sessions updating that same MOT table simultaneously - + +``` +create foreign table test (x int, y int, z int, primary key(x)); +``` + +- The advantage of OCC is that there are no locks until COMMIT. +- The disadvantage of using OCC is that the update may fail if another session updates the same record. If the update fails (in all supported isolation levels), an entire SESSION #2 transaction must be retried. +- Update conflicts are detected by the kernel at commit time by using a version checking mechanism. +- SESSION #2 will not wait in its update operation and will be aborted because of conflict detection at commit phase. + +**Table 2** Optimistic Approach Code Example - Used in MOT + +| | Session 1 | Session 2 | +| :--- | :------------------------------- | :------------------------------- | +| t0 | Begin | Begin | +| t1 | update test set y=200 where x=1; | | +| t2 | y=200 | Update test set y=300 where x=1; | +| t4 | Commit | y = 300 | +| | | Commit | +| | | ABORT | +| | | y = 200 | diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-3.md b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-3.md new file mode 100644 index 0000000000000000000000000000000000000000..3af831038f8bfda7238c36d37d0f6be5afea3f97 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-3.md @@ -0,0 +1,59 @@ +--- +title: Extended FDW and Other MogDB Features +summary: Extended FDW and Other MogDB Features +author: Zhang Cuiping +date: 2021-03-04 +--- + +# Extended FDW and Other MogDB Features + +MogDB is based on PostgreSQL, which does not have a built-in storage engine adapter, such as MySQL handlerton. To enable the integration of the MOT storage engine into MogDB, we have leveraged and extended the existing Foreign Data Wrapper (FDW) mechanism. With the introduction of FDW into PostgreSQL 9.1, externally managed databases can now be accessed in a way that presents these foreign tables and data sources as united, locally accessible relations. + +In contrast, the MOT storage engine is embedded inside MogDB and its tables are managed by it. Access to tables is controlled by the MogDB planner and executor. MOT gets logging and checkpointing services from MogDB and participates in the MogDB recovery process in addition to other processes. + +We refer to all the components that are in use or are accessing the MOT storage engine as the *Envelope*. + +The following figure shows how the MOT storage engine is embedded inside MogDB and its bi-directional access to database functionality. + +**Figure 1** MOT Storage Engine Embedded inside MogDB - FDW Access to External Databases + +![mot-architecture](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-scale-up-architecture-2.png) + +We have extended the capabilities of FDW by extending and modifying the FdwRoutine structure in order to introduce features and calls that were not required before the introduction of MOT. For example, support for The following new features was added - Add Index, Drop Index/Table, Truncate, Vacuum and Table/Index Memory Statistics. A significant emphasis was put on integration with MogDB logging, replication and checkpointing mechanisms in order to provide consistency for cross-table transactions through failures. In this case, the MOT itself sometimes initiates calls to MogDB functionality through the FDW layer. + +## Creating Tables and Indexes + +In order to support the creation of MOT tables, standard FDW syntax was reused. + +For example, create FOREIGN table. + +The MOT FDW mechanism passes the instruction to the MOT storage engine for actual table creation. Similarly, we support index creation (create index …). This feature was not previously available in FDW, because it was not needed since its tables are managed externally. + +To support both in MOT FDW, the **ValidateTableDef** function actually creates the specified table. It also handles the index creation of that relation, as well as DROP TABLE and DROP INDEX, in addition to VACUUM and ALTER TABLE, which were not previously supported in FDW. + +## Index Usage for Planning and Execution + +A query has two phases - **Planning** and **Execution**. During the Planning phase (which may take place once per multiple executions), the best index for the scan is chosen. This choice is made based on the matching query's WHERE clauses, JOIN clauses and ORDER BY conditions. During execution, a query iterates over the relevant table rows and performs various tasks, such as update or delete, per iteration. An insert is a special case where the table adds the row to all indexes and no scanning is required. + +- **Planner -** In standard FDW, a query is passed for execution to a foreign data source. This means that index filtering and the actual planning (such as the choice of indexes) is not performed locally in the database, rather it is performed in the external data source. Internally, the FDW returns a general plan to the database planner. MOT tables are handled in a similar manner as disk tables. This means that relevant MOT indexes are filtered and matched, and the indexes that minimize the set of traversed rows are selected and are added to the plan. +- **Executor -** The Query Executor uses the chosen MOT index in order to iterate over the relevant rows of the table. Each row is inspected by the MogDB envelope, and according to the query conditions, an update or delete is called to handle the relevant row. + +## Durability, Replication and High Availability + +A storage engine is responsible for storing, reading, updating and deleting data in the underlying memory and storage systems. The logging, checkpointing and recovery are not handled by the storage engine, especially because some transactions encompass multiple tables with different storage engines. Therefore, in order to persist and replicate data, the high-availability facilities from the MogDB envelope are used as follows - + +- **Durability -** In order to ensure Durability, the MOT engine persists data by Write-Ahead Logging (WAL) records using the MogDB's XLOG interface. This also provides the benefits of MogDB's replication capabilities that use the same APIs. You may refer to the **MOT Durability Concepts** for more information. +- **Checkpointing -** A MOT Checkpoint is enabled by registering a callback to the MogDB Checkpointer. Whenever a general database Checkpoint is performed, the MOT Checkpoint process is called as well. MOT keeps the Checkpoint's Log Sequence Number (LSN) in order to be aligned with MogDB recovery. The MOT Checkpointing algorithm is highly optimized and asynchronous and does not stop concurrent transactions. You may refer to the **MOT Checkpoint Concepts** for more information. +- **Recovery -** Upon startup, MogDB first calls an MOT callback that recovers the MOT Checkpoint by loading into memory rows and creating indexes, followed by the execution of the WAL recovery by replaying records according to the Checkpoint's LSN. The MOT Checkpoint is recovered in parallel using multiple threads - each thread reads a different data segment. This makes MOT Checkpoint recovery quite fast on many-core hardware, though it is still potentially slower compared to disk-based tables where only WAL records are replayed. You may refer to the **MOT Recovery Concepts** for more information. + +## VACUUM and DROP + +In order to maximize MOT functionality, we added support for VACUUM, DROP TABLE and DROP INDEX. All three execute with an exclusive table lock, meaning without allowing concurrent transactions on the table. The system VACUUM calls a new FDW function to perform the MOT vacuuming, while DROP was added to the ValidateTableDef() function. + +## Deleting Memory Pools + +Each index and table tracks all the memory pools that it uses. A DROP INDEX command is used to remove metadata. Memory pools are deleted as a single consecutive block. The MOT VACUUM only compacts used memory, because memory reclamation is performed continuously in the background by the epoch-based Garbage Collector (GC). In order to perform the compaction, we switch the index or the table to new memory pools, traverse all the live data, delete each row and insert it using the new pools and finally delete the pools as is done for a drop. + +## Query Native Compilation (JIT) + +The FDW adapter to MOT engine also contains a lite execution path that employs Just-In-Time (JIT) compiled query execution using the LLVM compiler. More information about MOT Query Native Compilation can be found in the **Query Native Compilation (JIT)** section. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-4.md b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-4.md new file mode 100644 index 0000000000000000000000000000000000000000..0194a6b5c0ca52d4ddac1cb6ecc1fff8bd75f2b5 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-4.md @@ -0,0 +1,22 @@ +--- +title: NUMA Awareness Allocation and Affinity +summary: NUMA Awareness Allocation and Affinity +author: Zhang Cuiping +date: 2021-03-04 +--- + +# NUMA Awareness Allocation and Affinity + +Non-Uniform Memory Access (NUMA) is a computer memory design used in multiprocessing, where the memory access time depends on the memory location relative to the processor. Under NUMA, a processor can take advantage of NUMA by preferring to access its own local memory (which is faster), rather than accessing non-local memory (meaning that it will prefer **not** to access the local memory of another processor or memory shared between processors). + +MOT memory access has been designed with NUMA awareness. This means that MOT is aware that memory is not uniform and achieves best performance by accessing the quickest and most local memory. + +The benefits of NUMA are limited to certain types of workloads, particularly on servers where the data is often strongly associated with certain tasks or users. + +In-memory database systems running on NUMA platforms face several issues, such as the increased latency and the decreased bandwidth when accessing remote main memory. To cope with these NUMA-related issues, NUMA awareness must be considered as a major design principle for the fundamental architecture of a database system. + +To facilitate quick operation and make efficient use of NUMA nodes, MOT allocates a designated memory pool for rows per table and for nodes per index. Each memory pool is composed from 2 MB chunks. A designated API allocates these chunks from a local NUMA node, from pages coming from all nodes or in a round-robin fashion, where each chunk is allocated on the next node. By default, pools of shared data are allocated in a round robin fashion in order to balance access, while not splitting rows between different NUMA nodes. However, thread private memory is allocated from a local node. It must also be verified that a thread always operates in the same NUMA node. + +**Summary** + +MOT has a smart memory control module that has preallocated memory pools intended for various types of memory objects. This smart memory control improves performance, reduces locks and ensures stability. The allocation of the memory objects of a transaction is always NUMA-local, ensuring optimal performance for CPU memory access and resulting in low latency and reduced contention. Deallocated objects go back to the memory pool. Minimized use of OS malloc functions during transactions circumvents unnecessary locks. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-5.md b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-5.md new file mode 100644 index 0000000000000000000000000000000000000000..016a23dd0c18a64b8c7ced456d9edc851899fa95 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-5.md @@ -0,0 +1,41 @@ +--- +title: MOT Indexes +summary: MOT Indexes +author: Zhang Cuiping +date: 2021-03-04 +--- + +# MOT Indexes + +MOT Index is a lock-free index based on state-of-the-art Masstree, which is a fast and scalable Key Value (KV) store for multicore systems, implemented as tries of B+ trees. It achieves excellent performance on many-core servers and high concurrent workloads. It uses various advanced techniques, such as an optimistic lock approach, cache-awareness and memory prefetching. + +After comparing various state-of-the-art solutions, we chose Masstree for the index because it demonstrated the best overall performance for point queries, iterations and modifications. Masstree is a combination of tries and a B+ tree that is implemented to carefully exploit caching, prefetching, optimistic navigation and fine-grained locking. It is optimized for high contention and adds various optimizations to its predecessors, such as OLFIT. However, the downside of a Masstree index is its higher memory consumption. While row data consumes the same memory size, the memory per row per each index (primary or secondary) is higher on average by 16 bytes - 29 bytes in the lock-based B-Tree used in disk-based tables vs. 45 bytes in MOT's Masstree. + +Our empirical experiments showed that the combination of the mature lock-free Masstree implementation and our robust improvements to Silo have provided exactly what we needed in that regard. + +Another challenge was making an optimistic insertion into a table with multiple indexes. + +The Masstree index is at the core of MOT memory layout for data and index management. Our team enhanced and significantly improved Masstree and submitted some of the key contributions to the Masstree open source. These improvements include - + +- Dedicated memory pools per index - Efficient allocation and fast index drop +- Global GC for Masstree - Fast, on-demand memory reclamation +- Masstree iterator implementation with access to an insertion key +- ARM architecture support + +We contributed our Masstree index improvements to the Masstree open-source implementation, which can be found here - . + +MOT's main innovation was to enhance the original Masstree data structure and algorithm, which did not support Non-Unique Indexes (as a Secondary index). You may refer to the **Non-unique Indexes** section for the design details. + +MOT supports both Primary, Secondary and Keyless indexes (subject to the limitations specified in the **Unsupported Index DDLs and Index**section). + +## Non-unique Indexes + +A non-unique index may contain multiple rows with the same key. Non-unique indexes are used solely to improve query performance by maintaining a sorted order of data values that are used frequently. For example, a database may use a non-unique index to group all people from the same family. However, the Masstree data structure implementation does not allow the mapping of multiple objects to the same key. Our solution for enabling the creation of non-unique indexes (as shown in the figure below) is to add a symmetry-breaking suffix to the key, which maps the row. This added suffix is the pointer to the row itself, which has a constant size of 8 bytes and a value that is unique to the row. When inserting into a non-unique index, the insertion of the sentinel always succeeds, which enables the row allocated by the executing transaction to be used. This approach also enable MOT to have a fast, reliable, order-based iterator for a non-unique index. + +**Figure 1** Non-unique Indexes + +![non-unique-indexes](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-indexes-2.png) + +The structure of an MOT table T that has three rows and two indexes is depicted in the figure above. The rectangles represent data rows, and the indexes point to sentinels (the elliptic shapes) which point to the rows. The sentinels are inserted into unique indexes with a key and into non-unique indexes with a key + a suffix. The sentinels facilitate maintenance operations so that the rows can be replaced without touching the index data structure. In addition, there are various flags and a reference count embedded in the sentinel in order to facilitate optimistic inserts. + +When searching a non-unique secondary index, the required key (for example, the family name) is used. The fully concatenated key is only used for insert and delete operations. Insert and delete operations always get a row as a parameter, thereby making it possible to create the entire key and to use it in the execution of the deletion or the insertion of the specific row for the index. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-6.md b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-6.md new file mode 100644 index 0000000000000000000000000000000000000000..9f4b3ea27c7c53f0a5f2c8c27275ee0a142c853d --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-6.md @@ -0,0 +1,204 @@ +--- +title: MOT Durability Concepts +summary: MOT Durability Concepts +author: Zhang Cuiping +date: 2021-03-04 +--- + +# MOT Durability Concepts + +Durability refers to long-term data protection (also known as *disk persistence*). Durability means that stored data does not suffer from any kind of degradation or corruption, so that data is never lost or compromised. Durability ensures that data and the MOT engine are restored to a consistent state after a planned shutdown (for example, for maintenance) or an unplanned crash (for example, a power failure). + +Memory storage is volatile, meaning that it requires power to maintain the stored information. Disk storage, on the other hand, is non-volatile, meaning that it does not require power to maintain stored information, thus, it can survive a power shutdown. MOT uses both types of storage - it has all data in memory, while persisting transactional changes to disk **MOT Durability** and by maintaining frequent periodic **MOT Checkpoints** in order to ensure data recovery in case of shutdown. + +The user must ensure sufficient disk space for the logging and Checkpointing operations. A separated drive can be used for the Checkpoint to improve performance by reducing disk I/O load. + +You may refer to **MOT Key Technologies** section__for an overview of how durability is implemented in the MOT engine. + +MOTs WAL Redo Log and checkpoints enabled durability, as described below - + +- **MOT Logging - WAL Redo Log Concepts** +- **MOT Checkpoint Concepts** + +## MOT Logging - WAL Redo Log Concepts + +### Overview + +Write-Ahead Logging (WAL) is a standard method for ensuring data durability. The main concept of WAL is that changes to data files (where tables and indexes reside) are only written after those changes have been logged, meaning only after the log records that describe the changes have been flushed to permanent storage. + +The MOT is fully integrated with the MogDB envelope logging facilities. In addition to durability, another benefit of this method is the ability to use the WAL for replication purposes. + +Three logging methods are supported, two standard Synchronous and Asynchronous, which are also supported by the standard MogDB disk-engine. In addition, in the MOT a Group-Commit option is provided with special NUMA-Awareness optimization. The Group-Commit provides the top performance while maintaining ACID properties. + +To ensure Durability, MOT is fully integrated with the MogDB's Write-Ahead Logging (WAL) mechanism, so that MOT persists data in WAL records using MogDB's XLOG interface. This means that every addition, update, and deletion to an MOT table's record is recorded as an entry in the WAL. This ensures that the most current data state can be regenerated and recovered from this non-volatile log. For example, if three new rows were added to a table, two were deleted and one was updated, then six entries would be recorded in the log. + +- MOT log records are written to the same WAL as the other records of MogDB disk-based tables. + +- MOT only logs an operation at the transaction commit phase. + +- MOT only logs the updated delta record in order to minimize the amount of data written to disk. + +- During recovery, data is loaded from the last known or a specific Checkpoint; and then the WAL Redo log is used to complete the data changes that occur from that point forward. + +- The WAL (Redo Log) retains all the table row modifications until a Checkpoint is performed (as described above). The log can then be truncated in order to reduce recovery time and to save disk space. + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** In order to ensure that the log IO device does not become a bottleneck, the log file must be placed on a drive that has low latency. + +### Logging Types + +Two synchronous transaction logging options and one asynchronous transaction logging option are supported (these are also supported by the standard MogDB disk engine). MOT also supports synchronous Group Commit logging with NUMA-awareness optimization, as described below. + +According to your configuration, one of the following types of logging is implemented: + +- **Synchronous Redo Logging** + + The **Synchronous Redo Logging** option is the simplest and most strict redo logger. When a transaction is committed by a client application, the transaction redo entries are recorded in the WAL (Redo Log), as follows - + + 1. While a transaction is in progress, it is stored in the MOT’s memory. + 2. After a transaction finishes and the client application sends a **Commit** command, the transaction is locked and then written to the WAL Redo Log on the disk. This means that while the transaction log entries are being written to the log, the client application is still waiting for a response. + 3. As soon as the transaction's entire buffer is written to the log, the changes to the data in memory take place and then the transaction is committed. After the transaction has been committed, the client application is notified that the transaction is complete. + +- **Technical Description** + + When a transaction ends, the SynchronousRedoLogHandler serializes its transaction buffer and write it to the XLOG iLogger implementation. + + **Figure 1** Synchronous Logging + + ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-durability-concepts-6.png) + + **Summary** + + The **Synchronous Redo Logging** option is the safest and most strict because it ensures total synchronization of the client application and the WAL Redo log entries for each transaction as it is committed; thus ensuring total durability and consistency with absolutely no data loss. This logging option prevents the situation where a client application might mark a transaction as successful, when it has not yet been persisted to disk. + + The downside of the **Synchronous Redo Logging** option is that it is the slowest logging mechanism of the three options. This is because a client application must wait until all data is written to disk and because of the frequent disk writes (which typically slow down the database). + +- **Group Synchronous Redo Logging** + + The **Group Synchronous Redo Logging** option is very similar to the **Synchronous Redo Logging** option, because it also ensures total durability with absolutely no data loss and total synchronization of the client application and the WAL (Redo Log) entries. The difference is that the **Group Synchronous Redo Logging** option writes _groups of transaction_redo entries to the WAL Redo Log on the disk at the same time, instead of writing each and every transaction as it is committed. Using Group Synchronous Redo Logging reduces the amount of disk I/Os and thus improves performance, especially when running a heavy workload. + + The MOT engine performs synchronous Group Commit logging with Non-Uniform Memory Access (NUMA)-awareness optimization by automatically grouping transactions according to the NUMA socket of the core on which the transaction is running. + + You may refer to the **NUMA Awareness Allocation and Affinity** section for more information about NUMA-aware memory access. + + When a transaction commits, a group of entries are recorded in the WAL Redo Log, as follows - + + 1. While a transaction is in progress, it is stored in the memory. The MOT engine groups transactions in buckets according to the NUMA socket of the core on which the transaction is running. This means that all the transactions running on the same socket are grouped together and that multiple groups will be filling in parallel according to the core on which the transaction is running. + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + > + > Each thread runs on a single core/CPU which belongs to a single socket and each thread only writes to the socket of the core on which it is running. + + 2. After a transaction finishes and the client application sends a Commit command, the transaction redo log entries are serialized together with other transactions that belong to the same group. + + 3. After the configured criteria are fulfilled for a specific group of transactions (quantity of committed transactions or timeout period as describes in the **REDO LOG (MOT)** section), the transactions in this group are written to the WAL on the disk. This means that while these log entries are being written to the log, the client applications that issued the commit are waiting for a response. + + 4. As soon as all the transaction buffers in the NUMA-aware group have been written to the log, all the transactions in the group are performing the necessary changes to the memory store and the clients are notified that these transactions are complete. + + Writing transactions to the WAL is more efficient in this manner because all the buffers from the same socket are written to disk together. + + **Technical Description** + + The four colors represent 4 NUMA nodes. Thus each NUMA node has its own memory log enabling a group commit of multiple connections. + + **Figure 2** Group Commit - with NUMA-awareness + + ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-durability-concepts-7.png) + + **Summary** + + The **Group Synchronous Redo Logging** option is a an extremely safe and strict logging option because it ensures total synchronization of the client application and the WAL Redo log entries; thus ensuring total durability and consistency with absolutely no data loss. This logging option prevents the situation where a client application might mark a transaction as successful, when it has not yet been persisted to disk. + + On one hand this option has fewer disk writes than the **Synchronous Redo Logging** option, which may mean that it is faster. The downside is that transactions are locked for longer, meaning that they are locked until after all the transactions in the same NUMA memory have been written to the WAL Redo Log on the disk. + + The benefits of using this option depend on the type of transactional workload. For example, this option benefits systems that have many transactions (and less so for systems that have few transactions, because there are few disk writes anyway). + +- **Asynchronous Redo Logging** + + The **Asynchronous Redo Logging** option is the fastest logging method, However, it does not ensure no data loss, meaning that some data that is still in the buffer and was not yet written to disk may get lost upon a power failure or database crash. When a transaction is committed by a client application, the transaction redo entries are recorded in internal buffers and written to disk at preconfigured intervals. The client application does not wait for the data being written to disk. It continues to the next transaction. This is what makes asynchronous redo logging the fastest logging method. + + When a transaction is committed by a client application, the transaction redo entries are recorded in the WAL Redo Log, as follows - + + 1. While a transaction is in progress, it is stored in the MOT's memory. + 2. After a transaction finishes and the client application sends a Commit command, the transaction redo entries are written to internal buffers, but are not yet written to disk. Then changes to the MOT data memory take place and the client application is notified that the transaction is committed. + 3. At a preconfigured interval, a redo log thread running in the background collects all the buffered redo log entries and writes them to disk. + + **Technical Description** + + Upon transaction commit, the transaction buffer is moved (pointer assignment - not a data copy) to a centralized buffer and a new transaction buffer is allocated for the transaction. The transaction is released as soon as its buffer is moved to the centralized buffer and the transaction thread is not blocked. The actual write to the log uses the Postgres walwriter thread. When the walwriter timer elapses, it first calls the AsynchronousRedoLogHandler (via registered callback) to write its buffers and then continues with its logic and flushes the data to the XLOG. + + **Figure 3** Asynchronous Logging + + ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-durability-concepts-8.png) + + **Summary** + + The Asynchronous Redo Logging option is the fastest logging option because it does not require the client application to wait for data being written to disk. In addition, it groups many transactions redo entries and writes them together, thus reducing the amount of disk I/Os that slow down the MOT engine. + + The downside of the Asynchronous Redo Logging option is that it does not ensure that data will not get lost upon a crash or failure. Data that was committed, but was not yet written to disk, is not durable on commit and thus cannot be recovered in case of a failure. The Asynchronous Redo Logging option is most relevant for applications that are willing to sacrifice data recovery (consistency) over performance. + + Logging Design Details + + The following describes the design details of each persistence-related component in the In-Memory Engine Module. + + **Figure 4** Three Logging Options + + ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-durability-concepts-9.png) + + The RedoLog component is used by both by backend threads that use the In-Memory Engine and by the WAL writer in order to persist their data. Checkpoints are performed using the Checkpoint Manager, which is triggered by the Postgres checkpointer. + +- **Logging Design Overview** + + Write-Ahead Logging (WAL) is a standard method for ensuring data durability. WAL's central concept is that changes to data files (where tables and indexes reside) are only written after those changes have been logged, meaning after the log records that describe these changes have been flushed to permanent storage. + + The MOT Engine uses the existing MogDB logging facilities, enabling it also to participate in the replication process. + +- **Per-transaction Logging** + + In the In-Memory Engine, the transaction log records are stored in a transaction buffer which is part of the transaction object (TXN). The transaction buffer is logged during the calls to addToLog() - if the buffer exceeds a threshold it is then flushed and reused. When a transaction commits and passes the validation phase (OCC SILO**[Comparison - Disk vs. MOT] validation)** or aborts for some reason, the appropriate message is saved in the log as well in order to make it possible to determine the transaction's state during a recovery. + + **Figure 5** Per-transaction Logging + + ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/mot-durability-concepts-10.png) + + Parallel Logging is performed both by MOT and disk engines. However, the MOT engine enhances this design with a log-buffer per transaction, lockless preparation and a single log record. + +- **Exception Handling** + + The persistence module handles exceptions by using the Postgres error reporting infrastructure (ereport). An error message is recorded in the system log for each error condition. In addition, the error is reported to the envelope using Postgres’s built-in error reporting infrastructure. + + The following exceptions are reported by this module - + + **Table 1** Exception Handling + + | Exception Condition | Exception Code | Scenario | Resulting Outcome | + | :----------------------------------- | :----------------------------- | :----------------------------------------------------------- | :--------------------- | + | WAL write failure | ERRCODE_FDW_ERROR | On any case the WAL write fails | Transaction terminates | + | File IO error: write, open and so on | ERRCODE_IO_ERROR | Checkpoint - Called on any file access error | FATAL - process exists | + | Out of Memory | ERRCODE_INSUFFICIENT_RESOURCES | Checkpoint - Local memory allocation failures | FATAL - process exists | + | Logic, DB errors | ERRCODE_INTERNAL_ERROR | Checkpoint: algorithm fails or failure to retrieve table data or indexes. | FATAL - process exists | + +## MOT Checkpoint Concepts + +In MogDB, a Checkpoints is a snapshot of a point in the sequence of transactions at which it is guaranteed that the heap and index data files have been updated with all information written before the checkpoint. + +At the time of a Checkpoint, all dirty data pages are flushed to disk and a special checkpoint record is written to the log file. + +The data is stored directly in memory. The MOT does not store its data it the same way as MogDB so that the concept of dirty pages does not exist. + +For this reason, we have researched and implemented the CALC algorithm, which is described in the paper named Low-Overhead Asynchronous Checkpointing in Main-Memory Database Systems, SIGMOND 2016 from Yale University. + +Low-overhead asynchronous checkpointing in main-memory database systems. + +### CALC Checkpoint Algorithm - Low Overhead in Memory and Compute + +The checkpoint algorithm provides the following benefits - + +- **Reduced Memory Usage -** At most two copies of each record are stored at any time. Memory usage is minimized by only storing a single physical copy of a record while it is live and stable versions are equal or when no checkpoint is actively being recorded. +- **Low Overhead -** CALC's overhead is smaller than other asynchronous checkpointing algorithms. +- **Uses Virtual Points of Consistency -** CALC does not require quiescing of the database in order to achieve a physical point of consistency. + +### Checkpoint Activation + +MOT checkpoints are integrated into MogDB's envelope's Checkpoint mechanism. The Checkpoint process can be triggered manually by executing the **CHECKPOINT;** command or automatically according to the envelope's Checkpoint triggering settings (time/size). + +Checkpoint configuration is performed in the mot.conf file - see the **CHECKPOINT (MOT)** section. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-7.md b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-7.md new file mode 100644 index 0000000000000000000000000000000000000000..51d26498ba1d6d6eced781cf922dd5f0e8411cd9 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-7.md @@ -0,0 +1,24 @@ +--- +title: MOT Recovery Concepts +summary: MOT Recovery Concepts +author: Zhang Cuiping +date: 2021-03-04 +--- + +# MOT Recovery Concepts + +The MOT Recovery Module provides all the required functionality for recovering the MOT tables data. The main objective of the Recovery module is to restore the data and the MOT engine to a consistent state after a planned (maintenance for example) shut down or an unplanned (power failure for example) crash. + +MogDB database recovery, which is also sometimes called a *Cold Start*, includes MOT tables and is performed automatically with the recovery of the rest of the database. The MOT Recovery Module is seamlessly and fully integrated into the MogDB recovery process. + +MOT recovery has two main stages - Checkpoint Recovery and WAL Recovery (Redo Log). + +MOT checkpoint recovery is performed before the envelope's recovery takes place. This is done only at cold-start events (start of a PG process). It recovers the metadata first (schema) and then inserts all the rows from the current valid checkpoint, which is done in parallel by checkpoint_recovery_workers, each working on a different table. The indexes are created during the insert process. + +When checkpointing a table, it is divided into 16MB chunks, so that multiple recovery workers can recover the table in parallel. This is done in order to speed-up the checkpoint recovery, it is implemented as a multi-threaded procedure where each thread is responsible for recovering a different segment. There are no dependencies between different segments therefore there is no contention between the threads and there is no need to use locks when updating table or inserting new rows. + +WAL records are recovered as part of the envelope's WAL recovery. MogDB envelope iterates through the XLOG and performs the necessary operation based on the xlog record type. In case of entry with record type MOT, the envelope forwards it to MOT RecoveryManager for handling. The xlog entry will be ignored by MOT recovery, if it is 'too old' - its LSN is older than the checkpoint's LSN (Log Sequence Number). + +In an active-standby deployment, the standby server is always in a Recovery state for an automatic WAL recovery process. + +The MOT recovery parameters are set in the mot.conf file explained in the **[MOT Recovery](5-mot-administration#mot-recovery)** section. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-8.md b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-8.md new file mode 100644 index 0000000000000000000000000000000000000000..661cac1f9337f39acb9c6093215b9d02c18de05e --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-8.md @@ -0,0 +1,73 @@ +--- +title: MOT Query Native Compilation (JIT) +summary: MOT Query Native Compilation (JIT) +author: Zhang Cuiping +date: 2021-03-04 +--- + +# MOT Query Native Compilation (JIT) + +MOT enables you to prepare and parse *pre-compiled full queries* in a native format (using a **PREPARE** statement) before they are needed for execution. + +This native format can later be executed (using an **EXECUTE** command) more efficiently. This type of execution is much more efficient because during execution the native format bypasses multiple database processing layers. This division of labor avoids repetitive parse analysis operations. The Lite Executor module is responsible for executing **prepared** queries and has a much faster execution path than the regular generic plan performed by the envelope. This is achieved using Just-In-Time (JIT) compilation via LLVM. In addition, a similar solution that has potentially similar performance is provided in the form of pseudo-LLVM. + +The following is an example of a **PREPARE** syntax in SQL: + +``` +PREPARE name [ ( data_type [, ...] ) ] AS statement +``` + +The following is an example of how to invoke a PREPARE and then an EXECUTE statement in a Java application - + +``` +conn = DriverManager.getConnection(connectionUrl, connectionUser, connectionPassword); + +// Example 1: PREPARE without bind settings +String query = "SELECT * FROM getusers"; +PreparedStatement prepStmt1 = conn.prepareStatement(query); +ResultSet rs1 = pstatement.executeQuery()) +while (rs1.next()) {…} + +// Example 2: PREPARE with bind settings +String sqlStmt = "SELECT * FROM employees where first_name=? and last_name like ?"; +PreparedStatement prepStmt2 = conn.prepareStatement(sqlStmt); +prepStmt2.setString(1, "Mark"); // first name "Mark" +prepStmt2.setString(2, "%n%"); // last name contains a letter "n" +ResultSet rs2 = prepStmt2.executeQuery()) +while (rs2.next()) {…} +``` + +## Prepare + +**PREPARE** creates a prepared statement. A prepared statement is a server-side object that can be used to optimize performance. When the **PREPARE** statement is executed, the specified statement is parsed, analyzed and rewritten. + +If the tables mentioned in the query statement are MOT tables, the MOT compilation takes charge of the object preparation and performs a special optimization by compiling the query into IR byte code based on LLVM. + +Whenever a new query compilation is required, the query is analyzed and a proper tailored IR byte code is generated for the query using the utility GsCodeGen object and standard LLVM JIT API (IRBuilder). After byte-code generation is completed, the code is JIT-compiled into a separate LLVM module. The compiled code results in a C function pointer that can later be invoked for direct execution. Note that this C function can be invoked concurrently by many threads, as long as each thread provides a distinct execution context (details are provided below). Each such execution context is referred to as *JIT Context*. + +To improve performance further, MOT JIT applies a caching policy for its LLVM code results, enabling them to be reused for the same queries across different sessions. + +## Execute + +When an EXECUTE command is issued, the prepared statement (described above) is planned and executed. This division of labor avoids repetitive parse analysis work, while enabling the execution plan to depend on the specific setting values supplied. + +When the resulting execute query command reaches the database, it uses the corresponding IR byte code which is executed directly and more efficiently within the MOT engine. This is referred to as *Lite Execution*. + +In addition, for availability, the Lite Executor maintains a preallocated pool of JIT sources. Each session preallocates its own session-local pool of JIT context objects (used for repeated executions of precompiled queries). + +For more details you may refer to the Supported Queries for Lite Execution and Unsupported Queries for Lite Execution sections. + +## JIT Compilation Comparison - MogDB Disk-based vs. MOT Tables + +Currently, MogDB contains two main forms of JIT / CodeGen query optimizations for its disk-based tables - + +- Accelerating expression evaluation, such as in WHERE clauses, target lists, aggregates and projections. +- Inlining small function invocations. + +These optimizations are partial (in the sense they do not optimize the entire interpreted operator tree or replace it altogether) and are targeted mostly at CPU-bound complex queries, typically seen in OLAP use cases. The execution of queries is performed in a pull-model (Volcano-style processing) using an interpreted operator tree. When activated, the compilation is performed at each query execution. At the moment, caching of the generated LLVM code and its reuse across sessions and queries is not yet provided. + +In contrast, MOT JIT optimization provides LLVM code for entire queries that qualify for JIT optimization by MOT. The resulting code is used for direct execution over MOT tables, while the interpreted operator model is abandoned completely. The result is *practically* handwritten LLVM code that has been generated for an entire specific query execution. + +Another significant conceptual difference is that MOT LLVM code is only generated for prepared queries during the PREPARE phase of the query, rather than at query execution. This is especially important for OLTP scenarios due to the rather short runtime of OLTP queries, which cannot allow for code generation and relatively long query compilation time to be performed during each query execution. + +Finally, in PostgreSQL the activation of a PREPARE implies the reuse of the resulting plan across executions with different parameters in the same session. Similarly, the MOT JIT applies a caching policy for its LLVM code results, and extends it for reuse across different sessions. Thus, a single query may be compiled just once and its LLVM code may be reused across many sessions, which again is beneficial for OLTP scenarios. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-9.md b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-9.md new file mode 100644 index 0000000000000000000000000000000000000000..6aac4a5c5884d754728cbc0b9b253ee0aeae0d2e --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/3-concepts-of-mot/3-9.md @@ -0,0 +1,69 @@ +--- +title: Comparison - Disk vs. MOT +summary: Comparison - Disk vs. MOT +author: Zhang Cuiping +date: 2021-03-04 +--- + +# Comparison - Disk vs. MOT + +The following table briefly compares the various features of the MogDB disk-based storage engine and the MOT storage engine. + +**Table 1** Comparison - Disk-based vs. MOT + +| Feature | MogDB Disk Store | MogDB MOT Engine | +| :--------------------------- | :------------------ | :---------------------------- | +| Intel x86 + Kunpeng ARM | Yes | Yes | +| SQL and Feature-set Coverage | 100% | 98% | +| Scale-up (Many-cores, NUMA) | Low Efficiency | High Efficiency | +| Throughput | High | Extremely High | +| Latency | Low | Extremely Low | +| Distributed (Cluster Mode) | Yes | Yes | +| Isolation Levels | RC+SIRRSerializable | RCRRRC+SI (in V2 release) | +| Concurrency Control | Pessimistic | Optimistic | +| Data Capacity (Data + Index) | Unlimited | Limited to DRAM | +| Native Compilation | No | Yes | +| Replication, Recovery | Yes | Yes | +| Replication Options | 2 (sync, async) | 3 (sync, async, group-commit) | + +**Legend -** + +- RR = Repeatable Reads +- RC = Read Committed +- SI = Snapshot Isolation + +## Appendices + +## References + +[1] Y. Mao, E. Kohler, and R. T. Morris. Cache craftiness for fast multicore key-value storage. In Proc. 7th ACM European Conference on Computer Systems (EuroSys), Apr. 2012. + +[2] K. Ren, T. Diamond, D. J. Abadi, and A. Thomson. Low-overhead asynchronous checkpointing in main-memory database systems. In Proceedings of the 2016 ACM SIGMOD International Conference on Management of Data, 2016. + +[3] . + +[4] . + +[5] Tu, S., Zheng, W., Kohler, E., Liskov, B., and Madden, S. Speedy transactions in multicore in-memory databases. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (New York, NY, USA, 2013), SOSP ’13, ACM, pp. 18-32. + +[6] H. Avni at al. Industrial-Strength OLTP Using Main Memory and Many-cores, VLDB 2020. + +[7] Bernstein, P. A., and Goodman, N. Concurrency control in distributed database systems. ACM Comput. Surv. 13, 2 (1981), 185-221. + +[8] Felber, P., Fetzer, C., and Riegel, T. Dynamic performance tuning of word-based software transactional memory. In Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP 2008, Salt Lake City, UT, USA, February 20-23, 2008 (2008), + +pp. 237-246. + +[9] Appuswamy, R., Anadiotis, A., Porobic, D., Iman, M., and Ailamaki, A. Analyzing the impact of system architecture on the scalability of OLTP engines for high-contention workloads. PVLDB 11, 2 (2017), + +121-134. + +[10] R. Sherkat, C. Florendo, M. Andrei, R. Blanco, A. Dragusanu, A. Pathak, P. Khadilkar, N. Kulkarni, C. Lemke, S. Seifert, S. Iyer, S. Gottapu, R. Schulze, C. Gottipati, N. Basak, Y. Wang, V. Kandiyanallur, S. Pendap, D. Gala, R. Almeida, and P. Ghosh. Native store extension for SAP HANA. PVLDB, 12(12): + +2047-2058, 2019. + +[11] X. Yu, A. Pavlo, D. Sanchez, and S. Devadas. Tictoc: Time traveling optimistic concurrency control. In Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26 - July 01, 2016, pages 1629-1642, 2016. + +[12] V. Leis, A. Kemper, and T. Neumann. The adaptive radix tree: Artful indexing for main-memory databases. In C. S. Jensen, C. M. Jermaine, and X. Zhou, editors, 29th IEEE International Conference on Data Engineering, ICDE 2013, Brisbane, Australia, April 8-12, 2013, pages 38-49. IEEE Computer Society, 2013. + +[13] S. K. Cha, S. Hwang, K. Kim, and K. Kwon. Cache-conscious concurrency control of main-memory indexes on shared-memory multiprocessor systems. In P. M. G. Apers, P. Atzeni, S. Ceri, S. Paraboschi, K. Ramamohanarao, and R. T. Snodgrass, editors, VLDB 2001, Proceedings of 27th International Conference on Very Large Data Bases, September 11-14, 2001, Roma, Italy, pages 181-190. Morga Kaufmann, 2001. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/4-appendix/1-references.md b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/4-appendix/1-references.md new file mode 100644 index 0000000000000000000000000000000000000000..4055b8f1162ff4f4b79927bdb5d3fcb2172f3c85 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/4-appendix/1-references.md @@ -0,0 +1,40 @@ +--- +title: References +summary: References +author: Zhang Cuiping +date: 2021-05-18 +--- + +# References + +[1] Y. Mao, E. Kohler, and R. T. Morris. Cache craftiness for fast multicore key-value storage. In Proc. 7th ACM European Conference on Computer Systems (EuroSys), Apr. 2012. + +[2] K. Ren, T. Diamond, D. J. Abadi, and A. Thomson. Low-overhead asynchronous checkpointing in main-memory database systems. In Proceedings of the 2016 ACM SIGMOD International Conference on Management of Data, 2016. + +[3] . + +[4] . + +[5] Tu, S., Zheng, W., Kohler, E., Liskov, B., and Madden, S. Speedy transactions in multicore in-memory databases. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (New York, NY, USA, 2013), SOSP ’13, ACM, pp. 18-32. + +[6] H. Avni at al. Industrial-Strength OLTP Using Main Memory and Many-cores, VLDB 2020. + +[7] Bernstein, P. A., and Goodman, N. Concurrency control in distributed database systems. ACM Comput. Surv. 13, 2 (1981), 185-221. + +[8] Felber, P., Fetzer, C., and Riegel, T. Dynamic performance tuning of word-based software transactional memory. In Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP 2008, Salt Lake City, UT, USA, February 20-23, 2008 (2008), + +pp. 237-246. + +[9] Appuswamy, R., Anadiotis, A., Porobic, D., Iman, M., and Ailamaki, A. Analyzing the impact of system architecture on the scalability of OLTP engines for high-contention workloads. PVLDB 11, 2 (2017), + +121-134. + +[10] R. Sherkat, C. Florendo, M. Andrei, R. Blanco, A. Dragusanu, A. Pathak, P. Khadilkar, N. Kulkarni, C. Lemke, S. Seifert, S. Iyer, S. Gottapu, R. Schulze, C. Gottipati, N. Basak, Y. Wang, V. Kandiyanallur, S. Pendap, D. Gala, R. Almeida, and P. Ghosh. Native store extension for SAP HANA. PVLDB, 12(12): + +2047-2058, 2019. + +[11] X. Yu, A. Pavlo, D. Sanchez, and S. Devadas. Tictoc: Time traveling optimistic concurrency control. In Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26 - July 01, 2016, pages 1629-1642, 2016. + +[12] V. Leis, A. Kemper, and T. Neumann. The adaptive radix tree: Artful indexing for main-memory databases. In C. S. Jensen, C. M. Jermaine, and X. Zhou, editors, 29th IEEE International Conference on Data Engineering, ICDE 2013, Brisbane, Australia, April 8-12, 2013, pages 38-49. IEEE Computer Society, 2013. + +[13] S. K. Cha, S. Hwang, K. Kim, and K. Kwon. Cache-conscious concurrency control of main-memory indexes on shared-memory multiprocessor systems. In P. M. G. Apers, P. Atzeni, S. Ceri, S. Paraboschi, K. Ramamohanarao, and R. T. Snodgrass, editors, VLDB 2001, Proceedings of 27th International Conference on Very Large Data Bases, September 11-14, 2001, Roma, Italy, pages 181-190. Morga Kaufmann, 2001. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/4-appendix/2-glossary.md b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/4-appendix/2-glossary.md new file mode 100644 index 0000000000000000000000000000000000000000..f2d1576038d266439c9d9238fcf119133a52ce34 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/mot-engine/4-appendix/2-glossary.md @@ -0,0 +1,59 @@ +--- +title: Glossary +summary: Glossary +author: Zhang Cuiping +date: 2021-05-18 +--- + +# Glossary + +| Acronym | Definition/Description | +| :------ | :----------------------------------------------------------- | +| 2PL | 2-Phase Locking | +| ACID | Atomicity, Consistency, Isolation, Durability | +| AP | Analytical Processing | +| ARM | Advanced RISC Machine, a hardware architecture alternative to x86 | +| CC | Concurrency Control | +| CPU | Central Processing Unit | +| DB | Database | +| DBA | Database Administrator | +| DBMS | Database Management System | +| DDL | Data Definition Language. Database Schema management language | +| DML | Data Modification Language | +| ETL | Extract, Transform, Load or Encounter Time Locking | +| FDW | Foreign Data Wrapper | +| GC | Garbage Collector | +| HA | High Availability | +| HTAP | Hybrid Transactional-Analytical Processing | +| IoT | Internet of Things | +| IM | In-Memory | +| IMDB | In-Memory Database | +| IR | Intermediate Representation of a source code, used in compilation and optimization | +| JIT | Just In Time | +| JSON | JavaScript Object Notation | +| KV | Key Value | +| LLVM | Low-Level Virtual Machine, refers to a compilation code or queries to IR | +| M2M | Machine-to-Machine | +| ML | Machine Learning | +| MM | Main Memory | +| MO | Memory Optimized | +| MOT | Memory Optimized Tables storage engine (SE), pronounced as /em/ /oh/ /tee/ | +| MVCC | Multi-Version Concurrency Control | +| NUMA | Non-Uniform Memory Access | +| OCC | Optimistic Concurrency Control | +| OLTP | Online Transaction Processing | +| PG | PostgreSQL | +| RAW | Reads-After-Writes | +| RC | Return Code | +| RTO | Recovery Time Objective | +| SE | Storage Engine | +| SQL | Structured Query Language | +| TCO | Total Cost of Ownership | +| TP | Transactional Processing | +| TPC-C | An On-Line Transaction Processing Benchmark | +| Tpm-C | Transactions-per-minute-C. A performance metric for TPC-C benchmark that counts new-order transactions. | +| TVM | Tiny Virtual Machine | +| TSO | Time Sharing Option | +| UDT | User-Defined Type | +| WAL | Write Ahead Log | +| XLOG | A PostgreSQL implementation of transaction logging (WAL - described above) | diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/primary-and-standby-management.md b/product/en/docs-mogdb/v3.0/administrator-guide/primary-and-standby-management.md new file mode 100644 index 0000000000000000000000000000000000000000..b4db2942936e716277a950a4dd71d2f395faac75 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/primary-and-standby-management.md @@ -0,0 +1,126 @@ +--- +title: Primary and Standby Management +summary: Primary and Standby Management +author: Guo Huan +date: 2021-03-11 +--- + +# Primary and Standby Management + +## Scenarios + +During MogDB database running, the database administrator needs to manually perform an primary/standby switchover on the database node. For example, after a primary/standby database node failover, you need to restore the original primary/standby roles, or you need to manually perform a primary/standby switchover due to a hardware fault. A cascaded standby server cannot be directly switched to a primary server. You must perform a switchover or failover to change the cascaded standby server to a standby server, and then to a primary server. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> +> - The primary/standby switchover is a maintenance operation. Ensure that the MogDB database is normal and perform the switchover after all services are complete. +> - When the ultimate RTO is enabled, cascaded standby servers are not supported. The standby server cannot be connected when the ultimate RTO is enabled. As a result, the cascaded standby server cannot synchronize data. + +## Procedure + +1. Log in to any database node as the OS user **omm** and run the following command to check the primary/standby status: + + ```bash + gs_om -t status --detail + ``` + +2. Log in to the standby node to be switched to the primary node as the OS user **omm** and run the following command: + + ```bash + gs_ctl switchover -D /home/omm/cluster/dn1/ + ``` + + **/home/omm/cluster/dn1/** is the data directory of the standby database node. + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-notice.gif) **NOTICE:** For the same database, you cannot perform a new primary/standby switchover if the previous switchover has not completed. If a switchover is performed when the host thread is processing services, the thread cannot stop, and switchover timeout will be reported. Actually, the switchover is ongoing in the background and will complete after the thread finishes service processing and stops. For example, when a host is deleting a large partitioned table, it may fail to respond to the switchover request. + +3. After the switchover is successful, run the following command to record the information about the current primary and standby nodes: + + ```bash + gs_om -t refreshconf + ``` + +## Examples + +Run the following command to switch the standby database instance to the primary database instance: + +1. Queries database status. + + ```bash + $ gs_om -t status --detail + [ Cluster State ] + + cluster_state : Normal + redistributing : No + current_az : AZ_ALL + + [ Datanode State ] + + node node_ip port instance state + -------------------------------------------------------------------------------------------------- + 1 pekpopgsci00235 10.244.62.204 5432 6001 /home/omm/cluster/dn1/ P Primary Normal + 2 pekpopgsci00238 10.244.61.81 5432 6002 /home/omm/cluster/dn1/ S Standby Normal + ``` + +2. Log in to the standby node and perform a primary/standby switchover. In addition, after a cascaded standby node is switched over, the cascaded standby server becomes a standby server, and the original standby server becomes a cascaded standby server. + + ```bash + $ gs_ctl switchover -D /home/omm/cluster/dn1/ + [2020-06-17 14:28:01.730][24438][][gs_ctl]: gs_ctl switchover ,datadir is -D "/home/omm/cluster/dn1" + [2020-06-17 14:28:01.730][24438][][gs_ctl]: switchover term (1) + [2020-06-17 14:28:01.768][24438][][gs_ctl]: waiting for server to switchover............ + [2020-06-17 14:28:11.175][24438][][gs_ctl]: done + [2020-06-17 14:28:11.175][24438][][gs_ctl]: switchover completed (/home/omm/cluster/dn1) + ``` + +3. Save the information about the primary and standby nodes in the database. + + ```bash + $ gs_om -t refreshconf + Generating dynamic configuration file for all nodes. + Successfully generated dynamic configuration file. + ``` + +## Troubleshooting + +If a switchover fails, troubleshoot the problem according to the log information. For details, see [Log Reference](11-log-reference). + +## Exception Handling + +Exception handling rules are as follows: + +- A switchover takes a long time under high service loads. In this case, no further operation is required. + +- When standby nodes are being built, a primary node can be demoted to a standby node only after sending logs to one of the standby nodes. As a result, the primary/standby switchover takes a long time. In this case, no further operation is required. However, you are not advised to perform a primary/standby switchover during the build process. + +- During a switchover, due to network faults and high disk usage, it is possible that the primary and standby instances are disconnected, or two primary nodes exist in a single pair. In this case, perform the following steps: + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-warning.gif) **WARNING:** After two primary nodes appear, perform the following steps to restore the normal primary/standby state: Otherwise, data loss may occur. + +1. Run the following commands to query the current instance status of the database: + + ```bash + gs_om -t status --detail + ``` + + The query result shows that the status of two instances is **Primary**, which is abnormal. + +2. Determine the node that functions as the standby node and run the following command on the node to stop the service: + + ```bash + gs_ctl stop -D /home/omm/cluster/dn1/ + ``` + +3. Run the following command to start the standby node in standby mode: + + ```bash + gs_ctl start -D /home/omm/cluster/dn1/ -M standby + ``` + +4. Save the information about the primary and standby nodes in the database. + + ```bash + gs_om -t refreshconf + ``` + +5. Check the database status and ensure that the instance status is restored. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/0-starting-and-stopping-mogdb.md b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/0-starting-and-stopping-mogdb.md new file mode 100644 index 0000000000000000000000000000000000000000..f3119e28321f8ee75f64a8e35624acb8479d0841 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/0-starting-and-stopping-mogdb.md @@ -0,0 +1,65 @@ +--- +title: Starting and Stopping MogDB +summary: Starting and Stopping MogDB +author: Guo Huan +date: 2021-06-24 +--- + +# Starting and Stopping MogDB + +## Starting MogDB + +1. Log in as the OS user **omm** to the primary node of the database. + +2. Run the following command to start MogDB: + + ```bash + gs_om -t start + ``` + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** An HA cluster must be started in HA mode. If the cluster is started in standalone mode, you need to restore the HA relationship by running the **gs_ctl build** command. For details about how to use the **gs_ctl** tool, see [gs_ctl](4-gs_ctl). + +## Stopping MogDB + +1. Log in as the OS user **omm** to the primary node of the database. + +2. Run the following command to stop MogDB: + + ```bash + gs_om -t stop + ``` + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** For details about how to start and stop nodes and availability zones (AZs), see [gs_om](8-gs_om). + +## Examples + +Start MogDB: + +```bash +gs_om -t start +Starting cluster. +========================================= +========================================= +Successfully started. +``` + +Stop MogDB: + +```bash +gs_om -t stop +Stopping cluster. +========================================= +Successfully stopped cluster. +========================================= +End stop cluster. +``` + +## Troubleshooting + +If starting or stopping MogDB fails, troubleshoot the problem based on log information. For details, see [Log Reference](11-log-reference). + +If the startup fails due to timeout, you can run the following command to set the startup timeout interval, which is 300s by default: + +```bash +gs_om -t start --time-out=300 +``` diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/1-routine-maintenance-check-items.md b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/1-routine-maintenance-check-items.md new file mode 100644 index 0000000000000000000000000000000000000000..dd068915ad840f3d7216da7eaa0f437b646888d7 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/1-routine-maintenance-check-items.md @@ -0,0 +1,165 @@ +--- +title: 日常运维 +summary: 日常运维 +author: Zhang Cuiping +date: 2021-03-04 +--- + +# Routine Maintenance Check Items + +## Checking MogDB Status + +MogDB provides tools to check database and instance status, ensuring that databases and instances are running properly to provide data services. + +- Check instance status. + + ```bash + gs_check -U omm -i CheckClusterState + ``` + +- Check parameters. + + ```sql + mogdb=# SHOW parameter_name; + ``` + + In the above command, **parameter_name** needs to be replaced with a specific parameter name. +- Modify parameters. + + ```bash + gs_guc reload -D /mogdb/data/dbnode -c "paraname=value" + ``` + +## Checking Lock Information + +The lock mechanism is an important method to ensure data consistency. Information check helps learn database transactions and database running status. + +- Query lock information in the database. + + ```sql + mogdb=# SELECT * FROM pg_locks; + ``` + +- Query the status of threads waiting to acquire locks. + + ```sql + mogdb=# SELECT * FROM pg_thread_wait_status WHERE wait_status = 'acquire lock'; + ``` + +- Kill a system process. + + Search for a system process that is running and run the following command to end the process: + + ``` + ps ux + kill -9 pid + ``` + +## Collecting Event Statistics + +Long-time running of SQL statements will occupy a lot of system resources. You can check event occurrence time and occupied memory to learn about database running status. + +- Query the time points about an event. + + Run the following command to query the thread start time, transaction start time, SQL start time, and status change time of the event: + + ```sql + mogdb=# SELECT backend_start,xact_start,query_start,state_change FROM pg_stat_activity; + ``` + +- Query the number of sessions on the current server. + + ```sql + mogdb=# SELECT count(*) FROM pg_stat_activity; + ``` + +- Collect system-level statistics. + + Run the following command to query information about the session that uses the maximum memory: + + ```sql + mogdb=# SELECT * FROM pv_session_memory_detail() ORDER BY usedsize desc limit 10; + ``` + +## Checking Objects + +Tables, indexes, partitions, and constraints are key storage objects of a database. A database administrator needs to routinely maintain key information and these objects. + +- View table details. + + ```sql + mogdb=# \d+ table_name + ``` + +- Query table statistics. + + ```sql + mogdb=# SELECT * FROM pg_statistic; + ``` + +- View index details. + + ```sql + mogdb=# \d+ index_name + ``` + +- Query partitioned table information. + + ```sql + mogdb=# SELECT * FROM pg_partition; + ``` + +- Collect statistics. + + Run the **ANALYZE** statement to collect related statistics on the database. + + Run the **VACUUM** statement to reclaim space and update statistics. + +- Query constraint information. + + ```sql + mogdb=# SELECT * FROM pg_constraint; + ``` + +## Checking an SQL Report + +Run the **EXPLAIN** statement to view execution plans. + +## Backing Up Data + +Never forget to back up data. During the routine work, the backup execution and backup data validity need to be checked to ensure data security and encryption security. + +- Export a specified user. + + ```bash + gs_dump dbname -p port -f out.sql -U user_name -W password + ``` + +- Export a schema. + + ```bash + gs_dump dbname -p port -n schema_name -f out.sql + ``` + +- Export a table. + + ```bash + gs_dump dbname -p port -t table_name -f out.sql + ``` + +## Checking Basic Information + +Basic information includes versions, components, and patches. Periodic database information checks and records are important for database life cycle management. + +- Check version information. + + ```sql + mogdb=# SELECT version(); + ``` + +- Check table size and database size. + + ```sql + mogdb=# SELECT pg_table_size('table_name'); + mogdb=# SELECT pg_database_size('database_name'); + ``` diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/10-data-security-maintenance-suggestions.md b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/10-data-security-maintenance-suggestions.md new file mode 100644 index 0000000000000000000000000000000000000000..210a1641747139b252c816d9c04e17b3bf2c5472 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/10-data-security-maintenance-suggestions.md @@ -0,0 +1,29 @@ +--- +title: 日常运维 +summary: 日常运维 +author: Zhang Cuiping +date: 2021-03-04 +--- + +# Data Security Maintenance Suggestions + +To ensure data security in MogDB Kernel and prevent accidents, such as data loss and illegal data access, read this section carefully. + +**Preventing Data Loss** + +You are advised to plan routine physical backup and store backup files in a reliable medium. If a serious error occurs in the system, you can use the backup files to restore the system to the state at the backup point. + +**Preventing Illegal Data Access** + +- You are advised to manage database users based on their permission hierarchies. A database administrator creates users and grants permissions to the users based on service requirements to ensure users properly access the database. +- You are advised to deploy MogDB Kernel servers and clients (or applications developed based on the client library) in trusted internal networks. If the servers and clients must be deployed in an untrusted network, enable SSL encryption before services are started to ensure data transmission security. Note that enabling the SSL encryption function compromises database performance. + +**Preventing System Logs from Leaking Personal Data** + +- Delete personal data before sending debug logs to others for analysis. + + **NOTE:** The log level **log_min_messages** is set to **DEBUG**x (*x* indicates the debug level and the value ranges from 1 to 5). The information recorded in debug logs may contain personal data. + +- Delete personal data before sending system logs to others for analysis. If the execution of a SQL statement fails, the error SQL statement will be recorded in a system log by default. SQL statements may contain personal data. + +- Set **log_min_error_statement** to **PANIC** to prevent error SQL statements from being recorded in system logs. However, once the function is disabled, it is difficult to locate fault causes if faults occur. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/11-log-reference.md b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/11-log-reference.md new file mode 100644 index 0000000000000000000000000000000000000000..8ed5bd8cb699791e23428997bf3aa4087c236cd5 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/11-log-reference.md @@ -0,0 +1,137 @@ +--- +title: Log Reference +summary: Log Reference +author: Guo Huan +date: 2021-06-24 +--- + +# Log Reference + +## Log Overview + +During database running, a large number of logs are generated, including write-ahead logs (WALs, also called Xlogs) for ensuring database security and reliability and run logs and operation logs for daily database maintenance. If the database is faulty, you can refer to these logs to locate the fault and restore the database. + +**Log Type** + +The following table describes details about log types. + +**Table 1** Log types + +| Type | Description | +| :-------------- | :----------------------------------------------------------- | +| System log | Logs generated during database running. They are used to record abnormal process information. | +| Operation log | Logs generated when a client tool (such as **gs_guc**) is operating databases. | +| Trace log | Logs generated after the database debug switch is enabled. They are used to analyze database exceptions. | +| Black box log | Logs generated when the database system breaks down. You can analyze the process context when the fault occurs based on the heap and stack information in the logs to facilitate fault locating. A black box dumps stack, heap, and register information about processes and threads when a system breaks down. | +| Audit log | Logs used to record some of the database user operations after the database audit function is enabled. | +| WAL | Logs used to restore a damaged database. They are also called redo logs. You are advised to routinely back up WALs. | +| Performance log | Logs used to record the status of physical resources and the performance of access to external resources (such as disks, OBS and Hadoop clusters). | + +## System Logs + +System logs include those generated by database nodes when MogDB is running, and those generated when MogDB is deployed. If an error occurs during MogDB running, you can locate the cause and troubleshoot it based on system logs. + +**Log Storage Directory** + +Run logs of database nodes are stored in the corresponding folders in the **/var/log/mogdb/username/pg_log** directory. + +Logs generated during OM MogDB installation and uninstallation are stored in the **/var/log/mogdb/username/om** directory. + +**Log Naming Rules** + +The name format of database node run logs is: + +**postgresql-creation time.log** + +By default, a new log file is generated at 0:00 every day, or when the latest log file exceeds 16 MB or a database instance (database node) is restarted. + +**Log Content Description** + +Content of a line in a database node log: + +Date+Time+Time zone+Username+Database name+Session ID+Log level+Log content + +## Operation Logs + +Operation logs are generated when database tools are used by a database administrator or invoked by a cluster. If the cluster is faulty, you can backtrack user operations on the database and reproduce the fault based on the operation logs. + +**Log Storage Directory** + +The default path is **\$GAUSSLOG/bin**. If the environmental variable **$GAUSSLOG** does not exist or its value is empty, the log information generated for a tool will be displayed, but not recorded in the log file of the tool. + +The default value of **$GAUSSLOG** is **/var/log/mogdb/username**. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** If a database is deployed using the OM script, the log path is **/var/log/mogdb/username**. + +**Log Naming Rules** + +The log file name format is as follows: + +- **tool name-log creation time.log** +- **tool name-log creation time-current.log** + +**tool name-log creation time.log** is a historical log file, and **tool name-log creation time-current.log** is a current log file. + +If the size of a log file exceeds 16 MB, the next time the tool is invoked, the log file is renamed in the historical log file name format, and a new log file is generated at the current time point. + +For example, **gs_guc-2015-01-16_183728-current.log** is renamed as **gs_guc-2015-01-16_183728.log**, and **gs_guc-2015-01-17_142216-current.log** is generated. + +**Maintenance Suggestions** + +You are advised to dump expired logs periodically to save disk space and prevent important logs from being lost. + +## Audit Logs + +After the audit function is enabled, a large number of audit logs will be generated, which occupy large storage space. You can customize an audit log maintenance policy based on the size of available storage space. + +For details, see "Database Security Management > Configuring Database Audit > Maintaining Audit Logs" in the *Developer Guide*. + +## WALs + +In a system using write-ahead logs (WALs or Xlogs), all data file modifications are written to a log before they are applied. That is, the corresponding log must be written into a permanent memory before a data file is modified. You can use WALs to restore the cluster if the system crashes. + +**Log Storage Directory** + +Take a DN as an example. Its WALs are stored in the **/mogdb/data/data_dn/pg_xlog** directory. + +**/mogdb/data/data_dn** is the data directory of a node in the cluster. + +**Log Naming Rules** + +Log files are saved as segment files. Each segment is 16 MB and is divided into multiple 8 KB pages. The name of a WAL file consists of 24 hexadecimal characters. Each name has three parts, with each part having eight hexadecimal characters. The first part indicates the time line, the second part indicates the log file identifier, and the third part indicates the file segment identifier. A time line starts from 1, and a log file identifier and a file segment identifier start from 0. + +For example, the name of the first transaction log is **000000010000000000000000**. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** The numbers in each part are used in ascending order in succession. Exhausting all available numbers takes a long time, and the numbers will start from zero again after they reach the maximum. + +**Log Content Description** + +The content of WALs depends on the types of recorded transactions. WALs can be used to restore a system after the system breaks down. + +By default, MogDB Kernal reads WALs for system restoration during each startup. + +**Maintenance Suggestions** + +WALs are important for database restoration. You are advised to routinely back up WALs. + +## Performance Logs + +Performance logs focus on the access performance of external resources. Performance logs are used to record the status of physical resources and the performance of access to external resources (such as disks, OBS and Hadoop clusters). When a performance issue occurs, you can locate the cause using performance logs, which greatly improves troubleshooting efficiency. + +**Log Storage Directory** + +The performance logs of database are stored in the directories under **$GAUSSLOG/gs_profile**. + +**Log Naming Rules** + +The name format ofdatabase performance logs is: + +**postgresql-creation time.prf** + +By default, a new log file is generated at 0:00 every day, or when the latest log file exceeds 20 MB or a database instance (CN or DN) is restarted. + +**Log Content Description** + +Content of a line in a database log: + +**Host name+Date+Time+Instance name+Thread number+Log content** diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/2-checking-os-parameters.md b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/2-checking-os-parameters.md new file mode 100644 index 0000000000000000000000000000000000000000..29132b31d7177ad64fc80925c402ca0bf0f9a511 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/2-checking-os-parameters.md @@ -0,0 +1,178 @@ +--- +title: 日常运维 +summary: 日常运维 +author: Zhang Cuiping +date: 2021-03-04 +--- + +# Checking OS Parameters + +## Check Method + +Use the **gs_checkos** tool provided by MogDB to check the OS status. + +**Prerequisites** + +- The hardware and network are working properly. +- The trust relationship of user **root** among the hosts is normal. +- Only user **root** is authorized to run the **gs_checkos** command. + +**Procedure** + +1. Log in to a server as user **root**. + +2. Run the following command to check OS parameters of servers where the MogDB nodes are deployed: + + ``` + gs_checkos -i A + ``` + + Check the OS parameters to ensure that MogDB has passed the pre-installation check and can efficiently operate after it is installed. + +**Examples** + +Before running the **gs_checkos** command, execute pre-processing scripts by running **gs_preinstall** to prepare the environment. The following uses parameter **A** as an example: + +``` +gs_checkos -i A +Checking items: + A1. [ OS version status ] : Normal + A2. [ Kernel version status ] : Normal + A3. [ Unicode status ] : Normal + A4. [ Time zone status ] : Normal + A5. [ Swap memory status ] : Normal + A6. [ System control parameters status ] : Normal + A7. [ File system configuration status ] : Normal + A8. [ Disk configuration status ] : Normal + A9. [ Pre-read block size status ] : Normal + A10.[ IO scheduler status ] : Normal + A11.[ Network card configuration status ] : Normal + A12.[ Time consistency status ] : Warning + A13.[ Firewall service status ] : Normal + A14.[ THP service status ] : Normal +Total numbers:14. Abnormal numbers:0. Warning number:1. +``` + +The following uses parameter **B** as an example: + +``` +gs_checkos -i B +Setting items: + B1. [ Set system control parameters ] : Normal + B2. [ Set file system configuration value ] : Normal + B3. [ Set pre-read block size value ] : Normal + B4. [ Set IO scheduler value ] : Normal + B5. [ Set network card configuration value ] : Normal + B6. [ Set THP service ] : Normal + B7. [ Set RemoveIPC value ] : Normal + B8. [ Set Session Process ] : Normal +Total numbers:6. Abnormal numbers:0. Warning number:0. +``` + +## Exception Handling + +If you use the **gs_checkos** tool to check the OS and the command output shows **Abnormal**, run the following command to view detailed error information: + +``` +gs_checkos -i A --detail +``` + +The **Abnormal** state cannot be ignored because the OS in this state affects cluster installation. The **Warning** state does not affect cluster installation and thereby can be ignored. + +- If the check result for OS version status (**A1**) is **Abnormal**, replace OSs out of the mixed programming scope with those within the scope. + +- If the check result for kernel version status (**A2**) is **Warning**, the platform kernel versions in the cluster are inconsistent. + +- If the check result for Unicode status (**A3**) is **Abnormal**, set the same character set for all the hosts. You can add **export LANG=***unicode* to the **/etc/profile** file. + + ``` + vim /etc/profile + ``` + +- If the check result for time zone status (**A4**) is **Abnormal**, set the same time zone for all the hosts. You can copy the time zone file in the **/usr/share/zoneinfo/** directory as the **/etc/localtime** file. + + ``` + cp /usr/share/zoneinfo/$primary time zone/$secondary time zone /etc/localtime + ``` + +- If the check result for swap memory status (**A5**) is **Abnormal**, a possible cause is that the swap memory is larger than the physical memory. You can troubleshoot this issue by reducing the swap memory or increasing the physical memory. + +- If the check result for system control parameter status (**A6**) is **Abnormal**, troubleshoot this issue in either of the following two ways: + + - Run the following command: + + ``` + gs_checkos -i B1 + ``` + + - Modify the **/etc/sysctl.conf** file based on the error message and run **sysctl -p** to make it take effect. + + ``` + vim /etc/sysctl.conf + ``` + +- If the check result for file system configuration status (**A7**) is **Abnormal**, run the following command to troubleshoot this issue: + + ``` + gs_checkos -i B2 + ``` + +- If the check result for disk configuration status (**A8**) is **Abnormal**, set the disk mounting format to **rw,noatime,inode64,allocsize=16m**. + + Run the **man mount** command to mount the XFS parameter: + + ``` + rw,noatime,inode64,allocsize=16m + ``` + + You can also set the XFS parameter in the **/etc/fstab** file. For example: + + ``` + /dev/data /data xfs rw,noatime,inode64,allocsize=16m 0 0 + ``` + +- If the check result for pre-read block size status (**A9**) is **Abnormal**, run the following command to troubleshoot this issue: + + ``` + gs_checkos -i B3 + ``` + +- If the check result for I/O scheduling status (**A10**) is **Abnormal**, run the following command to troubleshoot this issue: + + ``` + gs_checkos -i B4 + ``` + +- If the check result for NIC configuration status (**A11**) is **Warning**, run the following command to troubleshoot this issue: + + ``` + gs_checkos -i B5 + ``` + +- If the check result for time consistency status (**A12**) is **Abnormal**, verify that the NTP service has been installed and started and has synchronized time from the NTP clock. + +- If the check result for firewall status (**A13**) is **Abnormal**, disable the firewall. Run the following commands: + + - SUSE: + + ``` + SuSEfirewall2 stop + ``` + + - RedHat7: + + ``` + systemctl disable firewalld + ``` + + - RedHat6: + + ``` + service iptables stop + ``` + +- If the check result for THP service status (**A14**) is **Abnormal**, run the following command to troubleshoot this issue: + + ``` + gs_checkos -i B6 + ``` diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/3-checking-mogdb-health-status.md b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/3-checking-mogdb-health-status.md new file mode 100644 index 0000000000000000000000000000000000000000..a9195068419d5e15303786dd892327848ac2eb98 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/3-checking-mogdb-health-status.md @@ -0,0 +1,645 @@ +--- +title: Checking MogDB Health Status +summary: Checking MogDB Health Status +author: Zhang Cuiping +date: 2021-03-04 +--- + +# Checking MogDB Health Status + +## Check Method + +Use the **gs_check** tool provided by MogDB to check the MogDB health status. + +**Precautions** + +- Only user **root** is authorized to check new nodes added during cluster scale-out. In other cases, the check can be performed only by user **omm**. +- Parameter **-i** or **-e** must be set. **-i** specifies a single item to be checked, and **-e** specifies an inspection scenario where multiple items will be checked. +- If **-i** is not set to a **root** item or no such items are contained in the check item list of the scenario specified by **-e**, you do not need to enter the name or password of user **root**. +- You can run **-skip-root-items** to skip **root** items. +- Check the consistency between the new node and existing nodes. Run the **gs_check** command on an existing node and specify the **-hosts** parameter. The IP address of the new node needs to be written into the **hosts** file. + +**Procedure** + +Method 1: + +1. Log in as the OS user **omm** to the primary node of the database. + +2. Run the following command to check the MogDB database status: + + ```bash + gs_check -i CheckClusterState + ``` + + In the command, **-i** indicates the check item and is case-sensitive. The format is **-i CheckClusterState**, **-i CheckCPU** or **-i CheckClusterState,CheckCPU**. + + Checkable items are listed in "Table 1 MogDB status checklist" in "Tool Reference > Server Tools > [gs_check](1-gs_check)". You can create a check item as needed. + +Method 2: + +1. Log in as the OS user **omm** to the primary node of the database. + +2. Run the following command to check the MogDB database health status: + + ```bash + gs_check -e inspect + ``` + + In the command, **-e** indicates the inspection scenario and is case-sensitive. The format is **-e inspect** or **-e upgrade**. + + The inspection scenarios include **inspect** (routine inspection), **upgrade** (inspection before upgrade), **Install** (install inspection ), **binary_upgrade** (inspection before in-place upgrade), **slow_node** (node inspection), **longtime** (time-consuming inspection) and **health** (health inspection). You can create an inspection scenario as needed. + +The MogDB inspection is performed to check MogDB status during MogDB running or to check the environment and conditions before critical operations, such as upgrade or scale-out. For details about the inspection items and scenarios, see "Server Tools > gs_check > MogDB status checks" in the *MogDB Tool Reference*. + +**Examples** + +Check result of a single item: + +```bash +perfadm@lfgp000700749:/opt/huawei/perfadm/tool/script> gs_check -i CheckCPU +Parsing the check items config file successfully +Distribute the context file to remote hosts successfully +Start to health check for the cluster. Total Items:1 Nodes:3 + +Checking... [=========================] 1/1 +Start to analysis the check result +CheckCPU....................................OK +The item run on 3 nodes. success: 3 + +Analysis the check result successfully +Success. All check items run completed. Total:1 Success:1 Failed:0 +For more information please refer to /opt/mogdb/tools/script/gspylib/inspection/output/CheckReport_201902193704661604.tar.gz +``` + +Local execution result: + +```bash +perfadm@lfgp000700749:/opt/huawei/perfadm/tool/script> gs_check -i CheckCPU -L + +2017-12-29 17:09:29 [NAM] CheckCPU +2017-12-29 17:09:29 [STD] Check the CPU usage of the host. If the value of idle is greater than 30% and the value of iowait is less than 30%, this item passes the check. Otherwise, this item fails the check. +2017-12-29 17:09:29 [RST] OK + +2017-12-29 17:09:29 [RAW] +Linux 4.4.21-69-default (lfgp000700749) 12/29/17 _x86_64_ + +17:09:24 CPU %user %nice %system %iowait %steal %idle +17:09:25 all 0.25 0.00 0.25 0.00 0.00 99.50 +17:09:26 all 0.25 0.00 0.13 0.00 0.00 99.62 +17:09:27 all 0.25 0.00 0.25 0.13 0.00 99.37 +17:09:28 all 0.38 0.00 0.25 0.00 0.13 99.25 +17:09:29 all 1.00 0.00 0.88 0.00 0.00 98.12 +Average: all 0.43 0.00 0.35 0.03 0.03 99.17 +``` + +Check result of a scenario: + +```bash +[perfadm@SIA1000131072 Check]$ gs_check -e inspect +Parsing the check items config file successfully +The below items require root privileges to execute:[CheckBlockdev CheckIOrequestqueue CheckIOConfigure CheckCheckMultiQueue CheckFirewall CheckSshdService CheckSshdConfig CheckCrondService CheckBootItems CheckFilehandle CheckNICModel CheckDropCache] +Please enter root privileges user[root]:root +Please enter password for user[root]: +Please enter password for user[root] on the node[10.244.57.240]: +Check root password connection successfully +Distribute the context file to remote hosts successfully +Start to health check for the cluster. Total Items:57 Nodes:2 + +Checking... [ ] 21/57 +Checking... [=========================] 57/57 +Start to analysis the check result +CheckClusterState...........................OK +The item run on 2 nodes. success: 2 + +CheckDBParams...............................OK +The item run on 1 nodes. success: 1 + +CheckDebugSwitch............................OK +The item run on 2 nodes. success: 2 + +CheckDirPermissions.........................OK +The item run on 2 nodes. success: 2 + +CheckReadonlyMode...........................OK +The item run on 1 nodes. success: 1 + +CheckEnvProfile.............................OK +The item run on 2 nodes. success: 2 (consistent) +The success on all nodes value: +GAUSSHOME /usr1/mogdb/app +LD_LIBRARY_PATH /usr1/mogdb/app/lib +PATH /usr1/mogdb/app/bin + + +CheckBlockdev...............................OK +The item run on 2 nodes. success: 2 + +CheckCurConnCount...........................OK +The item run on 1 nodes. success: 1 + +CheckCursorNum..............................OK +The item run on 1 nodes. success: 1 + +CheckPgxcgroup..............................OK +The item run on 1 nodes. success: 1 + +CheckDiskFormat.............................OK +The item run on 2 nodes. success: 2 + +CheckSpaceUsage.............................OK +The item run on 2 nodes. success: 2 + +CheckInodeUsage.............................OK +The item run on 2 nodes. success: 2 + +CheckSwapMemory.............................OK +The item run on 2 nodes. success: 2 + +CheckLogicalBlock...........................OK +The item run on 2 nodes. success: 2 + +CheckIOrequestqueue.....................WARNING +The item run on 2 nodes. warning: 2 +The warning[host240,host157] value: +On device (vdb) 'IO Request' RealValue '256' ExpectedValue '32768' +On device (vda) 'IO Request' RealValue '256' ExpectedValue '32768' + +CheckMaxAsyIOrequests.......................OK +The item run on 2 nodes. success: 2 + +CheckIOConfigure............................OK +The item run on 2 nodes. success: 2 + +CheckMTU....................................OK +The item run on 2 nodes. success: 2 (consistent) +The success on all nodes value: +1500 + +CheckPing...................................OK +The item run on 2 nodes. success: 2 + +CheckRXTX...................................NG +The item run on 2 nodes. ng: 2 +The ng[host240,host157] value: +NetWork[eth0] +RX: 256 +TX: 256 + + +CheckNetWorkDrop............................OK +The item run on 2 nodes. success: 2 + +CheckMultiQueue.............................OK +The item run on 2 nodes. success: 2 + +CheckEncoding...............................OK +The item run on 2 nodes. success: 2 (consistent) +The success on all nodes value: +LANG=en_US.UTF-8 + +CheckFirewall...............................OK +The item run on 2 nodes. success: 2 + +CheckKernelVer..............................OK +The item run on 2 nodes. success: 2 (consistent) +The success on all nodes value: +3.10.0-957.el7.x86_64 + +CheckMaxHandle..............................OK +The item run on 2 nodes. success: 2 + +CheckNTPD...................................OK +host240: NTPD service is running, 2020-06-02 17:00:28 +host157: NTPD service is running, 2020-06-02 17:00:06 + + +CheckOSVer..................................OK +host240: The current OS is centos 7.6 64bit. +host157: The current OS is centos 7.6 64bit. + +CheckSysParams..........................WARNING +The item run on 2 nodes. warning: 2 +The warning[host240,host157] value: +Warning reason: variable 'net.ipv4.tcp_retries1' RealValue '3' ExpectedValue '5'. +Warning reason: variable 'net.ipv4.tcp_syn_retries' RealValue '6' ExpectedValue '5'. + +CheckTHP....................................OK +The item run on 2 nodes. success: 2 + +CheckTimeZone...............................OK +The item run on 2 nodes. success: 2 (consistent) +The success on all nodes value: ++0800 + +CheckCPU....................................OK +The item run on 2 nodes. success: 2 + +CheckSshdService............................OK +The item run on 2 nodes. success: 2 + +Warning reason: UseDNS parameter is not set; expected: no + +CheckCrondService...........................OK +The item run on 2 nodes. success: 2 + +CheckStack..................................OK +The item run on 2 nodes. success: 2 (consistent) +The success on all nodes value: +8192 + +CheckSysPortRange...........................OK +The item run on 2 nodes. success: 2 + +CheckMemInfo................................OK +The item run on 2 nodes. success: 2 (consistent) +The success on all nodes value: +totalMem: 31.260929107666016G + +CheckHyperThread............................OK +The item run on 2 nodes. success: 2 + +CheckTableSpace.............................OK +The item run on 1 nodes. success: 1 + +CheckSysadminUser...........................OK +The item run on 1 nodes. success: 1 + + +CheckGUCConsistent..........................OK +All DN instance guc value is consistent. + +CheckMaxProcMemory..........................OK +The item run on 1 nodes. success: 1 + +CheckBootItems..............................OK +The item run on 2 nodes. success: 2 + +CheckHashIndex..............................OK +The item run on 1 nodes. success: 1 + +CheckPgxcRedistb............................OK +The item run on 1 nodes. success: 1 + +CheckNodeGroupName..........................OK +The item run on 1 nodes. success: 1 + +CheckTDDate.................................OK +The item run on 1 nodes. success: 1 + +CheckDilateSysTab...........................OK +The item run on 1 nodes. success: 1 + +CheckKeyProAdj..............................OK +The item run on 2 nodes. success: 2 + +CheckProStartTime.......................WARNING +host157: +STARTED COMMAND +Tue Jun 2 16:57:18 2020 /usr1/dmuser/dmserver/metricdb1/server/bin/mogdb --single_node -D /usr1/dmuser/dmb1/data -p 22204 +Mon Jun 1 16:15:15 2020 /usr1/mogdb/app/bin/mogdb -D /usr1/mogdb/data/dn1 -M standby + + +CheckFilehandle.............................OK +The item run on 2 nodes. success: 2 + +CheckRouting................................OK +The item run on 2 nodes. success: 2 + +CheckNICModel...............................OK +The item run on 2 nodes. success: 2 (consistent) +The success on all nodes value: +version: 1.0.1 +model: Red Hat, Inc. Virtio network device + + +CheckDropCache..........................WARNING +The item run on 2 nodes. warning: 2 +The warning[host240,host157] value: +No DropCache process is running + +CheckMpprcFile..............................NG +The item run on 2 nodes. ng: 2 +The ng[host240,host157] value: +There is no mpprc file + +Analysis the check result successfully +Failed. All check items run completed. Total:57 Success:50 Warning:5 NG:2 +For more information please refer to /usr1/mogdb/tool/script/gspylib/inspection/output/CheckReport_inspect611.tar.gz +``` + +## Exception Handling + +Troubleshoot exceptions detected in the inspection by following instructions in this section. + +**Table 1** Check of MogDB running status + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Check Item Abnormal Status Solution
CheckClusterState (Checks the MogDB status.) MogDB or MogDB instances are not started. Run the following command to start MogDB and instances:

gs_om -t start
The status of MogDB or MogDB instances is abnormal. Check the status of hosts and instances. Troubleshoot this issue based on the status information.
gs_check -i CheckClusterState
CheckDBParams (Checks database parameters.) Database parameters have incorrect values. Use the gs_guc tool to set the parameters to specified values.
CheckDebugSwitch (Checks debug logs.) The log level is incorrect. Use the gs_guc tool to set log_min_messages to specified content.
CheckDirPermissions (Checks directory permissions.) The permission for a directory is incorrect. Change the directory permission to a specified value (750 or 700).
chmod 750 DIR
CheckReadonlyMode (Checks the read-only mode.) The read-only mode is enabled. Verify that the usage of the disk where database nodes are located does not exceed the threshold (60% by default) and no other O&M operations are performed.
gs_check -i CheckDataDiskUsage ps ux
Use the gs_guc tool to disable the read-only mode of MogDB.
gs_guc reload -N all -I all -c 'default_transaction_read_only = off'
CheckEnvProfile (Checks environment variables.) Environment variables are inconsistent. Update the environment variable information.
CheckBlockdev (Checks pre-read blocks.) The size of a pre-read block is not 16384 KB. Use the gs_checkos tool to set the size of the pre-read block to 16384 KB and write the setting into the auto-startup file.
gs_checkos -i B3
CheckCursorNum (Checks the number of cursors.) The number of cursors fails to be checked. Check whether the database is properly connected and whether the MogDB status is normal.
CheckPgxcgroup (Checks the data redistribution status.) There are pgxc_group tables that have not been redistributed. Proceed with the redistribution.
gs_expand、gs_shrink
CheckDiskFormat (Checks disk configurations.) Disk configurations are inconsistent between nodes. Configure disk specifications to be consistent between nodes.
CheckSpaceUsage (Checks the disk space usage.) Disk space is insufficient. Clear or expand the disk for the directory.
CheckInodeUsage (Checks the disk index usage.) Disk indexes are insufficient. Clear or expand the disk for the directory.
CheckSwapMemory (Checks the swap memory.) The swap memory is greater than the physical memory. Reduce or disable the swap memory.
CheckLogicalBlock (Checks logical blocks.) The size of a logical block is not 512 KB. Use the gs_checkos tool to set the size of the logical block to 512 KB and write the setting into the auto-startup file.
gs_checkos -i B4
CheckIOrequestqueue (Checks I/O requests.) The requested I/O is not 32768. Use the gs_checkos tool to set the requested I/O to 32768 and write the setting into the auto-startup file.
gs_checkos -i B4
CheckCurConnCount (Checks the number of current connections.) The number of current connections exceeds 90% of the allowed maximum number of connections. Break idle primary database node connections.
CheckMaxAsyIOrequests (Checks the maximum number of asynchronous requests.) The maximum number of asynchronous requests is less than 104857600 or (Number of database instances on the current node x 1048576). Use the gs_checkos tool to set the maximum number of asynchronous requests to the larger one between 104857600 and (Number of database instances on the current node x 1048576).
gs_checkos -i B4
CheckMTU (Checks MTU values.) MTU values are inconsistent between nodes. Set the MTU value on each node to 1500 or 8192.
ifconfig eth* MTU 1500
CheckIOConfigure (Checks I/O configurations.) The I/O mode is not deadline. Use the gs_checkos tool to set the I/O mode to deadline and write the setting into the auto-startup file.
gs_checkos -i B4
CheckRXTX (Checks the RX/TX value.) The NIC RX/TX value is not 4096. Use the checkos tool to set the NIC RX/TX value to 4096 for MogDB.
gs_checkos -i B5
CheckPing (Checks whether the network connection is normal.) There are MogDB IP addresses that cannot be pinged. Check the network settings, network status, and firewall status between the abnormal IP addresses.
CheckNetWorkDrop (Checks the network packet loss rate.) The network packet loss rate is greater than 1%. Check the network load and status between the corresponding IP addresses.
CheckMultiQueue (Checks the NIC multi-queue function.) Multiqueue is not enabled for the NIC, and NIC interruptions are not bound to different CPU cores. Enable multiqueue for the NIC, and bind NIC interruptions to different CPU cores.
CheckEncoding (Checks the encoding format.) Encoding formats are inconsistent between nodes. Write the same encoding format into /etc/profile for each node.
echo "export LANG=XXX" >> /etc/profile
CheckActQryCount (Checks the archiving mode.) The archiving mode is enabled, and the archiving directory is not under the primary database node directory. Disable archiving mode or set the archiving directory to be under the primary database node directory.
CheckFirewall (Checks the firewall.) The firewall is enabled. Disable the firewall.
systemctl disable firewalld.service
CheckKernelVer (Checks kernel versions.) Kernel versions are inconsistent between nodes.
CheckMaxHandle (Checks the maximum number of file handles.) The maximum number of handles is less than 1000000. Set the soft and hard limits in the 91-nofile.conf or 90-nofile.conf file to 1000000.
gs_checkos -i B2
CheckNTPD (Checks the time synchronization service.) The NTPD service is disabled or the time difference is greater than 1 minute. Enable the NTPD service and set the time to be consistent.
CheckSysParams (Checks OS parameters.) OS parameter settings do not meet requirements. Use the gs_checkos tool or manually set parameters to values meeting requirements.
gs_checkos -i B1 vim /etc/sysctl.conf
CheckTHP (Checks the THP service.) The THP service is disabled. Use the gs_checkos to enable the THP service.
gs_checkos -i B6
CheckTimeZone (Checks time zones.) Time zones are inconsistent between nodes. Set time zones to be consistent between nodes.
cp /usr/share/zoneinfo/\$primary time zone/$secondary time zone\ /etc/localtime
CheckCPU (Checks the CPU.) CPU usage is high or I/O waiting time is too long. Upgrade CPUs or improve disk performance.
CheckSshdService (Checks the SSHD service.) The SSHD service is disabled. Enable the SSHD service and write the setting into the auto-startup file.
service sshd start echo "server sshd start" >> initFile
CheckSshdConfig (Checks SSHD configurations.) The SSHD service is incorrectly configured. Reconfigure the SSHD service.
PasswordAuthentication=no; MaxStartups=1000; UseDNS=yes; ClientAliveInterval=10800/ClientAliveInterval=0
Restart the service.
server sshd start
CheckCrondService (Checks the Crond service.) The Crond service is disabled. Install and enable the Crond service.
CheckStack (Checks the stack size.) The stack size is less than 3072. Use the gs_checkos tool to set the stack size to 3072 and restart the processes with a smaller stack size.
gs_checkos -i B2
CheckSysPortRange (Checks OS port configurations.) OS IP ports are not within the required port range or MogDB ports are within the OS IP port range. Set the OS IP ports within 26000 to 65535 and set the MogDB ports beyond the OS IP port range.
vim /etc/sysctl.conf
CheckMemInfo (Checks the memory information.) Memory sizes are inconsistent between nodes. Use physical memory of the same specifications between nodes.
CheckHyperThread (Checks the hyper-threading.) The CPU hyper-threading is disabled. Enable the CPU hyper-threading.
CheckTableSpace (Checks tablespaces.) The tablespace path is nested with the MogDB path or nested with the path of another tablespace. Migrate tablespace data to the tablespace with a valid path.
+ +## Querying Status + +### Background + +MogDB allows you to view the status of the entire MogDB. The query result shows whether the database or a single host is running properly. + +### Prerequisites + +The database has started. + +### Procedure + +1. Log in as the OS user **omm** to the primary node of the database. + +2. Run the following command to query the database status: + + ```bash + $ gs_om -t status --detail + ``` + + Table 1 describes parameters in the query result. + + To query the instance status on a host, add **-h** to the command. For example: + + ```bash + $ gs_om -t status -h plat2 + ``` + + **plat2** indicates the name of the host to be queried. + +### Parameter Description + +**Table 1** Node role description + +| Field | Description | Value | +| :------------ | :----------------------------------------------------------- | :----------------------------------------------------------- | +| cluster_state | The database status, which indicates whether the entire database is running properly. | **Normal**: The database is available and the data has redundancy backup. All the processes are running and the primary/standby relationship is normal.**Unavailable**: The database is unavailable.**Degraded**: The database is available and faulty database nodes and primary database nodes exist. | +| node | Host name. | Specifies the name of the host where the instance is located. If multiple AZs exist, the AZ IDs will be displayed. | +| node_ip | Host IP Address. | Specifies the IP address of the host where the instance is located. | +| instance | Instance ID. | Specifies the instance ID. | +| state | Instance role | **Normal**: a single host instance.**Primary**: The instance is a primary instance.**Standby**: The instance is a standby instance.**Cascade Standby**: The instance is a cascaded standby instance.**Secondary**: The instance is a secondary instance.**Pending**: The instance is in the quorum phase.**Unknown**: The instance status is unknown.**Down**: The instance is down.**Abnormal**: The node is abnormal.**Manually stopped**: The node has been manually stopped. | + +Each role has different states, such as startup and connection. The states are described as follows: + +**Table 2** Node state description + +| State | Description | +| :------------- | :----------------------------------------------------------- | +| Normal | The node starts up normally. | +| Need repair | The node needs to be restored. | +| Starting | The node is starting up. | +| Wait promoting | The node is waiting for upgrade. For example, after the standby node sends an upgrade request to the primary node, the standby node is waiting for the response from the primary node. | +| Promoting | The standby node is being upgraded to the primary node. | +| Demoting | The node is being downgraded, for example, the primary node is being downgraded to the standby node. | +| Building | The standby node fails to be started and needs to be rebuilt. | +| Catchup | The standby node is catching up with the primary node. | +| Coredump | The node program breaks down. | +| Unknown | The node status is unknown. | + +If a node is in **Need repair** state, you need to rebuild the node to restore it. Generally, the reasons for rebuilding a node are as follows: + +**Table 3** Node rebuilding causes + +| State | Description | +| :-------------------- | :----------------------------------------------------------- | +| Normal | The node starts up normally. | +| WAL segment removed | WALs of the primary node do not exist, and logs of the standby node are later than those of the primary node. | +| Disconnect | Standby node cannot be connected to the primary node. | +| Version not matched | The binary versions of the primary and standby nodes are inconsistent. | +| Mode not matched | Nodes do not match the primary and standby roles. For example, two standby nodes are connected. | +| System id not matched | The database system IDs of the primary and standby nodes are inconsistent. The system IDs of the primary and standby nodes must be the same. | +| Timeline not matched | The log timelines are inconsistent. | +| Unknown | Unknown cause. | + +## Examples + +View the database status details, including instance status. + +```bash +$ gs_om -t status --detail +[ Cluster State ] + +cluster_state : Normal +redistributing : No +current_az : AZ_ALL + +[ Datanode State ] + + node node_ip port instance state +----------------------------------------------------------------------------------------------------- +1 pekpopgsci00235 10.244.62.204 5432 6001 /opt/mogdb/cluster/data/dn1 P Primary Normal +2 pekpopgsci00238 10.244.61.81 5432 6002 /opt/mogdb/cluster/data/dn1 S Standby Normal +``` diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/4-checking-database-performance.md b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/4-checking-database-performance.md new file mode 100644 index 0000000000000000000000000000000000000000..716235a4f6ecfa79477eb0bc7bfb364be6feed21 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/4-checking-database-performance.md @@ -0,0 +1,83 @@ +--- +title: 日常运维 +summary: 日常运维 +author: Zhang Cuiping +date: 2021-03-04 +--- + +# Checking Database Performance + +## Check Method + +Use the **gs_checkperf** tool provided by MogDB to check hardware performance. + +**Prerequisites** + +- MogDB is running properly. +- Services are running properly on the database. + +**Procedure** + +1. Log in as the OS user **omm** to the primary node of the database. + +2. Run the following command to check the MogDB database performance: + + ``` + gs_checkperf + ``` + +For details about performance statistical items, see "Table 1 Performance check items" in "Tool Reference > Server Tools > [gs_checkperf](3-gs_checkperf)". + +**Examples** + +Simple performance statistical result is displayed on the screen as follows: + +``` +gs_checkperf -i pmk -U omm +Cluster statistics information: + Host CPU busy time ratio : 1.43 % + MPPDB CPU time % in busy time : 1.88 % + Shared Buffer Hit ratio : 99.96 % + In-memory sort ratio : 100.00 % + Physical Reads : 4 + Physical Writes : 25 + DB size : 70 MB + Total Physical writes : 25 + Active SQL count : 2 + Session count : 3 +``` + +## Exception Handling + +After you use the **gs_checkperf** tool to check the cluster performance, if the performance is abnormal, troubleshoot the issue by following instructions in this section. + +**Table 1** Cluster-level performance status + +| Abnormal Status | Solution | +| ---------------------------------- | ------------------------------------------------------------ | +| High CPU usage of hosts | 1. Add high-performance CPUs, or replace current CPUs with them.2. Run the **top** command to check which system processes cause high CPU usage, and run the **kill** command to stop unused processes.
`top` | +| High CPU usage of MogDB Kernel | 1. Add high-performance CPUs, or replace current CPUs with them.
2. Run the **top** command to check which database processes cause high CPU usage, and run the **kill** command to stop unused processes.
`top`
3. Use the **gs_expand** tool to add new hosts to lower the CPU usage. | +| Low hit ratio of the shared memory | 1. Expand the memory.
2. Run the following command to check the OS configuration file **/etc/sysctl.conf** and increase the value of **kernel.shmmax**.
`vim /etc/sysctl.conf` | +| Low in-memory sort ratio | Expand the memory. | +| High I/O and disk usage | 1. Replace current disks with high-performance ones.
2. Adjust the data layout to evenly distribute I/O requests to all the physical disks.
3. Run **VACUUM FULL** for the entire database.
`vacuum full;`
4. Clean up the disk space.
5. Reduce the number of concurrent connections. | +| Transaction statistics | Query the **pg_stat_activity** system catalog and disconnect unnecessary connections. (Log in to the database and run the **mogdb=# \d+ pg_stat_activity;** command.) | + +**Table 2** Node-level performance status + +| Abnormal Status | Solution | +| ----------------- | ------------------------------------------------------------ | +| High CPU usage | 1. Add high-performance CPUs, or replace current CPUs with them.
2. Run the **top** command to check which system processes cause high CPU usage, and run the **kill** command to stop unused processes.
`top` | +| High memory usage | Expand or clean up the memory. | +| High I/O usage | 1. Replace current disks with high-performance ones.
2. Clean up the disk space.
3. Use memory read/write to replace as much disk I/O as possible, putting frequently accessed files or data in the memory. | + +**Table 3** Session/process-level performance status + +| Abnormal Status | Solution | +| ------------------------------- | ------------------------------------------------------------ | +| High CPU, memory, and I/O usage | Check which processes cause high CPU, memory, or I/O usage. If they are unnecessary processes, kill them; otherwise, analyze the specific cause of high usage. For example, if SQL statement execution occupies much memory, check whether the SQL statements need optimization. | + +**Table 4** SSD performance status + +| Abnormal Status | Solution | +| -------------------- | ------------------------------------------------------------ | +| SSD read/write fault | Run the following command to check whether SSD is faulty. If yes, analyze the specific cause.
`gs_checkperf -i SSD -U omm` | diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/5-checking-and-deleting-logs.md b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/5-checking-and-deleting-logs.md new file mode 100644 index 0000000000000000000000000000000000000000..92358a20f3da94e394a7aa98ac21cd2daf2bb51d --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/5-checking-and-deleting-logs.md @@ -0,0 +1,160 @@ +--- +title: Checking and Deleting Logs +summary: Checking and Deleting Logs +author: Zhang Cuiping +date: 2021-03-04 +--- + +# Checking and Deleting Logs + +You are advised to check OS logs and database run logs monthly for monitoring system status and troubleshooting, and to delete database run logs monthly for saving disk space. + +## Checking OS Logs + +You are advised to monthly check OS logs to detect and prevent potential OS problems. + +**Procedure** + +Run the following command to check OS log files: + +``` +vim /var/log/messages +``` + +(Pay attention to words like **kernel**, **error**, and **fatal** in logs generated within the last month and handle the problems based on the alarm information.) + +## Checking MogDB Run Logs + +A database can still run when errors occur during the execution of some operations. However, data may be inconsistent before and after the error occurrences. Therefore, you are advised to monthly check MogDB run logs to detect potential problems in time. + +**Prerequisites** + +- The host used for collecting logs is running properly, and the network connection is normal. Database installation users trust each other. +- An OS tool (for example, **gstack**) that the log collection tool requires has been installed. If it is not installed, an error message is displayed, and this collection item is skipped. + +**Procedure** + +1. Log in as the OS user **omm** to the primary node of the database. + +2. Run the following command to collect database logs: + + ``` + gs_collector --begin-time="20160616 01:01" --end-time="20160616 23:59" + ``` + + In the command, **20160616 01:01** indicates the start time of the log and **20160616 23:59** indicates the end time of the log. + +3. Based on command output in [2](#2), access the related log collection directory, decompress collected database logs, and check these logs. + + Assume that collected logs are stored in **/opt/mogdb/tmp/gaussdba_mppdb/collector_20160726_105158.tar.gz**. + + ``` + tar -xvzf /opt/mogdb/tmp/gaussdba_mppdb/collector_20160726_105158.tar.gz + cd /opt/mogdb/tmp/gaussdba_mppdb/collector_20160726_105158 + ``` + +**Examples** + +- Run the **gs_collector** command together with parameters **-begin-time** and **-end-time**: + + ```bash + gs_collector --begin-time="20160616 01:01" --end-time="20160616 23:59" + ``` + + If information similar to the following is displayed, the logs have been archived: + + ``` + Successfully collected files + All results are stored in /tmp/gaussdba_mppdb/collector_20160616_175615.tar.gz. + ``` + +- Run the **gs_collector** command together with parameters **-begin-time**, **-end-time**, and **-h**: + + ```bash + gs_collector --begin-time="20160616 01:01" --end-time="20160616 23:59" -h plat2 + ``` + + If information similar to the following is displayed, the logs have been archived: + + ``` + Successfully collected files + All results are stored in /tmp/gaussdba_mppdb/collector_20160616_190225.tar.gz. + ``` + +- Run the **gs_collector** command together with parameters **-begin-time**, **-end-time**, and **-f**: + + ```bash + gs_collector --begin-time="20160616 01:01" --end-time="20160616 23:59" -f /opt/software/mogdb/output + ``` + + If information similar to the following is displayed, the logs have been archived: + + ``` + Successfully collected files + All results are stored in /opt/software/mogdb/output/collector_20160616_190511.tar.gz. + ``` + +- Run the **gs_collector** command together with parameters **-begin-time**, **-end-time**, and **-keyword**: + + ```bash + gs_collector --begin-time="20160616 01:01" --end-time="20160616 23:59" --keyword="os" + ``` + + If information similar to the following is displayed, the logs have been archived: + + ``` + Successfully collected files. + All results are stored in /tmp/gaussdba_mppdb/collector_20160616_190836.tar.gz. + ``` + +- Run the **gs_collector** command together with parameters **-begin-time**, **-end-time**, and **-o**: + + ```bash + gs_collector --begin-time="20160616 01:01" --end-time="20160616 23:59" -o /opt/software/mogdb/output + ``` + + If information similar to the following is displayed, the logs have been archived: + + ``` + Successfully collected files. + All results are stored in /opt/software/mogdb/output/collector_20160726_113711.tar.gz. + ``` + +- Run the **gs_collector** command together with parameters **-begin-time**, **-end-time**, and **-l** (the file name extension must be .log): + + ```bash + gs_collector --begin-time="20160616 01:01" --end-time="20160616 23:59" -l /opt/software/mogdb/logfile.log + ``` + + If information similar to the following is displayed, the logs have been archived: + + ``` + Successfully collected files. + All results are stored in /opt/software/mogdb/output/collector_20160726_113711.tar.gz. + ``` + +## Cleaning Run Logs + +A large number of run logs will be generated during database running and occupy huge disk space. You are advised to delete expired run logs and retain logs generated within one month. + +**Procedure** + +1. Log in as the OS user **omm** to any host in the MogDB Kernel cluster. + +2. Clean logs. + + a. Back up logs generated over one month ago to other disks. + + b. Access the directory where logs are stored. + + ``` + cd $GAUSSLOG + ``` + + c. Access the corresponding sub-directory and run the following command to delete logs generated one month ago: + + ``` + rm log name + ``` + + The naming convention of a log file is **mogdb-**year*-*month*-*day_**HHMMSS**. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/6-checking-time-consistency.md b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/6-checking-time-consistency.md new file mode 100644 index 0000000000000000000000000000000000000000..52090c929f44951f2426a79667402e65da2c4e05 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/6-checking-time-consistency.md @@ -0,0 +1,52 @@ +--- +title: Checking Time Consistency +summary: Checking Time Consistency +author: Zhang Cuiping +date: 2021-03-04 +--- + +# Checking Time Consistency + +Database transaction consistency is guaranteed by a logical clock and is not affected by OS time. However, OS time inconsistency will lead to problems, such as abnormal backend O&M and monitoring functions. Therefore, you are advised to monthly check time consistency among nodes. + +**Procedure** + +1. Log in as the OS user **omm** to any host in the MogDB Kernel cluster. + +2. Create a configuration file for recording each cluster node. (You can specify the *mpphosts* file directory randomly. It is recommended that the file be stored in the **/tmp** directory.) + + ```bash + vim /tmp/mpphosts + ``` + + Add the host name of each node. + + ``` + plat1 + plat2 + plat3 + ``` + +3. Save the configuration file. + + ``` + :wq! + ``` + +4. Run the following command and write the time on each node into the **/tmp/sys_ctl-os1.log** file: + + ``` + for ihost in `cat /tmp/mpphosts`; do ssh -n -q $ihost "hostname;date"; done > /tmp/sys_ctl-os1.log + ``` + +5. Check time consistency between the nodes based on the command output. The time difference should not exceed 30s. + + ``` + cat /tmp/sys_ctl-os1.log + plat1 + Thu Feb 9 16:46:38 CST 2017 + plat2 + Thu Feb 9 16:46:49 CST 2017 + plat3 + Thu Feb 9 16:46:14 CST 2017 + ``` diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/7-checking-the-number-of-application-connections.md b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/7-checking-the-number-of-application-connections.md new file mode 100644 index 0000000000000000000000000000000000000000..f5f77f016e8644dfe1729ae93bf5b9de1bab1ba8 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/7-checking-the-number-of-application-connections.md @@ -0,0 +1,130 @@ +--- +title: 日常运维 +summary: 日常运维 +author: Zhang Cuiping +date: 2021-03-04 +--- + +# Checking the Number of Application Connections + +If the number of connections between applications and the database exceeds the maximum value, new connections cannot be established. You are advised to daily check the number of connections, release idle connections in time, or increase the allowed maximum number of connections. + +**Procedure** + +1. Log in as the OS user **omm** to the primary node of the database. + +2. Run the following command to connect to the database: + + ``` + gsql -d mogdb -p 8000 + ``` + + **mogdb** is the name of the database to be connected, and **8000** is the port number of the database primary node. + + If information similar to the following is displayed, the connection succeeds: + + ``` + gsql ((MogDB x.x.x build 56189e20) compiled at 2022-01-07 18:47:53 commit 0 last mr ) + Non-SSL connection (SSL connection is recommended when requiring high-security) + Type "help" for help. + + mogdb=# + ``` + +3. Run the following SQL statement to check the number of connections: + + ``` + mogdb=# SELECT count(*) FROM (SELECT pg_stat_get_backend_idset() AS backendid) AS s; + ``` + + Information similar to the following is displayed. **2** indicates that two applications are connected to the database. + + ``` + count + ------- + 2 + (1 row) + ``` + +4. View the allowed maximum connections. + + ``` + mogdb=# SHOW max_connections; + ``` + + Information similar to the following is displayed. **200** indicates the currently allowed maximum number of connections. + + ``` + max_connections + ----------------- + 200 + (1 row) + ``` + +## Exception Handling + +If the number of connections in the command output is close to the value of **max_connections** of the database, delete existing connections or change the upper limit based on site requirements. + +1. Run the following SQL statement to view information about connections whose **state** is set to **idle**, and **state_change** column is not updated for a long time. + + ``` + mogdb=# SELECT * FROM pg_stat_activity where state='idle' order by state_change; + ``` + + Information similar to the following is displayed: + + ``` + datid | datname | pid | usesysid | usename | application_name | client_addr + | client_hostname | client_port | backend_start | xact_start | quer + y_start | state_change | waiting | enqueue | state | resource_pool + | query + -------+----------+-----------------+----------+----------+------------------+--------------- + -+-----------------+-------------+-------------------------------+------------+-------------- + -----------------+-------------------------------+---------+---------+-------+--------------- + +---------------------------------------------- + 13626 | mogdb | 140390162233104 | 10 | gaussdba | | + | | -1 | 2016-07-15 14:08:59.474118+08 | | 2016-07-15 14 + :09:04.496769+08 | 2016-07-15 14:09:04.496975+08 | f | | idle | default_pool + | select count(group_name) from pgxc_group; + 13626 | mogdb | 140390132872976 | 10 | gaussdba | cn_5002 | 10.180.123.163 + | | 48614 | 2016-07-15 14:11:16.014871+08 | | 2016-07-15 14 + :21:17.346045+08 | 2016-07-15 14:21:17.346095+08 | f | | idle | default_pool + | SET SESSION AUTHORIZATION DEFAULT;RESET ALL; + (2 rows) + ``` + +2. Release idle connections. + + Check each connection and release them after obtaining approval from the users of the connections. Run the following SQL command to release a connection using **pid** obtained in the previous step: + + ``` + mogdb=# SELECT pg_terminate_backend(140390132872976); + ``` + + Information similar to the following is displayed: + + ``` + mogdb=# SELECT pg_terminate_backend(140390132872976); + pg_terminate_backend + ---------------------- + t + (1 row) + ``` + + If no connections can be released, go to the next step. + +3. Increase the maximum number of connections. + + ``` + gs_guc set -D /mogdb/data/dbnode -c "max_connections= 800" + ``` + + **800** is the new maximum value. + +4. Restart database services to make the new settings take effect. + + > **NOTE:** The restart results in operation interruption. Properly plan the restart to avoid affecting users. + + ``` + gs_om -t stop && gs_om -t start + ``` diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/8-routinely-maintaining-tables.md b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/8-routinely-maintaining-tables.md new file mode 100644 index 0000000000000000000000000000000000000000..e99aeb3acb4adde43e4ac875e1a4a08978bd6b58 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/8-routinely-maintaining-tables.md @@ -0,0 +1,111 @@ +--- +title: 日常运维 +summary: 日常运维 +author: Zhang Cuiping +date: 2021-03-04 +--- + +# Routinely Maintaining Tables + +To ensure proper database running, after insert and delete operations, you need to routinely run **VACUUM FULL** and **ANALYZE** as appropriate for customer scenarios and update statistics to obtain better performance. + +**Related Concepts** + +You need to routinely run **VACUUM**, **VACUUM FULL**, and **ANALYZE** to maintain tables, because: + +- **VACUUM FULL** can be used to reclaim disk space occupied by updated or deleted data and combine small-size data files. +- **VACUUM** can be used to maintain a visualized mapping for each table to track pages that contain arrays visible to other active transactions. A common index scan uses the mapping to obtain the corresponding arrays and check whether the arrays are visible to the current transaction. If the arrays cannot be obtained, capture a batch of arrays to check the visibility. Therefore, updating the visualized mapping of a table can accelerate unique index scans. +- Running **VACUUM** can avoid original data loss caused by duplicate transaction IDs when the number of executed transactions exceeds the database threshold. +- **ANALYZE** can be used to collect statistics on tables in databases. The statistics are stored in the system catalog **PG_STATISTIC**. Then the query optimizer uses the statistics to work out the most efficient execution plan. + +**Procedure** + +1. Run the **VACUUM** or **VACUUM FULL** command to reclaim disk space. + + - **VACUUM**: + + Run **VACUUM** for a table. + + ``` + mogdb=# VACUUM customer; + ``` + + ``` + VACUUM + ``` + + This statement can be concurrently executed with database operation commands, including **SELECT**, **INSERT**, **UPDATE**, and **DELETE**; excluding **ALTER TABLE**. + + Run **VACUUM** for the table partition. + + ``` + mogdb=# VACUUM customer_par PARTITION ( P1 ); + ``` + + ``` + VACUUM + ``` + + - **VACUUM FULL**: + + ``` + mogdb=# VACUUM FULL customer; + ``` + + ``` + VACUUM + ``` + + During the command running, exclusive locks need to be added to the table and all other database operations need to be suspended. + +2. Run **ANALYZE** to update statistics. + + ``` + mogdb=# ANALYZE customer; + ``` + + ``` + ANALYZE + ``` + + Run **ANALYZE VERBOSE** to update statistics and display table information. + + ``` + mogdb=# ANALYZE VERBOSE customer; + ``` + + ``` + ANALYZE + ``` + + You can run **VACUUM ANALYZE** at the same time to optimize the query. + + ``` + mogdb=# VACUUM ANALYZE customer; + ``` + + ``` + VACUUM + ``` + + > **NOTE:** **VACUUM** and **ANALYZE** cause a substantial increase in I/O traffic, which may affect other active sessions. Therefore, you are advised to set the cost-based vacuum delay feature by specifying the **vacuum_cost_delay** parameter. For details, see "GUC Parameters > Resource Consumption > Cost-based Vacuum Delay" in the *Developer Guide*. + +3. Delete a table. + + ``` + mogdb=# DROP TABLE customer; + mogdb=# DROP TABLE customer_par; + mogdb=# DROP TABLE part; + ``` + + If the following information is displayed, the tables have been deleted: + + ``` + DROP TABLE + ``` + +**Maintenance Suggestions** + +- Routinely run **VACUUM FULL** for large tables. If the database performance deteriorates, run **VACUUM FULL** for the entire database. If the database performance is stable, you are advised to run **VACUUM FULL** monthly. +- Routinely run **VACUUM FULL** on system catalogs, especially **PG_ATTRIBUTE**. +- Enable automatic vacuum processes (**AUTOVACUUM**) in the system. The processes automatically run the **VACUUM** and **ANALYZE** statements to reclaim the record space marked as the deleted state and update statistics in the table. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/9-routinely-recreating-an-index.md b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/9-routinely-recreating-an-index.md new file mode 100644 index 0000000000000000000000000000000000000000..cfa38546f63ac7c8e1dd8d508e74e86c15c59190 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/9-routinely-recreating-an-index.md @@ -0,0 +1,68 @@ +--- +title: 日常运维 +summary: 日常运维 +author: Zhang Cuiping +date: 2021-03-04 +--- + +# Routinely Recreating an Index + +## **Background** + +When data deletion is repeatedly performed in the database, index keys will be deleted from the index pages, resulting in index bloat. Recreating an index routinely improves query efficiency. + +The database supports B-tree indexes. Recreating a B-tree index routinely helps improve query efficiency. + +- If a large amount of data is deleted, index keys on the index pages will be deleted. As a result, the number of index pages reduces and index bloat occurs. Recreating an index helps reclaim wasted space. +- In a newly created index, pages with adjacent logical structures tend to have adjacent physical structures. Therefore, a new index achieves a higher access speed than an index that has been updated for multiple times. + +**Methods** + +Use either of the following two methods to recreate an index: + +- Run the **DROP INDEX** statement to delete the index and run the **CREATE INDEX** statement to create an index. + + When you delete an index, a temporary exclusive lock is added in the parent table to block related read/write operations. During index creation, the write operation is locked, whereas the read operation is not locked and can use only sequential scans. + +- Run **REINDEX** to recreate an index. + + - When you run the **REINDEX TABLE** statement to recreate an index, an exclusive lock is added to block related read/write operations. + - When you run the **REINDEX INTERNAL TABLE** statement to recreate an index for a **desc** table (such as column-store **cudesc** table), an exclusive lock is added to block related read/write operations on the table. + +**Procedure** + +Assume the ordinary index **areaS_idx** exists in the **area_id** column of the imported table **areaS**. Use either of the following two methods to recreate an index: + +- Run the **DROP INDEX** statement to delete the index and run the **CREATE INDEX** statement to create an index. + + 1. Delete the index. + + ``` + mogdb=# DROP INDEX areaS_idx; + DROP INDEX + ``` + + 2. Create an index + + ``` + mogdb=# CREATE INDEX areaS_idx ON areaS (area_id); + CREATE INDEX + ``` + +- Run **REINDEX** to recreate an index. + + - Run **REINDEX TABLE** to recreate an index. + + ``` + mogdb=# REINDEX TABLE areaS; + REINDEX + ``` + + - Run **REINDEX INTERNAL TABLE** to recreate an index for a **desc** table (such as column-store **cudesc** table). + + ``` + mogdb=# REINDEX INTERNAL TABLE areaS; + REINDEX + ``` + +> **NOTE:** Before you recreate an index, you can increase the values of **maintenance_work_mem** and **psort_work_mem** to accelerate the index recreation. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/using-the-gsql-client-for-connection.md b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/using-the-gsql-client-for-connection.md new file mode 100644 index 0000000000000000000000000000000000000000..b63da74edcbfb683a7104550b704f18daa6eb249 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/routine-maintenance/using-the-gsql-client-for-connection.md @@ -0,0 +1,212 @@ +--- +title: Using the gsql Client for Connection +summary: Using the gsql Client for Connection +author: Zhang Cuiping +date: 2021-04-14 +--- + +# Using the gsql Client for Connection + +## Confirming Connection Information + +You can use a client tool to connect a database through the primary node of the database. Before the connection, obtain the IP address of the primary node of the database and the port number of the server where the primary node of the database is deployed. + +1. Log in to the primary node of the database as the OS user **omm**. + +2. Run the **gs_om -t status --detail** command to query instances in the MogDB cluster. + + ```bash + gs_om -t status --detail + + [ Datanode State ] + + node node_ip instance state + --------------------------------------------------------------------------------- + 1 mogdb-kernel-0005 172.16.0.176 6001 /mogdb/data/db1 P Primary Normal + ``` + + For example, the server IP address where the primary node of the database is deployed is 172.16.0.176. The data path of the primary node of the database is **/mogdb/data/db1**. + +3. Confirm the port number of the primary node of the database. + + View the port number in the **postgresql.conf** file in the data path of the primary database node obtained in step 2. The command is as follows: + + ```bash + cat /mogdb/data/db1/postgresql.conf | grep port + + port = 26000 # (change requires restart) + #comm_sctp_port = 1024 # Assigned by installation (change requires restart) + #comm_control_port = 10001 # Assigned by installation (change requires restart) + # supported by the operating system: + # e.g. 'localhost=10.145.130.2 localport=12211 remotehost=10.145.130.3 remoteport=12212, localhost=10.145.133.2 localport=12213 remotehost=10.145.133.3 remoteport=12214' + # e.g. 'localhost=10.145.130.2 localport=12311 remotehost=10.145.130.4 remoteport=12312, localhost=10.145.133.2 localport=12313 remotehost=10.145.133.4 remoteport=12314' + # %r = remote host and port + alarm_report_interval = 10 + support_extended_features=true + ``` + + **26000** in the first line is the port number of the primary database node. + +
+ +### Installing the gsql Client + +On the host, upload the client tool package and configure environment variables for the **gsql** client. + +1. Log in to the host where the client resides as any user. + +2. Run the following command to create the **/opt/mogdb/tools** directory: + + ```bash + mkdir /opt/mogdb/tools + ``` + +3. Obtain the file **MogDB-x.x.x-openEuler-64bit-tools.tar.gz** from the software installation package and upload it to the **/opt/mogdb/tools** directory. + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + > + > - The software package is located where you put it before installation. Set it based on site requirements. + > - The tool package name may vary in different OSs. Select the tool package suitable for your OS. + +4. Run the following commands to decompress the package: + + ```bash + cd /opt/mogdb/tools + tar -zxvf MogDB-x.x.x-openEuler-64bit-tools.tar.gz + ``` + +5. Set environment variables. + + Run the following command to open the **~/.bashrc** file: + + ```bash + vi ~/.bashrc + ``` + + Enter the following content and run **:wq!** to save and exit. + + ```bash + export PATH=/opt/mogdb/tools/bin:$PATH + export LD_LIBRARY_PATH=/opt/mogdb/tools/lib:$LD_LIBRARY_PATH + ``` + +6. Run the following command to make the environment variables take effect: + + ```bash + source ~/.bashrc + ``` + +
+ +## Connecting to a Database Using gsql + +
+ +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> +> - By default, if a client is idle state after connecting to a database, the client automatically disconnects from the database in the duration specified by **session_timeout**. To disable the timeout setting, set **session_timeout** to **0**. + +
+ +### Connecting to a Database Locally + +1. Log in as the OS user **omm** to the primary node of the database. + +2. Connect to a database. + + After the database is installed, a database named **postgres** is generated by default. When connecting to a database for the first time, you can connect to this database. + + Run the following command to connect to the **postgres** database: + + ```bash + gsql -d postgres -p 26000 + ``` + + **postgres** is the name of the database to be connected, and **26000** is the port number of the database primary node. Replace the values as required. + + If information similar to the following is displayed, the connection succeeds: + + ```sql + gsql ((MogDB x.x.x build 56189e20) compiled at 2022-01-07 18:47:53 commit 0 last mr ) + Non-SSL connection (SSL connection is recommended when requiring high-security) + Type "help" for help. + + postgres=# + ``` + + User **omm** is the administrator, and **postgres=#** is displayed. If you log in to and connect to the database as a common user, **postgres=>** is displayed. + + **Non-SSL connection** indicates that the database is not connected in SSL mode. If high security is required, connect to the database in SSL mode. + +3. Exit the database. + + ```sql + postgres=# \q + ``` + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> +> - When connecting to the database locally as user **omm**, no password is required. This is due to the default setting in the **pg_hba.conf** file that allows the local machine to connect in the **trust** way. +> - For details about the client authentication methods, see the [Client Access Authentication](1-client-access-authentication) chapter. + +
+ +## Connecting to a Database Remotely + +
+ +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> +> - Due to security restrictions, you can not remotely connect to the database as user **omm**. + +
+ +### Configuring a Whitelist Using gs_guc (Update pg_hba.conf) + +1. Log in as the OS user **omm** to the primary node of the database. + +2. Configure the client authentication mode and enable the client to connect to the host as user **jack**. User **omm** cannot be used for remote connection. + + Assume you are to allow the client whose IP address is **172.16.0.245** to access the current host. + + ```sql + gs_guc set -N all -I all -h "host all jack 172.16.0.245/24 sha256" + ``` + + **NOTICE:** + + - Before using user **jack**, connect to the database locally and run the following command in the database to create user **jack**: + + ```sql + postgres=# CREATE USER jack PASSWORD 'Test@123'; + ``` + + - **-N all** indicates all hosts in MogDB. + + - **-I all** indicates all instances on the host. + + - **-h** specifies statements that need to be added in the **pg_hba.conf** file. + + - **all** indicates that a client can connect to any database. + + - **jack** indicates the user that accesses the database. + + - **172.16.0.245/24** indicates that only the client whose IP address is **172.16.0.245** can connect to the host. The specified IP address must be different from those used in MogDB. **24** indicates that there are 24 bits whose value is 1 in the subnet mask. That is, the subnet mask is 255.255.255.0. + + - **sha256** indicates that the password of user **jack** is encrypted using the SHA-256 algorithm. + + This command adds a rule to the **pg_hba.conf** file corresponds to the primary node of the database. The rule is used to authenticate clients that access primary node. + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + > + > + For details about the client authentication methods, see the [Client Access Authentication](1-client-access-authentication) chapter. + +3. Connect to a database. + + After the database is installed, a database named **postgres** is generated by default. When connecting to a database for the first time, you can connect to this database. + + ```bash + gsql -d postgres -h 172.16.0.176 -U jack -p 26000 -W Test@123 + ``` + + **postgres** is the name of the database, **172.16.0.176** is the IP address of the server where the primary node of the database resides, **jack** is the user of the database, **26000** is the port number of the CN, and **Test@123** is the password of user **jack**. diff --git a/product/en/docs-mogdb/v3.0/administrator-guide/upgrade-guide.md b/product/en/docs-mogdb/v3.0/administrator-guide/upgrade-guide.md new file mode 100644 index 0000000000000000000000000000000000000000..3bd3f29a6e15d8c7f443f890e691de2148f04833 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/administrator-guide/upgrade-guide.md @@ -0,0 +1,347 @@ +--- +title: Upgrade Guide +summary: Upgrade Guide +author: Zhang Cuiping +date: 2021-09-27 +--- + +# Upgrade Guide + +## Overview + +This document provides guidance on version upgrade and rollback process. It also offers common problem resolving and troubleshooting methods. + +## Intended Audience + +This document is mainly intended for upgrade operators. They must have the following experience and skills: + +- Be familiar with the networking of the current network and versions of related NEs (network elements). +- Have maintenance experience of the related devices and be familiar with their operation and maintenance methods. + +## Upgrade Scheme + +This section provides guidance on selection of the upgrade modes. + +The user determines whether to upgrade the current system according to the new features of MogDB and database situations. + +The supported upgrade modes include in-place upgrade and gray upgrade. The upgrade strategies include major upgrade and minor upgrade. + +After the upgrade mode is determined, the system will automatically determine and choose the suitable upgrade strategy. + +* In-place upgrade: All services need to be stopped during the upgrade. All nodes are upgraded at a time. + +* Gray upgrade: supports full-service operations. All nodes are also upgraded at a time. (Currently, only the gray upgrade from version 1.1.0 to 2.0 and above is supported.) + +## Version Requirements Before the Upgrade (Upgrade Path) + +[Table 1](#biaoyi) lists the MogDB upgrade version requirements. + +**Table 1** Version requirements before the upgrade (upgrade path) + +| Version | Description | +| -------------------- | ------------------------------------------------------------ | +| MogDB 1.1.0 or later | Be able to be upgraded to any version later than MogDB 1.1.0. | + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** You can run the following command to check the version before the upgrade: +> +> ```bash +> gsql -V | --version +> ``` + +## Impact and Constraints + +The following precautions need to be considered during the upgrade: + +- The upgrade cannot be performed with capacity expansion and reduction concurrently. +- VIP (virtual IP) is not supported. +- During the upgrade, you are not allowed to modify the **wal_level**, **max_connections**, **max_prepared_transactions**, and **max_locks_per_transaction** GUC parameters. Otherwise, the instance will be started abnormally after rollback. +- It is recommended that the upgrade is performed when the database system is under the light workload. You can determine the off-peak hours according to your experience, such as holidays and festivals. +- Before the upgrade, make sure that the database is normal. You can run the **gs_om -t status** command to check the database status. If the returned value of **cluster_state** is **Normal**, the database is normal. +- Before the upgrade, make sure that mutual trust is established between database nodes. You can run the **ssh hostname** command on any node to connect to another node to verify whether the mutual trust has been established. If mutual connection between any two nodes does not require a password, the mutual trust is normal. (Generally, when the database status is normal, mutual trust is normal.) +- Before and after the upgrade, the database deployment mode must be kept consistent. Before the upgrade, the database deployment mode will be verified. If it is changed after the upgrade, an error will occur. +- Before the upgrade, make sure that the OS is normal. You can check the OS status using the **gs_checkos** tool. +- In-place upgrade requires stopping of services. Gray upgrade supports full-service operations. +- The database is running normally and the data of the primary domain name (DN) is fully synchronized to the standby DN. +- During the upgrade, the kerberos is not allowed to be enabled. +- You are not allowed to modify the **version.cfg** file decompressed from the installation package. +- During the upgrade, if an error causes upgrade failure, you need to perform rollback operations manually. The next upgrade can be performed only after the rollback is successful. +- After the rollback, if the next upgrade is successful, GUC parameters set before the upgrade is submitted will become invalid. +- During the upgrade, you are not allowed to set GUC parameters manually. +- During the gray upgrade, service interruption will occur and lasts less than 10s. +- During the upgrade, OM operations can be performed only when the kernel and OM versions are consistent. This consistency refers that the kernel code and OM code are from the same software package. If the pre-installation script of the installation package is executed but the upgrade is not performed, or the pre-installation script of the baseline package after the rollback is not performed, the kernel code will be inconsistent with the OM code. +- During the upgrade, if new fields are added to a system table but they cannot be found by running the **\d** command after the upgrade, you can run the **select** command to check the new fields. +- The upgrade is allowed only when the value of **enable_stream_replication** is **on**. +- During the gray upgrade, the number of concurrent read/write services must be less than 200. +- If the MOT is used in MogDB 1.1.0, MogDB 1.1.0 cannot be upgraded to MogDB 2.1. + +## Upgrade Process + +This section describes the upgrade process. + +**Figure 1** Upgrade process + +![21](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/upgrade-guide-2.png) + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif)**NOTE:** The time listed in the following table is for reference only. The actual time required depends on the upgrade environment. + +**Table 2** Estimated upgrade efficiency + +
Procedure Recommended Start Time Time Required (Day/Hour/Minute) Service Interruption Time Remarks
Perform the pre-upgrade preparations and check operations. One day before the upgrade About 2 to 3 hours No impact on services Pre-upgrade check, data backup, and software package verification
Perform the upgrade. Off-peak hours The time is mainly spent in starting and stopping the database and modifying the system table of each database. The upgrade usually takes less than 30 minutes. The service interruption time is the same as the upgrade time. Generally, the time taken is not greater than 30 minutes. Performed based on the Upgrade Guide
Verify the upgrade. Off-peak hours About 30 minutes The service interruption time is the same as the upgrade verification time, about 30 minutes. -
Submit the upgrade. Off-peak hours The upgrade submission usually takes less than 10 minutes. The service interruption time is the same as the upgrade submission time. Generally, the time taken is not greater than 10 minutes. -
Roll back the upgrade. Off-peak hours The rollback usually takes less than 30 minutes. The service interruption time is the same as the rollback time. Generally, the time taken is not greater than 30 minutes. -
+ +## Pre-Upgrade Preparations and Check + +### Pre-Upgrade Preparations and Checklist + +**Table 3** Pre-upgrade preparations and checklist + +
No. Item to Be Prepared for the Upgrade Preparation Content Recommended Start Time Time Required (Day/Hour/Minute)
1 Collect node information. Obtain the name, IP address, and passwords of users root and omm of related database nodes One day before the upgrade 1 hour
2 Set remote login as user root. Set the configuration file that allows remote login as user root One day before the upgrade 2 hours
3 Back up data. For details, see the Backup and Restoration section in the Administrator Guide. One day before the upgrade The time taken varies depends on the volume of data to be backed up and the backup strategy.
4 Obtain and verify the installation package. Obtain the installation package and verify the package integrity. One day before the upgrade 0.5 hour
5 Perform the health check. Check the OS status using the gs_checkos tool One day before the upgrade 0.5 hour
6 Check the disk usage of each database node. Check the disk usage by running the df command. One day before the upgrade 0.5 hour
7 Check the database status. Check the database status using the gs_om tool. One day before the upgrade 0.5 hour
+ +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif)**NOTE:** **Time Required** varies depends on the environment, including data volume, server performance, and other factors. + +### Collecting Node Information + +You can contact the system administrator to obtain the environment information, such as name, IP address, and passwords of users **root** and **omm** of the database node. + +**Table 4** Node information + + + + + + + + + + + + + + + + + + +
No. Node Name IP Address of the Node +Password of User root Password of User omm Remarks
1 +- +- - - -
+ +### Backing Up Data + +Once the upgrade fails, services will be affected. Therefore, you need to back up data in advance so that services can be quickly restored once the risk occurs. + +For details about data backup, see the **Backup and Restoration** section in the *Administrator Guide*. + +### Obtaining the Installation Package + +You can obtain the installation package from [this website](https://www.mogdb.io/en/downloads/mogdb/). + +### Performing the Health Check + +The **gs_checkos** tool can be used to check the OS status. + +**Prerequisites** + +- The current hardware and network environment is normal. +- The mutual trust between the **root** users of all hosts is normal. +- The **gs_checkos** command can be executed only as user **root**. + +**Procedure** + +1. Log in to the primary database node as user **root**. + +2. Run the following command to check the server OS parameters: + + ``` + # gs_checkos -i A + ``` + + Checking the OS parameters aims to ensure that the database can be pre-installed normally and can be run safely and efficiently after being installed. + +#### Checking the Disk Usage of the Database Node + +It is recommended that the upgrade is performed when the disk usage of the database node is less than 80%. + +#### Checking the Database Status + +This section introduces how to check the database status. + +**Procedure** + +1. Log in to the primary database node as user **omm** and run the **source** command to reload environment variables. + + ``` + # su - omm + $ source /home/omm/.bashrc + ``` + +2. Run the following command to check the database status: + + ```bash + gs_om -t status + ``` + +3. Ensure that the database status is normal. + +## Upgrade Procedure + +This section introduces details about in-place upgrade and gray upgrade. + +**Procedure** + +1. Log in to the primary database node as user **root**. + +2. Create a directory for storing the new package. + + ``` + # mkdir -p /opt/software/mogdb_upgrade + ``` + +3. Upload the new package to the **/opt/software/mogdb_upgrade** directory and decompress the package. + +4. Found the **script** file. + + ``` + # cd /opt/software/mogdb_upgrade/script + ``` + +5. Before the in-place or gray upgrade, execute the pre-installation script by running the **gs_preinstall** command. + + ``` + # ./gs_preinstall -U omm -G dbgrp -X /opt/software/mogdb/clusterconfig.xml + ``` + +6. Switch to user **omm**. + + ``` + # su - omm + ``` + +7. After ensuring that the database status is normal, run the required command to perform the in-place upgrade or gray upgrade. + + Example one: Execute the **gs_upgradectl** script to perform the in-place upgrade. + + ```bash + gs_upgradectl -t auto-upgrade -X /opt/software/mogdb/clusterconfig.xml + ``` + + Example two: Execute the **gs_upgradectl** script to perform the gray upgrade. + + ```bash + gs_upgradectl -t auto-upgrade -X /opt/software/mogdb/clusterconfig.xml --grey + ``` + +## Upgrade Verification + +This section introduces upgrade verification and provides detailed use cases and operations. + +### Verifying the Checklist of the Project + +**Table 5** Verification item checklist + +
No. Verification Item Check Standard Check Result
1 Version check Check whether the version is correct after the upgrade. -
2 Health check Use the gs_checkos tool to check the OS status. -
3 Database status Use the gs_om tool to check the database status. -
+ +### Querying the Upgrade Version + +This section introduces how to check the version. + +**Procedure** + +1. Log in to the primary database node as user **omm** and run the **source** command to reload environment variables. + + ``` + # su - omm + $ source /home/omm/.bashrc + ``` + +2. Run the following command to check the version information of all nodes: + + ```bash + gs_ssh -c "gsql -V" + ``` + +### Checking the Database Status + +This section introduces how to check the database status. + +**Procedure** + +1. Log in to the primary database node as user **omm**. + + ``` + # su - omm + ``` + +2. Run the following command to check the database status: + + ```bash + gs_om -t status + ``` + + If the value of **cluster_state** is **Normal**, the database is normal. + +## Upgrade Submission + +After the upgrade, if the verification is successful, the subsequent operation is to submit the upgrade. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** Once the upgrade is submitted, it cannot be rolled back. + +**Procedure** + +1. Log in to the primary database node as user **omm**. + + ``` + # su - omm + ``` + +2. Run the following command to submit the upgrade: + + ```bash + gs_upgradectl -t commit-upgrade -X /opt/software/mogdb/clusterconfig.xml + ``` + +## Version Rollback + +This section introduces how to roll back the upgrade. + +**Procedure** + +1. Log in to the primary database node as user **omm**. + + ``` + # su - omm + ``` + +2. Run the following command to perform the rollback operation (rolling back the kernel code). After the rollback, if you need to keep the versions of the kernel code and OM code consistent, execute the pre-installation script in the old package. (For details, see the [execute the pre-installtion script](#qianzhijiaoben) step.) + + ```bash + gs_upgradectl -t auto-rollback -X /opt/software/mogdb/clusterconfig.xml + ``` + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** If the database is abnormal, run the following command to perform the forcible rollback operation: + > + > ```bash + > gs_upgradectl -t auto-rollback -X /opt/software/mogdb/clusterconfig.xml --force + > ``` + +3. Check the version after the rollback. + + ```bash + gs_om -V | --version + ``` + + If the upgrade fails, perform the following operations to resolve the issue: + + a. Check whether the environment is abnormal. + + For example, the disk is fully occupied, the network is faulty, or the installation package or upgrade version is incorrect. After the problem is located and resolved, try to perform the upgrade again. + + b. If no environment issue is found or the upgrade fails again, collect related logs and contact technical engineers. + + Run the following command to collect logs: + + ```bash + gs_collector -begin-time='20200724 00:00' -end-time='20200725 00:00' + ``` + + If permitted, you are advised to retain the environment. diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/1-core-fault-locating/1-core-dump-occurs-due-to-full-disk-space.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/1-core-fault-locating/1-core-dump-occurs-due-to-full-disk-space.md new file mode 100644 index 0000000000000000000000000000000000000000..abd30c482ce2a2bdca8c4fa5d27d1c13ca2b654f --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/1-core-fault-locating/1-core-dump-occurs-due-to-full-disk-space.md @@ -0,0 +1,22 @@ +--- +title: Core Dump Occurs due to Full Disk Space +summary: Core Dump Occurs due to Full Disk Space +author: Guo Huan +date: 2021-05-24 +--- + +# Core Dump Occurs due to Full Disk Space + +## Symptom + +When TPC-C is running, the disk space is full during injection. As a result, a core dump occurs on the MogDB process, as shown in the following figure. + +![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/core-dump-occurs-due-to-full-disk-space.png) + +## Cause Analysis + +When the disk is full, Xlog logs cannot be written. The program exits through the panic log. + +## Procedure + +Externally monitor the disk usage and periodically clean up the disk. diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/1-core-fault-locating/2-core-dump-occurs-due-to-incorrect-settings-of-guc-parameter-log-directory.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/1-core-fault-locating/2-core-dump-occurs-due-to-incorrect-settings-of-guc-parameter-log-directory.md new file mode 100644 index 0000000000000000000000000000000000000000..2a20343f1a3807a7fc5d06ddba0470b1d22c3b53 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/1-core-fault-locating/2-core-dump-occurs-due-to-incorrect-settings-of-guc-parameter-log-directory.md @@ -0,0 +1,20 @@ +--- +title: Core Dump Occurs Due to Incorrect Settings of GUC Parameter log_directory +summary: Core Dump Occurs Due to Incorrect Settings of GUC Parameter log_directory +author: Guo Huan +date: 2021-05-24 +--- + +# Core Dump Occurs Due to Incorrect Settings of GUC Parameter log_directory + +## Symptom + +After the database process is started, a core dump occurs and no log is recorded. + +## Cause Analysis + +The directory specified by GUC parameter **log_directory** cannot be read or you do not have permissions to access this directory. As a result, the verification fails during the database startup, and the program exits through the panic log. + +## Procedure + +Set **log_directory** to a valid directory. For details, see [log_directory](1-logging-destination#log_directory). diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/1-core-fault-locating/3-core-dump-occurs-when-removeipc-is-enabled.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/1-core-fault-locating/3-core-dump-occurs-when-removeipc-is-enabled.md new file mode 100644 index 0000000000000000000000000000000000000000..cb04b6b8901602d143d16484b6965a77c0740eaf --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/1-core-fault-locating/3-core-dump-occurs-when-removeipc-is-enabled.md @@ -0,0 +1,24 @@ +--- +title: Core Dump Occurs when RemoveIPC Is Enabled +summary: Core Dump Occurs when RemoveIPC Is Enabled +author: Guo Huan +date: 2021-05-24 +--- + +# Core Dump Occurs when RemoveIPC Is Enabled + +## Symptom + +The **RemoveIPC** parameter in the OS configuration is set to **yes**. The database breaks down during running, and the following log information is displayed: + +``` +FATAL: semctl(1463124609, 3, SETVAL, 0) failed: Invalid argument +``` + +## Cause Analysis + +If **RemoveIPC** is set to **yes**, the OS deletes the IPC resources (shared memory and semaphore) when the corresponding user exits. As a result, the IPC resources used by the MogDB server are cleared, causing the database to break down. + +## Procedure + +Set **RemoveIPC** to **no**. For details, see [Modifying OS Configuration](3-modifying-os-configuration) in the *Installation Guide*. diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/1-core-fault-locating/4-core-dump-occurs-after-installation-on-x86.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/1-core-fault-locating/4-core-dump-occurs-after-installation-on-x86.md new file mode 100644 index 0000000000000000000000000000000000000000..0258da542e7bfc06585c3b580c02d35d6b3e2a8d --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/1-core-fault-locating/4-core-dump-occurs-after-installation-on-x86.md @@ -0,0 +1,26 @@ +--- +title: Core Dump Occurs After Installation on x86 +summary: Core Dump Occurs After Installation on x86 +author: Guo Huan +date: 2021-12-09 +--- + +# Core Dump Occurs After Installation on x86 + +## Symptom + +The core dump occurs after the installation of MogDB on x86 architecture machine is completed, and the error in the following figure displays + +![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/core-dump-occurs-after-installation-on-x86-1.png) + +## Cause Analysis + +The x86 architecture does not include the rdtscp instruction set, and the deployment of MogDB fails. This problem is common in the case of virtualized installation of Linux server on local windows system, but the virtualization version is too low. + +## Procedure + +Run the `lscpu | grep rdtscp` command to see if the rdtscp instruction set is supported. + +![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/administrator-guide/core-dump-occurs-after-installation-on-x86-2.png) + +Support for this instruction set can be enabled through the host's admin side settings. Set the cloud host CPU mode to **host-passthrough**, and then reboot. diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/10-disk-space-usage-reaches-the-threshold.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/10-disk-space-usage-reaches-the-threshold.md new file mode 100644 index 0000000000000000000000000000000000000000..0e7ce312879f0bd51cc8211964064716025083a8 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/10-disk-space-usage-reaches-the-threshold.md @@ -0,0 +1,58 @@ +--- +title: Disk Space Usage Reaches the Threshold and the Database Becomes Read-only +summary: Disk Space Usage Reaches the Threshold and the Database Becomes Read-only +author: Guo Huan +date: 2021-05-24 +--- + +# Disk Space Usage Reaches the Threshold and the Database Becomes Read-only + +## Symptom + +The following error is reported when a non-read-only SQL statement is executed: + +``` +ERROR: cannot execute %s in a read-only transaction. +``` + +An error is reported when some non-read-only SQL statements (such as insert, update, create table as, create index, alter table, and copy from) are executed: + +``` +canceling statement due to default_transaction_read_only is on. +``` + +## Cause Analysis + +After the disk space usage reaches the threshold, the database enters the read-only mode. In this mode, only read-only statements can be executed. + +## Procedure + +1. Use either of the following methods to connect to the database in maintenance mode: + + - Method 1 + + ``` + gsql -d mogdb -p 8000 -r -m + ``` + + - Method 2 + + ``` + gsql -d mogdb -p 8000 -r + ``` + + After the connection is successful, run the following command: + + ``` + set xc_maintenance_mode=on; + ``` + +2. Run the **DROP** or **TRUNCATE** statement to delete user tables that are no longer used until the disk space usage falls below the threshold. + + Deleting user tables can only temporarily relieve the insufficient disk space. To permanently solve the problem, expand the disk space. + +3. Disable the read-only mode of the database as user **omm**. + + ``` + gs_guc reload -D /mogdb/data/dbnode -c "default_transaction_read_only=off" + ``` diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/11-slow-response-to-a-query-statement.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/11-slow-response-to-a-query-statement.md new file mode 100644 index 0000000000000000000000000000000000000000..822bcaccf015e99f44753af86dfd2b5e5f78993c --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/11-slow-response-to-a-query-statement.md @@ -0,0 +1,48 @@ +--- +title: Slow Response to a Query Statement +summary: Slow Response to a Query Statement +author: Guo Huan +date: 2021-05-24 +--- + +# Slow Response to a Query Statement + +## Symptom + +After a query statement has been executed, no response is returned for a long time. + +## Cause Analysis + +- The query statement is complex and requires a long time for execution. +- The query statement is blocked. + +## Procedure + +1. Log in to the host as the OS user **omm**. + +2. Run the following command to connect to the database: + + ``` + gsql -d mogdb -p 8000 + ``` + + **mogdb** is the name of the database, and **8000** is the port number. + +3. Check for the query statements that are executed for a long time in the system. + + ``` + SELECT timestampdiff(minutes, query_start, current_timestamp) AS runtime, datname, usename, query FROM pg_stat_activity WHERE state != 'idle' ORDER BY 1 desc; + ``` + + Query statements are returned, sorted by execution time length in descending order. The first record is the query statement that takes the long time for execution. + + Alternatively, you can use the [TIMESTAMPDIFF](8-date-and-time-processing-functions-and-operators) function to set **current_timestamp** and **query_start** to be greater than a threshold to identify query statements that are executed for a duration longer than this threshold. The first parameter of **timestampdiff** is the time difference unit. For example, execute the following statement to query the statements whose execution lasts more than 2 minutes: + + ``` + SELECT query FROM pg_stat_activity WHERE timestampdiff(minutes, query_start, current_timestamp) > 2; + ``` + +4. Analyze the status of the query statements that were run for a long time. + + - If the query statement is normal, wait until the execution of the query statement is complete. + - If the query statement is blocked, rectify the fault by referring to [Analyzing Whether a Query Statement Is Blocked](14-analyzing-whether-a-query-statement-is-blocked). diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/12-analyzing-the-status-of-a-query-statement.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/12-analyzing-the-status-of-a-query-statement.md new file mode 100644 index 0000000000000000000000000000000000000000..08a8c1b897a07b5d2bdb0eb4650356f8df914a10 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/12-analyzing-the-status-of-a-query-statement.md @@ -0,0 +1,57 @@ +--- +title: Analyzing the Status of a Query Statement +summary: Analyzing the Status of a Query Statement +author: Guo Huan +date: 2021-05-24 +--- + +# Analyzing the Status of a Query Statement + +## Symptom + +Some query statements are executed for an excessively long time in the system. You need to analyze the status of the query statements. + +## Procedure + +1. Log in to the host as the OS user **omm**. + +2. Run the following command to connect to the database: + + ``` + gsql -d mogdb -p 8000 + ``` + + **mogdb** is the name of the database, and **8000** is the port number. + +3. Set the parameter **track_activities** to **on**. + + ``` + SET track_activities = on; + ``` + + The database collects the running information about active queries only if the parameter is set to **on**. + +4. View the running query statements. The **pg_stat_activity** view is used as an example here. + + ``` + SELECT datname, usename, state, query FROM pg_stat_activity; + datname | usename | state | query + ----------+---------+--------+------- + mogdb | omm | idle | + mogdb | omm | active | + (2 rows) + ``` + + If the **state** column is **idle**, the connection is idle and requires a user to enter a command. To identify only active query statements, run the following command: + + ``` + SELECT datname, usename, state, query FROM pg_stat_activity WHERE state != 'idle'; + ``` + +5. Analyze whether a query statement is in the active or blocked state. Run the following command to view a query statement in the block state: + + ``` + SELECT datname, usename, state, query FROM pg_stat_activity WHERE waiting = true; + ``` + + The query statement is displayed. It is requesting a lock resource that may be held by another session, and is waiting for the lock resource to be released by the session. diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/13-forcibly-terminating-a-session.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/13-forcibly-terminating-a-session.md new file mode 100644 index 0000000000000000000000000000000000000000..6b04991aafec8b6a6c13c082500bde2b7abc7fee --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/13-forcibly-terminating-a-session.md @@ -0,0 +1,62 @@ +--- +title: Forcibly Terminating a Session +summary: Forcibly Terminating a Session +author: Guo Huan +date: 2021-05-24 +--- + +# Forcibly Terminating a Session + +## Symptom + +In some cases, the administrator must forcibly terminate abnormal sessions to keep the system healthy. + +## Procedure + +1. Log in to the host as the OS user **omm**. + +2. Run the following command to connect to the database: + + ``` + gsql -d mogdb -p 8000 + ``` + + **mogdb** is the name of the database, and **8000** is the port number. + +3. Find the thread ID of the faulty session from the current active session view. + + ``` + SELECT datid, pid, state, query FROM pg_stat_activity; + ``` + + A command output similar to the following is displayed, where the pid value indicates the thread ID of the session. + + ``` + datid | pid | state | query + -------+-----------------+--------+------ + 13205 | 139834762094352 | active | + 13205 | 139834759993104 | idle | + (2 rows) + ``` + +4. Terminate the session using its thread ID. + + ``` + SELECT pg_terminate_backend(139834762094352); + ``` + + If information similar to the following is displayed, the session is successfully terminated: + + ``` + pg_terminate_backend + --------------------- + t + (1 row) + ``` + + If a command output similar to the following is displayed, a user is attempting to terminate the session, and the session will be reconnected rather than being terminated. + + ``` + FATAL: terminating connection due to administrator command + FATAL: terminating connection due to administrator command The connection to the server was lost. Attempting reset: Succeeded. + ``` diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/14-analyzing-whether-a-query-statement-is-blocked.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/14-analyzing-whether-a-query-statement-is-blocked.md new file mode 100644 index 0000000000000000000000000000000000000000..226fa3ff060dd542d60b36b79093a86c2d24054f --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/14-analyzing-whether-a-query-statement-is-blocked.md @@ -0,0 +1,56 @@ +--- +title: Analyzing Whether a Query Statement Is Blocked +summary: Analyzing Whether a Query Statement Is Blocked +author: Guo Huan +date: 2021-05-24 +--- + +# Analyzing Whether a Query Statement Is Blocked + +## Symptom + +During database running, query statements are blocked in some service scenarios. As a result, the query statements are executed for an excessively long time. + +## Cause Analysis + +A query statement uses a lock to protect the data objects that it wants to access. If the data objects have been locked by another session, the query statement will be blocked and wait for the session to complete operation and release the lock resource. The data objects requiring locks include tables and tuples. + +## Procedure + +1. Log in to the host as the OS user **omm**. + +2. Run the following command to connect to the database: + + ``` + gsql -d mogdb -p 8000 + ``` + + **mogdb** is the name of the database, and **8000** is the port number. + +3. Find the thread ID of the faulty session from the current active session view. + + ``` + SELECT w.query AS waiting_query, w.pid AS w_pid, w.usename AS w_user, l.query AS locking_query, l.pid AS l_pid, l.usename AS l_user, t.schemaname || '.' || t.relname AS tablename FROM pg_stat_activity w JOIN pg_locks l1 ON w.pid = l1.pid AND NOT l1.granted JOIN pg_locks l2 ON l1.relation = l2.relation AND l2.granted JOIN pg_stat_activity l ON l2.pid = l.pid JOIN pg_stat_user_tables t ON l1.relation = t.relid WHERE w.waiting = true; + ``` + +4. Terminate the session using its thread ID. + + ``` + SELECT pg_terminate_backend(139834762094352); + ``` + + If information similar to the following is displayed, the session is successfully terminated: + + ``` + pg_terminate_backend + --------------------- + t + (1 row) + ``` + + If a command output similar to the following is displayed, a user is attempting to terminate the session, and the session will be reconnected rather than being terminated. + + ``` + FATAL: terminating connection due to administrator command + FATAL: terminating connection due to administrator command The connection to the server was lost. Attempting reset: Succeeded. + ``` diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/15-low-query-efficiency.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/15-low-query-efficiency.md new file mode 100644 index 0000000000000000000000000000000000000000..479c9e35dc1ba78b9fe3a2238afa62cdb4a1c7e1 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/15-low-query-efficiency.md @@ -0,0 +1,32 @@ +--- +title: Low Query Efficiency +summary: Low Query Efficiency +author: Guo Huan +date: 2021-05-24 +--- + +# Low Query Efficiency + +## Symptom + +A query task that used to take a few milliseconds to complete is now requiring several seconds, and that used to take several seconds is now requiring even half an hour. + +## Procedure + +Perform the following procedure to locate the cause. + +1. Run the **analyze** command to analyze the database. + + Run the **analyze** command to update statistics such as data sizes and attributes in all tables. You are advised to perform the operation with light job load. If the query efficiency is improved or restored after the command execution, the **autovacuum** process does not function well that requires further analysis. + +2. Check whether the query statement returns unnecessary information. + + For example, if a query statement is used to query all records in a table with the first 10 records being used, then it is quick to query a table with 50 records. However, if a table contains 50000 records, the query efficiency decreases. If an application requires only a part of data information but the query statement returns all information, add a LIMIT clause to the query statement to restrict the number of returned records. In this way, the database optimizer can optimize space and improve query efficiency. + +3. Check whether the query statement still has a low response even when it is solely executed. + + Run the query statement when there are no or only a few other query requests in the database, and observe the query efficiency. If the efficiency is high, the previous issue is possibly caused by a heavily loaded host in the database system or + +4. Check the same query statement repeatedly to check the query efficiency. + + One major cause of low query efficiency is that the required information is not cached in the memory or is replaced by other query requests due to insufficient memory resources. This can be verified by running the same query statement repeatedly and the query efficiency increases gradually. diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/16-lock-wait-timeout-is-displayed.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/16-lock-wait-timeout-is-displayed.md new file mode 100644 index 0000000000000000000000000000000000000000..1944729224a8780c1a0ac7da777588ae5fc7d96a --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/16-lock-wait-timeout-is-displayed.md @@ -0,0 +1,25 @@ +--- +title: Lock wait timeout Is Displayed When a User Executes an SQL Statement +summary: Lock wait timeout Is Displayed When a User Executes an SQL Statement +author: Guo Huan +date: 2021-05-24 +--- + +# "Lock wait timeout" Is Displayed When a User Executes an SQL Statement + +## Symptom + +"Lock wait timeout" is displayed when a user executes an SQL statement. + +``` +ERROR: Lock wait timeout: thread 140533638080272 waiting for ShareLock on relation 16409 of database 13218 after 1200000.122 ms ERROR: Lock wait timeout: thread 140533638080272 waiting for AccessExclusiveLock on relation 16409 of database 13218 after 1200000.193 ms +``` + +## Cause Analysis + +Lock waiting times out in the database. + +## Procedure + +- After detecting such errors, the database automatically retries the SQL statements. The number of retries is controlled by **max_query_retry_times**. +- To analyze the cause of the lock wait timeout, find the SQL statements that time out in the **pg_locks** and **pg_stat_activity** system catalogs. diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/17-table-size-does-not-change.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/17-table-size-does-not-change.md new file mode 100644 index 0000000000000000000000000000000000000000..ce32f744d1816aee73e45f868e0120127061d697 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/17-table-size-does-not-change.md @@ -0,0 +1,39 @@ +--- +title: Table Size Does not Change After VACUUM FULL Is Executed on the Table +summary: Table Size Does not Change After VACUUM FULL Is Executed on the Table +author: Guo Huan +date: 2021-05-24 +--- + +# Table Size Does not Change After VACUUM FULL Is Executed on the Table + +## Symptom + +A user runs the **VACUUM FULL** command to clear a table, the table size does not change. + +## Cause Analysis + +Assume the table is named **table_name**. Possible causes are as follows: + +- No data has been deleted from the **table_name** table. Therefore, the execution of **VACUUM FULL table_name** does not cause the table size to change. +- Concurrent transactions exist during the execution of **VACUUM FULL table_name**. As a result, recently deleted data may be skipped when clearing the table. + +## Procedure + +For the second possible cause, use either of the following methods: + +- Wait until all concurrent transactions are complete, and run the **VACUUM FULL table_name** command again. + +- If the table size still does not change, ensure no service operations are performed on the table, and then execute the following SQL statements to query the active transaction list status: + + ``` + select txid_current(); + ``` + + The current XID is obtained. Then, run the following command to check the active transaction list: + + ``` + select txid_current_snapshot(); + ``` + + If any XID in the active transaction list is smaller than the current transaction XID, stop the database and then start it. Run **VACUUM FULL** to clear the table again. diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/18-an-error-is-reported-when-the-table-partition-is-modified.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/18-an-error-is-reported-when-the-table-partition-is-modified.md new file mode 100644 index 0000000000000000000000000000000000000000..c5710586803c4dbb5d8c1ff210fb0c31e7b1c3ae --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/18-an-error-is-reported-when-the-table-partition-is-modified.md @@ -0,0 +1,46 @@ +--- +title: An Error Is Reported When the Table Partition Is Modified +summary: An Error Is Reported When the Table Partition Is Modified +author: Guo Huan +date: 2021-05-24 +--- + +# An Error Is Reported When the Table Partition Is Modified + +## Symptom + +When **ALTER TABLE PARTITION** is performed, the following error message is displayed: + +``` +ERROR:start value of partition "XX" NOT EQUAL up-boundary of last partition. +``` + +## Cause Analysis + +If the **ALTER TABLE PARTITION** statement involves both the DROP PARTITION operation and the ADD PARTITION operation, MogDB always performs the DROP PARTITION operation before the ADD PARTITION operation regardless of their orders. However, performing DROP PARTITION before ADD PARTITION causes a partition gap. As a result, an error is reported. + +## Procedure + +To prevent partition gaps, set **END** in DROP PARTITION to the value of **START** in ADD PARTITION. The following is an example: + +``` +-- Create a partitioned table partitiontest. +mogdb=# CREATE TABLE partitiontest +( +c_int integer, +c_time TIMESTAMP WITHOUT TIME ZONE +) +PARTITION BY range (c_int) +( +partition p1 start(100)end(108), +partition p2 start(108)end(120) +); +-- An error is reported when the following statements are used: +mogdb=# ALTER TABLE partitiontest ADD PARTITION p3 start(120)end(130), DROP PARTITION p2; +ERROR: start value of partition "p3" NOT EQUAL up-boundary of last partition. +mogdb=# ALTER TABLE partitiontest DROP PARTITION p2,ADD PARTITION p3 start(120)end(130); +ERROR: start value of partition "p3" NOT EQUAL up-boundary of last partition. +-- Change them as follows: +mogdb=# ALTER TABLE partitiontest ADD PARTITION p3 start(108)end(130), DROP PARTITION p2; +mogdb=# ALTER TABLE partitiontest DROP PARTITION p2,ADD PARTITION p3 start(108)end(130); +``` diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/19-different-data-is-displayed.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/19-different-data-is-displayed.md new file mode 100644 index 0000000000000000000000000000000000000000..2a916055ec0340fb50fb590742c8e98e5b69eafb --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/19-different-data-is-displayed.md @@ -0,0 +1,32 @@ +--- +title: Different Data Is Displayed for the Same Table Queried By Multiple Users +summary: Different Data Is Displayed for the Same Table Queried By Multiple Users +author: Guo Huan +date: 2021-05-24 +--- + +# Different Data Is Displayed for the Same Table Queried By Multiple Users + +## Symptom + +Two users log in to the same database human_resource and run the following statement separately to query the areas table, but obtain different results. + +``` +select count(*) from areas; +``` + +## Cause Analysis + +1. Check whether tables with same names are really the same table. In a relational database, a table is identified by three elements: **database**, **schema**, and **table**. In this issue, **database** is **human_resource** and **table** is **areas**. +2. Check whether schemas of tables with the same name are consistent. Log in as users **omm** and **user01** separately. It is found that **search_path** is **public** and **$user**, respectively. As **omm** is the cluster administrator, a schema having the same name as user **omm** will not be created by default. That is, all tables will be created in **public** if no schema is specified. However, when a common user, such as **user01**, is created, the same-name schema (**user01**) is created by default. That is, all tables are created in **user01** if the schema is not specified. +3. If different users access different data in the same table, check whether objects in the table have different access policies for different users. + +## Procedure + +- For the query of tables with the same name in different schemas, add the schema reference to the queried table. The format is as follows: + + ``` + schema.table + ``` + +- If different access policies result in different query results of the same table, you can query the **pg_rlspolicy** system catalog to determine the specific access rules. diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/2-when-the-tpcc-is-running.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/2-when-the-tpcc-is-running.md new file mode 100644 index 0000000000000000000000000000000000000000..bd1a3f9de1fcf92473c9d87a263152cf3f727fe3 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/2-when-the-tpcc-is-running.md @@ -0,0 +1,20 @@ +--- +title: When the TPC-C is running and a disk to be injected is full, the TPC-C stops responding. +summary: When the TPC-C is running and a disk to be injected is full, the TPC-C stops responding. +author: Guo Huan +date: 2021-05-24 +--- + +# When the TPC-C is running and a disk to be injected is full, the TPC-C stops responding + +## Symptom + +When the TPC-C is running and a disk to be injected is full, the TPC-C stops responding. After the fault is rectified, the TPC-C automatically continues to run. + +## Cause Analysis + +When the disk where the performance log (**gs_profile**) is located is full, the database cannot write data and enters the infinite waiting state. As a result, the TPC-C stops responding. After the disk space insufficiency fault is rectified, performance logs can be properly written, and the TPC-C is restored. + +## Procedure + +Externally monitor the disk usage and periodically clean up the disk. diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/20-when-a-user-specifies-only-an-index-name.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/20-when-a-user-specifies-only-an-index-name.md new file mode 100644 index 0000000000000000000000000000000000000000..58f381dd29aaef3d3b2e076f04919f108f66d6ec --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/20-when-a-user-specifies-only-an-index-name.md @@ -0,0 +1,47 @@ +--- +title: When a User Specifies Only an Index Name to Modify the Index, A Message Indicating That the Index Does Not Exist Is Displayed +summary: When a User Specifies Only an Index Name to Modify the Index, A Message Indicating That the Index Does Not Exist Is Displayed +author: Guo Huan +date: 2021-05-24 +--- + +# When a User Specifies Only an Index Name to Modify the Index, A Message Indicating That the Index Does Not Exist Is Displayed + +## Symptom + +When a User Specifies Only an Index Name to Modify the Index, A Message Indicating That the Index Does Not Exist Is Displayed The following provides an example: + +``` +-- Create a partitioned table index HR_staffS_p1_index1, without specifying index partitions. +CREATE INDEX HR_staffS_p1_index1 ON HR.staffS_p1 (staff_ID) LOCAL; +-- Create a partitioned table index HR_staffS_p1_index2, with index partitions specified. +CREATE INDEX HR_staffS_p1_index2 ON HR.staffS_p1 (staff_ID) LOCAL +( +PARTITION staff_ID1_index, +PARTITION staff_ID2_index TABLESPACE example3, +PARTITION staff_ID3_index TABLESPACE example4 +) TABLESPACE example; +-- Change the tablespace of index partition staff_ID2_index to example1. A message is displayed, indicating that the index does not exist. +ALTER INDEX HR_staffS_p1_index2 MOVE PARTITION staff_ID2_index TABLESPACE example1; +``` + +## Cause Analysis + +The possible reason is that the user is in the public schema instead of the hr schema. + +``` +-- Run the following command to validate the inference. It is found that the calling is successful. +ALTER INDEX hr.HR_staffS_p1_index2 MOVE PARTITION staff_ID2_index TABLESPACE example1; +-- Change the schema of the current session to hr. +ALTER SESSION SET CURRENT_SCHEMA TO hr; +-- Run the following command to modify the index: +ALTER INDEX HR_staffS_p1_index2 MOVE PARTITION staff_ID2_index TABLESPACE example1; +``` + +## Procedure + +Add a schema reference to a table, index, or view. The format is as follows: + +``` +schema.table +``` diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/21-reindexing-fails.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/21-reindexing-fails.md new file mode 100644 index 0000000000000000000000000000000000000000..d8920008ea20e994d71b262654841aa64d36719f --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/21-reindexing-fails.md @@ -0,0 +1,29 @@ +--- +title: Reindexing Fails +summary: Reindexing Fails +author: Guo Huan +date: 2021-05-24 +--- + +# Reindexing Fails + +## Symptom + +When an index of the desc table is damaged, a series of operations cannot be performed. The error information may be as follows: + +``` +index \"%s\" contains corrupted page at block + %u" ,RelationGetRelationName(rel),BufferGetBlockNumber(buf), please reindex it. +``` + +## Cause Analysis + +In actual operations, indexes may break down due to software or hardware faults. For example, if disk space is insufficient or pages are damaged after indexes are split, the indexes may be damaged. + +## Procedure + +If the table is a column-store table named **pg_cudesc_xxxxx_index**, the desc index table is damaged. Find the OID and table corresponding to the primary table based on the desc index table name, and run the following statement to recreate the cudesc index. + +``` +REINDEX INTERNAL TABLE name; +``` diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/22-an-error-occurs-during-integer-conversion.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/22-an-error-occurs-during-integer-conversion.md new file mode 100644 index 0000000000000000000000000000000000000000..e353708426e46193c8907c0aee48a18e075beac9 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/22-an-error-occurs-during-integer-conversion.md @@ -0,0 +1,24 @@ +--- +title: An Error Occurs During Integer Conversion +summary: An Error Occurs During Integer Conversion +author: Guo Huan +date: 2021-05-24 +--- + +# An Error Occurs During Integer Conversion + +## Symptom + +The following error is reported during integer conversion: + +``` +Invalid input syntax for integer: "13." +``` + +## Cause Analysis + +Some data types cannot be converted to the target data type. + +## Procedure + +Gradually narrow down the range of SQL statements to determine the data types that cannot be converted. diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/23-too-many-clients-already.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/23-too-many-clients-already.md new file mode 100644 index 0000000000000000000000000000000000000000..59e5b1d1fdd39c625a12458087e634f1266b8ef6 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/23-too-many-clients-already.md @@ -0,0 +1,50 @@ +--- +title: too many clients already Is Reported or Threads Failed To Be Created in High Concurrency Scenarios +summary: too many clients already Is Reported or Threads Failed To Be Created in High Concurrency Scenarios +author: Guo Huan +date: 2021-05-24 +--- + +# "too many clients already" Is Reported or Threads Failed To Be Created in High Concurrency Scenarios + +## Symptom + +When a large number of SQL statements are concurrently executed, the error message "sorry, too many clients already" is displayed or an error is reported, indicating that threads cannot be created or processes cannot be forked. + +## Cause Analysis + +These errors are caused by insufficient OS threads. Check **ulimit -u** in the OS. If the value is too small (for example, less than 32768), the errors are caused by the OS limitation. + +## Procedure + +Run **ulimit -u** to obtain the value of **max user processes** in the OS. + +``` +[root@MogDB36 mnt]# ulimit -u +unlimited +``` + +Use the following formula to calculate the minimum value: + +``` +value=max (32768, number of instances x 8192) +``` + +The number of instances refers to the total number of instances on the node. + +To set the minimum value, add the following two lines to the **/etc/security/limits.conf** file: + +``` +* hard nproc [value] +* soft nproc [value] +``` + +The file to be modified varies based on the OS. For versions later than CentOS6, modify the **/etc/security/limits.d/90-nofile.conf** file in the same way. + +Alternatively, you can run the following command to change the value. However, the change becomes invalid upon OS restart. To solve this problem, you can add **ulimit -u [value]** to the global environment variable file **/etc/profile**. + +``` +ulimit -u [values] +``` + +In high concurrency mode, enable the thread pool to control thread resources in the database. diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/24-b-tree-index-faults.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/24-b-tree-index-faults.md new file mode 100644 index 0000000000000000000000000000000000000000..c04a36d92e7b877ab9f758e8b740cc85e5f90ac1 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/24-b-tree-index-faults.md @@ -0,0 +1,68 @@ +--- +title: B-tree Index Faults +summary: B-tree Index Faults +author: Guo Huan +date: 2021-05-24 +--- + +# B-tree Index Faults + +## Symptom + +The following error message is displayed, indicating that the index is lost occasionally. + +``` +ERROR: index 'xxxx_index' contains unexpected zero page +Or +ERROR: index 'pg_xxxx_index' contains unexpected zero page +Or +ERROR: compressed data is corrupt +``` + +## Cause Analysis + +This type of error is caused by the index fault. The possible causes are as follows: + +- The index is unavailable due to software bugs or hardware faults. +- The index contains many empty pages or almost empty pages. +- During concurrent DDL execution, the network is intermittently disconnected. +- The index failed to be created when indexes are concurrently created. +- A network fault occurs when a DDL or DML operation is performed. + +## Procedure + +Run the REINDEX command to rebuild the index. + +1. Log in to the host as the OS user **omm**. + +2. Run the following command to connect to the database: + + ``` + gsql -d mogdb -p 8000 -r + ``` + +3. Rebuild the index. + + - During DDL or DML operations, if index problems occur due to software or hardware faults, run the following command to rebuild the index: + + ``` + REINDEX TABLE tablename; + ``` + + - If the error message contains **xxxx_index**, the index of a user table is faulty. **xxxx** indicates the name of the user table. Run either of the following commands to rebuild the index: + + ``` + REINDEX INDEX indexname; + ``` + + Or + + ``` + REINDEX TABLE tablename; + ``` + + - If the error message contains **pg_xxxx_index**, the index of the system catalog is faulty. Run the following command to rebuild the index: + + ``` + REINDEX SYSTEM databasename; + ``` diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/3-standby-node-in-the-need-repair-state.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/3-standby-node-in-the-need-repair-state.md new file mode 100644 index 0000000000000000000000000000000000000000..0859df7e1d3750689a73209ac8ee0d20ad6240b0 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/3-standby-node-in-the-need-repair-state.md @@ -0,0 +1,20 @@ +--- +title: Standby Node in the Need Repair (WAL) State +summary: Standby Node in the Need Repair (WAL) State +author: Guo Huan +date: 2021-05-24 +--- + +# Standby Node in the **Need Repair (WAL)** State + +## Symptom + +The **Need Repair (WAL)** fault occurs on a standby node of the MogDB. + +## Cause Analysis + +The primary and standby DB instances are disconnected due to network faults or insufficient disk space. As a result, logs are not synchronized between the primary and standby DB instances, and the database cluster fails to start. + +## Procedure + +Run the **gs_ctl build -D** command to rebuild the faulty node. For details, see the [build parameter](4-gs_ctl#6) in the *MogDB Tool Reference*. diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/4-insufficient-memory.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/4-insufficient-memory.md new file mode 100644 index 0000000000000000000000000000000000000000..1bdc8300c9e4d89c03ef3de875625d05f261ee43 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/4-insufficient-memory.md @@ -0,0 +1,20 @@ +--- +title: Insufficient Memory +summary: Insufficient Memory +author: Guo Huan +date: 2021-05-24 +--- + +# Insufficient Memory + +## Symptom + +The client or logs contain the error message **memory usage reach the max_dynamic_memory**. + +## Cause Analysis + +The possible cause is that the value of the GUC parameter **max_process_memory** is too small. This parameter limits the maximum memory that can be used by an MogDB instance. + +## Procedure + +Use the **gs_guc** tool to adjust the value of **max_process_memory**. Note that you need to restart the instance for the modification to take effect. diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/5-service-startup-failure.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/5-service-startup-failure.md new file mode 100644 index 0000000000000000000000000000000000000000..ee94999f57c8d23d931e9cd8b3994580ce93337d --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/5-service-startup-failure.md @@ -0,0 +1,88 @@ +--- +title: Service Startup Failure +summary: Service Startup Failure +author: Guo Huan +date: 2021-05-24 +--- + +# Service Startup Failure + +## Symptom + +The service startup failed. + +## Cause Analysis + +- Parameters are set to improper values, resulting in insufficient system resources in the database cluster, or parameter settings do not meet the internal restrictions in the cluster. +- The status of some DNs is abnormal. +- Permissions to modify directories are insufficient. For example, users do not have sufficient permissions for the **/tmp** directory or the data directory in the cluster. +- The configured port has been occupied. +- The system firewall is enabled. +- The trust relationship between servers of the database in the cluster is abnormal. + +## Procedure + +- Check whether the parameter configurations are improper or meet internal constraints. + + - Log in to the node that cannot be started. Check the run logs and check whether the resources are insufficient or whether the parameter configurations meet internal constraints. For example, if the message "Out of memory" or the following error information is displayed, the resources are insufficient, the startup fails, or the configuration parameters do not meet the internal constraints. + + ``` + FATAL: hot standby is not possible because max_connections = 10 is a lower setting than on the master server (its value was 100) + ``` + + - Check whether the GUC parameters are set to proper values. For example, check parameters, such as **shared_buffers**, **effective_cache_size**, and **bulk_write_ring_size** that consume much resources, or parameter **max_connections** that cannot be easily set to a value that is less than its last value. For details about how to view and set GUC parameters, see Configuring Running Parameters. + +- Check whether the status of some DNs is abnormal. Check the status of each primary and standby instances in the current cluster using **gs_om -t status -detail**. + + - If the status of all the instances on a host is abnormal, replace the host. + + - If the status of an instance is **Unknown**, **Pending**, or **Down**, log in to the node where the instance resides as a cluster user to view the instance log and identify the cause. For example: + + ``` + 2014-11-27 14:10:07.022 CST 140720185366288 FATAL: database "postgres" does not exist 2014-11-27 14:10:07.022 CST 140720185366288 DETAIL: The database subdirectory "base/ 13252" is missing. + ``` + + If the preceding information is displayed in a log, files stored in the data directory where the DN resides are damaged, and the instance cannot be queried. You cannot execute normal queries to this instance. + +- Check whether users have sufficient directory permissions. For example, users do not have sufficient permissions for the **/tmp** directory or the data directory in the cluster. + + - Determine the directory for which users have insufficient permissions. + - Run the **chmod** command to modify directory permissions as required. The database user must have read/write permissions for the **/tmp** directory. To modify permissions for data directories, refer to the settings for data directories with sufficient permissions. + +- Check whether the configured ports have been occupied. + + - Log in to the node that cannot be started and check whether the instance process exists. + + - If the instance process does not exist, view the instance log to check the exception reasons. For example: + + ``` + 2014-10-17 19:38:23.637 CST 139875904172320 LOG: could not bind IPv4 socket at the 0 time: Address already in use 2014-10-17 19:38:23.637 CST 139875904172320 HINT: Is another postmaster already running on port 40005? If not, wait a few seconds and retry. + ``` + + If the preceding information is displayed in a log, the TCP port on the DN has been occupied, and the instance cannot be started. + + ``` + 2015-06-10 10:01:50 CST 140329975478400 [SCTP MODE] WARNING: (sctp bind) bind(socket=9, [addr:0.0.0.0,port:1024]):Address already in use -- attempt 10/10 2015-06-10 10:01:50 CST 140329975478400 [SCTP MODE] ERROR: (sctp bind) Maximum bind() attempts. Die now... + ``` + + If the preceding information is displayed in a log, the SCTP port on the DN has been occupied, and the instance cannot be started. + +- Run **sysctl -a** to view the **net.ipv4.ip_local_port_range** parameter. If this port configured for this instance is within the range of the port number randomly occupied by the system, modify the value of **net.ipv4.ip_local_port_range**, ensuring that all the instance port numbers in the XML file are beyond this range. Check whether a port has been occupied: + + ``` + netstat -anop | grep Port number + ``` + + The following is an example: + + ``` + [root@MogDB36 ~]# netstat -anop | grep 15970 + tcp 0 0 127.0.0.1:15970 0.0.0.0:* LISTEN 3920251/mogdb off (0.00/0/0) + tcp6 0 0 ::1:15970 :::* LISTEN 3920251/mogdb off (0.00/0/0) + unix 2 [ ACC ] STREAM LISTENING 197399441 3920251/mogdb /tmp/.s.PGSQL.15970 + unix 3 [ ] STREAM CONNECTED 197461142 3920251/mogdb /tmp/.s.PGSQL.15970 + ``` + +- Check whether the system firewall is enabled. + +- Check whether the mutual trust relationship is abnormal. Reconfigure the mutual trust relationship between servers in the cluster. diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/6-error-no-space-left-on-device-is-displayed.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/6-error-no-space-left-on-device-is-displayed.md new file mode 100644 index 0000000000000000000000000000000000000000..e87b496e6a84ebfce718b53fc29c53f8f1753a5a --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/6-error-no-space-left-on-device-is-displayed.md @@ -0,0 +1,64 @@ +--- +title: Error:No space left on device Is Displayed +summary: Error:No space left on device Is Displayed +author: Guo Huan +date: 2021-05-24 +--- + +# "Error:No space left on device" Is Displayed + +## Symptom + +The following error message is displayed when the cluster is being used: + +``` +Error:No space left on device +``` + +## Cause Analysis + +The disk space is insufficient. + +## Procedure + +- Run the following command to check the disk usage. The **Avail** column indicates the available disk space, and the **Use%** column indicates the percentage of disk space that has been used. + + ``` + [root@openeuler123 mnt]# df -h + Filesystem Size Used Avail Use% Mounted on + devtmpfs 255G 0 255G 0% /dev + tmpfs 255G 35M 255G 1% /dev/shm + tmpfs 255G 57M 255G 1% /run + tmpfs 255G 0 255G 0% /sys/fs/cgroup + /dev/mapper/openeuler-root 196G 8.8G 178G 5% / + tmpfs 255G 1.0M 255G 1% /tmp + /dev/sda2 9.8G 144M 9.2G 2% /boot + /dev/sda1 10G 5.8M 10G 1% /boot/efi + ``` + + The demand for remaining disk space depends on the increase in service data. Suggestions: + + - Check the disk space usage status, ensuring that the remaining space is sufficient for the growth of disk space for over one year. + - If the disk space usage exceeds 60%, you must clear or expand the disk space. + +- Run the following command to check the size of the data directory. + + ``` + du --max-depth=1 -h /mnt/ + ``` + + The following information is displayed. The first column shows the sizes of directories or files, and the second column shows all the sub-directories or files under the **/mnt/** directory. + + ``` + [root@MogDB36 mnt]# du --max-depth=1 -h /mnt + 83G /mnt/data3 + 71G /mnt/data2 + 365G /mnt/data1 + 518G /mnt + ``` + +- Clean up the disk space. You are advised to periodically back up audit logs to other storage devices. The recommended log retention period is one month. **pg_log** stores database process run logs which help database administrators locate faults. You can delete error logs if you view them every day and handle errors in time. + +- Delete useless data. Back up data that is not used frequently or used for a certain period of time to storage media with lower costs, and clean the backup data to free up disk space. + +- If the disk space is still insufficient, expand the disk capacity. diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/7-after-you-run-the-du-command.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/7-after-you-run-the-du-command.md new file mode 100644 index 0000000000000000000000000000000000000000..e2e8d4fabff6f1c8be852b6b410f681780b2f1b3 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/7-after-you-run-the-du-command.md @@ -0,0 +1,32 @@ +--- +title: After You Run the du Command to Query Data File Size In the XFS File System, the Query Result Is Greater than the Actual File Size +summary: After You Run the du Command to Query Data File Size In the XFS File System, the Query Result Is Greater than the Actual File Size +author: Guo Huan +date: 2021-05-24 +--- + +# After You Run the du Command to Query Data File Size In the XFS File System, the Query Result Is Greater than the Actual File Size + +## Symptom + +After you run the **du** command to query data file size in the cluster, the query result is probably greater than the actual file size. + +``` + du -sh file +``` + +## Cause Analysis + +The XFS file system has a pre-assignment mechanism. The file size is determined by the **allocsize** parameter. The file size displayed by the **du** command includes the pre-assigned disk space. + +## Procedure + +- Select the default value (64 KB) for the XFS file system mount parameter allocsize to eliminate the problem. + +- Add the **-apparent-size** parameter when using the **du** command to query the actual file size. + + ``` + du -sh file --apparent-size + ``` + +- If the XFS file system reclaims the pre-assigned space of a file, the **du** command displays the actual file size. diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/8-file-is-damaged-in-the-xfs-file-system.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/8-file-is-damaged-in-the-xfs-file-system.md new file mode 100644 index 0000000000000000000000000000000000000000..8b1f081fc8cfd54c85e9604e83e75f1aa420ec57 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/8-file-is-damaged-in-the-xfs-file-system.md @@ -0,0 +1,22 @@ +--- +title: File Is Damaged in the XFS File System +summary: File Is Damaged in the XFS File System +author: Guo Huan +date: 2021-05-24 +--- + +# File Is Damaged in the XFS File System + +## Symptom + +When a cluster is in use, error reports such as an input/output error or the structure needs cleaning generally do not occur in the XFS file system. + +## Cause Analysis + +The XFS file system is abnormal. + +## Procedure + +Try to mount or unmount the file system to check whether the problem can be solved. + +If the problem recurs, refer to the file system document, such as **xfs_repair**, and ask the system administrator to restore the file system. After the file system is repaired, run the **gs_ctl build** command to restore the damaged DNs. diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/9-primary-node-is-hung-in-demoting.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/9-primary-node-is-hung-in-demoting.md new file mode 100644 index 0000000000000000000000000000000000000000..36409ffe18838895b4fc9ae61c93f7622c57b37b --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-cases/9-primary-node-is-hung-in-demoting.md @@ -0,0 +1,24 @@ +--- +title: Primary Node Is Hung in Demoting During a Switchover +summary: Primary Node Is Hung in Demoting During a Switchover +author: Guo Huan +date: 2021-05-24 +--- + +# Primary Node Is Hung in Demoting During a Switchover + +## Symptom + +In a cluster deployed with one primary and multiple standby DNs, if system resources are insufficient and a switchover occurs, a node is hung in demoting. + +## Cause Analysis + +If system resources are insufficient, the third-party management thread cannot be created. As a result, the managed sub-threads cannot exit and the primary node is hung in demoting. + +## Procedure + +Run the following command to stop the process of the primary node so that the standby node can be promoted to primary: Perform the following operations only in the preceding scenario. + +``` + kill -9 PID +``` diff --git a/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-methods.md b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-methods.md new file mode 100644 index 0000000000000000000000000000000000000000..8f5c488c81e86c202774b2e3aed4f48a85cb860d --- /dev/null +++ b/product/en/docs-mogdb/v3.0/common-faults-and-identification/common-fault-locating-methods.md @@ -0,0 +1,283 @@ +--- +title: Common Fault Locating Methods +summary: Common Fault Locating Methods +author: Guo Huan +date: 2021-05-24 +--- + +# Common Fault Locating Methods + +## **Locating OS Faults** + +If all instances on a node are abnormal, an OS fault may have occurred. + +Use one of the following methods to check whether any OS fault occurs: + +- Log in to the node using SSH or other remote login tools. If the login fails, run the **ping** command to check the network status. + + - If no response is returned, the server is down or being restarted, or its network connection is abnormal. + + The restart takes a long time (about 20 minutes) if the system crashes due to an OS kernel panic. Try to connect the host every 5 minutes. If the connection failed 20 minutes later, the server is down or the network connection is abnormal. In this case, contact the administrator to locate the fault on site. + + - If ping operations succeed but SSH login fails or commands cannot be executed, the server does not respond to external connections possibly because system resources are insufficient (for example, CPU or I/O resources are overloaded). In this case, try again. If the fault persists within 5 minutes, contact the administrator for further fault locating on site. + +- If login is successful but responses are slow, check the system running status, such as collecting system information as well as checking system version, hardware, parameter setting, and login users. The following are common commands for reference: + + - Use the **who** command to check online users. + + ``` + [root@MogDB36 ~]# who + root pts/0 2020-11-07 16:32 (10.70.223.238) + wyc pts/1 2020-11-10 09:54 (10.70.223.222) + root pts/2 2020-10-10 14:20 (10.70.223.238) + root pts/4 2020-10-09 10:14 (10.70.223.233) + root pts/5 2020-10-09 10:14 (10.70.223.233) + root pts/7 2020-10-31 17:03 (10.70.223.222) + root pts/9 2020-10-20 10:03 (10.70.220.85) + ``` + + - Use the **cat /etc/openEuler-release** and **uname -a** commands to check the system version and kernel information. + + ``` + [root@MogDB36 ~]# cat /etc/openEuler-release + openEuler release 20.03 (LTS) + [root@MogDB36 ~]# uname -a + Linux MogDB36 4.19.90-2003.4.0.0036.oe1.aarch64 #1 SMP Mon Mar 23 19:06:43 UTC 2020 aarch64 aarch64 aarch64 GNU/Linux + [root@MogDB36 ~]# + ``` + + - Use the **sysctl -a** (run this command as user **root**) and **cat /etc/sysctl.conf** commands to obtain system parameter information. + + - Use the **cat /proc/cpuinfo** and **cat /proc/meminfo** commands to obtain CPU and memory information. + + ``` + [root@MogDB36 ~]# cat /proc/cpuinfo + processor : 0 + BogoMIPS : 200.00 + Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm + CPU implementer : 0x48 + CPU architecture: 8 + CPU variant : 0x1 + CPU part : 0xd01 + CPU revision : 0 + [root@MogDB36 ~]# cat /proc/meminfo + MemTotal: 534622272 kB + MemFree: 253322816 kB + MemAvailable: 369537344 kB + Buffers: 2429504 kB + Cached: 253063168 kB + SwapCached: 0 kB + Active: 88570624 kB + Inactive: 171801920 kB + Active(anon): 4914880 kB + Inactive(anon): 67011456 kB + Active(file): 83655744 kB + Inactive(file): 104790464 kB + ``` + + - Use the **top -H** command to query the CPU usage and check whether the CPU usage is high due to a specific process. If it is, use the **gdb** or **gstack** command to print the stack trace of this process and check whether this process is in an infinite loop. + + - Use the **iostat -x 1 3** command to query the I/O usage and check whether the I/O usage of the current disk is full. View the ongoing jobs to determine whether to handle the jobs with high I/O usage. + + - Use the **vmstat 1 3** command to query the memory usage in the current system and use the **top** command to obtain the processes with unexpectedly high memory usage. + + - View the OS logs (**/var/log/messages**) or dmseg information as user **root** to check whether errors have occurred in the OS. + + - The watchdog of an OS is a mechanism to ensure that the OS runs properly or exits from the infinite loop or deadlock state. If the watchdog times out (the default value is 60s), the system resets. + +## **Locating Network Faults** + +When the database runs normally, the network layer is transparent to upper-layer users. However, during the long-term operation of a database cluster, network exceptions or errors may occur. Common exceptions caused by network faults are as follows: + +- Network error reported due to database startup failure. +- Abnormal status, for example, all instances on a host are in the **UnKnown** state, or all services are switched over to standby instances. +- Network connection failure. +- Network disconnection reported during database sql query. +- Process response failures during database connection or query execution. When a network fault occurs in a database, locate and analyze the fault by using network-related Linux command tools (such as **ping**, **ifconfig**, **netstat**, and **lsof**) and process stack viewers (such as **gdb** and **gstack**) based on database log information. This section lists common network faults and describes how to analyze and locate faults. + +Common faults are as follows: + +- Network error reported due to a startup failure + + **Symptom 1**: The log contains the following error information. The port may be listened on by another process. + + ``` + LOG: could not bind socket at the 10 time, is another postmaster already running on port 54000? + ``` + + **Solution**: Run the following command to check the process that listens on the port. Replace the port number with the actual one. + + ``` + [root@MogDB36 ~]# netstat -anop | grep 15970 + tcp 0 0 127.0.0.1:15970 0.0.0.0:* LISTEN 3920251/mogdb off (0.00/0/0) + tcp6 0 0 ::1:15970 :::* LISTEN 3920251/mogdb off (0.00/0/0) + unix 2 [ ACC ] STREAM LISTENING 197399441 3920251/mogdb /tmp/.s.PGSQL.15970 + unix 3 [ ] STREAM CONNECTED 197461142 3920251/mogdb /tmp/.s.PGSQL.15970 + + ``` + + Forcibly stop the process that is occupying the port or change the listening port of the database based on the query result. + + **Symptom 2**: When the **gs_om -t status -detail** command is used to query status, the command output shows that the connection between the primary and standby nodes is not established. + + **Solution**: In openEuler, run the **systemctl status firewalld.service** command to check whether the firewall is enabled on this node. If it is enabled, run the **systemctl stop firewalld.service** command to disable it. + + ``` + [root@MogDB36 mnt]# systemctl status firewalld.service + ●firewalld.service - firewalld - dynamic firewall daemon + Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled) + Active: inactive (dead) + Docs: man:firewalld(1) + ``` + + The command varies according to the operating system. You can run the corresponding command to view and modify the configuration. + +- The database is abnormal. + + **Symptom**: The following problems occur on a node: + + - All instances are in the **Unknown** state. + - All primary instances are switched to standby instances. + - Errors "Connection reset by peer" and "Connection timed out" are frequently displayed. + + **Solution** + + - If you cannot connect to the faulty server through SSH, run the **ping** command on other servers to send data packages to the faulty server. If the ping operation succeeds, connection fails because resources such as memory, CPUs, and disks, on the faulty server are used up. + + - Connect to the faulty server through through SSH and run the **/sbin/ifconfig eth ?** command every other second (replace the question mark (?) with the number indicating the position of the NIC). Check value changes of **dropped** and **errors**. If they increase rapidly, the NIC or NIC driver may be faulty. + + ``` + [root@MogDB36 ~]# ifconfig enp125s0f0 + enp125s0f0: flags=4163 mtu 1500 + inet 10.90.56.36 netmask 255.255.255.0 broadcast 10.90.56.255 + inet6 fe80::7be7:8038:f3dc:f916 prefixlen 64 scopeid 0x20 + ether 44:67:47:7d:e6:84 txqueuelen 1000 (Ethernet) + RX packets 129344246 bytes 228050833914 (212.3 GiB) + RX errors 0 dropped 647228 overruns 0 frame 0 + TX packets 96689431 bytes 97279775245 (90.5 GiB) + TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 + ``` + + - Check whether the following parameters are correctly configured: + + ``` + net.ipv4.tcp_retries1 = 3 + net.ipv4.tcp_retries2 = 15 + ``` + +- Network connection failure. + + **Symptom 1**: A node fails to connect to other nodes, and the "Connection refused" error is reported in the log. + + **Solution** + + - Check whether the port is incorrectly configured, resulting in that the port used for connection is not the listening port of the peer end. Check whether the port number recorded in the **postgresql.conf** configuration file of the faulty node is the same as the listening port number of the peer end. + - Check whether the peer listening port is normal (for example, by running the **netstat -anp** command). + - Check whether the peer process exists. + + **Symptom 2**: When SQL operations are performed on the database, the connection descriptor fails to be obtained. The following error information is displayed: + + ``` + WARNING: 29483313: incomplete message from client:4905,9 + WARNING: 29483313: failed to receive connDefs at the time:1. + ERROR: 29483313: failed to get pooled connections + ``` + + In logs, locate and view the log content before the preceding error messages, which are generated due to incorrect active and standby information. Error details are displayed as follows. + + ``` + FATAL: dn_6001_6002: can not accept connection in pending mode. + FATAL: dn_6001_6002: the database system is starting up + FATAL: dn_6009_6010: can not accept connection in standby mode. + ``` + + **Solution** + + - Run the **gs_om -t status -detail** command to query the status and check whether an primary/standby switchover has occurred. Reset the instance status. + - In addition, check whether a core dump or restart occurs on the node that fails to be connected. In the om log, check whether restart occurs. + +- Network disconnection reported during database sql query. + + **Symptom 1**: The query fails, and the following error information is displayed: + + ``` + ERROR: dn_6065_6066: Failed to read response from Datanodes. Detail: Connection reset by peer. Local: dn_6065_6066 Remote: dn_6023_6024 + ERROR: Failed to read response from Datanodes Detail: Remote close socket unexpectedly + ERROR: dn_6155_6156: dn_6151_6152: Failed to read vector response from Datanodes + ``` + + If the connection fails, the error information may be as follows: + + ``` + ERROR: Distribute Query unable to connect 10.145.120.79:14600 [Detail:stream connect connect() fail: Connection timed out + ERROR: Distribute Query unable to connect 10.144.192.214:12600 [Detail:receive accept response fail: Connection timed out + ``` + + **Solution** + + 1. Use **gs_check** to check whether the network configuration meets requirements. For network check, see "Tool Reference > Server Tools > [gs_check](1-gs_check)" in the *Reference Guide*. + 2. Check whether a process core dump, restart, or switchover occurs. + 3. If problems still exist, contact network technical engineers. + +## **Locating Disk Faults** + +Common disk faults include insufficient disk space, bad blocks of disks, and unmounted disks. Disk faults such as unmount of disks damage the file system. The cluster management mechanism identifies this kind of faults and stops the instance, and the instance status is **Unknown**. However, disk faults such as insufficient disk space do not damage the file system. The cluster management mechanism cannot identify this kind of faults and service processes exit abnormally when accessing a faulty disk. Failures cover database startup, checksum verification, page read and write operation, and page verification. + +- For faults that result in file system damages, the instance status is **Unknown** when you view the host status. Perform the following operations to locate the disk fault: + + - Check the logs. If the logs contain information similar to "data path disc writable test failed", the file system is damaged. + + - The possible cause of file system damage may be unmounted disks. Run the **ls -l** command and you can view that the disk directory permission is abnormal, as shown in the following: + + - Another possible cause is that the disk has bad blocks. In this case, the OS rejects read and write operations to protect the file system. You can use a bad block check tool, for example, **badblocks**, to check whether bad blocks exist. + + ``` + [root@openeuler123 mnt]# badblocks /dev/sdb1 -s -v + Checking blocks 0 to 2147482623 + Checking for bad blocks (read-only test): done + Pass completed, 0 bad blocks found. (0/0/0 errors) + ``` + +- For faults that do not damage the file system, the service process will report an exception and exit when it accesses the faulty disk. Perform the following operations to locate the disk fault: + + View logs. The log contains read and write errors, such as "No space left on device" and "invalid page header n block 122838 of relation base/16385/152715". Run the **df -h** command to check the disk space. If the disk usage is 100% as shown below, the read and write errors are caused by insufficient disk space: + + ``` + [root@openeuler123 mnt]# df -h + Filesystem Size Used Avail Use% Mounted on + devtmpfs 255G 0 255G 0% /dev + tmpfs 255G 35M 255G 1% /dev/shm + tmpfs 255G 57M 255G 1% /run + tmpfs 255G 0 255G 0% /sys/fs/cgroup + /dev/mapper/openeuler-root 196G 8.8G 178G 5% / + tmpfs 255G 1.0M 255G 1% /tmp + /dev/sda2 9.8G 144M 9.2G 2% /boot + /dev/sda1 10G 5.8M 10G 1% /boot/efi + /dev/mapper/openeuler-home 1.5T 69G 1.4T 5% /home + tmpfs 51G 0 51G 0% /run/user/0 + tmpfs 51G 0 51G 0% /run/user/1004 + /dev/sdb1 2.0T 169G 1.9T 9% /data + ``` + +## **Locating Database Faults** + +- Logs. Database logs record the operations (starting, running, and stopping) on servers. Database users can view logs to quickly locate fault causes and rectify the faults accordingly. + +- View. A database provides different views to display its internal status. When locating a fault, you can use: + + - **pg_stat_activity**: shows the status of each session on the current instance. + - **pg_thread_wait_status**: shows the wait events of each thread on the current instance. + - **pg_locks**: shows the status of locks on the current instance. + +- Core files. Abnormal termination of a database process will trigger a core dump. A core dump file helps locate faults and determine fault causes. Once a core dump occurs during process running, collect the core file immediately for further analyzing and locating the fault. + + - The OS performance is affected, especially when errors occur frequently. + + - The OS disk space will be occupied by core files. Therefore, after core files are discovered, locate and rectify the errors as soon as possible. The OS is delivered with a core dump mechanism. If this mechanism is enabled, core files are generated for each core dump, which has an impact on the OS performance and disk space. + + - Set the path for generating core files. Modify the **/proc/sys/kernel/core_pattern** file. + + ``` + [root@openeuler123 mnt]# cat /proc/sys/kernel/core_pattern + /data/jenkins/workspace/MogDBInstall/dbinstall/cluster/corefile/core-%e-%p-%t + ``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/1-1-stored-procedure.md b/product/en/docs-mogdb/v3.0/developer-guide/1-1-stored-procedure.md new file mode 100644 index 0000000000000000000000000000000000000000..8ca0fae50f048d3a248f97b320dd97f7722446d2 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/1-1-stored-procedure.md @@ -0,0 +1,16 @@ +--- +title: Stored Procedure +summary: Stored Procedure +author: Guo Huan +date: 2021-03-04 +--- + +# Stored Procedure + +In MogDB, business rules and logics are saved as stored procedures. + +A stored procedure is a combination of SQL and PL/pgSQL. Stored procedures can move the code that executes business rules from applications to databases. Therefore, the code storage can be used by multiple programs at a time. + +For details about how to create and call a stored procedure, see [CREATE PROCEDURE](53-CREATE-PROCEDURE). + +The application methods for PL/pgSQL functions mentioned in [User-defined Functions](user-defined-functions) are similar to those for stored procedures. For details, please refer to [PL/pgSQL-SQL Procedural Language](1-1-plpgsql-overview) section, unless otherwise specified, the contents apply to stored procedures and user-defined functions. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/1-AI-features-overview.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/1-AI-features-overview.md new file mode 100644 index 0000000000000000000000000000000000000000..23509eddd8b0c8b2a2151688327c2465b85849ef --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/1-AI-features-overview.md @@ -0,0 +1,13 @@ +--- +title: Overview +summary: Overview +author: Guo Huan +date: 2021-05-19 +--- + +# Overview + +The history of artificial intelligence (AI) can be dated back to as early as the 1950s, even longer than the history of the database development. However, the AI technology has not been applied on a large scale for a long time due to various objective factors, and even experienced several obvious troughs. With the further development of information technologies in recent years, factors that restrict the AI development have been gradually weakened, and the AI, big data, and cloud computing (ABC) technologies are born. The combination of AI and databases has been a trending research topic in the industry in recent years. Our database team has participated in the exploration of this domain earlier and achieved phased achievements. The AI feature submodule **dbmind** is more independent than other database components. It can be divided into AI4DB and DB4AI. + +- AI4DB uses AI technologies to optimize database execution performance as well as achieve autonomy and O&M free. It includes self-tuning, self-diagnosis, self-security, self-O&M, and self-healing. +- DB4AI streamlines the E2E process from databases to AI applications and unifies the AI technology stack to achieve out-of-the-box, high performance, and cost saving. For example, you can use SQL-like statements to use functions such as recommendation system, image retrieval, and time series forecasting to maximize the advantages of MogDB, such as high parallelism and column store. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/2-predictor-ai-query-time-forecasting/2-1-ai-query-time-forecasting-overview.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/2-predictor-ai-query-time-forecasting/2-1-ai-query-time-forecasting-overview.md new file mode 100644 index 0000000000000000000000000000000000000000..feecfb2dcb7c94ed003e23d3a1a36c2bae5e29f4 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/2-predictor-ai-query-time-forecasting/2-1-ai-query-time-forecasting-overview.md @@ -0,0 +1,12 @@ +--- +title: Overview +summary: Overview +author: Guo Huan +date: 2021-05-19 +--- + +# Overview + +Predictor is a query time prediction tool that leverages machine learning and has online learning capability. By continuously learning the historical execution information collected in the database, Predictor can predict the execution time of a plan. + +To use this tool, you must start the Python process AI Engine for model training and inference. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/2-predictor-ai-query-time-forecasting/2-2-environment-deployment.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/2-predictor-ai-query-time-forecasting/2-2-environment-deployment.md new file mode 100644 index 0000000000000000000000000000000000000000..ecfc23a2e893ec55250e68dd225f6daf16eeef70 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/2-predictor-ai-query-time-forecasting/2-2-environment-deployment.md @@ -0,0 +1,203 @@ +--- +title: Environment Deployment +summary: Environment Deployment +author: Guo Huan +date: 2021-05-19 +--- + +# Environment Deployment + +## Prerequisites + +MogDB is in the normal state. The user has logged in to MogDB with an authenticated identity. The executed SQL syntax is correct and no error is reported. In the historical performance data window, the number of MogDB concurrent tasks is stable, the structure and number of tables remain unchanged, the data volume changes smoothly, and the GUC parameters related to query performance remain unchanged. During prediction, the model has been trained and converged. The running environment of the AIEngine is stable. + +## Request Example + +The AIEngine process communicates with the kernel process using HTTPS. An example request is as follows: + +```bash +curl -X POST -d '{"modelName":"modelname"}' -H 'Content-Type: application/json' 'https://IP-address:port/request-API' +``` + +**Table 1** AIEngine external APIs + +| Request API | Description | +| :------------- | :------------------------------------------ | +| /check | Checks whether a model is properly started. | +| /configure | Sets model parameters. | +| /train | Trains a model. | +| /track_process | Views model training logs. | +| /setup | Loads historical models. | +| /predict | Predicts a model. | + +## Generating Certificates + +Before using the prediction function, you need to use OpenSSL to generate certificates required for authentication between the communication parties, ensuring communication security. + +1. Set up a certificate generation environment. The certificate file is stored in **$GAUSSHOME/CA**. + + - Copy the certificate generation script and related files. + + ```bash + cp path_to_predictor/install/ssl.sh $GAUSSHOME/ + cp path_to_predictor/install/ca_ext.txt $GAUSSHOME/ + ``` + + - Copy the configuration file **openssl.cnf** to **$GAUSSHOME**. + + ```bash + cp $GAUSSHOME/share/om/openssl.cnf $GAUSSHOME/ + ``` + + - Modify the configuration parameters in **openssl.conf**. + + ```bash + dir = $GAUSSHOME/CA/demoCA + default_md = sha256 + ``` + + - The certificate generation environment is ready. + +2. Generate a certificate and private key. + + ```bash + cd $GAUSSHOME + sh ssl.sh + ``` + + - Set the password as prompted, for example, **Test@123**. + + - The password must contain at least eight characters of at least three different types. + + ```bash + Please enter your password: + ``` + + - Set the options as prompted. + + ```bash + Certificate Details: + Serial Number: 1 (0x1) + Validity + Not Before: May 15 08:32:44 2020 GMT + Not After : May 15 08:32:44 2021 GMT + Subject: + countryName = CN + stateOrProvinceName = SZ + organizationName = HW + organizationalUnitName = GS + commonName = CA + X509v3 extensions: + X509v3 Basic Constraints: + CA:TRUE + Certificate is to be certified until May 15 08:32:44 2021 GMT (365 days) + Sign the certificate? [y/n]:y + 1 out of 1 certificate requests certified, commit? [y/n]y + ``` + + - Enter the IP address for starting the AIEngine, for example, **127.0.0.1**. + + ```bash + Please enter your aiEngine IP: 127.0.0.1 + ``` + + - Set the options as prompted. + + ```bash + Certificate Details: + Serial Number: 2 (0x2) + Validity + Not Before: May 15 08:38:07 2020 GMT + Not After : May 13 08:38:07 2030 GMT + Subject: + countryName = CN + stateOrProvinceName = SZ + organizationName = HW + organizationalUnitName = GS + commonName = 127.0.0.1 + X509v3 extensions: + X509v3 Basic Constraints: + CA:FALSE + Certificate is to be certified until May 13 08:38:07 2030 GMT (3650 days) + Sign the certificate? [y/n]:y + 1 out of 1 certificate requests certified, commit? [y/n]y + ``` + + - Enter the IP address for starting MogDB, for example, **127.0.0.1**. + + ```bash + Please enter your mog IP: 127.0.0.1 + ``` + + - Set the options as prompted. + + ```bash + Certificate Details: + Serial Number: 3 (0x3) + Validity + Not Before: May 15 08:41:46 2020 GMT + Not After : May 13 08:41:46 2030 GMT + Subject: + countryName = CN + stateOrProvinceName = SZ + organizationName = HW + organizationalUnitName = GS + commonName = 127.0.0.1 + X509v3 extensions: + X509v3 Basic Constraints: + CA:FALSE + Certificate is to be certified until May 13 08:41:46 2030 GMT (3650 days) + Sign the certificate? [y/n]:y + 1 out of 1 certificate requests certified, commit? [y/n]y + ``` + + - The related certificate and key have been generated. The content in **$GAUSSHOME/CA** is as follows: + + ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/environment-deployment-1.png) + +## Setting Up the Environment + +1. Copy the tool code folder to the target environment. + + - Assume that the installation directory is **$INSTALL_FOLDER**. + + - Assume that the destination directory is **/home/ai_user**. + + ```bash + scp -r $INSTALL_FOLDER/bin/dbmind/predictor ai_user@127.0.0.1:path_to_Predictor + ``` + +2. Copy the CA certificate folder to a directory in the AIEngine environment. + + ```bash + cp -r $GAUSSHOME/CA ai_user@127.0.0.1:path_to_CA + ``` + +3. Install the **predictor/install/requirements(-gpu).txt** tool. + + ```bash + With GPU: pip install -r requirements-gpu.txt + Without GPU: pip install -r requirements.txt + ``` + +## Starting AIEngine + +1. Switch to the AIEngine environment (that is, copy the target environment **ai_user** of the predictor). + + Set parameters in **predictor/python/settings.py**. + + ```bash + DEFAULT_FLASK_SERVER_HOST = '127.0.0.1' (running IP address of AIEngine) + DEFAULT_FLASK_SERVER_PORT = '5000' (running port number of AIEngine) + PATH_SSL = "path_to_CA" (CA folder path) + ``` + +2. Run the startup script of AIEngine. + + ```bash + python path_to_Predictor/python/run.py + ``` + + In this case, the AIEngine keeps enabled on the corresponding port and waits for the request of the time prediction function from the kernel. + + For details about how to initiate a time prediction command from the kernel, see the [Usage Guide](2-3-usage-guide). diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/2-predictor-ai-query-time-forecasting/2-3-usage-guide.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/2-predictor-ai-query-time-forecasting/2-3-usage-guide.md new file mode 100644 index 0000000000000000000000000000000000000000..03e412ac9a9968c21f47cb2a5be4808c9e3adc0a --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/2-predictor-ai-query-time-forecasting/2-3-usage-guide.md @@ -0,0 +1,214 @@ +--- +title: Usage Guide +summary: Usage Guide +author: Guo Huan +date: 2021-05-19 +--- + +# Usage Guide + +## Data Collection + +1. Enable data collection. + + a. Set parameters related to the ActiveSQL operator. + + ```bash + enable_resource_track=on + resource_track_level=operator + enable_resource_record=on + resource_track_cost=10 (The default value is 100000.) + ``` + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + > + > - The value of **resource_track_cost** must be smaller than the total query cost of the information to be collected. Only the information that meets the requirements can be collected. + > - Cgroup functions are available. + + b. Collect information. + + Execute the service query statement. + + View data collected in real time. + + ```sql + select * from gs_wlm_plan_operator_history; + ``` + + Expected result: All jobs that meet **resource_track_duration** and **resource_track_cost** are collected. + +2. Disable data collection. + + a. Set any of the following parameters related to the ActiveSQL operator. + + ```bash + enable_resource_track=off + resource_track_level=none + resource_track_level=query + ``` + + b. Execute the service query statement. + + Wait for 3 minutes and check the data on the current node. + + ```sql + select * from gs_wlm_plan_operator_info; + ``` + + Expected result: No new data is added to the tables and views. + +3. Persist data. + + a. Set parameters related to the ActiveSQL operator. + + ```bash + enable_resource_track=on + resource_track_level=operator + enable_resource_record=on + resource_track_duration=0 (The default value is 60s.) + resource_track_cost=10 (The default value is 100000.) + ``` + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + > + > - The value of **resource_track_cost** must be smaller than the total query cost of the information to be collected. Only the information that meets the requirements can be collected. + > - Cgroup functions are available. + + b. Execute the service query statement. + + Wait for 3 minutes and check the data on the current node. + + ```sql + select * from gs_wlm_plan_operator_info; + ``` + + Expected result: All jobs that meet **resource_track_duration** and **resource_track_cost** are collected. + +## Model Management (System Administrators) + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> Model management operations can be performed only when the database is normal. + +1. Add a new model. + + INSERT INTO gs_opt_model values('……'); + + Example: + + ```sql + INSERT INTO gs_opt_model values('rlstm', 'model_name', 'datname', '127.0.0.1', 5000, 2000, 1, -1, 64, 512, 0 , false, false, '{S, T}', '{0,0}', '{0,0}', 'Text'); + ``` + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + > + > - For details about model parameter settings, see [GS_OPT_MODEL](GS_OPT_MODEL). + > - Currently, only **rlstm** is supported in the **template_name** column. + > - The values in the **datname** column must be the same as those in the database used for model usage and training. Otherwise, the model cannot be used. + > - The values in the **model_name** column must meet the **unique** constraint. + > - For details about other parameter settings, see [Best Practices](2-4-best-practices). + +2. Modify model parameters. + + ```sql + UPDATE gs_opt_model SET = WHERE model_name = ; + ``` + +3. Delete a model. + + ```sql + DELETE FROM gs_opt_model WHERE model_name = ; + ``` + +4. Query the existing model and its status. + + ```sql + SELECT * FROM gs_opt_model; + ``` + +## Model Training (System Administrators) + +1. Add models and modify model parameters by following the steps in Model Management (System Administrators). + + Example: + + Add a mode. + + ```sql + INSERT INTO gs_opt_model values('rlstm', 'default', 'postgres', '127.0.0.1', 5000, 2000, 1, -1, 64, 512, 0 , false, false, '{S, T}', '{0,0}', '{0,0}', 'Text'); + ``` + + Modify training parameters. + + ```sql + UPDATE gs_opt_model SET = WHERE model_name = ; + ``` + +2. Check that the database status is normal and historical data is collected properly before you perform the following operations: + + Delete the original encoding data. + + ```sql + DELETE FROM gs_wlm_plan_encoding_table; + ``` + + To encode data, specify the database name. + + ```sql + SELECT gather_encoding_info('postgres'); + ``` + + Start training. + + ```sql + SELECT model_train_opt('rlstm', 'default'); + ``` + +3. View the model training status. + + ```sql + SELECT * FROM track_model_train_opt('rlstm', 'default'); + ``` + + The URL used by TensorBoard is returned. + + ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/usage-guide-1.png) + + Access the URL to view the model training status. + + ![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/usage-guide-2.png) + +## Model Prediction + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> +> - Model prediction can be performed only when the database status is normal and the specified model has been trained and converged. +> - Currently, the labels of model training parameters must contain the **S** label so that the **p-time** value can be displayed in **EXPLAIN**. +> Example: INSERT INTO gs_opt_model values('rlstm', 'default', 'postgres', '127.0.0.1', 5000, 1000, 1, -1, 50, 500, 0 , false, false, '{**S**, T}', '{0,0}', '{0,0}', 'Text'); + +Call EXPLAIN. + + ```sql + explain (analyze on, predictor ) + SELECT ... + ``` + +Expected result: + + ```bash + Example: Row Adapter (cost=110481.35..110481.35 rows=100 p-time=99..182 width=100) (actual time=375.158..375.160 rows=2 loops=1) + The p-time column indicates the predicted value of the label. + ``` + +## Other Functions + +1. Check whether the AI Engine can be connected. + + ```sql + mogdb=# select check_engine_status('aiEngine-ip-address',running-port); + ``` + +2. Check the path for storing model logs on the AI Engine. + + ```sql + mogdb=# select track_model_train_opt('template_name', 'model_name'); + ``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/2-predictor-ai-query-time-forecasting/2-4-best-practices.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/2-predictor-ai-query-time-forecasting/2-4-best-practices.md new file mode 100644 index 0000000000000000000000000000000000000000..40dcde4dcfdf188e5fd28d70891e5532a6deeeb3 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/2-predictor-ai-query-time-forecasting/2-4-best-practices.md @@ -0,0 +1,32 @@ +--- +title: Best Practices +summary: Best Practices +author: Guo Huan +date: 2021-05-19 +--- + +# Best Practices + +For details about the parameters, see [GS_OPT_MODEL](GS_OPT_MODEL). + +**Table 1** + +| Model Parameter | Recommended Value | +| :--------------- | :----------------------------------------------------------- | +| template_name | rlstm | +| model_name | The value can be customized, for example, **open_ai**. The value must meet the unique constraint. | +| datname | Name of the database to be served, for example, **postgres**. | +| ip | IP address of the AI Engine, for example, **127.0.0.1**. | +| port | AI Engine listening port number, for example, **5000**. | +| max_epoch | Iteration times. A large value is recommended to ensure the convergence effect, for example, **2000**. | +| learning_rate | (0, 1] is a floating-point number. A large learning rate is recommended to accelerate convergence. | +| dim_red | Number of feature values to be reduced.
**-1**: Do not use PCA for dimension reduction. All features are supported.
Floating point number in the range of (0, 1]: A smaller value indicates a smaller training dimension and a faster convergence speed, but affects the training accuracy. | +| hidden_units | If the feature value dimension is high, you are advised to increase the value of this parameter to increase the model complexity. For example, set this parameter to **64**, **128**, and so on. | +| batch_size | You are advised to increase the value of this parameter based on the amount of encoded data to accelerate model convergence. For example, set this parameter to **256**, **512**, and so on. | +| Other parameters | See [GS_OPT_MODEL](GS_OPT_MODEL). | + +Recommended parameter settings: + +```sql +INSERT INTO gs_opt_model values('rlstm', 'open_ai', 'postgres', '127.0.0.1', 5000, 2000, 1, -1, 64, 512, 0, false, false, '{S, T}', '{0,0}', '{0,0}', 'Text'); +``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/2-predictor-ai-query-time-forecasting/2-5-faqs.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/2-predictor-ai-query-time-forecasting/2-5-faqs.md new file mode 100644 index 0000000000000000000000000000000000000000..79a7dd5a453a7cf947d0346b2822d5551e6d8336 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/2-predictor-ai-query-time-forecasting/2-5-faqs.md @@ -0,0 +1,35 @@ +--- +title: FAQs +summary: FAQs +author: Guo Huan +date: 2021-05-19 +--- + +# FAQs + +## AI Engine Configuration + +- **AI Engine failed to be started**: Check whether the IP address and port are available and whether the CA certificate path exists. +- **AI Engine does not respond**: Check whether the CA certificates of the two communication parties are consistent. +- **Training and test failure**: Check whether the path for saving the model files exists and whether the training prediction file is correctly downloaded. +- **Changing the AI Engine IP address**: Regenerate the certificate by following the steps in [Generating Certificates](2-2-environment-deployment#generating-certificates). Enter the new IP address in Generate a certificate and private key. + +## Database Internal Errors + +Problem: AI Engine connection failed. + +``` +ERROR: AI engine connection failed. +CONTEXT: referenced column: model_train_opt +``` + +Solution: Check whether the AI Engine is started or restarted properly. Check whether the CA certificates of the communication parties are consistent. Check whether the IP address and port number in the model configuration match. + +Problem: The model does not exist. + +``` +ERROR: OPT_Model not found for model name XXX +CONTEXT: referenced column: track_model_train_opt +``` + +Solution: Check whether [GS_OPT_MODEL](GS_OPT_MODEL) contains the model specified in the **model_name** column in the statement. If the error is reported when the prediction function is used, check whether the model has been trained. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/3-x-tuner-parameter-optimization-and-diagnosis/3-1-x-tuner-overview.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/3-x-tuner-parameter-optimization-and-diagnosis/3-1-x-tuner-overview.md new file mode 100644 index 0000000000000000000000000000000000000000..118476296311abb940f5b068d87f16ad6bbb4f81 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/3-x-tuner-parameter-optimization-and-diagnosis/3-1-x-tuner-overview.md @@ -0,0 +1,10 @@ +--- +title: Overview +summary: Overview +author: Guo Huan +date: 2021-05-19 +--- + +# Overview + +X-Tuner is a parameter tuning tool integrated into databases. It uses AI technologies such as deep reinforcement learning and global search algorithms to obtain the optimal database parameter settings without manual intervention. This function is not forcibly deployed with the database environment. It can be independently deployed and run without the database installation environment. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/3-x-tuner-parameter-optimization-and-diagnosis/3-2-preparations.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/3-x-tuner-parameter-optimization-and-diagnosis/3-2-preparations.md new file mode 100644 index 0000000000000000000000000000000000000000..25a6791024ed4d764d878c9cb3e9e44c0f1ccdef --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/3-x-tuner-parameter-optimization-and-diagnosis/3-2-preparations.md @@ -0,0 +1,227 @@ +--- +title: Preparations +summary: Preparations +author: Guo Huan +date: 2021-10-21 +--- + +# Preparations + +
+ +## Prerequisites and Precautions + +- The database status is normal; the client can be properly connected; and data can be imported to the database. As a result, the optimization program can perform the benchmark test for optimization effect. +- To use this tool, you need to specify the user who logs in to the database. The user who logs in to the database must have sufficient permissions to obtain sufficient database status information. +- If you log in to the database host as a Linux user, add **$GAUSSHOME/bin** to the **PATH** environment variable so that you can directly run database O&M tools, such as gsql, gs_guc, and gs_ctl. +- The recommended Python version is Python 3.6 or later. The required dependency has been installed in the operating environment, and the optimization program can be started properly. You can install a Python 3.6+ environment independently without setting it as a global environment variable. You are not advised to install the tool as the root user. If you install the tool as the root user and run the tool as another user, ensure that you have the read permission on the configuration file. +- This tool can run in three modes. In **tune** and **train** modes, you need to configure the benchmark running environment and import data. This tool will iteratively run the benchmark to check whether the performance is improved after the parameters are modified. +- In **recommend** mode, you are advised to run the command when the database is executing the workload to obtain more accurate real-time workload information. +- By default, this tool provides benchmark running script samples of TPC-C, TPC-H, TPC-DS, and sysbench. If you use the benchmarks to perform pressure tests on the database system, you can modify or configure the preceding configuration files. To adapt to your own service scenarios, you need to compile the script file that drives your customized benchmark based on the **template.py** file in the **benchmark** directory. + +
+ +## Principles + +The tuning program is a tool independent of the database kernel. The usernames and passwords for the database and instances are required to control the benchmark performance test of the database. Before starting the tuning program, ensure that the interaction in the test environment is normal, the benchmark test script can be run properly, and the database can be connected properly. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** If the parameters to be tuned include the parameters that take effect only after the database is restarted, the database will be restarted multiple times during the tuning. Exercise caution when using **train** and **tune** modes if the database is running jobs. + +X-Tuner can run in any of the following modes: + +- **recommend**: Log in to the database using the specified user name, obtain the feature information about the running workload, and generate a parameter recommendation report based on the feature information. Report improper parameter settings and potential risks in the current database. Output the currently running workload behavior and characteristics. Output the recommended parameter settings. In this mode, the database does not need to be restarted. In other modes, the database may need to be restarted repeatedly. +- **train**: Modify parameters and execute the benchmark based on the benchmark information provided by users. The reinforcement learning model is trained through repeated iteration so that you can load the model in **tune** mode for optimization. +- **tune**: Use an optimization algorithm to tune database parameters. Currently, two types of algorithms are supported: deep reinforcement learning and global search algorithm (global optimization algorithm). The deep reinforcement learning mode requires **train** mode to generate the optimized model after training. However, the global search algorithm does not need to be trained in advance and can be directly used for search and optimization. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-notice.gif) **NOTICE:** If the deep reinforcement learning algorithm is used in **tune** mode, a trained model must be available, and the parameters for training the model must be the same as those in the parameter list (including max and min) for tuning. + +**Figure 1** X-Tuner structure + +![x-tuner-structure](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/preparations-1.png) + +Figure 1 X-Tuner architecture shows the overall architecture of the X-Tuner. The X-Tuner system can be divided into the following parts: + +- DB: The DB_Agent module is used to abstract database instances. It can be used to obtain the internal database status information and current database parameters and set database parameters. The SSH connection used for logging in to the database environment is included on the database side. +- Algorithm: algorithm package used for optimization, including global search algorithms (such as Bayesian optimization and particle swarm optimization) and deep reinforcement learning (such as DDPG). +- X-Tuner main logic module: encapsulated by the environment module. Each step is an optimization process. The entire optimization process is iterated through multiple steps. +- benchmark: a user-specified benchmark performance test script, which is used to run benchmark jobs. The benchmark result reflects the performance of the database system. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** Ensure that the larger the benchmark script score is, the better the performance is. For example, for the benchmark used to measure the overall execution duration of SQL statements, such as TPC-H, the inverse value of the overall execution duration can be used as the benchmark score. + +
+ +## Installing and Running X-Tuner + +You can run the X-Tuner in two ways. One is to run the X-Tuner directly through the source code. The other is to install the X-Tuner on the system through the Python setuptools, and then run the **gs_xtuner** command to call the X-Tuner. The following describes two methods of running the X-Tuner. + +Method 1: Run the source code directly. + +1. Switch to the **xtuner** source code directory. For the openGauss community code, the path is **openGauss-server/src/gausskernel/dbmind/tools/xtuner**. For an installed database system, the source code path is *$GAUSSHOME***/bin/dbmind/xtuner**. + +2. You can view the **requirements.txt** file in the current directory. Use the pip package management tool to install the dependency based on the **requirements.txt** file. + + ```bash + pip install -r requirements.txt + ``` + +3. After the installation is successful, add the environment variable PYTHONPATH, and then run **main.py**. For example, to obtain the help information, run the following command: + + ```bash + cd tuner # Switch to the directory where the main.py entry file is located. + export PYTHONPATH='..' # Add the upper-level directory to the path for searching for packages. + python main.py --help # Obtain help information. The methods of using other functions are similar. + ``` + +Method 2: Install the X-Tuner in the system. + +1. You can use the **setup.py** file to install the X-Tuner to the system and then run the **gs_xtuner** command. You need to switch to the root directory of **xtuner**. For details about the directory location, see the preceding description. + +2. Run the following command to install the tool in the Python environment using Python setuptools: + + ```bash + python setup.py install + ``` + + If the **bin** directory of Python is added to the *PATH* environment variable, the **gs_xtuner** command can be directly called anywhere. + +3. For example, to obtain the help information, run the following command: + + ```bash + gs_xtuner --help + ``` + +
+ +## Description of the X-Tuner Configuration File + +Before running the X-Tuner, you need to load the configuration file. The default path of the configuration file is tuner/xtuner.conf. You can run the **gs_xtuner -help** command to view the absolute path of the configuration file that is loaded by default. + +``` +... + -x TUNER_CONFIG_FILE, --tuner-config-file TUNER_CONFIG_FILE + This is the path of the core configuration file of the + X-Tuner. You can specify the path of the new + configuration file. The default path is /path/to/xtuner/xtuner.conf. + You can modify the configuration file to control the + tuning process. +... +``` + +You can modify the configuration items in the configuration file as required to instruct the X-Tuner to perform different actions. For details about the configuration items in the configuration file, see Table 2 in [Command Reference](3-5-command-reference). If you need to change the loading path of the configuration file, you can specify the path through the **-x** command line option. + +
+ +## Benchmark Selection and Configuration + +The benchmark drive script is stored in the benchmark subdirectory of the X-Tuner. X-Tuner provides common benchmark driver scripts, such as TPC-C and TPC-H. The X-Tuner invokes the **get_benchmark_instance()** command in the benchmark/__init__.py file to load different benchmark driver scripts and obtain benchmark driver instances. The format of the benchmark driver script is described as follows: + +- Name of the driver script: name of the benchmark. The name is used to uniquely identify the driver script. You can specify the benchmark driver script to be loaded by setting the **benchmark_script** configuration item in the configuration file of the X-Tuner. +- The driver script contains the *path* variable, *cmd* variable, and the **run** function. + +The following describes the three elements of the driver script: + +1. *path*: path for saving the benchmark script. You can modify the path in the driver script or specify the path by setting the **benchmark_path** configuration item in the configuration file. + +2. *cmd*: command for executing the benchmark script. You can modify the command in the driver script or specify the command by setting the **benchmark_cmd** configuration item in the configuration file. Placeholders can be used in the text of cmd to obtain necessary information for running cmd commands. For details, see the TPC-H driver script example. These placeholders include: + + - {host}: IP address of the database host machine + - {port}: listening port number of the database instance + - {user}: user name for logging in to the database + - {password}: password of the user who logs in to the database system + - {db}: name of the database that is being optimized + +3. **run** function: The signature of this function is as follows: + + ``` + def run(remote_server, local_host) -> float: + ``` + + The returned data type is float, indicating the evaluation score after the benchmark is executed. A larger value indicates better performance. For example, the TPC-C test result tpmC can be used as the returned value, the inverse number of the total execution time of all SQL statements in TPC-H can also be used as the return value. A larger return value indicates better performance. + + The *remote_server* variable is the shell command interface transferred by the X-Tuner program to the remote host (database host machine) used by the script. The *local_host* variable is the shell command interface of the local host (host where the X-Tuner script is executed) transferred by the X-Tuner program. Methods provided by the preceding shell command interface include: + + ``` + exec_command_sync(command, timeout) + Function: This method is used to run the shell command on the host. + Parameter list: + command: The data type can be str, and the element can be a list or tuple of the str type. This parameter is optional. + timeout: The timeout interval for command execution in seconds. This parameter is optional. + Return value: + Returns 2-tuple (stdout and stderr). stdout indicates the standard output stream result, and stderr indicates the standard error stream result. The data type is str. + ``` + + ``` + exit_status + Function: This attribute indicates the exit status code after the latest shell command is executed. + Note: Generally, if the exit status code is 0, the execution is normal. If the exit status code is not 0, an error occurs. + ``` + +Benchmark driver script example: + +1. TPC-C driver script + + ```bash + from tuner.exceptions import ExecutionError + + # WARN: You need to download the benchmark-sql test tool to the system, + # replace the PostgreSQL JDBC driver with the openGauss driver, + # and configure the benchmark-sql configuration file. + # The program starts the test by running the following command: + path = '/path/to/benchmarksql/run' # Path for storing the TPC-C test script benchmark-sql + cmd = "./runBenchmark.sh props.gs" # Customize a benchmark-sql test configuration file named props.gs. + + def run(remote_server, local_host): + # Switch to the TPC-C script directory, clear historical error logs, and run the test command. + # You are advised to wait for several seconds because the benchmark-sql test script generates the final test report through a shell script. The entire process may be delayed. + # To ensure that the final tpmC value report can be obtained, wait for 3 seconds. + stdout, stderr = remote_server.exec_command_sync(['cd %s' % path, 'rm -rf benchmarksql-error.log', cmd, 'sleep 3']) + # If there is data in the standard error stream, an exception is reported and the system exits abnormally. + if len(stderr) > 0: + raise ExecutionError(stderr) + + # Find the final tpmC result. + tpmC = None + split_string = stdout.split() # Split the standard output stream result. + for i, st in enumerate(split_string): + # In the benchmark-sql of version 5.0, the value of tpmC is the last two digits of the keyword (NewOrders). In normal cases, the value of tpmC is returned after the keyword is found. + if "(NewOrders)" in st: + tpmC = split_string[i + 2] + break + stdout, stderr = remote_server.exec_command_sync( + "cat %s/benchmarksql-error.log" % path) + nb_err = stdout.count("ERROR:") # Check whether errors occur during the benchmark running and record the number of errors. + return float(tpmC) - 10 * nb_err # The number of errors is used as a penalty item, and the penalty coefficient is 10. A higher penalty coefficient indicates a larger number of errors. + + ``` + +2. TPC-H driver script + + ```bash + import time + + from tuner.exceptions import ExecutionError + + # WARN: You need to import data into the database and SQL statements in the following path will be executed. + # The program automatically collects the total execution duration of these SQL statements. + path = '/path/to/tpch/queries' # Directory for storing SQL scripts used for the TPC-H test + cmd = "gsql -U {user} -W {password} -d {db} -p {port} -f {file}" # The command for running the TPC-H test script. Generally, gsql -f script file is used. + + def run(remote_server, local_host): + # Traverse all test case file names in the current directory. + find_file_cmd = "find . -type f -name '*.sql'" + stdout, stderr = remote_server.exec_command_sync(['cd %s' % path, find_file_cmd]) + if len(stderr) > 0: + raise ExecutionError(stderr) + files = stdout.strip().split('\n') + time_start = time.time() + for file in files: + # Replace {file} with the file variable and run the command. + perform_cmd = cmd.format(file=file) + stdout, stderr = remote_server.exec_command_sync(['cd %s' % path, perform_cmd]) + if len(stderr) > 0: + print(stderr) + # The cost is the total execution duration of all test cases. + cost = time.time() - time_start + # Use the inverse number to adapt to the definition of the run function. The larger the returned result is, the better the performance is. + return - cost + ``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/3-x-tuner-parameter-optimization-and-diagnosis/3-3-examples.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/3-x-tuner-parameter-optimization-and-diagnosis/3-3-examples.md new file mode 100644 index 0000000000000000000000000000000000000000..311ccc689907c8df58c134dce8f0bda0b2fcae8d --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/3-x-tuner-parameter-optimization-and-diagnosis/3-3-examples.md @@ -0,0 +1,151 @@ +--- +title: Examples +summary: Examples +author: Guo Huan +date: 2021-05-19 +--- + +# Examples + +X-Tuner supports three modes: recommend mode for obtaining parameter diagnosis reports, train mode for training reinforcement learning models, and tune mode for using optimization algorithms. The preceding three modes are distinguished by command line parameters, and the details are specified in the configuration file. + +## Configuring the Database Connection Information + +Configuration items for connecting to a database in the three modes are the same. You can enter the detailed connection information in the command line or in the JSON configuration file. Both methods are described as follows: + +1. Entering the connection information in the command line + + Input the following options: **-db-name -db-user -port -host -host-user**. The **-host-ssh-port** is optional. The following is an example: + + ``` + gs_xtuner recommend --db-name postgres --db-user omm --port 5678 --host 192.168.1.100 --host-user omm + ``` + +2. Entering the connection information in the JSON configuration file + + Assume that the file name is **connection.json**. The following is an example of the JSON configuration file: + + ``` + { + "db_name": "postgres", # Database name + "db_user": "dba", # Username for logging in to the database + "host": "127.0.0.1", # IP address of the database host + "host_user": "dba", # Username for logging in to the database host + "port": 5432, # Listening port number of the database + "ssh_port": 22 # SSH listening port number of the database host + } + ``` + + Input **-f connection.json**. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** To prevent password leakage, the configuration file and command line parameters do not contain password information by default. After you enter the preceding connection information, the program prompts you to enter the database password and the OS login password in interactive mode. + +## Example of Using recommend Mode + +The configuration item **scenario** takes effect for recommend mode. If the value is **auto**, the workload type is automatically detected. + +Run the following command to obtain the diagnosis result: + +``` + +gs_xtuner recommend -f connection.json + +``` + +The diagnosis report is generated as follows: + +**Figure 1** Report generated in recommend mode + +![report-generated-in-recommend-mode](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/examples-1.png) + +In the preceding report, the database parameter configuration in the environment is recommended, and a risk warning is provided. The report also generates the current workload features. The following features are for reference: + +- **temp_file_size**: number of generated temporary files. If the value is greater than 0, the system uses temporary files. If too many temporary files are used, the performance is poor. If possible, increase the value of **work_mem**. +- **cache_hit_rate**: cache hit ratio of **shared_buffer**, indicating the cache efficiency of the current workload. +- **read_write_ratio**: read/write ratio of database jobs. +- **search_modify_ratio**: ratio of data query to data modification of a database job. +- **ap_index**: AP index of the current workload. The value ranges from 0 to 10. A larger value indicates a higher preference for data analysis and retrieval. +- **workload_type**: workload type, which can be AP, TP, or HTAP based on database statistics. +- **checkpoint_avg_sync_time**: average duration for refreshing data to the disk each time when the database is at the checkpoint, in milliseconds. +- **load_average**: average load of each CPU core in 1 minute, 5 minutes, and 15 minutes. Generally, if the value is about 1, the current hardware matches the workload. If the value is about 3, the current workload is heavy. If the value is greater than 5, the current workload is too heavy. In this case, you are advised to reduce the load or upgrade the hardware. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** In recommend mode, information in the **pg_stat_database** and **pg_stat_bgwriter**system catalogs in the database is read. Therefore, the database login user must have sufficient permissions. (You are advised to own the administrator permission which can be granted to username using **alter user username sysadmin**.) Some system catalogs keep recording statistics, which may affect load feature identification. Therefore, you are advised to clear the statistics of some system catalogs, run the workload for a period of time, and then use recommend mode for diagnosis to obtain more accurate results. To clear the statistics, run the following command: select pg_stat_reset_shared('bgwriter'); select pg_stat_reset(); + +## Example of Using train Mode + +This mode is used to train the deep reinforcement learning model. The configuration items related to this mode are as follows: + +- **rl_algorithm**: algorithm used to train the reinforcement learning model. Currently, this parameter can be set to **ddpg**. + +- **rl_model_path**: path for storing the reinforcement learning model generated after training. + +- **rl_steps**: maximum number of training steps in the training process. + +- **max_episode_steps**: maximum number of steps in each episode. + +- **scenario**: specifies the workload type. If the value is **auto**, the system automatically determines the workload type. The recommended parameter tuning list varies according to the mode. + +- **tuning_list**: specifies the parameters to be tuned. If this parameter is not specified, the list of parameters to be tuned is automatically recommended based on the workload type. If this parameter is specified, **tuning_list**indicates the path of the tuning list file. The following is an example of the content of a tuning list configuration file. + + ``` + { + "work_mem": { + "default": 65536, + "min": 65536, + "max": 655360, + "type": "int", + "restart": false + }, + "shared_buffers": { + "default": 32000, + "min": 16000, + "max": 64000, + "type": "int", + "restart": true + }, + "random_page_cost": { + "default": 4.0, + "min": 1.0, + "max": 4.0, + "type": "float", + "restart": false + }, + "enable_nestloop": { + "default": true, + "type": "bool", + "restart": false + } + } + ``` + +After the preceding configuration items are configured, run the following command to start the training: + +``` + +gs_xtuner train -f connection.json + +``` + +After the training is complete, a model file is generated in the directory specified by the **rl_model_path**configuration item. + +## Example of Using tune Mode + +The tune mode supports a plurality of algorithms, including a DDPG algorithm based on reinforcement learning (RL), and a Bayesian optimization algorithm and a particle swarm algorithm (PSO) which are both based on a global optimization algorithm (GOP). + +The configuration items related to tune mode are as follows: + +- **tune_strategy**: specifies the algorithm to be used for optimization. The value can be **rl**(using the reinforcement learning model), **gop**(using the global optimization algorithm), or **auto**(automatic selection). If this parameter is set to **rl**, RL-related configuration items take effect. In addition to the preceding configuration items that take effect in train mode, the **test_episode**configuration item also takes effect. This configuration item indicates the maximum number of episodes in the tuning process. This parameter directly affects the execution time of the tuning process. Generally, a larger value indicates longer time consumption. +- **gop_algorithm**: specifies a global optimization algorithm. The value can be **bayes** or **pso**. +- **max_iterations**: specifies the maximum number of iterations. A larger value indicates a longer search time and better search effect. +- **particle_nums**: specifies the number of particles. This parameter is valid only for the PSO algorithm. +- For details about **scenario** and **tuning_list**, see the description of train mode. + +After the preceding items are configured, run the following command to start tuning: + +``` + +gs_xtuner tune -f connection.json + +``` + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-caution.gif) **CAUTION:** Before using tune and train modes, you need to import the data required by the benchmark, check whether the benchmark can run properly, and back up the current database parameters. To query the current database parameters, run the following command: select name, setting from pg_settings; diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/3-x-tuner-parameter-optimization-and-diagnosis/3-4-obtaining-help-information.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/3-x-tuner-parameter-optimization-and-diagnosis/3-4-obtaining-help-information.md new file mode 100644 index 0000000000000000000000000000000000000000..12f32ef275085145a39b65474fc5e50fb90cb69e --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/3-x-tuner-parameter-optimization-and-diagnosis/3-4-obtaining-help-information.md @@ -0,0 +1,51 @@ +--- +title: Obtaining Help Information +summary: Obtaining Help Information +author: Guo Huan +date: 2021-05-19 +--- + +# Obtaining Help Information + +Before starting the tuning program, run the following command to obtain help information: + +```bash +python main.py --help +``` + +The command output is as follows: + +```bash +usage: main.py [-h] [-m {train,tune}] [-f CONFIG_FILE] [--db-name DB_NAME] +[--db-user DB_USER] [--port PORT] [--host HOST] +[--host-user HOST_USER] [--host-ssh-port HOST_SSH_PORT] +[--scenario {ap,tp,htap}] [--benchmark BENCHMARK] +[--model-path MODEL_PATH] [-v] + +X-Tuner: a self-tuning toolkit for MogD. + +optional arguments: +-h, --help show this help message and exit +-m {train,tune}, --mode {train,tune} +train a reinforcement learning model or tune by your +trained model. +-f CONFIG_FILE, --config-file CONFIG_FILE +you can pass a config file path or you should manually +set database information. +--db-name DB_NAME database name. +--db-user DB_USER database user name. +--port PORT database connection port. +--host HOST where did your database install on? +--host-user HOST_USER +user name of the host where your database installed +on. +--host-ssh-port HOST_SSH_PORT +host ssh port. +--scenario {ap,tp,htap} +--benchmark BENCHMARK +--model-path MODEL_PATH +the place where you want to save model weights to or +load model weights from. +-v, --version +show version. +``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/3-x-tuner-parameter-optimization-and-diagnosis/3-5-command-reference.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/3-x-tuner-parameter-optimization-and-diagnosis/3-5-command-reference.md new file mode 100644 index 0000000000000000000000000000000000000000..b91dcea4663c9e5f376972215258148d1f932fc4 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/3-x-tuner-parameter-optimization-and-diagnosis/3-5-command-reference.md @@ -0,0 +1,50 @@ +--- +title: Command Reference +summary: Command Reference +author: Guo Huan +date: 2021-05-19 +--- + +# Command Reference + +**Table 1** Command-line Parameter + +| Parameter | Description | Value Range | +| :--------------------- | :----------------------------------------------------------- | :--------------------- | +| mode | Specifies the running mode of the tuning program. | train, tune, recommend | +| -tuner-config-file, -x | Path of the core parameter configuration file of X-Tuner. The default path is **xtuner.conf** under the installation directory. | - | +| -db-config-file, -f | Path of the connection information configuration file used by the optimization program to log in to the database host. If the database connection information is configured in this file, the following database connection information can be omitted. | - | +| -db-name | Specifies the name of a database to be tuned. | - | +| -db-user | Specifies the user account used to log in to the tuned database. | - | +| -port | Specifies the database listening port. | - | +| -host | Specifies the host IP address of the database instance. | - | +| -host-user | Specifies the username for logging in to the host where the database instance is located. The database O&M tools, such as **gsql** and **gs_ctl**, can be found in the environment variables of the username. | - | +| -host-ssh-port | Specifies the SSH port number of the host where the database instance is located. This parameter is optional. The default value is **22**. | - | +| -help, -h | Returns the help information. | - | +| -version, -v | Returns the current tool version. | - | + +**Table 2** Parameters in the configuration file + +| Parameter | Description | Value Range | +| :-------------------- | :----------------- | :------------------- | +| logfile | Path for storing generated logs. | - | +| output_tuning_result | (Optional) Specifies the path for saving the tuning result. | - | +| verbose | Whether to print details. | on, off | +| recorder_file | Path for storing logs that record intermediate tuning information. | - | +| tune_strategy | Specifies a strategy used in tune mode. | rl, gop, auto | +| drop_cache | Whether to perform drop cache in each iteration. Drop cache can make the benchmark score more stable. If this parameter is enabled, add the login system user to the **/etc/sudoers** list and grant the NOPASSWD permission to the user. (You are advised to enable the NOPASSWD permission temporarily and disable it after the tuning is complete.) | on, off | +| used_mem_penalty_term | Penalty coefficient of the total memory used by the database. This parameter is used to prevent performance deterioration caused by unlimited memory usage. The greater the value is, the greater the penalty is. | Recommended value: 0 ~ 1 | +| rl_algorithm | Specifies the RL algorithm. | ddpg | +| rl_model_path | Path for saving or reading the RL model, including the save directory name and file name prefix. In train mode, this path is used to save the model. In tune mode, this path is used to read the model file. | - | +| rl_steps | Number of training steps of the deep reinforcement learning algorithm | - | +| max_episode_steps | Maximum number of training steps in each episode | - | +| test_episode | Number of episodes when the RL algorithm is used for optimization | - | +| gop_algorithm | Specifies a global optimization algorithm. | bayes, pso, auto | +| max_iterations | Maximum number of iterations of the global search algorithm. (The value is not fixed. Multiple iterations may be performed based on the actual requirements.) | - | +| particle_nums | Number of particles when the PSO algorithm is used | - | +| benchmark_script | Benchmark driver script. This parameter specifies the file with the same name in the benchmark path to be loaded. Typical benchmarks, such as TPC-C and TPC-H, are supported by default. | tpcc, tpch, tpcds, sysbench … | +| benchmark_path | Path for saving the benchmark script. If this parameter is not configured, the configuration in the benchmark drive script is used. | - | +| benchmark_cmd | Command for starting the benchmark script. If this parameter is not configured, the configuration in the benchmark drive script is used. | - | +| benchmark_period | This parameter is valid only for **period benchmark**. It indicates the test period of the entire benchmark. The unit is second. | - | +| scenario | Type of the workload specified by the user. | tp, ap, htap | +| tuning_list | List of parameters to be tuned. For details, see the **share/knobs.json.template** file. | - | diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/3-x-tuner-parameter-optimization-and-diagnosis/3-6-Troubleshooting.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/3-x-tuner-parameter-optimization-and-diagnosis/3-6-Troubleshooting.md new file mode 100644 index 0000000000000000000000000000000000000000..df96e0c7367c2bb6ec000afe1d48bfb28d021f51 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/3-x-tuner-parameter-optimization-and-diagnosis/3-6-Troubleshooting.md @@ -0,0 +1,14 @@ +--- +title: Troubleshooting +summary: Troubleshooting +author: Guo Huan +date: 2021-05-19 +--- + +# Troubleshooting + +- Failure of connection to the database instance: Check whether the database instance is faulty or the security permissions of configuration items in the **pg_hba.conf** file are incorrectly configured. +- Restart failure: Check the health status of the database instance and ensure that the database instance is running properly. +- Dependency installation failure: Upgrade the pip package management tool by running the **python -m pip install -upgrade pip** command. +- Poor performance of TPC-C jobs: In high-concurrency scenarios such as TPC-C, a large amount of data is modified during pressure tests. Each test is not idempotent, for example, the data volume in the TPC-C database increases, invalid tuples are not cleared using VACUUM FULL, checkpoints are not triggered in the database, and drop cache is not performed. Therefore, it is recommended that the benchmark data that is written with a large amount of data, such as TPC-C, be imported again at intervals (depending on the number of concurrent tasks and execution duration). A simple method is to back up the $PGDATA directory. +- When the TPC-C job is running, the TPC-C driver script reports the error "TypeError: float() argument must be a string or a number, not 'NoneType'" (**none** cannot be converted to the float type). This is because the TPC-C pressure test result is not obtained. There are many causes for this problem, manually check whether TPC-C can be successfully executed and whether the returned result can be obtained. If the preceding problem does not occur, you are advised to set the delay time of the **sleep** command in the command list in the TPC-C driver script to a larger value. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/4-sqldiag-slow-sql-discovery/4-1-overview.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/4-sqldiag-slow-sql-discovery/4-1-overview.md new file mode 100644 index 0000000000000000000000000000000000000000..1fbef2fbfabf26873fa17e2558fbea4e8a7328e4 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/4-sqldiag-slow-sql-discovery/4-1-overview.md @@ -0,0 +1,16 @@ +--- +title: Overview +summary: Overview +author: Guo Huan +date: 2021-05-19 +--- + +# Overview + +SQLdiag is a framework for predicting the execution duration of SQL statements in MogDB. The existing prediction technologies are mainly based on model prediction of execution plans. These prediction solutions are applicable only to jobs whose execution plans can be obtained in the OLAP scenarios, and are not useful for quick query such as OLTP or HTAP. Different from the preceding solutions, SQLdiag focuses on the historical SQL statements of databases. Because the execution duration of the database SQL statements in a short time does not vary greatly, SQLdiag can detect instruction sets similar to the entered instructions from the historical data, and predict the SQL statement execution duration based on the SQL vectorization technology and the time series prediction algorithm. This framework has the following benefits: + +1. Execution plans do not require instructions. This has no impact on database performance. +2. The framework is widely used, unlike many other well-targeted algorithms in the industry, for example, they may applicable only to OLTP or OLAP. +3. The framework is robust and easy to understand. Users can design their own prediction models by simply modifying the framework. + +SQLdiag is an SQL statement execution time prediction tool. It predicts the execution time of SQL statements based on the statement logic similarity and historical execution records without obtaining the SQL statement execution plan using a template. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/4-sqldiag-slow-sql-discovery/4-2-usage-guide.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/4-sqldiag-slow-sql-discovery/4-2-usage-guide.md new file mode 100644 index 0000000000000000000000000000000000000000..fdc1f70712e0ec2de44a10016d3f25c9df972430 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/4-sqldiag-slow-sql-discovery/4-2-usage-guide.md @@ -0,0 +1,107 @@ +--- +title: Usage Guide +summary: Usage Guide +author: Guo Huan +date: 2021-11-24 +--- + +# Usage Guide + +## Prerequisites + +- Ensure that users provide training data. +- If the user collects training data through the tools provided, the WDR function needs to be enabled, and the parameters involved are **track_stmt_stat_level** and **log_min_duration_statement**, as described in the following section. +- To ensure the prediction accuracy, the historical statement logs provided by users should be as comprehensive and representative as possible. +- The Python 3.6+ environment and dependencies have been configured as required. + +## Environment Configuration + +This function requires Python 3.6+ to run. The required third-party dependency packages are recorded in the **requirements.txt** file, and the dependencies can be installed via the **pip install** command, for example: + +```bash +pip install requirements.txt +``` + +## Collecting SQL Statements + +This tool requires the user to prepare the data in advance, and the training data are in the following format, with each sample separated by a newline character. + +```bash +SQL,EXECUTION_TIME +``` + +The format of the predicted data is as follows: + +```bash +SQL +``` + +**SQL** denotes the text of SQL statement and **EXECUTION_TIME** denotes the execution time of SQL statement. Sample data are shown in **train.csv** and **predict.csv** in **sample_data**. + +Users can collect training data by themselves in the required format, and the tool also provides a script for automatic collection (load_sql_from_rd), which obtains SQL information based on WDR reports and involves parameters such as **log\_min\_duration\_statement** and **track_stmt_stat_level**. + +- **log_min_duration_statement** indicates the slow SQL threshold, if 0 then the full amount is collected (in milliseconds). +- **track_stmt_stat_level** indicates the level of information capture, it is recommended to set track_stmt_stat_level='L0,L0' + +If this parameter is enabled, a certain amount of system resources may be occupied but the usage is generally low. Continuous high-concurrency scenarios may generate less than 5% performance loss. If the database concurrency is low, the performance loss can be ignored. + +```bash +# Use the script to get the training data: +load_sql_from_wdr.py [-h] --port PORT --start_time START_TIME + --finish_time FINISH_TIME [--save_path SAVE_PATH] +# For example: + python load_sql_from_wdr.py --start_time "2021-04-25 00:00:00" --finish_time "2021-04-26 14:00:00" --port 5432 --save_path ./data.csv +``` + +## Procedure + +1. Provide historical logs for model training. + +2. Perform training and prediction. + + ```bash + # Training and prediction based on template: + python main.py [train, predict] -f FILE --model template --model-path template_model_path + # Training and prediction based on DNN: + python main.py [train, predict] -f FILE --model dnn --model-path dnn_model_path + ``` + +## Examples + +In the root directory of this tool, run the following commands to achieve the corresponding functions. + +Use the provided test data for template training: + +```bash +python main.py train -f ./sample_data/train.csv --model template --model-path ./template +``` + +Use the provided test data to make templated predictions: + +```bash +python main.py predict -f ./sample_data/predict.csv --model template --model-path ./template --predicted-file ./result/t_result +``` + +Use the provided test data to update the templated model: + +```bash +python main.py finetune -f ./sample_data/train.csv --model template --model-path ./template +``` + +Use the provided test data for DNN training: + +```bash +python main.py train -f ./sample_data/train.csv --model dnn --model-path ./dnn_model +``` + +Use the provided test data to make DNN predictions: + +```bash +python main.py predict -f ./sample_data/predict.csv --model dnn --model-path ./dnn_model --predicted-file +``` + +Use the provided test data to update the DNN model: + +```bash +python main.py finetune -f ./sample_data/train.csv --model dnn --model-path ./dnn_model +``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/4-sqldiag-slow-sql-discovery/4-3-obtaining-help-information.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/4-sqldiag-slow-sql-discovery/4-3-obtaining-help-information.md new file mode 100644 index 0000000000000000000000000000000000000000..91ec920b59292ec46d0fa019964d1a7261f267a9 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/4-sqldiag-slow-sql-discovery/4-3-obtaining-help-information.md @@ -0,0 +1,49 @@ +--- +title: Obtaining Help Information +summary: Obtaining Help Information +author: Guo Huan +date: 2021-05-19 +--- + +# Obtaining Help Information + +Before using the SQLdiag tool, run the following command to obtain help information: + +``` +python main.py --help +``` + +The command output is as follows: + +``` +usage: main.py [-h] [-f CSV_FILE] [--predicted-file PREDICTED_FILE] + [--model {template,dnn}] --model-path MODEL_PATH + [--config-file CONFIG_FILE] + {train,predict,finetune} + +SQLdiag integrated by openGauss. + +positional arguments: + {train,predict,finetune} + The training mode is to perform feature extraction and + model training based on historical SQL statements. The + prediction mode is to predict the execution time of a + new SQL statement through the trained model. + +optional arguments: + -h, --help show this help message and exit + -f CSV_FILE, --csv-file CSV_FILE + The data set for training or prediction. The file + format is CSV. If it is two columns, the format is + (SQL statement, duration time). If it is three + columns, the format is (timestamp of SQL statement + execution time, SQL statement, duration time). + --predicted-file PREDICTED_FILE + The file path to save the predicted result. + --model {template,dnn} + Choose the model model to use. + --model-path MODEL_PATH + The storage path of the model file, used to read or + save the model file. + --config-file CONFIG_FILE +``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/4-sqldiag-slow-sql-discovery/4-4-command-reference.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/4-sqldiag-slow-sql-discovery/4-4-command-reference.md new file mode 100644 index 0000000000000000000000000000000000000000..56c792569e2b7e04d90a87d1600012587e839ce0 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/4-sqldiag-slow-sql-discovery/4-4-command-reference.md @@ -0,0 +1,17 @@ +--- +title: Command Reference +summary: Command Reference +author: Guo Huan +date: 2021-05-19 +--- + +# Command Reference + +**Table 1** Command-line options + +| Parameter | Description | Value Range | +| :-------------- | :----------------------------------- | :------------ | +| -f | Training or prediction file location | | +| -predicted-file | Prediction result location | | +| -model | Model selection | template, dnn | +| -model-path | Location of the training model | | diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/4-sqldiag-slow-sql-discovery/4-5-troubleshooting.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/4-sqldiag-slow-sql-discovery/4-5-troubleshooting.md new file mode 100644 index 0000000000000000000000000000000000000000..442f8d1d07449da92ea7ca5ac3d6c860a438ac6b --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/4-sqldiag-slow-sql-discovery/4-5-troubleshooting.md @@ -0,0 +1,11 @@ +--- +title: Troubleshooting +summary: Troubleshooting +author: Guo Huan +date: 2021-05-19 +--- + +# Troubleshooting + +- Failure in the training scenario: Check whether the file path of historical logs is correct and whether the file format meets the requirements. +- Failure in the prediction scenario: Check whether the model path is correct. Ensure that the format of the load file to be predicted is correct. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/5-a-detection-status-monitoring/5-1-overview.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/5-a-detection-status-monitoring/5-1-overview.md new file mode 100644 index 0000000000000000000000000000000000000000..ce372de191ffea93172541dc73f20e1fff2c43b8 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/5-a-detection-status-monitoring/5-1-overview.md @@ -0,0 +1,14 @@ +--- +title: Overview +summary: Overview +author: Guo Huan +date: 2021-05-19 +--- + +# Overview + +anomaly_detection is an AI tool integrated into MogDB. It can be used to collect database metrics, forecast metric trend changes, analyze root causes of slow SQL statements, and detect and diagnose exceptions. It is a component in the DBMind suite. The information that can be collected consists of **os_exporter**,**database_exporter**, and **wdr**. **os_exporter**includes I/O read, I/O write, I/O wait, CPU usage, memory usage, and disk space occupied by the database data directories.**database_exporter** includes QPS, some key GUC parameters (**work_mem**,**shared_buffers**, and **max_connections**), temporary database files, external processes, and external connections.**wdr**includes the slow SQL text, SQL execution start time, and SQL execution end time. In terms of exception detection, anomaly_detection can forecast the change trend of multiple metrics such as**IO_Read**, **IO_Write**,**IO_Wait**, **CPU_Usage**,**Memory_Usage**, and **Disk_Space**. When detecting that a metric exceeds the manually set threshold in a certain period or at a certain moment in the future, the tool generates an alarm through logs. In terms of slow SQL root cause analysis, the tool periodically obtains slow SQL information from WDR reports, diagnoses the root causes of slow SQL statements, and saves the diagnosis result to log files. In addition, the tool supports interactive slow SQL diagnosis, that is, the tool analyzes the root causes of slow SQL statements entered by users, and sends the result to the user. + +anomaly_detection consists of the agent and detector modules. The agent and MogDB database are deployed on the same server. The agent module provides the following functions: 1. Periodically collect database metric data and store the collected data in the buffer queue. 2. Periodically send the data in the buffer queue to the collector submodule of the detector module. + +The detector module consists of collector and monitor. The collector submodule communicates with the agent module through HTTP or HTTPS, receives data pushed by the agent module, and stores the data locally. The monitor submodule forecasts the metric change trend and generates alarms based on the local data. In addition, the monitor submodule analyzes the root cause of slow SQL statements based on related information such as the system and WDR reports. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/5-a-detection-status-monitoring/5-2-preparations.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/5-a-detection-status-monitoring/5-2-preparations.md new file mode 100644 index 0000000000000000000000000000000000000000..3f0a9a70ad13e42246089b6eb7fe0fa49104a088 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/5-a-detection-status-monitoring/5-2-preparations.md @@ -0,0 +1,173 @@ +--- +title: Preparations +summary: Preparations +author: Guo Huan +date: 2021-05-19 +--- + +# Preparations + +## Prerequisites and Precautions + +- The database is running properly. +- During the running of the tool, if the system time is tampered with, the slow SQL data collection may fail. +- The tool does not support data collection on the standby node. +- If you log in to the database host as a Linux user, add **$GAUSSHOME/bin** to the **PATH** environment variable so that you can directly run database O&M tools, such as gsql, gs_guc, and gs_ctl. +- The recommended Python version is Python 3.6 or later. The required dependency has been installed in the operating environment, and the optimization program can be started properly. +- This tool consists of the agent and detector. Data is transmitted between the agent and detector in HTTP or HTTPS mode. Therefore, ensure that the agent server can communicate with the detector server properly. +- The detector module runs the collector and monitor services, which need to be started separately. +- If HTTPS is used for communication, you need to prepare the CA certificate, and certificates and keys of the agent and detector, and save them to **ca**, **agent**, and **collector** in the **root** directory of the project, respectively. In addition, you need to save the key encryption password to **pwf** of the certificate, and set the permission to **600** to prevent other users from performing read and write operations. You can also use the script in the **share** directory to generate certificates and keys. +- You are advised to configure your own Python environment to avoid affecting other functions (for example, using miniconda). +- To analyze the root cause of slow SQL statements, you need the WDR report. In this case, you need to set **track_stmt_stat_level** to **'OFF,L1'** and **log_min_duration_statement** to **3000** (slow SQL threshold, which can be set as required). The unit is ms. +- If the detecor and database are deployed on the same server, the service port of the collector cannot be the same as the local port of the database. Otherwise, the process cannot be started. + +## Principles + +**Figure 1** anomaly_detection structure + +![anomaly_detection-structure](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/5-2-preparations-1.png) + +anomaly_detection is a tool independent of the database kernel. Figure 1 shows the anomaly_detection structure. The anomaly_detection tool consists of the agent and detector modules. + +- Agent: data agent module, which consists of the source, channel, and sink. It collects metrics in the database and sends the metrics to the remote detector in HTTP or HTTPS mode. +- Detector: collects and stores data pushed by the agent, monitors and detects database metrics based on algorithms such as time series forecast and exception detection, and provides root cause analysis on slow SQL statements. + +## Running and Installation of anomaly_detection + +1. Switch to the **anomaly_detection** directory. For the openGauss community code, the path is **openGauss-server/src/gausskernel/dbmind/tools/anomaly_detection**. For an installed database system, the source code path is **$GAUSSHOME/bin/dbmind/anomaly_detection**. + +2. You can view the **requirements.txt** file in the current directory. Use the pip package management tool to install the dependency based on the **requirements.txt** file. + + ``` + pip install -r requirements.txt + ``` + +3. After the installation is successful, run **main.py**. For example, to obtain the help information, run the following command: + + ``` + python main.py --help # Obtain help information. The methods of using other functions are similar. + ``` + +## Certificate Generation + +When using https to communicate, the user needs to provide a certificate, anomaly_detection also provides a certificate generation tool. + +1. To generate the CA root certificate, run the following command in the **share** directory of anomaly_detection: + + ``` + sh gen_ca_certificate.sh + ``` + +The script will create a certificate directory under the root directory of anomaly_detection, which includes three subdirectories of **ca**, **server**, and **agent**. The root certificate **ca.crt** and the key file **ca.key** are stored in **ca**. + +1. To generate the server-side certificate and key file, run the following command in the **share** directory of anomaly_detection: + + ``` + sh gen_certificate.sh + + # please input the basename of ssl certificate: ../certificate/server + + # please input the filename of ssl certificate: server + + # please input the local host: 127.0.0.1 + + # please input the password of ca and ssl separated by space: + ``` + +This script requires the user to input the storage directory of the generated certificate and key file, the name of the certificate and key file, the IP address of the detector server, the CA certificate password, and the current certificate password (separated by spaces). The script will finally generate **server.crt** and **server.key** under the server of the certificate. + +1. To generate the agent certificate key and file, run the following command in the **share** directory of anomaly_detection: + + ``` + sh gen_certificate.sh + + # please input the basename of ssl certificate: ../certificate/agent + + # please input the filename of ssl certificate: agent + + # please input the local host: 127.0.0.1 + + # please input the password of ca and ssl separated by space: + ``` + +This script requires the user to input the directory where the generated certificate and key file are stored, the name of the certificate and key file, the agent server IP address, the CA certificate password, and the current certificate password (separated by spaces). The script will finally generate **agent.crt** and **agent.key** under the certificate's agent. + +## Description of the anomaly_detection Configuration File + +The **a-detection.conf** and **metric\_task.conf**configuration files need to be loaded before **anomaly_detection** is executed. You can run the **python main.py -help** command to view the configuration file path. + +**a-detection.conf** contains six sections: agent, server, database, security, forecast, and log. The parameters are described as follows: + +``` +[database] +storage_duration = 12H # Data storage duration. The default value is 12 hours. +database_dir = ./data # Data storage directory + +[security] +tls = False +ca = ./certificate/ca/ca.crt +server_cert = ./certificate/server/server.crt +server_key = ./certificate/server/server.key +agent_cert = ./certificate/agent/agent.crt +agent_key = ./certificate/agent/agent.key + +[server] +host = 0.0.0.0 # IP address of the server +listen_host = 0.0.0.0 +listen_port = 8080 +white_host = 0.0.0.0 # IP address whitelist +white_port = 8000 # Port number whitelist + +[agent] +source_timer_interval = 10S # Agent data collection frequency +sink_timer_interval = 10S # Agent data sending frequency +channel_capacity = 1000 # Maximum length of the buffer queue +db_host = 0.0.0.0 # IP address of the agent node +db_port = 8080 # Port number of the agent node +db_type = single # Agent node type. The value can be single (single node), cn (CN), or dn (DN). + +[forecast] +forecast_alg = auto_arima # Time series prediction algorithm. The value can be auto_arima or fbprophet (You need to install by yourself). +[log] +log_dir = ./log # Log file location +``` + +**metric\_task.conf**: This configuration file contains three sections: **detector_method**, **os_exporter**, and **trend_parameter**. The parameters are described as follows: + +``` +[detector_method] +trend = os_exporter # Name of the table used for time series prediction +slow_sql = wdr # Name of the table for slow SQL diagnosis + +[os_exporter] +cpu_usage_minimum = 1 # Lower limit of CPU usage +cpu_usage_maximum = 10 # Upper limit of CPU usage +memory_usage_minimum = 1 # Lower limit of memory usage +memory_usage_maximum = 10 # Upper limit of memory usage +io_read_minimum = 1 +io_read_maximum = 10 +io_write_minimum = 1 +io_write_maximum = 10 +io_wait_minimum = 1 +io_wait_maximum = 10 +disk_space_minimum = 1 +disk_space_maximum = 10 + +[common_parameter] +data_period = 1000S # Length of historical data used for time series forecast. The value can be an integer plus the time unit (for example, 100S, 2M, and 10D). +interval = 20S # Monitoring interval +freq = 3S # Trend forecast frequency +period = 2 # Trend forecast period +``` + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> +> - The following time units are supported: +> - **'S'**: second +> - **'M'**: minute +> - **'H'**: hour +> - **'D'**: day +> - **'W'**: week +> - At least one of **minimum** and **maximum** must be provided. +> - **freq** and **period** determine the time series forecast result. For example, if **freq** is set to **2S** and **period** is set to **5**, the values of future 2s, 4s, 6s, 8s, and 10s will be forecasted. +> - Ensure that the training data length is greater than the forecasting length. Otherwise, the forecasting effect will be affected. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/5-a-detection-status-monitoring/5-3-adding-monitoring-parameters.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/5-a-detection-status-monitoring/5-3-adding-monitoring-parameters.md new file mode 100644 index 0000000000000000000000000000000000000000..c947be3d39e390852e0ce719b09a804089888692 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/5-a-detection-status-monitoring/5-3-adding-monitoring-parameters.md @@ -0,0 +1,40 @@ +--- +title: Adding Monitoring Parameters +summary: Adding Monitoring Parameters +author: Guo Huan +date: 2021-05-19 +--- + +# Adding Monitoring Parameters + +The tool performs trend prediction and threshold exception detection only for metrics in **os_exporter**. You can add new monitoring parameters. The procedure is as follows: + +1. Compile a function for obtaining metrics in **os_exporter** of **task/os_exporter.py**, and add the function to the output result list. For example: + + ``` + @staticmethod + def new_metric(): + return metric_value + + def output(self): + result = [self.cpu_usage(), self.io_wait(), self.io_read(), + self.io_write(), self.memory_usage(), self.disk_space(), self.new_metric()] + return result + + ``` + +2. In **os_exporter** of **table.json**, add the **new_metric** field to **CREATE table** and add the field type information to **INSERT**. For example: + + ``` + "os_exporter": { + "create_table": "create table os_exporter(timestamp bigint, cpu_usage text, io_wait text, io_read text, io_write text, memory_usage text, disk_space text, new_metric text);", + "insert": "insert into os_exporter values(%d, \"%s\", \"%s\", \"%s\", \"%s\", \"%s\", \"%s\", \"%s\");", + ``` + +3. Add the upper or lower limit of the metric to the **task/metric_task.conf** file. For example: + + ``` + [os_exporter] + new_metric_minimum = 0 + new_metric_maximum = 10 + ``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/5-a-detection-status-monitoring/5-4-obtaining-help-information.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/5-a-detection-status-monitoring/5-4-obtaining-help-information.md new file mode 100644 index 0000000000000000000000000000000000000000..e51719cd4ea2ad1f1eb136fb9d0c25fa1ec572be --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/5-a-detection-status-monitoring/5-4-obtaining-help-information.md @@ -0,0 +1,78 @@ +--- +title: Obtaining Help Information +summary: Obtaining Help Information +author: Guo Huan +date: 2021-05-19 +--- + +# Obtaining Help Information + +Before starting the tuning program, run the following command to obtain help information: + +``` +Source code mode: python main.py --help +``` + +The command output is as follows: + +``` +usage: + python main.py start [--role {{agent,collector,monitor}}] # start local service. + python main.py stop [--role {{agent,collector,monitor}}] # stop local service. + python main.py start [--user USER] [--host HOST] [--project-path PROJECT_PATH] [--role {{agent,collector,monitor}}] + # start the remote service. + python main.py stop [--user USER] [--host HOST] [--project-path PROJECT_PATH] [--role {{agent,collector, + monitor}}] # stop the remote service. + python main.py deploy [--user USER] [--host HOST] [--project-path PROJECT_PATH] # deploy project in remote host. + python main.py diagnosis [--query] [--start_time] [--finish_time] # rca for slow SQL. + python main.py show_metrics # display all monitored metrics(can only be executed on 'detector' machine). + python main.py forecast [--metric-name METRIC_NAME] [--period] [--freq] + [--forecast-method {{auto_arima, fbprophet}}] [--save-path SAVE_PATH] # forecast future trend of + metric(can only be executed on 'detector' machine). + +Anomaly-detection: a time series forecast and anomaly detection tool. + +positional arguments: + {start,stop,deploy,show_metrics,forecast,diagnosis} + +optional arguments: + -h, --help show this help message and exit + --user USER User of remote server. + --host HOST IP of remote server. + --project-path PROJECT_PATH + Project location in remote server. + --role {agent,collector,monitor} + Run as 'agent', 'collector', 'monitor'. Notes: ensure + the normal operation of the openGauss in agent. + --metric-name METRIC_NAME + Metric name to be predicted, you must provide an specified metric name. +. + --query QUERY target sql for RCA. + Currently, the join operator is not supported, and the accuracy of the result + is not guaranteed for SQL syntax containing "not null and". + --start_time START_TIME + start time of query + --finish_time FINISH_TIME + finish time of query + --period PERIOD Forecast periods of metric, it should be integernotes: + the specific value should be determined to the + trainnig data.if this parameter is not provided, the + default value '100S' will be used. + --freq FREQ forecast gap, time unit: S: Second, M: Minute, H: + Hour, D: Day, W: Week. + --forecast-method FORECAST_METHOD + Forecast method, default method is 'auto_arima',if + want to use 'fbprophet', you should install fbprophet + first. + --save-path SAVE_PATH + Save the results to this path using csv format, if + this parameter is not provided,, the result wil not be + saved. + -v, --version show program's version number and exit + +epilog: + the 'a-detection.conf' and 'metric_task.conf' will be read when the program is running, + the location of them is: + a-detection.conf: /openGauss-server/src/gausskernel/dbmind/tools/anomaly_detection/a-detection.conf. + metric_config: /openGauss-server/src/gausskernel/dbmind/tools/anomaly_detection/task/metric_task.conf. +``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/5-a-detection-status-monitoring/5-5-command-reference.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/5-a-detection-status-monitoring/5-5-command-reference.md new file mode 100644 index 0000000000000000000000000000000000000000..c177845c2e2551e88a7ae95d57e4c104b1ff7e00 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/5-a-detection-status-monitoring/5-5-command-reference.md @@ -0,0 +1,26 @@ +--- +title: Command Reference +summary: Command Reference +author: Guo Huan +date: 2021-05-19 +--- + +# Command Reference + +**Table 1** Command-line parameters + +| Parameter | Description | Value Range | +| :---------------- | :----------------------------------------------------------- | :----------------------------------------------------------- | +| mode | Specified running mode | **start**, **stop**, **forecast**, **show_metrics**, **deploy**, and **diagnosis** | +| -user | Remote server user | - | +| -host | IP address of the remote server | - | +| -project-path | Path of the **anomaly_detection** project on the remote server | - | +| -role | Start role | **agent**, **collector**, and **monitor** | +| -metric-name | Metric name | - | +| -query | Root cause analysis (RCA) target query | | +| -start_time | Time when the query starts | | +| -finish_time | Time when the query ends | | +| -forecast-periods | Forecast period | Integer + Time unit, for example, '100S', '3H'.
The time unit can be **S** (second), **M** (minute), **H** (hour), **D** (day), and **W** (week). | +| -forecast-method | Forecasting method | **auto_arima** and **fbprophet** | +| -save-path | Path for storing the forecasting result | - | +| -version, -v | Current tool version | - | diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/5-a-detection-status-monitoring/5-6-examples.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/5-a-detection-status-monitoring/5-6-examples.md new file mode 100644 index 0000000000000000000000000000000000000000..a639ae6ca93add92e772347d62705385eb1f9a1f --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/5-a-detection-status-monitoring/5-6-examples.md @@ -0,0 +1,102 @@ +--- +title: Examples +summary: Examples +author: Guo Huan +date: 2021-05-19 +--- + +# Examples + +To help users understand the deployment process, assume that the current database node information is as follows: + +``` +IP: 10.90.110.130 +PORT: 8000 +type: single +``` + +The detector server information is as follows: + +``` +IP: 10.90.110.131 +listen_host = 0.0.0.0 +listen_port = 8080 +``` + +The deployment startup process is as follows. + +## Modifying the Configuration File + +Modify the **a-detection.conf** configuration file. The following two sessions are involved: + +``` +[database] +storage_duration = 12H # Data storage duration. The default value is 12 hours. +database_dir = ./data # Data storage directory + +[security] +tls = False +ca = ./certificate/ca/ca.crt +server_cert = ./certificate/server/server.crt +server_key = ./certificate/server/server.key +agent_cert = ./certificate/agent/agent.crt +agent_key = ./certificate/agent/agent.key + +[server] +host = 10.90.110.131 +listen_host = 0.0.0.0 +listen_port = 8080 +white_host = 10.90.110.130 +white_port = 8000 +[agent] +source_timer_interval = 10S +sink_timer_interval = 10S +channel_capacity = 1000 +db_host = 10.90.110.130 +db_port = 8080 +db_type = single + +[forecast] +forecast_alg = auto_arima + +[log] +log_dir = ./log +``` + +## Starting and Stopping Services + +Start the local agent service. + +``` +python main.py start --role agent +``` + +Stop the local agent service. + +``` +python main.py stop --role agent +``` + +Start the local collector service. + +``` +python main.py start --role collector +``` + +Stop the local collector service. + +``` +python main.py stop --role collector +``` + +Start the local monitor service. + +``` +python main.py start --role monitor +``` + +Stop the local monitor service. + +``` +python main.py stop --role monitor +``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/5-a-detection-status-monitoring/5-7-ai-server.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/5-a-detection-status-monitoring/5-7-ai-server.md new file mode 100644 index 0000000000000000000000000000000000000000..80c282eb4411a2f3cf55eacbc54ffd2cbffbf10b --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/5-a-detection-status-monitoring/5-7-ai-server.md @@ -0,0 +1,72 @@ +--- +title: AI_SERVER +summary: AI_SERVER +author: Guo Huan +date: 2021-10-21 +--- + +# AI_SERVER + +AI_SERVER is an independent from the anomaly_detection feature. In addition to the data collection function of the anomaly_detection feature, the collection type, collection item, and data storage mode are added to the AI_SERVER feature. The AI_SERVER feature is used only for data collection and will be integrated into the anomaly_detection feature in the future. This feature includes the server and agent components. The agent must be deployed on the database node for data collection, while the server is deployed on an independent node for data collection and storage. + +The data storage modes include **sqlite**, **mongodb**, and **influxdb**. + +Table 1 describes the collection items. + +**Table 1** Collection items + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Collection TypeCollection ItemDescription
databasework_memGUC parameter related to the database memory. This parameter is used to check whether the allocated space is sufficient for SQL statements involving sorting tasks.
shared_buffersGUC parameter related to the database memory. Improper setting of shared_buffer will deteriorate the database performance.
max_connectionsMaximum number of database connections.
current connectionsNumber of current database connections.
qpsDatabase performance metrics:
oscpu usageCPU usage.
memory usageMemory usage.
io waitI/O wait event.
io writeData disk write throughput.
io readData disk read throughput.
disk usedSize of the used disk space.
+ +For details about the deployment mode, see [AI_MANAGER](5-8-ai-manager). diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/5-a-detection-status-monitoring/5-8-ai-manager.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/5-a-detection-status-monitoring/5-8-ai-manager.md new file mode 100644 index 0000000000000000000000000000000000000000..654fc54fd27157d351c974e11411e03c4b84e225 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/5-a-detection-status-monitoring/5-8-ai-manager.md @@ -0,0 +1,82 @@ +--- +title: AI_MANAGER +summary: AI_MANAGER +author: Guo Huan +date: 2021-10-22 +--- + +# AI_MANAGER + +AI_MANAGER is an AI feature deployment tool. It aims to provide automatic, efficient, and convenient deployment and uninstallation of AI features. You can specify the module name, operation type, and parameter file to automatically deploy and uninstall AI features, implementing version management, operation logging, log management, installation information recording. In addition, it supports feature-level horizontal expansion. Currently, this tool supports only the installation and uninstallation of AI_SERVER. + +## Preparations + +- The project deployment path is **/dbs/AI-tools**. Ensure that the path exists and you have the read, write, and execute permissions on the path. The content in the path will be cleared during installation or uninstallation. Do not save other files in this path. +- You need to install the Python3 environment and the Python library required by the feature. For details about the dependent library, see the **requirements.txt** file in the package. +- If HTTPS is enabled, you need to prepare the corresponding root certificate, key file, and password. +- The MogDB database has been started on the agent node. +- You need to install the agent node as a cluster user. +- If the **-/.bashrc** file of the cluster user on the agent node does not contain the correct **PGHOST** configuration, configure the **PGHOST** configuration in the **/dbs/AI-tools/ai_env** file. + +## Examples + +An example of the installation command is as follows: + +```bash +python3 ai_manager --module anomaly_detection --action install --param-file opengauss.json +``` + +An example of the uninstallation command is as follows: + +```bash +python3 ai_manager --module anomaly_detection--action uninstall --param-file opengauss.json +``` + +The following is an example of the parameter file: + +```json +{ + "scene": "mogdb", # Scenario. Both the server and agent modules are installed for mogdb, and only the server module is installed for HUAWEI CLOUD. + "module": "anomaly_detection", # Module (feature) name. Currently, only anomaly_detection is supported. + "action": "install", # Operation type. The value can be install or uninstall. + "ca_info": { + "ca_cert_path": "/home/Ruby/CA_AI/ca.crt", # Path of the root certificate + "ca_key_path": "/home/Ruby/CA_AI/ca.crt.key", # Path of the root certificate key + "ca_password": "GHJAyusa241~" # Root certificate password + }, + "agent_nodes": [ + { + "node_ip": "10.000.00.000", # IP address of the agent node + "username": "Ruby", # User of the agent node + "password": "password" # Password for logging in to the agent node + } + ], + "config_info": { + "server": { + "host": "10.000.00.000", # IP address of the node where server is deployed (execution node) + "listen_host": "0.0.0.0", # Server listening IP address + "listen_port": "20060", # Server listening port number + "pull_kafka": "False" # Specifies whether to pull Kafka data. Currently, data cannot be pulled. + }, + "database": { + "name": "sqlite", # Data storage mode. sqlite, mongodb, and influxdb are supported. + "host": "127.0.0.1", # Database IP address + "port": "2937", # Database port number + "user": "Ruby", # Database user + "size": "175000000", # Maximum storage capacity in mongodb mode + "max_rows": "1000000" # Maximum number of stored records in mongodb mode + }, + "agent": { + "cluster_name": "my_cluster", # Name of the collection database + "collection_type": "os", # Collection type. The value can be os, database, or all. + "collection_item": [["dn", "10.000.00.000", "33700"]], # Type of data collected by the agent node (DN/CN), IP address of the collection node, and port number + "channel_capacity": "1000", # Queue capacity + "source_timer_interval": "5S", # Collection Interval + "sink_timer_interval": "5S" # Sending interval + }, + "security": { + "tls": "True" # Specifies whether to enable HTTPS. + } + } +} +``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/6-index-advisor-index-recommendation/6-1-single-query-index-recommendation.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/6-index-advisor-index-recommendation/6-1-single-query-index-recommendation.md new file mode 100644 index 0000000000000000000000000000000000000000..ea47de36d76b6a67f064224df74ff0668de2e638 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/6-index-advisor-index-recommendation/6-1-single-query-index-recommendation.md @@ -0,0 +1,59 @@ +--- +title: Single-query Index Recommendation +summary: Single-query Index Recommendation +author: Guo Huan +date: 2021-05-19 +--- + +# Single-query Index Recommendation + +The single-query index recommendation function allows users to directly perform operations in the database. This function generates recommended indexes for a single query statement entered by users based on the semantic information of the query statement and the statistics of the database. This function involves the following interfaces: + +**Table 1** Single-query index recommendation interfaces + +| Function Name | Parameter | Description | +| :-------------- | :------------------- | :----------------------------------------------------------- | +| gs_index_advise | SQL statement string | Generates a recommendation index for a single query statement. | + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> +> - This function supports only a single SELECT statement and does not support other types of SQL statements. +> - Partitioned tables, column-store tables, segment-paged tables, common views, materialized views, global temporary tables, and encrypted databases are not supported. + +## Application Scenarios + +Use the preceding function to obtain the recommendation index generated for the query. The recommendation result consists of the table name and column name of the index. + +For example: + +```sql +mogdb=> select "table", "column" from gs_index_advise('SELECT c_discount from bmsql_customer where c_w_id = 10'); + table | column +----------------+---------- + bmsql_customer | (c_w_id) +(1 row) +``` + +The preceding information indicates that an index should be created on the **c_w_id** column of the **bmsql_customer** table. You can run the following SQL statement to create an index: + +```sql +CREATE INDEX idx on bmsql_customer(c_w_id); +``` + +Some SQL statements may also be recommended to create a join index, for example: + +```sql +mogdb=# select "table", "column" from gs_index_advise('select name, age, sex from t1 where age >= 18 and age < 35 and sex = ''f'';'); + table | column +-------+------------ + t1 | (age, sex) +(1 row) +``` + +The preceding statement indicates that a join index **(age, sex)** needs to be created in the **t1** table. You can run the following command to create a join index: + +```sql +CREATE INDEX idx1 on t1(age, sex); +``` + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** Parameters of the system function **gs_index_advise()** are of the text type. If the parameters contain special characters such as single quotation marks ('), you can use single quotation marks (') to escape the special characters. For details, see the preceding example. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/6-index-advisor-index-recommendation/6-2-virtual-index.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/6-index-advisor-index-recommendation/6-2-virtual-index.md new file mode 100644 index 0000000000000000000000000000000000000000..4d6cd4168804fde4c3b17cedc26d968087243ba0 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/6-index-advisor-index-recommendation/6-2-virtual-index.md @@ -0,0 +1,123 @@ +--- +title: Virtual Index +summary: Virtual Index +author: Guo Huan +date: 2021-05-19 +--- + +# Virtual Index + +The virtual index function allows users to directly perform operations in the database. This function simulates the creation of a real index to avoid the time and space overhead required for creating a real index. Based on the virtual index, users can evaluate the impact of the index on the specified query statement by using the optimizer. + +This function involves the following interfaces: + +**Table 1** Virtual index function interfaces + +| Function Name | Parameter | Description | +| :------------------- | :------------------------------------------------------ | :----------------------------------------------------------- | +| hypopg_create_index | Character string of the statement for creating an index | Creates a virtual index. | +| hypopg_display_index | None | Displays information about all created virtual indexes. | +| hypopg_drop_index | OID of the index | Deletes a specified virtual index. | +| hypopg_reset_index | None | Clears all virtual indexes. | +| hypopg_estimate_size | OID of the index | Estimates the space required for creating a specified index. | + +This function involves the following GUC parameters: + +**Table 2** GUC parameters of the virtual index function + +| Parameter | Description | Default Value | +| :---------------- | :-------------------------------------------- | :------------ | +| enable_hypo_index | Whether to enable the virtual index function. | off | + +## Procedure + +1. Use the **hypopg_create_index** function to create a virtual index. For example: + + ```sql + mogdb=> select * from hypopg_create_index('create index on bmsql_customer(c_w_id)'); + indexrelid | indexname + ------------+------------------------------------- + 329726 | <329726>btree_bmsql_customer_c_w_id + (1 row) + ``` + +2. Enable the GUC parameter **enable_hypo_index**. This parameter controls whether the database optimizer considers the created virtual index when executing the EXPLAIN statement. By executing EXPLAIN on a specific query statement, you can evaluate whether the index can improve the execution efficiency of the query statement based on the execution plan provided by the optimizer. For example: + + ```sql + mogdb=> set enable_hypo_index = on; + SET + ``` + + Before enabling the GUC parameter, run **EXPLAIN** and the query statement. + + ```sql + mogdb=> explain SELECT c_discount from bmsql_customer where c_w_id = 10; + QUERY PLAN + ---------------------------------------------------------------------- + Seq Scan on bmsql_customer (cost=0.00..52963.06 rows=31224 width=4) + Filter: (c_w_id = 10) + (2 rows) + ``` + + After enabling the GUC parameter, run **EXPLAIN** and the query statement. + + ```sql + mogdb=> explain SELECT c_discount from bmsql_customer where c_w_id = 10; + QUERY PLAN + ------------------------------------------------------------------------------------------------------------------ + [Bypass] + Index Scan using <329726>btree_bmsql_customer_c_w_id on bmsql_customer (cost=0.00..39678.69 rows=31224 width=4) + Index Cond: (c_w_id = 10) + (3 rows) + ``` + + By comparing the two execution plans, you can find that the index may reduce the execution cost of the specified query statement. Then, you can consider creating a real index. + +3. (Optional) Use the **hypopg_display_index** function to display all created virtual indexes. For example: + + ```sql + mogdb=> select * from hypopg_display_index(); + indexname | indexrelid | table | column + --------------------------------------------+------------+----------------+------------------ + <329726>btree_bmsql_customer_c_w_id | 329726 | bmsql_customer | (c_w_id) + <329729>btree_bmsql_customer_c_d_id_c_w_id | 329729 | bmsql_customer | (c_d_id, c_w_id) + (2 rows) + ``` + +4. (Optional) Use the **hypopg_estimate_size** function to estimate the space (in bytes) required for creating a virtual index. For example: + + ```sql + mogdb=> select * from hypopg_estimate_size(329730); + hypopg_estimate_size + ---------------------- + 15687680 + (1 row) + ``` + +5. Delete the virtual index. + + Use the **hypopg_drop_index** function to delete the virtual index of a specified OID. For example: + + ```sql + mogdb=> select * from hypopg_drop_index(329726); + hypopg_drop_index + ------------------- + t + (1 row) + ``` + + Use the **hypopg_reset_index** function to clear all created virtual indexes at a time. For example: + + ```sql + mogdb=> select * from hypopg_reset_index(); + hypopg_reset_index + -------------------- + + (1 row) + ``` + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> +> - Running **EXPLAIN ANALYZE** does not involve the virtual index function. +> - The created virtual index is at the database instance level and can be shared by sessions. After a session is closed, the virtual index still exists. However, the virtual index will be cleared after the database is restarted. +> - This function does not support common views, materialized views, and column-store tables. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/6-index-advisor-index-recommendation/6-3-workload-level-index-recommendation.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/6-index-advisor-index-recommendation/6-3-workload-level-index-recommendation.md new file mode 100644 index 0000000000000000000000000000000000000000..409d1c1643699a00f5cef4b611b98d4d8c5e07cd --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/6-index-advisor-index-recommendation/6-3-workload-level-index-recommendation.md @@ -0,0 +1,105 @@ +--- +title: Workload-level Index Recommendation +summary: Workload-level Index Recommendation +author: Guo Huan +date: 2021-05-19 +--- + +# Workload-level Index Recommendation + +For workload-level indexes, you can run scripts outside the database to use this function. This function uses the workload of multiple DML statements as the input to generate a batch of indexes that can optimize the overall workload execution performance. In addition, it provides the function of extracting service data SQL statements from logs. + +## Prerequisites + +- The database is normal, and the client can be connected properly. + +- The **gsql** tool has been installed by the current user, and the tool path has been added to the **PATH** environment variable. + +- The Python 3.6+ environment is available. + +- To use the service data extraction function, you need to set the GUC parameters of the node whose data is to be collected as follows: + + - log_min_duration_statement = 0 + + - log_statement= 'all' + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** After service data extraction is complete, you are advised to restore the preceding GUC parameters. Otherwise, log files may be expanded. + +## Procedure for Using the Service Data Extraction Script + +1. Set the GUC parameters according to instructions in the prerequisites. + +2. Run the Python script **extract_log.py**: + + ``` + ython extract_log.py [l LOG_DIRECTORY] [f OUTPUT_FILE] [-d DATABASE] [-U USERNAME][--start_time] [--sql_amount] [--statement] [--json] + ``` + + The input parameters are as follows: + + - **LOG_DIRECTORY**: directory for storing **pg_log**. + - **OUTPUT_PATH**: path for storing the output SQL statements, that is, path for storing the extracted service data. + - **DATABASE** (optional): database name. If this parameter is not specified, all databases are selected by default. + - **USERNAME** (optional): username. If this parameter is not specified, all users are selected by default. + - **start_time** (optional): start time for log collection. If this parameter is not specified, all files are collected by default. + - **sql_amount** (optional): maximum number of SQL statements to be collected. If this parameter is not specified, all SQL statements are collected by default. + - **statement** (optional): Collects the SQL statements starting with **statement** in **pg_log log**. If this parameter is not specified, the SQL statements are not collected by default. + - **json**: Specifies that the collected log files are stored in JSON format after SQL normalization. If the default format is not specified, each SQL statement occupies a line. + + An example is provided as follows. + + ``` + python extract_log.py $GAUSSLOG/pg_log/dn_6001 sql_log.txt -d postgres -U omm --start_time '2021-07-06 00:00:00' --statement + ``` + +3. Change the GUC parameter values set in step 1 to the values before the setting. + +## Procedure for Using the Index Recommendation Script + +1. Prepare a file that contains multiple DML statements as the input workload. Each statement in the file occupies a line. You can obtain historical service statements from the offline logs of the database. + +2. Run the Python script **index_advisor_workload.py**: + + ``` + python index_advisor_workload.py [p PORT] [d DATABASE] [f FILE] [--h HOST] [-U USERNAME] [-W PASSWORD][--schema SCHEMA][--max_index_num MAX_INDEX_NUM][--max_index_storage MAX_INDEX_STORAGE] [--multi_iter_mode] [--multi_node] [--json] [--driver] [--show_detail] + ``` + + The input parameters are as follows: + + - **PORT**: port number of the connected database. + - **DATABASE**: name of the connected database. + - **FILE**: file path that contains the workload statement. + - **HOST** (optional): ID of the host that connects to the database. + - **USERNAME** (optional): username for connecting to the database. + - **PASSWORD** (optional): password for connecting to the database. + - **SCHEMA**: schema name. + - **MAX_INDEX_NUM** (optional): maximum number of recommended indexes. + - **MAX_INDEX_STORAGE** (optional): maximum size of the index set space. + - **multi_node** (optional): specifies whether the current instance is a distributed database instance. + - **multi_iter_mode** (optional): algorithm mode. You can switch the algorithm mode by setting this parameter. + - **json** (optional): Specifies the file path format of the workload statement as JSON after SQL normalization. By default, each SQL statement occupies one line. + - **driver** (optional): Specifies whether to use the Python driver to connect to the database. By default, **gsql** is used for the connection. + - **show_detail** (optional): Specifies whether to display the detailed optimization information about the current recommended index set. + + Example: + + ``` + python index_advisor_workload.py 6001 postgres tpcc_log.txt --schema public --max_index_num 10 --multi_iter_mode + ``` + + The recommendation result is a batch of indexes, which are displayed on the screen in the format of multiple create index statements. The following is an example of the result. + + ```sql + create index ind0 on public.bmsql_stock(s_i_id,s_w_id); + create index ind1 on public.bmsql_customer(c_w_id,c_id,c_d_id); + create index ind2 on public.bmsql_order_line(ol_w_id,ol_o_id,ol_d_id); + create index ind3 on public.bmsql_item(i_id); + create index ind4 on public.bmsql_oorder(o_w_id,o_id,o_d_id); + create index ind5 on public.bmsql_new_order(no_w_id,no_d_id,no_o_id); + create index ind6 on public.bmsql_customer(c_w_id,c_d_id,c_last,c_first); + create index ind7 on public.bmsql_new_order(no_w_id); + create index ind8 on public.bmsql_oorder(o_w_id,o_c_id,o_d_id); + create index ind9 on public.bmsql_district(d_w_id); + ``` + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** The value of the **multi_node** parameter must be specified based on the current database architecture. Otherwise, the recommendation result is incomplete, or even no recommendation result is generated. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/7-deepsql/7-1-overview.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/7-deepsql/7-1-overview.md new file mode 100644 index 0000000000000000000000000000000000000000..8fafac892f6e80088328cd9b5d454ebdf7664cd0 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/7-deepsql/7-1-overview.md @@ -0,0 +1,108 @@ +--- +title: Overview +summary: Overview +author: Guo Huan +date: 2021-05-19 +--- + +# Overview + +The DeepSQL feature is compatible with the MADLib framework and can implement AI algorithms in the database. A complete set of SQL-based machine learning, data mining, and statistics algorithms is provided. Users can directly use SQL statements to perform machine learning. Deep SQL can abstract the end-to-end R&D process from data to models. With the bottom-layer engine and automatic optimization, technical personnel with basic SQL knowledge can complete most machine learning model training and prediction tasks. The entire analysis and processing are running in the database engine. Users can directly analyze and process data in the database without transferring data between the database and other platforms. This avoids unnecessary data movement between multiple environments. + +DeepSQL is an enhancement to MogDB DB4AI, allowing data analysts or developers who are familiar with MADLib to easily migrate data to MogDB. DeepSQL encapsulates common machine learning algorithms into SQL statements and supports more than 60 general algorithms, including regression algorithms (such as linear regression, logistic regression, and random forest), classification algorithms (such as KNN), and clustering algorithms (such as K-means). In addition to basic machine learning algorithms, graph-related algorithms are also included, such as algorithms about the shortest path and graph diameter. Also, it supports data processing (such as PCA), sparse vectors, common statistical algorithms (such as covariance and Pearson coefficient calculation), training set and test set segmentation, and cross validation. + +**Table 1** Supported machine learning algorithms: regression algorithms + +| Algorithm Name | Abbreviation | Application Scenario | +| :---------------------------------- | :----------- | :----------------------------------------------------------- | +| Logistic regression | - | For example, find the risk factors of a disease, or evaluate enterprises for financial and commercial institutions.
Prediction: Use a model to predict the occurrence probabilities of a disease or situation under different independent variables.
Judgment: Use a model to determine the probability that a person has certain diseases or be in certain situations. | +| Cox proportional hazards regression | - | The model takes the survival result and the survival time as dependent variables, can analyze the influence of many factors on the survival time simultaneously, and can analyze the data with the truncated survival time, without the need of estimating the distribution type of the data. Because of the preceding excellent properties, this model has been widely used in medical research since its inception, and is the most widely used multi-factor analysis method in survival analysis. | +| Elastic net regularization | - | Elastic regression is a hybrid technique of ridge regression and lasso regression, which uses L2 and L1 regularization at the same time. When there are multiple related features, the lasso regression is likely to randomly select one of them, while the elastic regression is likely to select all of them. | +| Generalized linear model | - | In some practical problems, the relationship between variables is not always linear. In this case, curves should be used for fitting. | +| Marginal effect | - | Calculation of marginal effects. | +| Multinomial regression | - | If there are more than two target categories, multinomial regression is required. For example, evaluate the curative effect with "ineffective", "effective", and "cured". | +| Ordinal regression | - | In statistics, ordinal regression is a regression analysis used to predict ordinal variables. That is, the values of variables are within any range, and the metric distances between different values are different. It can be considered as an issue between regression and classification. Examples include the severity of illness (levels 1, 2, 3, and 4), the pain scale (no pain, mild, moderate, and severe), and the drug dose-response effects (ineffective, less effective, effective, and very effective). The differences between levels are not necessarily equal, for example, the difference between no pain and mild is not necessarily equal to the difference between moderate and severe. | +| Clustered variance | - | The clustered variance module adjusts the standard error of clustering. For example, when a dataset is copied 100 times, precision of parameter estimation should not be increased, but execution of this process in compliance with an independent identically distributed (IID) assumption actually improves precision. | +| Robust variance | - | The functions in the robust variance module are used to compute the robust variance (Huber-White estimator) of linear regression, logistic regression, multinomial logistic regression, and Cox proportional hazard regression (Huber-White estimation). They can be used to compute differences of data in datasets with potential anomalous noises. | +| Support vector machine | SVM | Compared with traditional query optimization schemes, SVM can obtain higher query accuracy for text and hypertext classification and image classification. This also applies to image segmentation systems. | +| Linear regression | - | This is widely used in economics and finance. | + +**Table 2** Supported machine learning algorithms: other supervised learning + +| Algorithm Name | Abbreviation | Application Scenario | +| :----------------------- | :----------- | :----------------------------------------------------------- | +| Decision tree | - | It is one of the most widely used inductive inference algorithms. It handles the classification and prediction problems of category or continuous variables. The model can be represented by graphs and if-then rules, which is highly readable. | +| Random forest | RF | Random forest is a kind of combinatorial method specially designed for decision tree classifier. It combines multiple decision trees to make predictions. | +| Conditional random field | CRF | CRF is a discriminant graphic model with undirected probability. A linear chain CRF is a special type of CRF that assumes that the current state depends only on the previous state. Good results have been obtained in sequence annotation tasks such as word segmentation, part-of-speech tagging, and named entity recognition. | +| Naive Bayes | - | Classification by calculating probabilities can be used to deal with multi-classification issues, such as spam filters. | +| Neural network | - | It has a wide range of application scenarios, such as speech recognition, image recognition, and machine translation. It is a standard supervised learning algorithm in the domain of pattern recognition, and continues to be a research subject in the domain of computational neurology. MLP has been proved to be a general function approximation method, which can be used to fit complex functions or solve classification problems. | +| k-nearest neighbors | - | In the k-nearest neighbor classification method, a distance between each training sample and a to-be-classified sample is computed, and **K** training samples that are closest to the to-be-classified sample are selected. If training samples of a category in the **K** samples accounts for a majority, the to-be-classified tuple belongs to the category.
It can be used for text recognition, facial recognition, gene pattern recognition, customer churn prediction, and fraud detection. | + +**Table 3** Supported machine learning algorithms: data processing algorithms + +| Algorithm Name | Abbreviation | Application Scenario | +| :-------------------------------------------------------- | :----------- | :----------------------------------------------------------- | +| Array operation | - | Array and vector operations, including basic addition, subtraction, multiplication, and division, exponentiation, root extraction, cos, sin, absolute value, and variance. | +| Principal component analysis for dimensionality reduction | PCA | This is used to reduce dimensions and compute the principal component. | +| Encoding categorical variable | - | Currently, the one-hot and dummy encoding technologies are supported.
When a specific group of prediction variables need to be compared with another group of prediction variables, dummy coding is usually used, and a group of variables compared with the group of prediction variables is referred to as a reference group. One-hot encoding is similar to dummy encoding, and a difference lies in that the one-hot encoding establishes a numeric type 0/1 indication column for each classification value. In each row of data (corresponding to one data point), a value of only one classification code column can be **1**. | +| Matrix operation | - | Using matrix decomposition to decompose a large matrix into the product form of a simple matrix can greatly reduce the difficulty and volume of computation.
Matrix addition, subtraction, multiplication, and division, extremum, mean, rank calculation, inversion, matrix decomposition (QR, LU, Cholesky), and feature extraction. | +| Norms and distance functions | - | This is used to compute the norm, cosine similarity, and distance between vectors. | +| Sparse vector | - | This is used to implement the sparse vector type. If there are a large number of repeated values in the vector, the vector can be compressed to save space. | +| Pivot | - | Pivot tables are used to meet common row and column transposition requirements in OLAP or report systems. The pivot function can perform basic row-to-column conversion on data stored in a table and output the aggregation result to another table. It makes row and column conversion easier and more flexible. | +| Path | - | It performs regular pattern matching on a series of rows and extracts useful information about pattern matching. The useful information can be a simple match count or something more involved, such as an aggregate or window function. | +| Sessionize | - | The sessionize function performs time-oriented session rebuilding on a dataset that includes an event sequence. The defined inactive period indicates the end of a session and the start of the next session.
It can be used for network analysis, network security, manufacturing, finance, and operation analysis. | +| Conjugate gradient | - | A method for solving numerical solutions of linear equations whose coefficient matrices are symmetric positive definite matrices. | +| Stemming | - | Stemming is simply to find the stem of a word. It can be used to, for example, establish a topic-focused search engine.
The optimization effect is obvious on English websites, which can be a reference for websites in other languages. | +| Train-Test Split | - | It is used to split a dataset into a training set and a test set. The train set is used for training, and the test set is used for verification. | +| Cross validation | - | It is used to perform cross validation. | +| Prediction metric | - | It is used to evaluate the quality of model prediction, including the mean square error, AUC value, confusion matrix, and adjusted R-square. | +| Mini-batch preprocessor | - | It is used to pack the data into small parts for training. The advantage is that the performance is better than that of the stochastic gradient descent (the default MADlib optimizer), and the convergence is faster and smoother. | + +**Table 4** Supported machine learning algorithms: graph + +| Algorithm Name | Abbreviation | Application Scenario | +| :----------------------------- | :----------- | :----------------------------------------------------------- | +| All pairs shortest path | APSP | APSP finds the length (summed weight) of the shortest path between all pairs of vertices to minimize the sum of the path edge weights. | +| Breadth-first search | - | This algorithm traverses paths. | +| Hyperlink-induced topic search | HITS | HITS outputs the authority score and hub score of each vertex, where authority estimates the value of the content of the page and hub estimates the value of its links to other pages. | +| Average path length | - | This function computes the average value of the shortest paths between each pair of vertices. The average path length is based on the "reachable target vertexes", so it ignores infinite-length paths between unconnected vertices. | +| Closeness centrality | - | The closeness measures are the inverse of the sum, the inverse of the average, and the sum of inverses of the shortest distances to all reachable target vertices (excluding the source vertex). | +| Chart diameter | - | The diameter is defined as the longest of all the shortest paths in the graph. | +| In-Out degree | - | This algorithm computes the in-degree and out-degree of each node. The node in-degree is the number of edges pointing in to the node and node out-degree is the number of edges pointing out of the node. | +| PageRank | - | Given a graph, the PageRank algorithm outputs a probability distribution representing the likelihood that a person randomly traversing the graph will arrive at any particular vertex. | +| Single source shortest path | SSSP | Given a graph and a source vertex, the SSSP algorithm finds a path from the source vertex to every other vertex in the graph to minimize the sum of the weights of the path edges. | +| Weakly connected component | - | Given a directed graph, the WCC is a subgraph of the original graph, where all vertices are connected to each other through a path, ignoring the direction of the edges. In the case of an undirected graph, the WCC is also a strongly connected component. This module also includes many auxiliary functions that run on the WCC output. | + +**Table 5** Supported machine learning algorithms: time series + +| Algorithm Name | Abbreviation | Application Scenario | +| :--------------------------------------------- | :----------- | :----------------------------------------------------------- | +| Autoregressive integrated moving average model | ARIMA | Time series forecasting, which is used to understand and forecast future values in the series.
For example, international air traveler data can be used to forecast the number of passengers. | + +**Table 6** Supported machine learning algorithms - sampling + +| Algorithm Name | Abbreviation | Application Scenario | +| :------------------ | :----------- | :----------------------------------------------------------- | +| Sample | - | Sampling. | +| Stratified sampling | - | Stratified random sampling, also known as type random sampling, is used to divide the overall units into various types (or layers) according to a certain standard. Then, according to the ratio of the number of units of each type to the total number of units, the number of units extracted from each type is determined. Finally, samples are extracted from each type according to the random principle. | +| Balanced sampling | - | Some classification algorithms only perform optimally when the number of samples in each class is roughly the same. Highly skewed datasets are common in many domains (such as fraud detection), so resampling to offset this imbalance can produce better decision boundaries. | + +**Table 7** Supported machine learning algorithms: statistics + +| Algorithm Name | Abbreviation | Application Scenario | +| :------------------------------- | :----------- | :----------------------------------------------------------- | +| Summary | - | This algorithm generates summary statistics for any data table. | +| Correlation and covariance | - | Descriptive statistics, one of which computes the Pearson coefficient and correlation coefficient, and the other outputs covariance. This help us understand the characteristics of the data amount that is statistically reflected so that we can better understand the data to be mined. | +| CountMin (Cormode-Muthukrishnan) | - | This algorithm counts the occurrence frequency of an element in a real-time data stream, and is ready to response to the occurrence frequency of an element at any time. No accurate counting is required. | +| Flajolet-Martin | FM | This algorithm is used to obtain the number of different values in a specified column, and find the number of unique numbers in the number set. | +| Most frequent values | MFV | This algorithm is used to compute frequent values. | +| Hypothesis test | - | This algorithm includes F-test and chi2-test. | +| Probability functions | - | The probability functions module provides cumulative distribution, density/mass, and quantile functions for various probability distributions. | + +**Table 8** Supported machine learning algorithms: other algorithms + +| Algorithm Name | Abbreviation | Application Scenario | +| :-------------------------- | :----------- | :----------------------------------------------------------- | +| k-means clustering | - | This algorithm is used in the clustering scenario. | +| Latent Dirichlet allocation | LDA | LDA plays an important role in the topic model and is often used for text classification. | +| Apriori algorithm | - | Apriori algorithm is used to discover the association between data item sets, such as the typical "beer and diaper" association. | diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/7-deepsql/7-2-environment-deployment.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/7-deepsql/7-2-environment-deployment.md new file mode 100644 index 0000000000000000000000000000000000000000..bc3bc927cf35efb7759ecf9a8420b289d17c08c3 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/7-deepsql/7-2-environment-deployment.md @@ -0,0 +1,115 @@ +--- +title: Environment Deployment +summary: Environment Deployment +author: Guo Huan +date: 2021-05-19 +--- + +# Environment Deployment + +The DeepSQL environment consists of two parts: compiling the database and installing the algorithm library. + +## Prerequisites + +- Python 2.7.12 or later has been installed. +- The database supports the PL/Python stored procedure. +- You have the administrator permission to install the algorithm library. + +## Procedure + +1. Check the Python deployment environment. + + Before the installation, check the Python version installed in the system. Currently, DeepSQL requires Python 2.7.12 or later. + + - If the Python version is later than 2.7.12, you can directly install the **python-devel** package. + - If the version is too early or the **python-devel** package cannot be installed, download the latest Python 2 source code, manually configure and compile Python 2, and configure environment variables. + + In the algorithm library, some algorithms, such as NumPy and pandas, call Python packages. You can install the following Python libraries: + + ``` + pip install numpy + pip install pandas + pip install scipy + ``` + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-notice.gif) **NOTICE:** + > + > - If you compile Python by yourself, you need to add the **-enable-shared** option when executing the **configure** script. + > - If Python 2 in the system uses the UCS4 code, you need to add the **-enable-unicode=ucs4** option when compiling Python 2. You can run the **import sys; print sys.maxunicode** command in the built-in Python 2 and view the result. If the result is **65535**, the system uses the UCS2 code by default. If the result is **1114111**, the system uses the UCS4 code. + > - If the built-in Python 2 uses UCS4, the gdb and gstack in the system also depend on UCS4. Therefore, when configuring Python 2, you need to add **-enable-unicode=ucs4**. Otherwise, an error is reported when gdb or gstack is used. + +2. Compile and deploy the database. + + The database supports the PL/Python stored procedure. The database is compiled by default, and this module is not included. Therefore, you need to add the **-with-python** option in the **configure** phase when compiling the database. + + Other compilation steps remain unchanged. + + After the compilation is complete, run the **gs_initdb** command again. + + By default, the PL/Python stored procedure module is not loaded. Run **CREATE EXTENSION plpythonu** to load the module. + +3. Compile and install the algorithm library. + + The algorithm library uses the open-source MADlib machine learning framework. The source code package and patch can be obtained from the code repository of the third-party library. The installation commands are as follows: + + ```bash + tar -zxf apache-madlib-1.17.0-src.tar.gz + cp madlib.patch apache-madlib-1.17.0-src + cd apache-madlib-1.17.0-src/ + patch -p1 < madlib.patch + ``` + + The compilation commands are as follows: + + ```bash + ./configure -DCMAKE_INSTALL_PREFIX={YOUR_MADLIB_INSTALL_FOLDER} + -DPOSTGRESQL_EXECUTABLE=$GAUSSHOME/bin/ + -DPOSTGRESQL_9_2_EXECUTABLE=$GAUSSHOME/bin/ + -DPOSTGRESQL_9_2_CLIENT_INCLUDE_DIR=$GAUSSHOME/bin/ + -DPOSTGRESQL_9_2_SERVER_INCLUDE_DIR=$GAUSSHOME/bin/ + # The preceding commands are all configure commands. + make + make install + ``` + + Replace **{YOUR_MADLIB_INSTALL_FOLDER}** with the actual installation path. + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-notice.gif) **NOTICE:** When MADlib is compiled, the dependency software is downloaded online. If the network cannot be connected, you need to manually download the dependency packages **PyXB-1.2.6.tar.gz**, **eigen-branches-3.2.tar.gz**, and **boost_1_61_0.tar.gz** to the local host. The **configure** commands are as follows: + > + > ```bash + > ./configure -DCMAKE_INSTALL_PREFIX={YOUR_MADLIB_INSTALL_FOLDER} # your install folder + > -DPYXB_TAR_SOURCE={YOUR_DEPENDENCY_FOLDER}/PyXB-1.2.6.tar.gz # change to your local folder + > -DEIGEN_TAR_SOURCE={YOUR_DEPENDENCY_FOLDER}/eigen-branches-3.2.tar.gz # change to your local folder + > -DBOOST_TAR_SOURCE={YOUR_DEPENDENCY_FOLDER}/boost_1_61_0.tar.gz # change to your local folder + > -DPOSTGRESQL_EXECUTABLE=$GAUSSHOME/bin/ + > -DPOSTGRESQL_9_2_EXECUTABLE=$GAUSSHOME/bin/ + > -DPOSTGRESQL_9_2_CLIENT_INCLUDE_DIR=$GAUSSHOME/bin/ + > -DPOSTGRESQL_9_2_SERVER_INCLUDE_DIR=$GAUSSHOME/bin/ + > ``` + +4. Install the algorithm library in the database. + + a. Go to the **{YOUR_MADLIB_INSTALL_FOLDER}** directory. + + b. Go to the **bin** directory. + + c. Run the following command: + + ```bash + ./madpack -s -p opengauss -c @127.0.0.1:/ install + ``` + + The options in the command are described as follows: + + - **-s**: specifies the name of a schema. + - **-p**: database platform, which can be **mogdb**. + - **-c**: specifies parameters for connecting to the database, including the username, @, IP address, port number, and target database name. + + **install** is the installation command. Besides, you can run the **reinstall** or **uninstall** commands. + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + > + > - The destination database must exist. + > - Use 127.0.0.1 instead of localhost as the IP address. + > - Operations such as installation and uninstallation of a large number of PL/Python stored procedures need to be performed by the database administrator. Common users do not have the permission to create or modify PL/Python stored procedures. They can only call the stored procedures. + > - The recommended database compatibility is B. The processing of empty and NULL values varies according to the database compatibility. B compatibility is recommended. Example: `CREATE DATABASE dbcompatibility='B'`. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/7-deepsql/7-3-usage-guide.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/7-deepsql/7-3-usage-guide.md new file mode 100644 index 0000000000000000000000000000000000000000..722755de90b4473939ff171b98897b12e778fa32 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/7-deepsql/7-3-usage-guide.md @@ -0,0 +1,226 @@ +--- +title: Usage Guide +summary: Usage Guide +author: Guo Huan +date: 2021-05-19 +--- + +# Usage Guide + +## PL/Python Stored Procedure + +Currently, the PL/Python stored procedure uses Python 2 by default. + +Functions in PL/Python are declared through the standard CREATE FUNCTION. + +```sql +CREATE FUNCTION funcname (argument-list) +RETURNS return-type +AS $$ +# PL/Python function body +$$ LANGUAGE plpythonu; +``` + +The function body is a simple Python script. When the function is called, its arguments are input as the elements of the list **args**. Named arguments are also input to the Python script as common variables. Named arguments are usually easier to read. The result is returned from Python code as usual using **return** or **yield** (in the case of a result set statement). If no return value is provided, Python returns **None** by default. PL/Python considers **None** in Python as the SQL NULL value. + +For example, the function that returns the larger of two integers is defined as follows: + +```sql +CREATE FUNCTION pymax(a integer, b integer) RETURNS integer AS $$ +if a > b: + return a +return b +$$ LANGUAGE plpythonu; +``` + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-caution.gif) **CAUTION:** +> +> - In the PL/Python function, the suffix is **plpythonu**. **u** indicates that the stored procedure is of the untrusted type. +> - **Trusted**: This language cannot be used to access unauthorized data. For example, files of the database server and internal database (including direct access to the shared memory). +> - **Untrusted**: This language has no restrictions and allows access to any data (including files, networks, and shared libraries). It is hazardous but has more powerful functions. +> - PL/Python is an untrusted stored procedure language. Currently, only administrators can create and modify PL/Python. Common users can only use PL/Python. +> - When defining PL/Python stored procedures, do not define the risky statement execution such as **import os; os.system("rm -rf /")**. Users with administrator permission need to create such PL/Python stored procedures with caution. + +## Processing of NULL, None, and Empty Strings in the Database + +If an SQL NULL value is input to the function, the parameter value is displayed as **None** in Python. In the database, empty strings are treated as NULL due to different compatibility. + +The performance of the same function varies depending on the compatibility. + +```sql +CREATE FUNCTION quote(t text, how text) RETURNS text AS $$ +if how == "literal": + return plpy.quote_literal(t) +elif how == "nullable": + return plpy.quote_nullable(t) +elif how == "ident": + return plpy.quote_ident(t) +else: + raise plpy.Error("unrecognized quote type %s" % how) +$$ LANGUAGE plpythonu; +``` + +**Example 1:** + +```sql +SELECT quote(t, 'literal') FROM (VALUES ('abc'),('a''bc'),('''abc'''),(''),(''''),('xyzv')) AS v(t); +``` + +Results of different database compatibility are as follows: + +- If the compatibility is A, the returned result is as follows: + + ```sql + ERROR: TypeError: argument 1 must be string, not None + CONTEXT: Traceback (most recent call last): + PL/Python function "quote", line 3, in + return plpy.quote_literal(t) + referenced column: quote + ``` + +- If the compatibility is B, the returned result is as follows: + + ```sql + quote + ----------- + 'abc' + 'a''bc' + '''abc''' + '' + '''' + 'xyzv' + (6 rows) + ``` + +**Example 2:** + +```sql +SELECT quote(t, 'nullable') FROM (VALUES ('abc'),('a''bc'),('''abc'''),(''),(''''),(NULL)) AS v(t); +``` + +Results of different database compatibility are as follows: + +- If the compatibility is A, the returned result is as follows: + + ```sql + quote + ----------- + 'abc' + 'a''bc' + '''abc''' + NULL + '''' + NULL + (6 rows) + ``` + +- If the compatibility is B, the returned result is as follows: + + ```sql + quote + ----------- + 'abc' + 'a''bc' + '''abc''' + '' + '''' + NULL + (6 rows) + ``` + +In the preceding examples, the empty string is regarded as NULL when the compatibility is A. + +## Triggers + +Currently, the PL/Python stored procedure does not support triggers. + +## Anonymous Block of Code + +PL/Python also supports anonymous block of code declared by DO: + +```sql +DO $$ +# PL/Python code +$$ LANGUAGE plpythonu; +``` + +An anonymous block of code does not accept parameters and discards values they return. + +## Sharing Data + +Each function gets its own execution environment in the Python interpreter. + +The global dictionary SD is used to store data between function calls. These variables are private static data. Each function has its own SD space. The global data and parameters of function A cannot be used by function B. + +The global dictionary GD is public data. In a gsql session, all Python functions can be accessed and changed. Exercise caution when using the global dictionary. + +When gsql is disconnected or exited, the sharing data is released. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-caution.gif) **CAUTION:** +> +> - When running the DeepSQL or PL/Python stored procedure, you need to disable the parameters related to the thread pool. Otherwise, the functions such as sharing data (GD and SD) in the PL/Python stored procedure are invalid. +> - In the database, when the thread pool function is disabled, a new thread is started in the database for each connected gsql. In gsql, if the PL/Python stored procedure is called, the Python parser module is initialized in this thread, including initializing the sharing space such as GD and SD. +> - When the thread pool function is enabled, an idle thread executes the gsql command. Each execution may be allocated to a different thread. As a result, the sharing data is disordered. + +## Database Access + +The PL/Python language module automatically imports a Python module called plpy. + +The plpy module provides several functions to execute database commands, such as plpy.execute and plpy.prepare. + +The plpy module also provides the following functions: plpy.debug(msg), plpy.log(msg), plpy.info(msg), plpy.notice(msg), plpy.warning(msg), plpy.error(msg), and plpy.fatal(msg). The plpy.error and plpy.fatal throw a Python exception, which causes the current transaction or sub-transaction to exit. + +Another set of useful functions is plpy.quote_literal(string), plpy.quote_nullable(string), and plpy.quote_ident(string). + +## Audit + +PL/Python stored procedures support the audit function. For details, see [Auditing](1-audit-switch). + +## Concurrent Execution + +Currently, PL/Python stored procedures are not friendly to concurrent execution. You are advised to execute them in serial mode. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** Due to the MogDB multi-thread architecture and the restriction of GlobalInterpreter Lock (GIL) in C-python, multiple threads can only be executed alternately in Python, and concurrent operations cannot be implemented. + +## Algorithms in the Library + +For details about algorithms in the library and how to use them, see the official [MADlib document](http://madlib.apache.org/docs/latest/index.html). + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-notice.gif) **NOTICE:** +> +> - Currently, only the machine learning algorithm is supported. The deep learning module is not supported. +> - The current database does not support XML files. Therefore, the pmml module and related functions are not supported. +> - The database does not support the jsonb module, and the model export function in JSON format is not supported. + +## Other Supported Algorithms + +In addition to the algorithms provided by MADlib, MogDB provides the following three algorithms: + +**Table 1** Additional modules + +| Algorithm Name | Abbreviation | +| :-------------------------------- | :--------------- | +| Gradient boosted tree | gbdt | +| Gradient boosting | xgboost | +| Time series forecasting algorithm | facebook_prophet | + +You need to install Python libraries which the preceding algorithms depend on as follows: + +- If the prophet algorithm is used: + + ```bash + pip install pystan + pip install holidays==0.9.8 + pip install fbprophet==0.3.post2 + ``` + +- If the xgboost algorithm is used: + + ```bash + pip install xgboost + pip install scikit-learn + ``` + +- The gbdt algorithm does not require library installation. + +For details, see [Best Practices](7-4-best-practices). diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/7-deepsql/7-4-best-practices.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/7-deepsql/7-4-best-practices.md new file mode 100644 index 0000000000000000000000000000000000000000..d7aea6eea1d39e676ff2e0848d154fa3a1240509 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/7-deepsql/7-4-best-practices.md @@ -0,0 +1,888 @@ +--- +title: Best Practices +summary: Best Practices +author: Guo Huan +date: 2021-05-19 +--- + +# Best Practices + +This section describes how to use certain algorithms, including the classification, regression, clustering, gbdt, xgboost, and prohpet algorithms. + +Create a database and install an algorithm. + +``` +create database test1 dbcompatibility='B'; +./madpack -s madlib -p mogdb -c opg@127.0.0.1:7651/test1 install +``` + +## Classification Algorithm + +Take the housing price prediction with the support vector machine (SVM) as an example: + +1. Prepare a dataset. + + ``` + DROP TABLE IF EXISTS houses; + CREATE TABLE houses (id INT, tax INT, bedroom INT, bath FLOAT, price INT, size INT, lot INT); + INSERT INTO houses VALUES + (1 , 590 , 2 , 1 , 50000 , 770 , 22100), + (2 , 1050 , 3 , 2 , 85000 , 1410 , 12000), + (3 , 20 , 3 , 1 , 22500 , 1060 , 3500), + (4 , 870 , 2 , 2 , 90000 , 1300 , 17500), + (5 , 1320 , 3 , 2 , 133000 , 1500 , 30000), + (6 , 1350 , 2 , 1 , 90500 , 820 , 25700), + (7 , 2790 , 3 , 2.5 , 260000 , 2130 , 25000), + (8 , 680 , 2 , 1 , 142500 , 1170 , 22000), + (9 , 1840 , 3 , 2 , 160000 , 1500 , 19000), + (10 , 3680 , 4 , 2 , 240000 , 2790 , 20000), + (11 , 1660 , 3 , 1 , 87000 , 1030 , 17500), + (12 , 1620 , 3 , 2 , 118600 , 1250 , 20000), + (13 , 3100 , 3 , 2 , 140000 , 1760 , 38000), + (14 , 2070 , 2 , 3 , 148000 , 1550 , 14000), + (15 , 650 , 3 , 1.5 , 65000 , 1450 , 12000); + ``` + +2. Train the model. + + Configure the corresponding schema and compatibility parameters before training. + + ``` + SET search_path="$user",public,madlib; + SET behavior_compat_options = 'bind_procedure_searchpath'; + ``` + + Use default parameters for training. The classification condition is 'price < 100000'. The SQL statement is as follows: + + ``` + DROP TABLE IF EXISTS houses_svm, houses_svm_summary; + SELECT madlib.svm_classification('public.houses','public.houses_svm','price < 100000','ARRAY[1, tax, bath, size]'); + ``` + +3. View the model. + + ``` + \x on + SELECT * FROM houses_svm; + \x off + ``` + + The results are displayed as follows: + + ``` + -[ RECORD 1 ]------+----------------------------------------------------------------- + coef | {.113989576847,-.00226133300602,-.0676303607996,.00179440841072} + loss | .614496714256667 + norm_of_gradient | 108.171180769224 + num_iterations | 100 + num_rows_processed | 15 + num_rows_skipped | 0 + dep_var_mapping | {f,t} + ``` + +4. Perform prediction. + + ``` + DROP TABLE IF EXISTS houses_pred; + SELECT madlib.svm_predict('public.houses_svm','public.houses','id','public.houses_pred'); + ``` + + - View the prediction result. + + ``` + SELECT *, price < 100000 AS actual FROM houses JOIN houses_pred USING (id) ORDER BY id; + ``` + + The results are displayed as follows: + + ``` + id | tax | bedroom | bath | price | size | lot | prediction | decision_function | actual + ----+------+---------+------+--------+------+-------+------------+-------------------+-------- + 1 | 590 | 2 | 1 | 50000 | 770 | 22100 | t | .09386721875 | t + 2 | 1050 | 3 | 2 | 85000 | 1410 | 12000 | t | .134445058042 | t + 3 | 20 | 3 | 1 | 22500 | 1060 | 3500 | t | 1.9032054712902 | t + 4 | 870 | 2 | 2 | 90000 | 1300 | 17500 | t | .3441000739464 | t + 5 | 1320 | 3 | 2 | 133000 | 1500 | 30000 | f | -.3146180966186 | f + 6 | 1350 | 2 | 1 | 90500 | 820 | 25700 | f | -1.5350254452892 | t + 7 | 2790 | 3 | 2.5 | 260000 | 2130 | 25000 | f | -2.5421154971142 | f + 8 | 680 | 2 | 1 | 142500 | 1170 | 22000 | t | .6081106124962 | f + 9 | 1840 | 3 | 2 | 160000 | 1500 | 19000 | f | -1.490511259749 | f + 10 | 3680 | 4 | 2 | 240000 | 2790 | 20000 | f | -3.336577140997 | f + 11 | 1660 | 3 | 1 | 87000 | 1030 | 17500 | f | -1.8592129109042 | t + 12 | 1620 | 3 | 2 | 118600 | 1250 | 20000 | f | -1.4416201011046 | f + 13 | 3100 | 3 | 2 | 140000 | 1760 | 38000 | f | -3.873244660547 | f + 14 | 2070 | 2 | 3 | 148000 | 1550 | 14000 | f | -1.9885277913972 | f + 15 | 650 | 3 | 1.5 | 65000 | 1450 | 12000 | t | 1.1445697772786 | t + (15 rows) + ``` + + - View the false positive rate. + + ``` + SELECT COUNT(*) FROM houses_pred JOIN houses USING (id) WHERE houses_pred.prediction != (houses.price < 100000); + ``` + + The results are displayed as follows: + + ``` + count + ------- + 3 + (1 row) + ``` + +5. Use other SVM cores for training. + + ``` + DROP TABLE IF EXISTS houses_svm_gaussian, houses_svm_gaussian_summary, houses_svm_gaussian_random; + SELECT madlib.svm_classification( 'public.houses','public.houses_svm_gaussian','price < 100000','ARRAY[1, tax, bath, size]','gaussian','n_components=10', '', 'init_stepsize=1, max_iter=200' ); + ``` + + Perform prediction and view the training result. + + ``` + DROP TABLE IF EXISTS houses_pred_gaussian; + SELECT madlib.svm_predict('public.houses_svm_gaussian','public.houses','id', 'public.houses_pred_gaussian'); + SELECT COUNT(*) FROM houses_pred_gaussian JOIN houses USING (id) WHERE houses_pred_gaussian.prediction != (houses.price < 100000); + ``` + + The result is as follows: + + ``` + count + -------+ + 0 + (1 row) + ``` + +6. Use other parameters. + + In addition to specifying different kernel methods, you can specify the number of training steps and initial parameters, such as **init_stepsize**, **max_iter**, and **class_weight**. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-notice.gif) **NOTICE:** +> +> ``` +> SET search_path="$user",public,madlib; +> SET behavior_compat_options = 'bind_procedure_searchpath'; +> ``` +> +> Before executing the algorithm, you need to set the schema in **search_path** and set the **bind_procedure_searchpath**. Otherwise, the table cannot be found. All machine learning methods are installed in a schema, and user tables are installed in user's schemas. In this example, the algorithm is installed in **madlib**, and user tables are stored in **public**. If no schema is set, the table may not be found when the algorithm is executed. When executing the algorithm, you are advised to add the scheme of the input table. + +## Regression Algorithm + +Use the Boston housing price prediction with the linear regression as an example: + +1. Prepare a dataset. + + The dataset is the same as that of the SVM. + +2. Train a model. + + ``` + SET search_path="$user",public,madlib; + SET behavior_compat_options = 'bind_procedure_searchpath'; + + DROP TABLE IF EXISTS houses_linregr, houses_linregr_summary; + SELECT madlib.linregr_train( 'public.houses', 'public.houses_linregr', 'price', 'ARRAY[1, tax, bath, size]'); + ``` + +3. View the model content. + + ``` + \x ON + SELECT * FROM houses_linregr; + \x OFF + ``` + + The returned result is as follows: + + ``` + -[ RECORD 1 ]------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + coef | {-12849.4168959872,28.9613922651775,10181.6290712649,50.516894915353} + r2 | .768577580597462 + std_err | {33453.0344331377,15.8992104963991,19437.7710925915,32.9280231740856} + t_stats | {-.384103179688204,1.82156166004197,.523806408809163,1.53416118083608} + p_values | {.708223134615411,.0958005827189556,.610804093526516,.153235085548177} + condition_no | 9002.50457069858 + num_rows_processed | 15 + num_missing_rows_skipped | 0 + variance_covariance | {{1119105512.7847,217782.067878005,-283344228.394538,-616679.693190829},{217782.067878005,252.784894408806,-46373.1796964038,-369.864520095145},{-283344228.394538,-46373.1796964038,377826945.047986,-209088.217319699},{-616679.693190829,-369.864520095145,-209088.217319699,1084.25471015312}} + ``` + +4. Predict and compare the results. + + ``` + SELECT houses.*, + madlib.linregr_predict( m.coef, ARRAY[1,tax,bath,size]) as predict, + price - madlib.linregr_predict( m.coef, ARRAY[1,tax,bath,size]) as residual + FROM public.houses, public.houses_linregr AS m + ORDER BY id; + ``` + + The returned result is as follows: + + ``` + id | tax | bedroom | bath | price | size | lot | predict | residual + ----+------+---------+------+--------+------+-------+------------------+------------------- + 1 | 590 | 2 | 1 | 50000 | 770 | 22100 | 53317.4426965543 | -3317.44269655428 + 2 | 1050 | 3 | 2 | 85000 | 1410 | 12000 | 109152.124955627 | -24152.1249556268 + 3 | 20 | 3 | 1 | 22500 | 1060 | 3500 | 51459.3486308555 | -28959.3486308555 + 4 | 870 | 2 | 2 | 90000 | 1300 | 17500 | 98382.215907206 | -8382.21590720599 + 5 | 1320 | 3 | 2 | 133000 | 1500 | 30000 | 121518.221409606 | 11481.7785903935 + 6 | 1350 | 2 | 1 | 90500 | 820 | 25700 | 77853.9455638568 | 12646.0544361432 + 7 | 2790 | 3 | 2.5 | 260000 | 2130 | 25000 | 201007.926371722 | 58992.0736282778 + 8 | 680 | 2 | 1 | 142500 | 1170 | 22000 | 76130.7259665615 | 66369.2740334385 + 9 | 1840 | 3 | 2 | 160000 | 1500 | 19000 | 136578.145387499 | 23421.8546125013 + 10 | 3680 | 4 | 2 | 240000 | 2790 | 20000 | 255033.901596231 | -15033.9015962306 + 11 | 1660 | 3 | 1 | 87000 | 1030 | 17500 | 97440.5250982859 | -10440.5250982859 + 12 | 1620 | 3 | 2 | 118600 | 1250 | 20000 | 117577.415360321 | 1022.58463967856 + 13 | 3100 | 3 | 2 | 140000 | 1760 | 38000 | 186203.892319614 | -46203.8923196141 + 14 | 2070 | 2 | 3 | 148000 | 1550 | 14000 | 155946.739425522 | -7946.73942552213 + 15 | 650 | 3 | 1.5 | 65000 | 1450 | 12000 | 94497.4293105374 | -29497.4293105374 + (15 rows) + ``` + +## Clustering Algorithm + +Use kmeans as an example: + +1. Prepare the data. + + ``` + DROP TABLE IF EXISTS km_sample; + CREATE TABLE km_sample(pid int, points double precision[]); + INSERT INTO km_sample VALUES + (1, '{14.23, 1.71, 2.43, 15.6, 127, 2.8, 3.0600, 0.2800, 2.29, 5.64, 1.04, 3.92, 1065}'), + (2, '{13.2, 1.78, 2.14, 11.2, 1, 2.65, 2.76, 0.26, 1.28, 4.38, 1.05, 3.49, 1050}'), + (3, '{13.16, 2.36, 2.67, 18.6, 101, 2.8, 3.24, 0.3, 2.81, 5.6799, 1.03, 3.17, 1185}'), + (4, '{14.37, 1.95, 2.5, 16.8, 113, 3.85, 3.49, 0.24, 2.18, 7.8, 0.86, 3.45, 1480}'), + (5, '{13.24, 2.59, 2.87, 21, 118, 2.8, 2.69, 0.39, 1.82, 4.32, 1.04, 2.93, 735}'), + (6, '{14.2, 1.76, 2.45, 15.2, 112, 3.27, 3.39, 0.34, 1.97, 6.75, 1.05, 2.85, 1450}'), + (7, '{14.39, 1.87, 2.45, 14.6, 96, 2.5, 2.52, 0.3, 1.98, 5.25, 1.02, 3.58, 1290}'), + (8, '{14.06, 2.15, 2.61, 17.6, 121, 2.6, 2.51, 0.31, 1.25, 5.05, 1.06, 3.58, 1295}'), + (9, '{14.83, 1.64, 2.17, 14, 97, 2.8, 2.98, 0.29, 1.98, 5.2, 1.08, 2.85, 1045}'), + (10, '{13.86, 1.35, 2.27, 16, 98, 2.98, 3.15, 0.22, 1.8500, 7.2199, 1.01, 3.55, 1045}'); + ``` + +2. Run the kmeans algorithm. + + Use kmeans++ and Euclidean distance function for calculation. + + ``` + SET search_path="$user",public,madlib; + SET behavior_compat_options = 'bind_procedure_searchpath'; + + DROP TABLE IF EXISTS km_result; + CREATE TABLE km_result AS SELECT * FROM madlib.kmeanspp( 'public.km_sample', -- Table of source data + 'points', -- Column containing point co-ordinates + 2, -- Number of centroids to calculate + 'madlib.squared_dist_norm2', -- Distance function + 'madlib.avg', -- Aggregate function + 20, -- Number of iterations + 0.001 -- Fraction of centroids reassigned to keep iterating + ); + ``` + + After kmeans is executed, no table is automatically created to save the content. Therefore, you need to create a table. + + ``` + \x on + select * from km_result; + \x off + ``` + + The returned result is as follows: + + ``` + -[ RECORD 1 ]----+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + centroids | {{14.0333333333333,1.84111111111111,2.41,15.5111111111111,96.2222222222222,2.91666666666667,3.01111111111111,.282222222222222,1.95444444444444,5.88553333333333,1.02222222222222,3.38222222222222,1211.66666666667},{13.24,2.59,2.87,21,118,2.8,2.69,.39,1.82,4.32,1.04,2.93,735}} + cluster_variance | {257041.999707571,0} + objective_fn | 257041.999707571 + frac_reassigned | 0 + num_iterations | 2 + ``` + +3. Apply the clustering result. + + Execute the following function to calculate the adjacent node and the corresponding distance of each node: + + ``` + DROP TABLE IF EXISTS km_points_silh; + SELECT * FROM madlib.simple_silhouette_points('public.km_sample', -- Input points table + 'public.km_points_silh', -- Output table + 'pid', -- Point ID column in input table + 'points', -- Points column in input table + 'public.km_result', -- Centroids table + 'centroids', -- Column in centroids table containing centroids + 'madlib.squared_dist_norm2' -- Distance function + ); + SELECT * FROM km_points_silh ORDER BY pid; + ``` + + The returned result is as follows: + + ``` + pid | centroid_id | neighbor_centroid_id | silh + -----+-------------+----------------------+------------------ + 1 | 0 | 1 | .793983543638996 + 2 | 0 | 1 | .688301735667703 + 3 | 0 | 1 | .996324103148159 + 4 | 0 | 1 | .869765755931474 + 5 | 1 | 0 | 1 + 6 | 0 | 1 | .888416176253661 + 7 | 0 | 1 | .980107240092519 + 8 | 0 | 1 | .975880363039906 + 9 | 0 | 1 | .712384473959954 + 10 | 0 | 1 | .712198411442872 + (10 rows) + ``` + +## gbdt Algorithm + +Although the gdbt base learner is a regression tree, the algorithm itself supports both classification and regression operations. The following describes the implementation of the two types of tasks. It is worth noting that this method does not support NULL values in the label column. + +**Classification** + +1. Prepare the data. + + ``` + DROPTABLEIFEXISTS dt_golf CASCADE; + DROPTABLEIFEXISTS train_output,train_output_summary; + CREATETABLE dt_golf ( + id integerNOTNULL, + "OUTLOOK"text, + temperature double precision, + humidity double precision, + "Cont_features"double precision[], + cat_features text[], + windy boolean, + class integer + ) ; + INSERTINTO dt_golf (id,"OUTLOOK",temperature,humidity,"Cont_features",cat_features, windy,class) VALUES + (1, 'sunny', 85, 85,ARRAY[85, 85], ARRAY['a', 'b'], false, 0), + (2, 'sunny', 80, 90, ARRAY[80, 90], ARRAY['a', 'b'], true, 0), + (3, 'overcast', 83, 78, ARRAY[83, 78], ARRAY['a', 'b'], false, 1), + (4, 'rain', 70, NULL, ARRAY[70, 96], ARRAY['a', 'b'], false, 1), + (5, 'rain', 68, 80, ARRAY[68, 80], ARRAY['a', 'b'], false, 1), + (6, 'rain', NULL, 70, ARRAY[65, 70], ARRAY['a', 'b'], true, 0), + (7, 'overcast', 64, 65, ARRAY[64, 65], ARRAY['c', 'b'], NULL , 1), + (8, 'sunny', 72, 95, ARRAY[72, 95], ARRAY['a', 'b'], false, 0), + (9, 'sunny', 69, 70, ARRAY[69, 70], ARRAY['a', 'b'], false, 1), + (10, 'rain', 75, 80, ARRAY[75, 80], ARRAY['a', 'b'], false, 1), + (11, 'sunny', 75, 70, ARRAY[75, 70], ARRAY['a', 'd'], true, 1), + (12, 'overcast', 72, 90, ARRAY[72, 90], ARRAY['c', 'b'], NULL, 1), + (13, 'overcast', 81, 75, ARRAY[81, 75], ARRAY['a', 'b'], false, 1), + (15, NULL, 81, 75, ARRAY[81, 75], ARRAY['a', 'b'], false, 1), + (16, 'overcast', NULL, 75, ARRAY[81, 75], ARRAY['a', 'd'], false,1), + (14, 'rain', 71, 80, ARRAY[71, 80], ARRAY['c', 'b'], true, 0); + ``` + +2. Train a model. + + ``` + select madlib.gbdt_train('dt_golf', -- source table + 'train_output', -- output model table + 'id' , -- id column + 'class', -- response + '"OUTLOOK", temperature', -- features + NULL, -- exclude columns + 1, --weight + 10, -- num of trees + NULL, -- num of random features + 10, -- max depth + 1, -- min split + 1, -- min bucket + 8, -- number of bins per continuous variable + 'max_surrogates=0', + TRUE + ); + ``` + + After the model training is complete, two tables are generated. The **train_output** table stores the regression tree model in gdbt, including the parameter records of each base learner. + + ``` + Column | Type | Modifiers | Storage | Stats target | Description + --------------------+---------------+-----------+----------+--------------+------------- + iteration | integer | | plain | | + tree | madlib.bytea8 | | external | | + cat_levels_in_text | text[] | | extended | | + cat_n_levels | integer[] | | extended | | + tree_depth | integer | | plain | | + ``` + + ``` + iteration | cat_levels_in_text | cat_n_levels | tree_depth + -----------+-----------------------+--------------+------------ + 0 | {sunny,rain,overcast} | {3} | 4 + 1 | {sunny,rain,overcast} | {3} | 5 + 2 | {sunny,rain,overcast} | {3} | 6 + 3 | {sunny,rain,overcast} | {3} | 4 + 4 | {sunny,rain,overcast} | {3} | 5 + 5 | {sunny,rain,overcast} | {3} | 5 + 6 | {sunny,rain,overcast} | {3} | 5 + 7 | {sunny,rain,overcast} | {3} | 5 + 8 | {sunny,rain,overcast} | {3} | 4 + 9 | {sunny,rain,overcast} | {3} | 3 + (10 rows) + ``` + + The **train_output_summary** table describes the gdbt training. + + ``` + select * from train_output_summary; + method | cat_features | con_features | source_table | model_table | null_proxy | learning_rate | is_classification | predict_dt_prob | num_trees + --------+--------------+--------------+--------------+--------------+------------+---------------+-------------------+-----------------+----------- + GBDT | "OUTLOOK" | temperature | dt_golf | train_output | | .01 | t | response | 10 + (1 row) + ``` + +3. Perform prediction. + + ``` + select madlib.gbdt_predict('dt_golf2','train_output','test_output','id'); + ``` + + View the prediction result. + + ``` + select test_output.id, test_prediction,class from test_output join dt_golf using (id); + id | test_prediction | class + ----+-----------------+------- + 1 | 1.0 | 0 + 2 | 1.0 | 0 + 3 | 1.0 | 1 + 4 | 1.0 | 1 + 5 | 1.0 | 1 + 6 | 1.0 | 0 + 7 | 1.0 | 1 + 8 | 0.0 | 0 + 9 | 1.0 | 1 + 10 | 1.0 | 1 + 11 | 1.0 | 1 + 12 | 1.0 | 1 + 13 | 1.0 | 1 + 15 | 0.0 | 1 + 16 | 1.0 | 1 + 14 | 0.0 | 0 + (16 rows) + ``` + +## **Regression** + +1. Prepare the data. + + ``` + DROP TABLE IF EXISTS crime; + CREATE TABLE crime ( + id SERIAL NOT NULL, + CrimeRat DOUBLE PRECISION, + MaleTeen INTEGER, + South SMALLINT, + Educ DOUBLE PRECISION, + Police60 INTEGER, + Police59 INTEGER, + Labor INTEGER, + Males INTEGER, + Pop INTEGER, + NonWhite INTEGER, + Unemp1 INTEGER, + Unemp2 INTEGER, + Median INTEGER, + BelowMed INTEGER + ); + + INSERT INTO crime( + CrimeRat, MaleTeen, South, Educ, Police60, Police59, Labor, Males, Pop, + NonWhite, Unemp1, Unemp2, Median, BelowMed + ) VALUES + (79.1, 151, 1, 9.1, 58, 56, 510, 950, 33, 301, 108, 41, 394, 261), + (163.5, 143, 0, 11.3, 103, 95, 583, 1012, 13, 102, 96, 36, 557, 194), + (57.8, 142, 1, 8.9, 45, 44, 533, 969, 18, 219, 94, 33, 318, 250), + (196.9, 136, 0, 12.1, 149, 141, 577, 994, 157, 80, 102, 39, 673, 167), + (123.4, 141, 0, 12.1, 109, 101, 591, 985, 18, 30, 91, 20, 578, 174), + (68.2, 121, 0, 11.0, 118, 115, 547, 964, 25, 44, 84, 29, 689, 126), + (96.3, 127, 1, 11.1, 82, 79, 519, 982, 4, 139, 97, 38, 620, 168), + (155.5, 131, 1, 10.9, 115, 109, 542, 969, 50, 179, 79, 35, 472, 206), + (85.6, 157, 1, 9.0, 65, 62, 553, 955, 39, 286, 81, 28, 421, 239), + (70.5, 140, 0, 11.8, 71, 68, 632, 1029, 7, 15, 100, 24, 526, 174), + (167.4, 124, 0, 10.5, 121, 116, 580, 966, 101, 106, 77, 35, 657, 170), + (84.9, 134, 0, 10.8, 75, 71, 595, 972, 47, 59, 83, 31, 580, 172), + (51.1, 128, 0, 11.3, 67, 60, 624, 972, 28, 10, 77, 25, 507, 206), + (66.4, 135, 0, 11.7, 62, 61, 595, 986, 22, 46, 77, 27, 529, 190), + (79.8, 152, 1, 8.7, 57, 53, 530, 986, 30, 72, 92, 43, 405, 264), + (94.6, 142, 1, 8.8, 81, 77, 497, 956, 33, 321, 116, 47, 427, 247), + (53.9, 143, 0, 11.0, 66, 63, 537, 977, 10, 6, 114, 35, 487, 166), + (92.9, 135, 1, 10.4, 123, 115, 537, 978, 31, 170, 89, 34, 631, 165), + (75.0, 130, 0, 11.6, 128, 128, 536, 934, 51, 24, 78, 34, 627, 135), + (122.5, 125, 0, 10.8, 113, 105, 567, 985, 78, 94, 130, 58, 626, 166), + (74.2, 126, 0, 10.8, 74, 67, 602, 984, 34, 12, 102, 33, 557, 195), + (43.9, 157, 1, 8.9, 47, 44, 512, 962, 22, 423, 97, 34, 288, 276), + (121.6, 132, 0, 9.6, 87, 83, 564, 953, 43, 92, 83, 32, 513, 227), + (96.8, 131, 0, 11.6, 78, 73, 574, 1038, 7, 36, 142, 42, 540, 176), + (52.3, 130, 0, 11.6, 63, 57, 641, 984, 14, 26, 70, 21, 486, 196), + (199.3, 131, 0, 12.1, 160, 143, 631, 1071, 3, 77, 102, 41, 674, 152), + (34.2, 135, 0, 10.9, 69, 71, 540, 965, 6, 4, 80, 22, 564, 139), + (121.6, 152, 0, 11.2, 82, 76, 571, 1018, 10, 79, 103, 28, 537, 215), + (104.3, 119, 0, 10.7, 166, 157, 521, 938, 168, 89, 92, 36, 637, 154), + (69.6, 166, 1, 8.9, 58, 54, 521, 973, 46, 254, 72, 26, 396, 237), + (37.3, 140, 0, 9.3, 55, 54, 535, 1045, 6, 20, 135, 40, 453, 200), + (75.4, 125, 0, 10.9, 90, 81, 586, 964, 97, 82, 105, 43, 617, 163), + (107.2, 147, 1, 10.4, 63, 64, 560, 972, 23, 95, 76, 24, 462, 233), + (92.3, 126, 0, 11.8, 97, 97, 542, 990, 18, 21, 102, 35, 589, 166), + (65.3, 123, 0, 10.2, 97, 87, 526, 948, 113, 76, 124, 50, 572, 158), + (127.2, 150, 0, 10.0, 109, 98, 531, 964, 9, 24, 87, 38, 559, 153), + (83.1, 177, 1, 8.7, 58, 56, 638, 974, 24, 349, 76, 28, 382, 254), + (56.6, 133, 0, 10.4, 51, 47, 599, 1024, 7, 40, 99, 27, 425, 225), + (82.6, 149, 1, 8.8, 61, 54, 515, 953, 36, 165, 86, 35, 395, 251), + (115.1, 145, 1, 10.4, 82, 74, 560, 981, 96, 126, 88, 31, 488, 228), + (88.0, 148, 0, 12.2, 72, 66, 601, 998, 9, 19, 84, 20, 590, 144), + (54.2, 141, 0, 10.9, 56, 54, 523, 968, 4, 2, 107, 37, 489, 170), + (82.3, 162, 1, 9.9, 75, 70, 522, 996, 40, 208, 73, 27, 496, 224), + (103.0, 136, 0, 12.1, 95, 96, 574, 1012, 29, 36, 111, 37, 622, 162), + (45.5, 139, 1, 8.8, 46, 41, 480, 968, 19, 49, 135, 53, 457, 249), + (50.8, 126, 0, 10.4, 106, 97, 599, 989, 40, 24, 78, 25, 593, 171), + (84.9, 130, 0, 12.1, 90, 91, 623, 1049, 3, 22, 113, 40, 588, 160); + ``` + +2. Train a model. + + ``` + select madlib.gbdt_train('crime', -- source table + 'train_output', -- output model table + 'id' , -- id column + 'CrimeRat', -- response + '*', -- features + NULL, -- exclude columns + 1, --weight + 20, -- num of trees + 4, -- num of random features + 10, -- max depth + 1, -- min split + 1, -- min bucket + 8, -- number of bins per continuous variable + 'max_surrogates=0', + FALSE + ); + ``` + + When **is_classification** is set to **FALSE**, the model is a regression task. By default, gbdt provides the regression calculation function. The method generates two tables. One table records the collective information of each tree and the binary of the model. The other table records the parameter information of the method. + +3. Perform prediction. + + ``` + select madlib.gbdt_predict('crime','train_output','test_output','id'); + ``` + + View the prediction result. + + ``` + select test_output.id, test_prediction,CrimeRat from test_output join crime using (id); + + id | test_prediction | crimerat + ----+--------------------+---------- + 1 | 79.1 | 79.1 + 2 | 163.5 | 163.5 + 3 | 57.8 | 57.8 + 4 | 196.9 | 196.9 + 5 | 123.4 | 123.4 + 6 | 68.2 | 68.2 + 7 | 96.2999999992251 | 96.3 + 8 | 155.49842087765936 | 155.5 + 9 | 84.35 | 85.6 + 10 | 70.50157912234037 | 70.5 + 11 | 167.4000000007749 | 167.4 + 12 | 84.9 | 84.9 + 13 | 51.1 | 51.1 + 14 | 66.4 | 66.4 + 15 | 79.8 | 79.8 + 16 | 94.6 | 94.6 + 17 | 53.9 | 53.9 + 18 | 92.9 | 92.9 + 19 | 75.0 | 75 + 20 | 122.5 | 122.5 + 21 | 74.2 | 74.2 + 22 | 43.9 | 43.9 + 23 | 121.6 | 121.6 + 24 | 96.8 | 96.8 + 25 | 52.3 | 52.3 + 26 | 199.3 | 199.3 + 27 | 34.2 | 34.2 + 28 | 121.6 | 121.6 + 29 | 104.3 | 104.3 + 30 | 69.6 | 69.6 + 31 | 37.3 | 37.3 + 32 | 75.4 | 75.4 + 33 | 107.2 | 107.2 + 34 | 92.3 | 92.3 + 35 | 65.2999999992251 | 65.3 + 36 | 127.19842087765963 | 127.2 + 37 | 84.35000000002215 | 83.1 + 38 | 56.60155638297881 | 56.6 + 39 | 82.45000000075257 | 82.6 + 40 | 115.10002273936168 | 115.1 + 41 | 88.0 | 88 + 42 | 54.19997726063828 | 54.2 + 43 | 82.44999999999999 | 82.3 + 44 | 103.00002273936173 | 103 + 45 | 45.500000000000156 | 45.5 + 46 | 50.8 | 50.8 + 47 | 84.9 | 84.9 + (47 rows) + ``` + +## xgboost Algorithm + +The new xgboost supports classification and regression. The following uses the iris flower classification as an example to describe the xgboost algorithm. + +The xgboost supports grid search mode and can train multiple groups of parameters at the same time. + +1. Prepare the data. + + ``` + DROP TABLE IF EXISTS iris; + create table iris (id serial, a float, d float, c float, b float, label int); + + INSERT into iris (a, b, c, d, label) values + (5.1, 3.5, 1.4, 0.2, 0),(4.9, 3.0, 1.4, 0.2, 0),(4.7, 3.2, 1.3, 0.2, 0),(4.6, 3.1, 1.5, 0.2, 0), + (5.0, 3.6, 1.4, 0.2, 0),(5.4, 3.9, 1.7, 0.4, 0),(4.6, 3.4, 1.4, 0.3, 0),(5.0, 3.4, 1.5, 0.2, 0), + (4.4, 2.9, 1.4, 0.2, 0),(4.9, 3.1, 1.5, 0.1, 0),(5.4, 3.7, 1.5, 0.2, 0),(4.8, 3.4, 1.6, 0.2, 0), + (4.8, 3.0, 1.4, 0.1, 0),(4.3, 3.0, 1.1, 0.1, 0),(5.8, 4.0, 1.2, 0.2, 0),(5.7, 4.4, 1.5, 0.4, 0), + (5.4, 3.9, 1.3, 0.4, 0),(5.1, 3.5, 1.4, 0.3, 0),(5.7, 3.8, 1.7, 0.3, 0),(5.1, 3.8, 1.5, 0.3, 0), + (5.4, 3.4, 1.7, 0.2, 0),(5.1, 3.7, 1.5, 0.4, 0),(4.6, 3.6, 1.0, 0.2, 0),(5.1, 3.3, 1.7, 0.5, 0), + (4.8, 3.4, 1.9, 0.2, 0),(5.0, 3.0, 1.6, 0.2, 0),(5.0, 3.4, 1.6, 0.4, 0),(5.2, 3.5, 1.5, 0.2, 0), + (5.2, 3.4, 1.4, 0.2, 0),(4.7, 3.2, 1.6, 0.2, 0),(4.8, 3.1, 1.6, 0.2, 0),(5.4, 3.4, 1.5, 0.4, 0), + (5.2, 4.1, 1.5, 0.1, 0),(5.5, 4.2, 1.4, 0.2, 0),(4.9, 3.1, 1.5, 0.2, 0),(5.0, 3.2, 1.2, 0.2, 0), + (5.5, 3.5, 1.3, 0.2, 0),(4.9, 3.6, 1.4, 0.1, 0),(4.4, 3.0, 1.3, 0.2, 0),(5.1, 3.4, 1.5, 0.2, 0), + (5.0, 3.5, 1.3, 0.3, 0),(4.5, 2.3, 1.3, 0.3, 0),(4.4, 3.2, 1.3, 0.2, 0),(5.0, 3.5, 1.6, 0.6, 0), + (5.1, 3.8, 1.9, 0.4, 0),(4.8, 3.0, 1.4, 0.3, 0),(5.1, 3.8, 1.6, 0.2, 0),(4.6, 3.2, 1.4, 0.2, 0), + (5.3, 3.7, 1.5, 0.2, 0),(5.0, 3.3, 1.4, 0.2, 0),(7.0, 3.2, 4.7, 1.4, 1),(6.4, 3.2, 4.5, 1.5, 1), + (6.9, 3.1, 4.9, 1.5, 1),(5.5, 2.3, 4.0, 1.3, 1),(6.5, 2.8, 4.6, 1.5, 1),(5.7, 2.8, 4.5, 1.3, 1), + (6.3, 3.3, 4.7, 1.6, 1),(4.9, 2.4, 3.3, 1.0, 1),(6.6, 2.9, 4.6, 1.3, 1),(5.2, 2.7, 3.9, 1.4, 1), + (5.0, 2.0, 3.5, 1.0, 1),(5.9, 3.0, 4.2, 1.5, 1),(6.0, 2.2, 4.0, 1.0, 1),(6.1, 2.9, 4.7, 1.4, 1), + (5.6, 2.9, 3.6, 1.3, 1),(6.7, 3.1, 4.4, 1.4, 1),(5.6, 3.0, 4.5, 1.5, 1),(5.8, 2.7, 4.1, 1.0, 1), + (6.2, 2.2, 4.5, 1.5, 1),(5.6, 2.5, 3.9, 1.1, 1),(5.9, 3.2, 4.8, 1.8, 1),(6.1, 2.8, 4.0, 1.3, 1), + (6.3, 2.5, 4.9, 1.5, 1),(6.1, 2.8, 4.7, 1.2, 1),(6.4, 2.9, 4.3, 1.3, 1),(6.6, 3.0, 4.4, 1.4, 1), + (6.8, 2.8, 4.8, 1.4, 1),(6.7, 3.0, 5.0, 1.7, 1),(6.0, 2.9, 4.5, 1.5, 1),(5.7, 2.6, 3.5, 1.0, 1), + (5.5, 2.4, 3.8, 1.1, 1),(5.5, 2.4, 3.7, 1.0, 1),(5.8, 2.7, 3.9, 1.2, 1),(6.0, 2.7, 5.1, 1.6, 1), + (5.4, 3.0, 4.5, 1.5, 1),(6.0, 3.4, 4.5, 1.6, 1),(6.7, 3.1, 4.7, 1.5, 1),(6.3, 2.3, 4.4, 1.3, 1), + (5.6, 3.0, 4.1, 1.3, 1),(5.5, 2.5, 4.0, 1.3, 1),(5.5, 2.6, 4.4, 1.2, 1),(6.1, 3.0, 4.6, 1.4, 1), + (5.8, 2.6, 4.0, 1.2, 1),(5.0, 2.3, 3.3, 1.0, 1),(5.6, 2.7, 4.2, 1.3, 1),(5.7, 3.0, 4.2, 1.2, 1), + (5.7, 2.9, 4.2, 1.3, 1),(6.2, 2.9, 4.3, 1.3, 1),(5.1, 2.5, 3.0, 1.1, 1),(5.7, 2.8, 4.1, 1.3, 1), + (6.3, 3.3, 6.0, 2.5, 2),(5.8, 2.7, 5.1, 1.9, 2),(7.1, 3.0, 5.9, 2.1, 2),(6.3, 2.9, 5.6, 1.8, 2), + (6.5, 3.0, 5.8, 2.2, 2),(7.6, 3.0, 6.6, 2.1, 2),(4.9, 2.5, 4.5, 1.7, 2),(7.3, 2.9, 6.3, 1.8, 2), + (6.7, 2.5, 5.8, 1.8, 2),(7.2, 3.6, 6.1, 2.5, 2),(6.5, 3.2, 5.1, 2.0, 2),(6.4, 2.7, 5.3, 1.9, 2), + (6.8, 3.0, 5.5, 2.1, 2),(5.7, 2.5, 5.0, 2.0, 2),(5.8, 2.8, 5.1, 2.4, 2),(6.4, 3.2, 5.3, 2.3, 2), + (6.5, 3.0, 5.5, 1.8, 2),(7.7, 3.8, 6.7, 2.2, 2),(7.7, 2.6, 6.9, 2.3, 2),(6.0, 2.2, 5.0, 1.5, 2), + (6.9, 3.2, 5.7, 2.3, 2),(5.6, 2.8, 4.9, 2.0, 2),(7.7, 2.8, 6.7, 2.0, 2),(6.3, 2.7, 4.9, 1.8, 2), + (6.7, 3.3, 5.7, 2.1, 2),(7.2, 3.2, 6.0, 1.8, 2),(6.2, 2.8, 4.8, 1.8, 2),(6.1, 3.0, 4.9, 1.8, 2), + (6.4, 2.8, 5.6, 2.1, 2),(7.2, 3.0, 5.8, 1.6, 2),(7.4, 2.8, 6.1, 1.9, 2),(7.9, 3.8, 6.4, 2.0, 2), + (6.4, 2.8, 5.6, 2.2, 2),(6.3, 2.8, 5.1, 1.5, 2),(6.1, 2.6, 5.6, 1.4, 2),(7.7, 3.0, 6.1, 2.3, 2), + (6.3, 3.4, 5.6, 2.4, 2),(6.4, 3.1, 5.5, 1.8, 2),(6.0, 3.0, 4.8, 1.8, 2),(6.9, 3.1, 5.4, 2.1, 2), + (6.7, 3.1, 5.6, 2.4, 2),(6.9, 3.1, 5.1, 2.3, 2),(5.8, 2.7, 5.1, 1.9, 2),(6.8, 3.2, 5.9, 2.3, 2), + (6.7, 3.3, 5.7, 2.5, 2),(6.7, 3.0, 5.2, 2.3, 2),(6.3, 2.5, 5.0, 1.9, 2),(6.5, 3.0, 5.2, 2.0, 2), + (6.2, 3.4, 5.4, 2.3, 2),(5.9, 3.0, 5.1, 1.8, 2); + ``` + +2. Perform the classification training. + + ``` + SET search_path="$user",public,madlib; + SET behavior_compat_options = 'bind_procedure_searchpath'; + select madlib.xgboost_sk_Classifier('public.iris', 'public.iris_model_xgbc', 'id', 'label', 'a,b,c,d', NULL, + $${'booster': ['gbtree'], 'eta': (0.1, 0.9), 'max_depth': (5,1), 'objective': ('multi:softmax',)}$$, -- Training parameter combination. If there are multiple parameters, input them in tuple or list mode. + TRUE); -- Whether to evaluate the model. The multi-classification evaluation metrics are the accuracy and kappa value. The binary classification evaluation metrics are precision, recall, fscore, and support. The regression evaluation metrics are mae, mse, R2squared, and rmse. + ``` + + The xgboost supports parallel training of multiple groups of parameters. For example, the **eta** values in the case are **0.1** and **0.9**, and the maximum depth is **5** or **1**. + + ``` + select id, train_timestamp, source_table, y_type, metrics, features, params from iris_model_xgbc; + ``` + + The following shows the model result. + + ``` + id | train_timestamp | source_table | y_type | metrics | features | params + ----+-------------------------------+--------------+---------+----------------------------+-----------+--------------------------------------------------------------------------------- + 1 | 2020-12-14 20:15:05.904184+08 | public.iris | integer | {'acc': 1.0, 'kappa': 1.0} | {a,b,c,d} | ('objective = multi:softmax', 'eta = 0.1', 'max_depth = 5', 'booster = gbtree') + 2 | 2020-12-14 20:15:05.904184+08 | public.iris | integer | {'acc': 1.0, 'kappa': 1.0} | {a,b,c,d} | ('objective = multi:softmax', 'eta = 0.1', 'max_depth = 1', 'booster = gbtree') + 3 | 2020-12-14 20:15:05.904184+08 | public.iris | integer | {'acc': 1.0, 'kappa': 1.0} | {a,b,c,d} | ('objective = multi:softmax', 'eta = 0.9', 'max_depth = 5', 'booster = gbtree') + 4 | 2020-12-14 20:15:05.904184+08 | public.iris | integer | {'acc': 1.0, 'kappa': 1.0} | {a,b,c,d} | ('objective = multi:softmax', 'eta = 0.9', 'max_depth = 1', 'booster = gbtree') + (4 rows) + ``` + + The result table records the training time, features, result types, and used parameters. + + In this example, two types of eta and two types of max_depth are selected. Therefore, there are a total of four parameter combinations displayed on four rows in the result. In the **metrics** column, the training evaluation results of the four parameter combinations are recorded. You can enter multiple parameter combinations and select a proper model after training. + +3. View the prediction result. + + ``` + select madlib.xgboost_sk_predict('public.iris', 'public.iris_model_xgbc', 'public.iris_xgbc_out', 'id'); + select t1.id, prediction, label from iris as t1, iris_xgbc_out as t2 where t1.id = t2.id and prediction <> label; + ``` + + The comparison between the prediction result and the training result shows that the number of unmatched rows is 0, indicating that the classification accuracy is high. + + ``` + id | prediction | label + ----+------------+------- + (0 rows) + ``` + +## prophet Algorithm + +The prophet time series forecasting algorithm for Facebook is added. The following uses time series data as an example to describe how to use the prophet algorithm. + +1. Prepare the data. + + ``` + DROP TABLE IF EXISTS ts_data; + CREATE TABLE ts_data(date date, value float); + + INSERT into ts_data (date, value) values + ('2016-11-29 21:20:00', 5.6),('2016-11-29 21:30:00', 5.2),('2016-11-29 21:40:00', 5.3),('2016-11-29 21:50:00', 5.3), + ('2016-11-29 22:00:00', 5.1),('2016-11-29 22:10:00', 5.8),('2016-11-29 22:20:00', 5.6),('2016-11-29 22:30:00', 5.4), + ('2016-11-29 22:40:00', 5.4),('2016-11-29 22:50:00', 5.1),('2016-11-29 23:00:00', 5.2),('2016-11-29 23:10:00', 5.9), + ('2016-11-29 23:20:00', 5.9),('2016-11-29 23:30:00', 5.1),('2016-11-29 23:40:00', 5.8),('2016-11-29 23:50:00', 6.0), + ('2016-11-30 00:00:00', 5.9),('2016-11-30 00:10:00', 5.3),('2016-11-30 00:20:00', 5.4),('2016-11-30 00:30:00', 5.1), + ('2016-11-30 00:40:00', 5.6),('2016-11-30 00:50:00', 5.7),('2016-11-30 01:00:00', 5.8),('2016-11-30 01:10:00', 5.4), + ('2016-11-30 01:20:00', 5.8),('2016-11-30 01:30:00', 5.1),('2016-11-30 01:40:00', 5.6),('2016-11-30 01:50:00', 5.6), + ('2016-11-30 02:00:00', 5.6),('2016-11-30 02:10:00', 5.9),('2016-11-30 02:20:00', 5.7),('2016-11-30 02:30:00', 5.4), + ('2016-11-30 02:40:00', 5.6),('2016-11-30 02:50:00', 5.4),('2016-11-30 03:00:00', 5.1),('2016-11-30 03:10:00', 5.0), + ('2016-11-30 03:20:00', 5.9),('2016-11-30 03:30:00', 5.8),('2016-11-30 03:40:00', 5.4),('2016-11-30 03:50:00', 5.7), + ('2016-11-30 04:00:00', 5.6),('2016-11-30 04:10:00', 5.9),('2016-11-30 04:20:00', 5.1),('2016-11-30 04:30:00', 5.8), + ('2016-11-30 04:40:00', 5.5),('2016-11-30 04:50:00', 5.1),('2016-11-30 05:00:00', 5.8),('2016-11-30 05:10:00', 5.5), + ('2016-11-30 05:20:00', 5.7),('2016-11-30 05:30:00', 5.2),('2016-11-30 05:40:00', 5.7),('2016-11-30 05:50:00', 6.0), + ('2016-11-30 06:00:00', 5.8),('2016-11-30 06:10:00', 5.6),('2016-11-30 06:20:00', 5.2),('2016-11-30 06:30:00', 5.8), + ('2016-11-30 06:40:00', 5.3),('2016-11-30 06:50:00', 5.4),('2016-11-30 07:00:00', 5.8),('2016-11-30 07:10:00', 5.2), + ('2016-11-30 07:20:00', 5.3),('2016-11-30 07:30:00', 5.3),('2016-11-30 07:40:00', 5.8),('2016-11-30 07:50:00', 5.9), + ('2016-11-30 08:00:00', 5.6),('2016-11-30 08:10:00', 5.2),('2016-11-30 08:20:00', 5.4),('2016-11-30 08:30:00', 5.6), + ('2016-11-30 08:40:00', 6.0),('2016-11-30 08:50:00', 5.4),('2016-11-30 09:00:00', 6.0),('2016-11-30 09:10:00', 5.1), + ('2016-11-30 09:20:00', 5.1),('2016-11-30 09:30:00', 5.5),('2016-11-30 09:40:00', 5.6),('2016-11-30 09:50:00', 5.0), + ('2016-11-30 10:00:00', 5.1),('2016-11-30 10:10:00', 5.7),('2016-11-30 10:20:00', 5.4),('2016-11-30 10:30:00', 5.4), + ('2016-11-30 10:40:00', 5.7),('2016-11-30 10:50:00', 5.2),('2016-11-30 11:00:00', 5.4),('2016-11-30 11:10:00', 5.3), + ('2016-11-30 11:20:00', 5.6),('2016-11-30 11:30:00', 5.0),('2016-11-30 11:40:00', 5.2),('2016-11-30 11:50:00', 5.2), + ('2016-11-30 12:00:00', 5.5),('2016-11-30 12:10:00', 5.1),('2016-11-30 12:20:00', 5.7),('2016-11-30 12:30:00', 5.4), + ('2016-11-30 12:40:00', 5.2),('2016-11-30 12:50:00', 5.5),('2016-11-30 13:00:00', 5.0),('2016-11-30 13:10:00', 5.5), + ('2016-11-30 13:20:00', 5.6),('2016-11-30 13:30:00', 5.3),('2016-11-30 13:40:00', 5.5),('2016-11-30 13:50:00', 5.9), + ('2016-11-30 14:00:00', 10.9),('2016-11-30 14:10:00', 10.6),('2016-11-30 14:20:00', 10.3),('2016-11-30 14:30:00', 11.0), + ('2016-11-30 14:40:00', 10.0),('2016-11-30 14:50:00', 10.1),('2016-11-30 15:00:00', 10.2),('2016-11-30 15:10:00', 10.2), + ('2016-11-30 15:20:00', 10.3),('2016-11-30 15:30:00', 10.1),('2016-11-30 15:40:00', 10.9),('2016-11-30 15:50:00', 10.1), + ('2016-11-30 16:00:00', 11.0),('2016-11-30 16:10:00', 10.2),('2016-11-30 16:20:00', 10.7),('2016-11-30 16:30:00', 10.2), + ('2016-11-30 16:40:00', 10.2),('2016-11-30 16:50:00', 10.2),('2016-11-30 17:00:00', 10.8),('2016-11-30 17:10:00', 10.6), + ('2016-11-30 17:20:00', 10.5),('2016-11-30 17:30:00', 10.7),('2016-11-30 17:40:00', 10.9),('2016-11-30 17:50:00', 10.9), + ('2016-11-30 18:00:00', 10.1),('2016-11-30 18:10:00', 10.3),('2016-11-30 18:20:00', 10.1),('2016-11-30 18:30:00', 10.6), + ('2016-11-30 18:40:00', 10.3),('2016-11-30 18:50:00', 10.8),('2016-11-30 19:00:00', 10.9),('2016-11-30 19:10:00', 10.8), + ('2016-11-30 19:20:00', 10.6),('2016-11-30 19:30:00', 11.0),('2016-11-30 19:40:00', 10.3),('2016-11-30 19:50:00', 10.9), + ('2016-11-30 20:00:00', 10.6),('2016-11-30 20:10:00', 10.6),('2016-11-30 20:20:00', 10.5),('2016-11-30 20:30:00', 10.4), + ('2016-11-30 20:40:00', 10.9),('2016-11-30 20:50:00', 10.9),('2016-11-30 21:00:00', 10.7),('2016-11-30 21:10:00', 10.6), + ('2016-11-30 21:20:00', 10.5),('2016-11-30 21:30:00', 10.8),('2016-11-30 21:40:00', 10.4),('2016-11-30 21:50:00', 10.0), + ('2016-11-30 22:00:00', 10.6),('2016-11-30 22:10:00', 10.6),('2016-11-30 22:20:00', 10.6),('2016-11-30 22:30:00', 10.1), + ('2016-11-30 22:40:00', 10.4),('2016-11-30 22:50:00', 10.8),('2016-11-30 23:00:00', 10.4),('2016-11-30 23:10:00', 10.6), + ('2016-11-30 23:20:00', 10.1),('2016-11-30 23:30:00', 10.2),('2016-11-30 23:40:00', 10.6),('2016-11-30 23:50:00', 10.8), + ('2016-12-01 00:00:00', 10.6),('2016-12-01 00:10:00', 10.2),('2016-12-01 00:20:00', 10.9),('2016-12-01 00:30:00', 10.3), + ('2016-12-01 00:40:00', 10.3),('2016-12-01 00:50:00', 10.1),('2016-12-01 01:00:00', 10.7),('2016-12-01 01:10:00', 10.5), + ('2016-12-01 01:20:00', 10.4),('2016-12-01 01:30:00', 10.7),('2016-12-01 01:40:00', 10.5),('2016-12-01 01:50:00', 10.7), + ('2016-12-01 02:00:00', 10.8),('2016-12-01 02:10:00', 10.9),('2016-12-01 02:20:00', 10.9),('2016-12-01 02:30:00', 10.1), + ('2016-12-01 02:40:00', 10.4),('2016-12-01 02:50:00', 10.7),('2016-12-01 03:00:00', 10.7),('2016-12-01 03:10:00', 10.5), + ('2016-12-01 03:20:00', 10.2),('2016-12-01 03:30:00', 10.2),('2016-12-01 03:40:00', 10.8),('2016-12-01 03:50:00', 10.2), + ('2016-12-01 04:00:00', 10.9),('2016-12-01 04:10:00', 10.4),('2016-12-01 04:20:00', 10.6),('2016-12-01 04:30:00', 11.0), + ('2016-12-01 04:40:00', 10.4),('2016-12-01 04:50:00', 10.3),('2016-12-01 05:00:00', 10.7),('2016-12-01 05:10:00', 10.6), + ('2016-12-01 05:20:00', 10.9),('2016-12-01 05:30:00', 11.0),('2016-12-01 05:40:00', 10.9),('2016-12-01 05:50:00', 10.0), + ('2016-12-01 06:00:00', 10.8),('2016-12-01 06:10:00', 10.0),('2016-12-01 06:20:00', 10.1),('2016-12-01 06:30:00', 10.5), + ('2016-12-01 06:40:00', 15.5),('2016-12-01 06:50:00', 15.7),('2016-12-01 07:00:00', 15.1),('2016-12-01 07:10:00', 15.6), + ('2016-12-01 07:20:00', 15.5),('2016-12-01 07:30:00', 15.4),('2016-12-01 07:40:00', 15.7),('2016-12-01 07:50:00', 15.6), + ('2016-12-01 08:00:00', 15.3),('2016-12-01 08:10:00', 15.6),('2016-12-01 08:20:00', 15.1),('2016-12-01 08:30:00', 15.6), + ('2016-12-01 08:40:00', 15.9),('2016-12-01 08:50:00', 16.0),('2016-12-01 09:00:00', 15.4),('2016-12-01 09:10:00', 15.0), + ('2016-12-01 09:20:00', 15.0),('2016-12-01 09:30:00', 15.4),('2016-12-01 09:40:00', 15.9),('2016-12-01 09:50:00', 15.6), + ('2016-12-01 10:00:00', 15.7),('2016-12-01 10:10:00', 15.4),('2016-12-01 10:20:00', 15.2),('2016-12-01 10:30:00', 15.2), + ('2016-12-01 10:40:00', 15.8),('2016-12-01 10:50:00', 15.4),('2016-12-01 11:00:00', 16.0),('2016-12-01 11:10:00', 15.9), + ('2016-12-01 11:20:00', 15.1),('2016-12-01 11:30:00', 15.0),('2016-12-01 11:40:00', 15.0),('2016-12-01 11:50:00', 15.4), + ('2016-12-01 12:00:00', 15.5),('2016-12-01 12:10:00', 15.3),('2016-12-01 12:20:00', 16.0),('2016-12-01 12:30:00', 15.1), + ('2016-12-01 12:40:00', 15.5),('2016-12-01 12:50:00', 16.0),('2016-12-01 13:00:00', 15.7),('2016-12-01 13:10:00', 15.9), + ('2016-12-01 13:20:00', 15.4),('2016-12-01 13:30:00', 15.3),('2016-12-01 13:40:00', 15.9),('2016-12-01 13:50:00', 15.8), + ('2016-12-01 14:00:00', 15.4),('2016-12-01 14:10:00', 15.9),('2016-12-01 14:20:00', 15.3),('2016-12-01 14:30:00', 16.0), + ('2016-12-01 14:40:00', 15.5),('2016-12-01 14:50:00', 15.0),('2016-12-01 15:00:00', 15.1),('2016-12-01 15:10:00', 16.0), + ('2016-12-01 15:20:00', 15.8),('2016-12-01 15:30:00', 15.9),('2016-12-01 15:40:00', 15.4),('2016-12-01 15:50:00', 15.1), + ('2016-12-01 16:00:00', 15.8),('2016-12-01 16:10:00', 15.2),('2016-12-01 16:20:00', 15.4),('2016-12-01 16:30:00', 15.8), + ('2016-12-01 16:40:00', 15.8),('2016-12-01 16:50:00', 15.1),('2016-12-01 17:00:00', 15.3),('2016-12-01 17:10:00', 15.6), + ('2016-12-01 17:20:00', 15.3),('2016-12-01 17:30:00', 15.8),('2016-12-01 17:40:00', 15.0),('2016-12-01 17:50:00', 15.3), + ('2016-12-01 18:00:00', 15.5),('2016-12-01 18:10:00', 15.4),('2016-12-01 18:20:00', 15.3),('2016-12-01 18:30:00', 15.8), + ('2016-12-01 18:40:00', 15.2),('2016-12-01 18:50:00', 15.9),('2016-12-01 19:00:00', 15.4),('2016-12-01 19:10:00', 15.3), + ('2016-12-01 19:20:00', 15.1),('2016-12-01 19:30:00', 15.3),('2016-12-01 19:40:00', 15.9),('2016-12-01 19:50:00', 15.3), + ('2016-12-01 20:00:00', 15.3),('2016-12-01 20:10:00', 15.2),('2016-12-01 20:20:00', 15.0),('2016-12-01 20:30:00', 15.3), + ('2016-12-01 20:40:00', 15.1),('2016-12-01 20:50:00', 15.1),('2016-12-01 21:00:00', 15.6),('2016-12-01 21:10:00', 15.8), + ('2016-12-01 21:20:00', 15.4),('2016-12-01 21:30:00', 15.2),('2016-12-01 21:40:00', 16.0),('2016-12-01 21:50:00', 15.5), + ('2016-12-01 22:00:00', 15.4),('2016-12-01 22:10:00', 15.7),('2016-12-01 22:20:00', 15.3),('2016-12-01 22:30:00', 15.9), + ('2016-12-01 22:40:00', 15.9),('2016-12-01 22:50:00', 15.2),('2016-12-01 23:00:00', 15.8),('2016-12-01 23:10:00', 15.9), + ('2016-12-01 23:20:00', 20.9),('2016-12-01 23:30:00', 20.4),('2016-12-01 23:40:00', 20.3),('2016-12-01 23:50:00', 20.1), + ('2016-12-02 00:00:00', 20.7),('2016-12-02 00:10:00', 20.7),('2016-12-02 00:20:00', 20.5),('2016-12-02 00:30:00', 20.4), + ('2016-12-02 00:40:00', 20.4),('2016-12-02 00:50:00', 20.1),('2016-12-02 01:00:00', 20.2),('2016-12-02 01:10:00', 20.9), + ('2016-12-02 01:20:00', 20.6),('2016-12-02 01:30:00', 20.0),('2016-12-02 01:40:00', 20.4),('2016-12-02 01:50:00', 20.2), + ('2016-12-02 02:00:00', 20.6),('2016-12-02 02:10:00', 20.4),('2016-12-02 02:20:00', 20.5),('2016-12-02 02:30:00', 20.4), + ('2016-12-02 02:40:00', 20.5),('2016-12-02 02:50:00', 20.7),('2016-12-02 03:00:00', 20.2),('2016-12-02 03:10:00', 20.2), + ('2016-12-02 03:20:00', 20.1),('2016-12-02 03:30:00', 20.5),('2016-12-02 03:40:00', 20.5),('2016-12-02 03:50:00', 20.0), + ('2016-12-02 04:00:00', 20.7),('2016-12-02 04:10:00', 20.8),('2016-12-02 04:20:00', 20.6),('2016-12-02 04:30:00', 20.4), + ('2016-12-02 04:40:00', 20.5),('2016-12-02 04:50:00', 20.8),('2016-12-02 05:00:00', 20.1),('2016-12-02 05:10:00', 20.9), + ('2016-12-02 05:20:00', 20.5),('2016-12-02 05:30:00', 20.4),('2016-12-02 05:40:00', 20.2),('2016-12-02 05:50:00', 20.4), + ('2016-12-02 06:00:00', 20.8),('2016-12-02 06:10:00', 20.7),('2016-12-02 06:20:00', 20.9),('2016-12-02 06:30:00', 20.1), + ('2016-12-02 06:40:00', 20.3),('2016-12-02 06:50:00', 20.2),('2016-12-02 07:00:00', 20.4),('2016-12-02 07:10:00', 20.7), + ('2016-12-02 07:20:00', 20.4),('2016-12-02 07:30:00', 20.8),('2016-12-02 07:40:00', 20.8),('2016-12-02 07:50:00', 20.1), + ('2016-12-02 08:00:00', 20.3),('2016-12-02 08:10:00', 20.7),('2016-12-02 08:20:00', 20.9),('2016-12-02 08:30:00', 21.0), + ('2016-12-02 08:40:00', 20.2),('2016-12-02 08:50:00', 20.5),('2016-12-02 09:00:00', 20.2),('2016-12-02 09:10:00', 20.8), + ('2016-12-02 09:20:00', 20.9),('2016-12-02 09:30:00', 20.5),('2016-12-02 09:40:00', 20.9),('2016-12-02 09:50:00', 20.7), + ('2016-12-02 10:00:00', 20.3),('2016-12-02 10:10:00', 21.0),('2016-12-02 10:20:00', 20.5),('2016-12-02 10:30:00', 20.3), + ('2016-12-02 10:40:00', 20.2),('2016-12-02 10:50:00', 20.3),('2016-12-02 11:00:00', 20.4),('2016-12-02 11:10:00', 20.4), + ('2016-12-02 11:20:00', 21.0),('2016-12-02 11:30:00', 20.3),('2016-12-02 11:40:00', 20.3),('2016-12-02 11:50:00', 20.9), + ('2016-12-02 12:00:00', 20.8),('2016-12-02 12:10:00', 20.9),('2016-12-02 12:20:00', 20.7),('2016-12-02 12:30:00', 20.7); + ``` + +2. Perform training. + + ``` + SET search_path="$user",public,madlib; + SET behavior_compat_options = 'bind_procedure_searchpath'; + select madlib.prophet_fit('public.ts_data', 'public.prophet_model', + $${'ds': 'date', 'y': 'value'}$$, -- Column name mapping. The prophet requires that the time column name be 'ds' and the time series value column name be 'y'. + $${'growth': 'linear', 'changepoints': ['2016-11-30 05:40:00']}$$ -- Training parameter combination. If there are multiple parameters, input them in tuple mode. + ); + ``` + + Query the model table. + + ``` + select id, y_type, params from public.prophet_model; + + id | y_type | params + ----+------------------+--------------------------------------------------------------- + 1 | double precision | {'changepoints': ['2016-11-30 05:40:00'], 'growth': 'linear'} + ``` + + The model table records the training time, result types, and used parameters. + +3. Perform the prediction. + + ``` + select madlib.prophet_predict('public.prophet_model','public.prophet_output', 10, '10T'); + ``` + + View the prediction result. + + ``` + select ds, yhat, yhat_lower, yhat_upper from public.prophet_output; + + ds | yhat | yhat_lower | yhat_upper + ------------+---------------+---------------+--------------- + 2016-12-02 | 20.6943848045 | 17.7671496048 | 23.4160694837 + 2016-12-02 | 20.7408355633 | 17.9264413164 | 23.6426403933 + 2016-12-02 | 20.7872863221 | 17.9298207895 | 23.4548814727 + 2016-12-02 | 20.833737081 | 18.234443228 | 23.5317342873 + 2016-12-02 | 20.8801878398 | 18.2471709649 | 23.8345735574 + 2016-12-02 | 20.9266385986 | 18.1780101465 | 23.696087927 + 2016-12-02 | 20.9730893575 | 18.4292088648 | 23.7209823631 + 2016-12-02 | 21.0195401163 | 18.2623494126 | 23.7341427068 + 2016-12-02 | 21.0659908751 | 18.1173966769 | 23.7919478206 + 2016-12-02 | 21.112441634 | 18.5018042056 | 23.9508963879 + (10 rows) + ``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/7-deepsql/7-5-troubleshooting.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/7-deepsql/7-5-troubleshooting.md new file mode 100644 index 0000000000000000000000000000000000000000..cc172f05f431612e1d6b19934ab70b9ae634ad70 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/7-deepsql/7-5-troubleshooting.md @@ -0,0 +1,46 @@ +--- +title: Troubleshooting +summary: Troubleshooting +author: Guo Huan +date: 2021-05-19 +--- + +# Troubleshooting + +- **Problem description**: When you compile the database, the following messages about the Python module are displayed: "can not be used when making a shared object;recompile with -fPIC" or "libpython22.7.a: could not read symbols: Bad value". + + **Solution**: + + 1. Check the Python version and environment variables. + 2. Check whether python-devel is i**gdb** oor whether **-enable-shared** is enabled during Python compilation. + +- **Problem description**: When the **gdb** or **gstack** command is executed, the error message "gdb: symbol lookup error: gdb: undefined symbol: PyUnicodeUCS4_FromEncodedObject" is displayed. + + **Solution**: This problem generally occurs in the environment where Python 2 is compiled. During Python 2 compilation and installation, you can use the **-enable-unicode=ucs2** or **-enable-unicode=ucs4** option to specify that two bytes or four bytes are used to represent a Unicode character. By default, Python 2 uses **-enable-unicode=ucs2**. By default, Python 3 uses four bytes to represent a Unicode character. + + You can run the **import sys; print sys.maxunicode** command in the built-in Python 2 and view the result. If the result is **65535**, the system uses the UCS2 code by default. If the result is **1114111**, the system uses the UCS4 code. + + When you compile Python 2 by yourself, if the built-in Python 2 uses UCS4, the gdb in the system also depends on UCS4. Therefore, you need to add **-enable-unicode=ucs4** when configuring Python 2 that is compiled by yourself. + +- **Problem description**: The error message " "Data table does not exist" " is displayed when the kmeans algorithm is used. + + **Solution**: If the schema where the algorithm is located and the input table are not in the same schema, you can set **SET behavior_compat_options = 'bind_procedure_searchpath'** to solve the problem. + +- **Problem description**: An error is reported during Python startup or import. + + **Solution**: + + 1. Check the environment variables, such as **PYTHONHOME** and *PYTHONPATH*. + 2. Install the required dependency packages. + +- **Problem description**: An error message "ERROR: spiexceptions.UndefinedFunction: operator does not exist: json ->> unknown." is displayed when algorithms such as regression are used. + + **Solution**: The database does not support the JSON export function. + +- **Problem description**: During compilation in MADlib, if **make -sj** is used, boost-related errors are reported. For example: **fatal error: boost/mpl/if.hpp: No such file or directory**. + + **Solution**: This is not a problem. During MADlib compilation, the installation packages are decompressed first. If the compilation is performed in parallel mode, the compilation and decompression may be performed at the same time. If this file used for compilation has not been decompressed yet, such an error is reported. You can run the **make -sj** command again to solve the problem. + +- **Problem description**: The error message "ERROR: Failed to connect to database" is displayed when you run the **./madpack** command to install the Madpack. + + **Solution**: Check whether the database is started, whether the target database exists, whether the database port is occupied, and whether the installation user has the administrator permission. When installing the Madpack, set the IP address to **127.0.0.1** instead of localhost. Otherwise, the connection fails. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/8-db4ai/8-1-overview.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/8-db4ai/8-1-overview.md new file mode 100644 index 0000000000000000000000000000000000000000..38d232b57e9d2fd30982cf52d345264b93ff493c --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/8-db4ai/8-1-overview.md @@ -0,0 +1,10 @@ +--- +title: Overview +summary: Overview +author: Guo Huan +date: 2021-10-20 +--- + +# Overview + +DB4AI uses database capabilities to drive AI tasks and implement data storage and technology stack isomorphism. By integrating AI algorithms into the database, MogDB supports the native AI computing engine, model management, AI operators, and native AI execution plan, providing users with inclusive AI technologies. Different from the traditional AI modeling process, DB4AI one-stop modeling eliminates repeated data flowing among different platforms, simplifies the development process, and plans the optimal execution path through the database, so that developers can focus on the tuning of specific services and models. It outcompetes similar products in ease-of-use and performance. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/8-db4ai/8-2-db4ai-snapshots-for-data-version-management.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/8-db4ai/8-2-db4ai-snapshots-for-data-version-management.md new file mode 100644 index 0000000000000000000000000000000000000000..497627218f1be43ba4ba8cb11e77cd65ecede75d --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/8-db4ai/8-2-db4ai-snapshots-for-data-version-management.md @@ -0,0 +1,265 @@ +--- +title: DB4AI-Snapshots for Data Version Management +summary: DB4AI-Snapshots for Data Version Management +author: Guo Huan +date: 2021-10-20 +--- + +# DB4AI-Snapshots for Data Version Management + +DB4AI-Snapshots is used by the DB4AI module to manage dataset versions. With the DB4AI-Snapshots component, developers can easily and quickly perform data preprocessing operations such as feature filtering and type conversion. In addition, developers can perform version control on training datasets like Git. After a data table snapshot is created, it can be used as a view. However, once the data table snapshot is released, it is fixed as static data. To modify the content of this data table snapshot, you need to create a data table snapshot with a different version number. + +## Lifecycle of DB4AI-Snapshots + +DB4AI-Snapshots can be published, archived, or purged. After being published, DB4AI-Snapshots can be used. Archived DB4AI-Snapshots are in the archiving period and will not be used to train new models. Instead, old data is used to verify new models. Purged DB4AI-Snapshots will not be found in the database system. + +Note that the DB4AI-Snapshots function is used to provide unified training data for users. Team members can use the specified training data to retrain the machine learning models, facilitating collaboration between users. Therefore, the DB4AI-Snapshots feature is not supported in scenarios where user data conversion is not supported, such as **private users** and **separation of duty** (**enableSeparationOfDuty** set to **ON**). + +You can run the **CREATE SNAPSHOT** statement to create a data table snapshot. The created snapshot is in the published state by default. You can create a table snapshot in either **MSS** or **CSS** mode, which can be configured using the GUC parameter **db4ai_snapshot_mode**. For the MSS mode, it is realized by materialization algorithm where data entity of original datasets is stored. The CSS mode is implemented based on a relative calculation algorithm where incremental data information is stored. The metadata of the data table snapshot is stored in the system directory of DB4AI. You can view it in the **db4ai.snapshot** system catalog. + +You can run the **ARCHIVE SNAPSHOT** statement to mark a data table snapshot as archived, and run the **PUBLISH SNAPSHOT** statement to mark it as published again. The state of a data table snapshot is marked to help data scientists work together as a team. + +If a table snapshot is no longer useful, you can run the **PURGE SNAPSHOT** statement to permanently delete it and restore the storage space. + +## DB4AI-Snapshots Usage Guide + +1. Create a table and insert table data. + + If a data table exists in the database, you can create a data table snapshot based on the existing data table. To facilitate subsequent demonstration, create a data table named **t1** and insert test data into the table. + + ```sql + create table t1 (id int, name varchar); + insert into t1 values (1, 'zhangsan'); + insert into t1 values (2, 'lisi'); + insert into t1 values (3, 'wangwu'); + insert into t1 values (4, 'lisa'); + insert into t1 values (5, 'jack'); + ``` + + Run the following SQL statement to query the content of the collocation data table: + + ```sql + SELECT * FROM t1; + id | name + ----+---------- + 1 | zhangsan + 2 | lisi + 3 | wangwu + 4 | lisa + 5 | jack + (5 rows) + ``` + +2. Use DB4AI-Snapshots. + + - Create DB4AI-Snapshots. + + - Example 1: CREATE SNAPSHOT… AS + + In the following example, the default version separator is an at sign (@), and the default subversion separator is a period (.). You can set the separator by setting the **db4ai_snapshot_version_delimiter** and **db4ai_snapshot_version_separator** parameters. + + ```sql + create snapshot s1@1.0 comment is 'first version' as select * from t1; + schema | name + --------+-------- + public | s1@1.0 + (1 row) + ``` + + The command output indicates that a snapshot has been created for data table **s1** and the version number is **1.0**. A created data table snapshot can be queried in the same way as a common view, but cannot be updated using the **INSERT INTO** statement. For example, you can use any of the following statements to query the content of data table snapshot **s1** of version 1.0: + + ```sql + SELECT * FROM s1@1.0; + SELECT * FROM public.s1@1.0; + SELECT * FROM public . s1 @ 1.0; + id | name + ----+---------- + 1 | zhangsan + 2 | lisi + 3 | wangwu + 4 | lisa + 5 | jack + (5 rows) + ``` + + You can run the following SQL statement to modify the content of the **t1** data table: + + ```sql + UPDATE t1 SET name = 'tom' where id = 4; + insert into t1 values (6, 'john'); + insert into t1 values (7, 'tim'); + ``` + + When content of data table **t1** is retrieved again, it is found that although the content of data table **t1** has changed, the query result of data table snapshot **s1@1.0** does not change. Because data in data table **t1** has changed, to use content of the current data table as a version 2.0, you can create snapshot **s1@2.0** by using the following SQL statement: + + ```sql + create snapshot s1@2.0 as select * from t1; + ``` + + According to the foregoing example, it can be found that the data table snapshot can solidify content of a data table, to avoid instability during training of a machine learning model caused by data modification in the process, and can also avoid a lock conflict caused when multiple users access and modify the same table at the same time. + + - Example 2: CREATE SNAPSHOT… FROM + + You can run an SQL statement to inherit a created data table snapshot and generate a new data table snapshot based on the data modification. Example: + + ```sql + create snapshot s1@3.0 from @1.0 comment is 'inherits from @1.0' using (INSERT VALUES(6, 'john'), (7, 'tim'); DELETE WHERE id = 1); + schema | name + --------+-------- + public | s1@3.0 + (1 row) + ``` + + Where, @ is the data table snapshot version separator and the from clause is followed by the existing data table snapshot, in the format of @ + version number. You can add an operation keyword (such as **INSERT**, **UPDATE**, **DELETE**, or **ALTER**) after the **USING** keyword. In the **INSERT INTO** and **DELETE FROM** statements, clauses related to data table snapshot names, such as **INTO** and **FROM**, can be omitted. For details, see [AI Feature Functions](28-ai-feature-functions). + + In the example, based on the **s1@1.0** snapshot, insert two pieces of data and delete one piece of data to generate a new snapshot **s1@3.0**. Then, retrieve **s1@3.0**. + + ```sql + SELECT * FROM s1@3.0; + id | name + ----+---------- + 1 | zhangsan + 2 | lisi + 3 | wangwu + 4 | lisa + 5 | jack + 6 | john + 7 | tim + (7 rows) + ``` + + - Delete the data table snapshot **SNAPSHOT**. + + ```sql + purge snapshot s1@3.0; + schema | name + --------+-------- + public | s1@3.0 + (1 row) + ``` + + At this time, no data can be retrieved from **s1@3.0**, and the records of the data table snapshot in the **db4ai.snapshot** view are cleared. Deleting the data table snapshot of this version does not affect the data table snapshots of other versions. + + - Sample from a data table snapshot. + + Example: Use the sampling rate 0.5 to extract data from snapshot **s1**. + + ```sql + sample snapshot s1@2.0 stratify by name as nick at ratio .5; + schema | name + --------+------------ + public | s1nick@2.0 + (1 row) + ``` + + You can use this function to create a training set and a test set. For example: + + ```sql + SAMPLE SNAPSHOT s1@2.0 STRATIFY BY name AS _test AT RATIO .2, AS _train AT RATIO .8 COMMENT IS 'training'; + schema | name + --------+---------------- + public | s1_test@2.0 + public | s1_train@2.0 + (2 rows) + ``` + + - Publish a data table snapshot. + + Run the following SQL statement to mark the data table snapshot **s1@2.0** as published: + + ```sql + publish snapshot s1@2.0; + schema | name + --------+-------- + public | s1@2.0 + (1 row) + ``` + + - Archive a data table snapshot. + + Run the following statement to mark the data table snapshot as archived: + + ```sql + archive snapshot s1@2.0; + schema | name + --------+-------- + public | s1@2.0 + (1 row) + ``` + + You can use the views provided by DB4AI-Snapshots to view the status of the current data table snapshot and other information. + + ```sql + select * from db4ai.snapshot; + id | parent_id | matrix_id | root_id | schema | name | owner | commands | comment | published | archived | created | row_count + ----+-----------+-----------+---------+--------+------------+--------+------------------------------------------+---------+-----------+----------+----------------------------+----------- + 1 | | | 1 | public | s1@2.0 | omm | {"select *","from t1 where id > 3",NULL} | | t | f | 2021-04-17 09:24:11.139868 | 2 + 2 | 1 | | 1 | public | s1nick@2.0 | omm | {"SAMPLE nick .5 {name}"} | | f | f | 2021-04-17 10:02:31.73923 | 0 + ``` + +3. Perform troubleshooting in case of exceptions. + + - The data table or DB4AI snapshot does not exist. + + ```sql + purge snapshot s1nick@2.0; + publish snapshot s1nick@2.0; + --------- + ERROR: snapshot public."s1nick@2.0" does not exist + CONTEXT: PL/pgSQL function db4ai.publish_snapshot(name,name) line 11 at assignment + + archive snapshot s1nick@2.0; + ---------- + ERROR: snapshot public."s1nick@2.0" does not exist + CONTEXT: PL/pgSQL function db4ai.archive_snapshot(name,name) line 11 at assignment + ``` + + - Before deleting a snapshot, ensure that other snapshots that depend on it have been deleted. + + ```sql + purge snapshot s1@1.0; + ERROR: cannot purge root snapshot 'public."s1@1.0"' having dependent db4ai-snapshots + HINT: purge all dependent db4ai-snapshots first + CONTEXT: referenced column: purge_snapshot_internal + SQL statement "SELECT db4ai.purge_snapshot_internal(i_schema, i_name)" + PL/pgSQL function db4ai.purge_snapshot(name,name) line 62 at PERFORM + ``` + +4. Set GUC parameters. + + - db4ai_snapshot_mode: + + There are two snapshot modes: MSS (materialized mode, used to store data entities) and CSS (computing mode, used to store incremental information). The snapshot mode can be switched between MSS and CSS. The default snapshot mode is MSS. + + - db4ai_snapshot_version_delimiter: + + Used to set the data table snapshot version separator. The at sign (@) is the default data table snapshot version separator. + + - db4ai_snapshot_version_separator + + Used to set the data table snapshot subversion separator. The period (.) is the default data table snapshot subversion separator. + +5. View the snapshot details of a data table in the DB4AI schema by using **db4ai.snapshot**. + + ```sql + mogdb=# \d db4ai.snapshot + Table "db4ai.snapshot" + Column | Type | Modifiers + -----------+-----------------------------+--------------------------- + id | bigint | + parent_id | bigint | + matrix_id | bigint | + root_id | bigint | + schema | name | not null + name | name | not null + owner | name | not null + commands | text[] | not null + comment | text | + published | boolean | not null default false + archived | boolean | not null default false + created | timestamp without time zone | default pg_systimestamp() + row_count | bigint | not null + Indexes: + "snapshot_pkey" PRIMARY KEY, btree (schema, name) TABLESPACE pg_default + "snapshot_id_key" UNIQUE CONSTRAINT, btree (id) TABLESPACE pg_default + ``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/8-db4ai/8-3-db4ai-query-for-model-training-and-prediction.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/8-db4ai/8-3-db4ai-query-for-model-training-and-prediction.md new file mode 100644 index 0000000000000000000000000000000000000000..d4b4153b8312314e68eedad240a52cb064d1eae3 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/8-db4ai/8-3-db4ai-query-for-model-training-and-prediction.md @@ -0,0 +1,420 @@ +--- +title: DB4AI-Query for Model Training and Prediction +summary: DB4AI-Query for Model Training and Prediction +author: Guo Huan +date: 2021-10-20 +--- + +# DB4AI-Query for Model Training and Prediction + +The current version of MogDB supports the native DB4AI capability. By introducing native AI operators, MogDB simplifies the operation process and fully utilizes the optimization and execution capabilities of the database optimizer and executor to obtain the high-performance model training capability in the database. With a simpler model training and prediction process and higher performance, developers can focus on model tuning and data analysis in a shorter period of time, avoiding fragmented technology stacks and redundant code implementation. + +## Keyword Parsing + +**Table 1** DB4AI syntax and keywords + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
NameDescription
StatementCREATE MODELCreates a model, trains it, and saves the model.
PREDICT BYUses an existing model for prediction.
KeywordTARGETTarget column name of a training or prediction task.
FEATURESData feature column name of a training or prediction task.
MODELModel name of a training task.
+ +## Developer Guide + +1. Introduce the algorithms supported in this version. + + DB4AI of the current version supports logistic regression (binary classification tasks), linear regression, and vector machine algorithms (classification tasks) based on the SGD operator, as well as the K-Means clustering algorithm based on the K-Means operator. + +2. Learn about the model training syntax. + + - CREATE MODEL + + You can run the **CREATE MODEL** statement to create and train a model. Taking dataset **kmeans_2d** as an example, the data content of the table is as follows: + + ```sql + mogdb=# select * from kmeans_2d; + id | position + ----+------------------------------------- + 1 | {74.5268815685995,88.2141939294524} + 2 | {70.9565760521218,98.8114827475511} + 3 | {76.2756086327136,23.8387574302033} + 4 | {17.8495847294107,81.8449544720352} + 5 | {81.2175785354339,57.1677675866522} + 6 | {53.97752255667,49.3158342130482} + 7 | {93.2475341879763,86.934042100329} + 8 | {72.7659293473698,19.7020415100269} + 9 | {16.5800288529135,75.7475957670249} + 10 | {81.8520747194998,40.3476078575477} + 11 | {76.796671198681,86.3827232690528} + 12 | {59.9231450678781,90.9907738864422} + 13 | {70.161884885747,19.7427458665334} + 14 | {11.1269539105706,70.9988166182302} + 15 | {80.5005071521737,65.2822235273197} + 16 | {54.7030725912191,52.151339428965} + 17 | {103.059707058128,80.8419883321039} + 18 | {85.3574452036992,14.9910179991275} + 19 | {28.6501615960151,76.6922890325077} + 20 | {69.7285806713626,49.5416352967732} + (20 rows) + ``` + + The data type of the **position** field in this table is double precision[]. + + - The following uses K-Means as an example to describe how to train a model. Specify **position** as a feature column in the **kmeans_2d** training set, and use the K-Means algorithm to create and save the **point_kmeans** model. + + ```sql + mogdb=# CREATE MODEL point_kmeans USING kmeans FEATURES position FROM kmeans_2d WITH num_centroids=3; + NOTICE: Hyperparameter max_iterations takes value DEFAULT (10) + NOTICE: Hyperparameter num_centroids takes value 3 + NOTICE: Hyperparameter tolerance takes value DEFAULT (0.000010) + NOTICE: Hyperparameter batch_size takes value DEFAULT (10) + NOTICE: Hyperparameter num_features takes value DEFAULT (2) + NOTICE: Hyperparameter distance_function takes value DEFAULT (L2_Squared) + NOTICE: Hyperparameter seeding_function takes value DEFAULT (Random++) + NOTICE: Hyperparameter verbose takes value DEFAULT (0) + NOTICE: Hyperparameter seed takes value DEFAULT (0) + MODEL CREATED. PROCESSED 1 + ``` + + In the preceding command: + + - The **CREATE MODEL** statement is used to train and save a model. + + - **USING** specifies the algorithm name. + + - **FEATURES** specifies the features of the training model and needs to be added based on the column name of the training data table. + + - **TARGET** specifies the training target of the model. It can be the column name of the data table required for training or an expression, for example, **price > 10000**. + + - **WITH** specifies the hyperparameters used for model training. When the hyperparameter is not set by the user, the framework uses the default value. + + The framework supports various hyperparameter combinations for different operators. + + **Table 2** Hyperparameters supported by operators + + | Operator | Hyperparameter | + | :----------------------------------------------------------- | :----------------------------------------------------------- | + | GD(logistic_regression, linear_regression, and svm_classification) | optimizer(char\*), verbose(bool), max_iterations(int), max_seconds(double), batch_size(int), learning_rate(double), decay(double), and tolerance(double)
SVM limits the hyperparameter lambda(double). | + | K-Means | max_iterations(int), num_centroids(int), tolerance(double), batch_size(int), num_features(int), distance_function(char), seeding_function(char*), verbose(int), and seed(int) | + + The default value and value range of each hyperparameter are as follows: + + **Table 3** Default values and value ranges of hyperparameters + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
OperatorDefault Hyperparameter ValueValue RangeHyperparameter Description
GD (logistic_regression, linear_regression, and svm_classification)optimizer = gd (gradient descent)gd or ngd (natural gradient descent)Optimizer
verbose = falseT or FLog display
max_iterations = 100(0, INT_MAX_VALUE]Maximum iterations
max_seconds = 0 (The running duration is not limited.)[0,INT_MAX_VALUE]Running duration
batch_size = 1000(0, MAX_MEMORY_LIMIT]Number of data records selected per training
learning_rate = 0.8(0, DOUBLE_MAX_VALUE]Learning rate
decay = 0.95(0, DOUBLE_MAX_VALUE]Weight decay rate
tolerance = 0.0005(0, DOUBLE_MAX_VALUE]Tolerance
seed = 0 (random value of seed)[0, INT_MAX_VALUE]Seed
just for SVM:lambda = 0.01(0, DOUBLE_MAX_VALUE)Regularization parameter
Kmeansmax_iterations = 10[1, INT_MAX_VALUE]Maximum iterations
num_centroids = 10[1, MAX_MEMORY_LIMIT]Number of clusters
tolerance = 0.00001(0,1)Central point error
batch_size = 10[1, MAX_MEMORY_LIMIT]Number of data records selected per training
num_features = 2[1, GS_MAX_COLS]Number of sample features
distance_function = "L2_Squared"L1, L2, L2_Squared, or LinfRegularization method
seeding_function = "Random++""Random++" or "KMeans||"Method for initializing seed points
verbose = 0U{ 0, 1, 2 }Verbose mode
seed = 0U[0, INT_MAX_VALUE]Seed
MAX_MEMORY_LIMIT = Maximum number of tuples loaded in memory
GS_MAX_COLS = Maximum number of attributes in a database table
+ + - If the model is saved successfully, the following information is returned: + + ```sql + MODEL CREATED. PROCESSED x + ``` + +3. View the model information. + + After the training is complete, the model is stored in the **gs_model_warehouse** system catalog. You can view information about the model and training process in the **gs_model_warehouse** system catalog. + + You can view a model by viewing the system catalog. For example, run the following SQL statement to view the model named **point_kmeans**: + + ```sql + mogdb=# select * from gs_model_warehouse where modelname='point_kmeans'; + -[ RECORD 1 ]---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + modelname | point_kmeans + modelowner | 10 + createtime | 2021-04-30 17:30:39.59044 + processedtuples | 20 + discardedtuples | 0 + pre_process_time | 6.2001e-05 + exec_time | .000185272 + iterations | 5 + outputtype | 23 + modeltype | kmeans + query | CREATE MODEL point_kmeans USING kmeans FEATURES position FROM kmeans_2d WITH num_centroids=3; + modeldata | + weight | + hyperparametersnames | {max_iterations,num_centroids,tolerance,batch_size,num_features,distance_function,seeding_function,verbose,seed} + hyperparametersvalues | {10,3,1e-05,10,2,L2_Squared,Random++,0,0} + hyperparametersoids | {23,23,701,23,23,1043,1043,23,23} + coefnames | {original_num_centroids,actual_num_centroids,dimension,distance_function_id,seed,coordinates} + coefvalues | {3,3,2,2,572368998,"(77.282589,23.724434)(74.421616,73.239455)(18.551682,76.320914)"} + coefoids | + trainingscoresname | + trainingscoresvalue | + modeldescribe | {"id:1,objective_function:542.851169,avg_distance_to_centroid:108.570234,min_distance_to_centroid:1.027078,max_distance_to_centroid:297.210108,std_dev_distance_to_centroid:105.053257,cluster_size:5","id:2,objective_function:5825.982139,avg_distance_to_centroid:529.634740,min_distance_to_centroid:100.270449,max_distance_to_centroid:990.300588,std_dev_distance_to_centroid:285.915094,cluster_size:11","id:3,objective_function:220.792591,avg_distance_to_centroid:55.198148,min_distance_to_centroid:4.216111,max_distance_to_centroid:102.117204,std_dev_distance_to_centroid:39.319118,cluster_size:4"} + ``` + +4. Use an existing model to perform a prediction task. + + Use the **SELECT** and **PREDICT BY** keywords to complete the prediction task based on the existing model. + + Query syntax: SELECT… PREDICT BY… (FEATURES…)… FROM…; + + ```sql + mogdb=# SELECT id, PREDICT BY point_kmeans (FEATURES position) as pos FROM (select * from kmeans_2d limit 10); + id | pos + ----+----- + 1 | 2 + 2 | 2 + 3 | 1 + 4 | 3 + 5 | 2 + 6 | 2 + 7 | 2 + 8 | 1 + 9 | 3 + 10 | 1 + (10 rows) + ``` + + For the same prediction task, the results of the same model are stable. In addition, models trained based on the same hyperparameter and training set are stable. AI model training is random (random gradient descent of data distribution each batch). Therefore, the computing performance and results of different models can vary slightly. + +5. View the execution plan. + + You can use the **EXPLAIN** statement to analyze the execution plan in the model training or prediction process of **CREATE MODEL** and **PREDICT BY**. The keyword **EXPLAIN** can be followed by a **CREATE MODEL** or **PREDICT BY** clause or an optional parameter. The supported parameters are as follows: + + **Table 4** Parameters supported by EXPLAIN + + | Parameter | Description | + | :-------- | :----------------------------------------------------------- | + | ANALYZE | Boolean variable, which is used to add description information such as the running time and number of loop times | + | VERBOSE | Boolean variable, which determines whether to output the training running information to the client | + | COSTS | Boolean variable | + | CPU | Boolean variable | + | DETAIL | Boolean variable, which is available only in distributed mode | + | NODES | Boolean variable, which is available only in distributed mode | + | NUM_NODES | Boolean variable, which is available only in distributed mode | + | BUFFERS | Boolean variable | + | TIMING | Boolean variable | + | PLAN | Boolean variable | + | FORMAT | Optional format type: TEXT, XML, JSON, and YAML | + + Example: + + ```sql + mogdb=# Explain CREATE MODEL patient_logisitic_regression USING logistic_regression FEATURES second_attack, treatment TARGET trait_anxiety > 50 FROM patients WITH batch_size=10, learning_rate = 0.05; + NOTICE: Hyperparameter batch_size takes value 10 + NOTICE: Hyperparameter decay takes value DEFAULT (0.950000) + NOTICE: Hyperparameter learning_rate takes value 0.050000 + NOTICE: Hyperparameter max_iterations takes value DEFAULT (100) + NOTICE: Hyperparameter max_seconds takes value DEFAULT (0) + NOTICE: Hyperparameter optimizer takes value DEFAULT (gd) + NOTICE: Hyperparameter tolerance takes value DEFAULT (0.000500) + NOTICE: Hyperparameter seed takes value DEFAULT (0) + NOTICE: Hyperparameter verbose takes value DEFAULT (FALSE) + NOTICE: GD shuffle cache size 212369 + QUERY PLAN + ------------------------------------------------------------------- + Gradient Descent (cost=0.00..0.00 rows=0 width=0) + -> Seq Scan on patients (cost=0.00..32.20 rows=1776 width=12) + (2 rows) + ``` + +6. Perform troubleshooting in case of exceptions. + + - Training phase + + - Scenario 1: When the value of the hyperparameter exceeds the value range, the model training fails and an error message is returned. For example: + + ```sql + mogdb=# CREATE MODEL patient_linear_regression USING linear_regression FEATURES second_attack,treatment TARGET trait_anxiety FROM patients WITH optimizer='aa'; + NOTICE: Hyperparameter batch_size takes value DEFAULT (1000) + NOTICE: Hyperparameter decay takes value DEFAULT (0.950000) + NOTICE: Hyperparameter learning_rate takes value DEFAULT (0.800000) + NOTICE: Hyperparameter max_iterations takes value DEFAULT (100) + NOTICE: Hyperparameter max_seconds takes value DEFAULT (0) + NOTICE: Hyperparameter optimizer takes value aa + ERROR: Invalid hyperparameter value for optimizer. Valid values are: gd, ngd. (default is gd) + ``` + + - Scenario 2: If the model name already exists, the model fails to be saved, and an error message with the cause is displayed: + + ```sql + mogdb=# CREATE MODEL patient_linear_regression USING linear_regression FEATURES second_attack,treatment TARGET trait_anxiety FROM patients; + NOTICE: Hyperparameter batch_size takes value DEFAULT (1000) + NOTICE: Hyperparameter decay takes value DEFAULT (0.950000) + NOTICE: Hyperparameter learning_rate takes value DEFAULT (0.800000) + NOTICE: Hyperparameter max_iterations takes value DEFAULT (100) + NOTICE: Hyperparameter max_seconds takes value DEFAULT (0) + NOTICE: Hyperparameter optimizer takes value DEFAULT (gd) + NOTICE: Hyperparameter tolerance takes value DEFAULT (0.000500) + NOTICE: Hyperparameter seed takes value DEFAULT (0) + NOTICE: Hyperparameter verbose takes value DEFAULT (FALSE) + NOTICE: GD shuffle cache size 5502 + ERROR: The model name "patient_linear_regression" already exists in gs_model_warehouse. + ``` + + - Scenario 3: If the value in the **FEATURE** or **TARGETS** column is **\***, **ERROR** is returned with the error cause: + + ```sql + mogdb=# CREATE MODEL patient_linear_regression USING linear_regression FEATURES * TARGET trait_anxiety FROM + patients; + ERROR: FEATURES clause cannot be * + ----------------------------------------------------------------------------------------------------------------------- + mogdb=# CREATE MODEL patient_linear_regression USING linear_regression FEATURES second_attack,treatment TARGET * FROM patients; + ERROR: TARGET clause cannot be * + ``` + + - Scenario 4: If the keyword **TARGET** is used in the unsupervised learning method or is not applicable to the supervised learning method, **ERROR** is returned with the error cause: + + ```sql + mogdb=# CREATE MODEL patient_linear_regression USING linear_regression FEATURES second_attack,treatment FROM patients; + ERROR: Supervised ML algorithms require TARGET clause + ----------------------------------------------------------------------------------------------------------------------------- + CREATE MODEL patient_linear_regression USING linear_regression TARGET trait_anxiety FROM patients; ERROR: Supervised ML algorithms require FEATURES clause + ``` + + - Scenario 5: If the GUC parameter **statement_timeout** is set, the statement that is executed due to training timeout will be terminated. In this case, execute the **CREATE MODEL** statement. Parameters such as the size of the training set, number of training rounds (**iteration**), early termination conditions (**tolerance** and **max_seconds**), and number of parallel threads (**nthread**) affect the training duration. When the duration exceeds the database limit, the statement execution is terminated and model training fails. + + - Prediction phase + + - Scenario 6: If the model name cannot be found in the system catalog, the database reports **ERROR**: + + ```sql + mogdb=# select id, PREDICT BY patient_logistic_regression (FEATURES second_attack,treatment) FROM patients; + ERROR: There is no model called "patient_logistic_regression". + ``` + + - Scenario 7: If the data dimension and data type of the **FEATURES** task are inconsistent with those of the training set, **ERROR** is reported and the error cause is displayed. For example: + + ```sql + mogdb=# select id, PREDICT BY patient_linear_regression (FEATURES second_attack) FROM patients; + ERROR: Invalid number of features for prediction, provided 1, expected 2 + CONTEXT: referenced column: patient_linear_regression_pred + ------------------------------------------------------------------------------------------------------------------------------------- + mogdb=# select id, PREDICT BY patient_linear_regression (FEATURES 1,second_attack,treatment) FROM patients; + ERROR: Invalid number of features for prediction, provided 3, expected 2 + CONTEXT: referenced column: patient_linear_regression_pre + ``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/AI-features/8-db4ai/8-4-pl-python-fenced-mode.md b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/8-db4ai/8-4-pl-python-fenced-mode.md new file mode 100644 index 0000000000000000000000000000000000000000..f2bcefe30ed24207670c0ff0eb8c6a76bab3292c --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/AI-features/8-db4ai/8-4-pl-python-fenced-mode.md @@ -0,0 +1,135 @@ +--- +title: PL/Python Fenced Mode +summary: PL/Python Fenced Mode +author: Guo Huan +date: 2021-10-20 +--- + +# PL/Python Fenced Mode + +PL/Python is added to the fenced mode, which is insecure. During database compilation, to integrate Python into the database, you can add the **-with-python** option to **configure**, or specify the Python path for installing PL/Python and add the **-with-includes='/python-dir=path'** option. + +Before starting the database, set the GUC parameter **unix_socket_directory** to specify the file address for communication between unix_socket processes. You need to create a folder in **user-set-dir-path** in advance and grant read, write, and execute permissions on the folder. + +```bash +unix_socket_directory = '/user-set-dir-path' +``` + +After the configuration is complete, start the database. + +After PL/Python is added to the database compilation and the GUC parameter **unix_socket_directory** is set, the **fenced-Master** process is automatically created during database startup. If Python compilation is not performed for the database, you need to manually start the master process in fenced mode. After the GUC parameter is set, run the command to create the master process. + +Run the following command to start the **fenced-Master** process: + +```bash +gaussdb --fenced -k /user-set-dir-path -D /user-set-dir-path & +``` + +After the fenced mode is configured, the UDF calculation is performed in the **fenced-worker** process for the PL/Python-fenced UDF database. + +## User Guide + +- Create an extension. + + - When the compiled PL/Python is Python 2: + + ```sql + mogdb=# create extension plpythonu; + CREATE EXTENSION + ``` + + - When the compiled PL/Python is Python 3: + + ```sql + mogdb=# create extension plpython3u; + CREATE EXTENSION + ``` + + The following uses Python 2 as an example. + +- Create a PL/Python-fenced UDF database. + + ```sql + mogdb=# create or replace function pymax(a int, b int) + mogdb-# returns INT + mogdb-# language plpythonu fenced + mogdb-# as $$ + mogdb$# import numpy + mogdb$# if a > b: + mogdb$# return a; + mogdb$# else: + mogdb$# return b; + mogdb$# $$; + CREATE FUNCTION + ``` + +- View UDF information. + + ```sql + mogdb=# select * from pg_proc where proname='pymax'; + -[ RECORD 1 ]----+-------------- + proname | pymax + pronamespace | 2200 + proowner | 10 + prolang | 16388 + procost | 100 + prorows | 0 + provariadic | 0 + protransform | - + proisagg | f + proiswindow | f + prosecdef | f + proleakproof | f + proisstrict | f + proretset | f + provolatile | v + pronargs | 2 + pronargdefaults | 0 + prorettype | 23 + proargtypes | 23 23 + proallargtypes | + proargmodes | + proargnames | {a,b} + proargdefaults | + prosrc | + | import numpy + | if a > b: + | return a; + | else: + | return b; + | + probin | + proconfig | + proacl | + prodefaultargpos | + fencedmode | t + proshippable | f + propackage | f + prokind | f + proargsrc | + ``` + +- Run the UDF. + + - Create a data table. + + ```sql + mogdb=# create table temp (a int ,b int) ; + CREATE TABLE + mogdb=# insert into temp values (1,2),(2,3),(3,4),(4,5),(5,6); + INSERT 0 5 + ``` + + - Run the UDF. + + ```sql + mogdb=# select pymax(a,b) from temp; + pymax + ------- + 2 + 3 + 4 + 5 + 6 + (5 rows) + ``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/autonomous-transaction/1-introduction-to-autonomous-transaction.md b/product/en/docs-mogdb/v3.0/developer-guide/autonomous-transaction/1-introduction-to-autonomous-transaction.md new file mode 100644 index 0000000000000000000000000000000000000000..604bb593648d89fd718f9a4728b7afbd1e85d924 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/autonomous-transaction/1-introduction-to-autonomous-transaction.md @@ -0,0 +1,12 @@ +--- +title: Introduction +summary: Introduction +author: Zhang Cuiping +date: 2021-05-10 +--- + +# Introduction + +An autonomous transaction is an independent transaction that is started during the execution of a primary transaction. Committing and rolling back an autonomous transaction does not affect the data that has been committed by the primary transaction. In addition, an autonomous transaction is not affected by the primary transaction. + +Autonomous transactions are defined in stored procedures, functions, and anonymous blocks, and are declared using the **PRAGMA AUTONOMOUS_TRANSACTION** keyword. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/autonomous-transaction/2-function-supporting-autonomous-transaction.md b/product/en/docs-mogdb/v3.0/developer-guide/autonomous-transaction/2-function-supporting-autonomous-transaction.md new file mode 100644 index 0000000000000000000000000000000000000000..50bc79d4dcb4f8479ea0a736b32be80311a3dedf --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/autonomous-transaction/2-function-supporting-autonomous-transaction.md @@ -0,0 +1,41 @@ +--- +title: Function Supporting Autonomous Transaction +summary: Function Supporting Autonomous Transaction +author: Zhang Cuiping +date: 2021-05-10 +--- + +# Function Supporting Autonomous Transaction + +An autonomous transaction can be defined in a function. The identifier of an autonomous transaction is **PRAGMA AUTONOMOUS_TRANSACTION**. The syntax of an autonomous transaction is the same as that of creating a function. The following is an example. + +```sql +create table t4(a int, b int, c text); + +CREATE OR REPLACE function autonomous_32(a int ,b int ,c text) RETURN int AS +DECLARE + PRAGMA AUTONOMOUS_TRANSACTION; +BEGIN + insert into t4 values(a, b, c); + return 1; +END; +/ +CREATE OR REPLACE function autonomous_33(num1 int) RETURN int AS +DECLARE + num3 int := 220; + tmp int; + PRAGMA AUTONOMOUS_TRANSACTION; +BEGIN + num3 := num3/num1; + return num3; +EXCEPTION + WHEN division_by_zero THEN + select autonomous_32(num3, num1, sqlerrm) into tmp; + return 0; +END; +/ + +select autonomous_33(0); + +select * from t4; +``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/autonomous-transaction/3-stored-procedure-supporting-autonomous-transaction.md b/product/en/docs-mogdb/v3.0/developer-guide/autonomous-transaction/3-stored-procedure-supporting-autonomous-transaction.md new file mode 100644 index 0000000000000000000000000000000000000000..65adcbc347df7a23f6afa890b621f08bce920aa2 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/autonomous-transaction/3-stored-procedure-supporting-autonomous-transaction.md @@ -0,0 +1,45 @@ +--- +title: Stored Procedure Supporting Autonomous Transaction +summary: Stored Procedure Supporting Autonomous Transaction +author: Zhang Cuiping +date: 2021-05-10 +--- + +# Stored Procedure Supporting Autonomous Transaction + +An autonomous transaction can be defined in a stored procedure. The identifier of an autonomous transaction is **PRAGMA AUTONOMOUS_TRANSACTION**. The syntax of an autonomous transaction is the same as that of creating a stored procedure. The following is an example. + +```sql +-- Create a table. +create table t2(a int, b int); +insert into t2 values(1,2); +select * from t2; + +-- Create a stored procedure that contains an autonomous transaction. +CREATE OR REPLACE PROCEDURE autonomous_4(a int, b int) AS +DECLARE + num3 int := a; + num4 int := b; + PRAGMA AUTONOMOUS_TRANSACTION; +BEGIN + insert into t2 values(num3, num4); + dbe_output.print_line('just use call.'); +END; +/ +-- Create a common stored procedure that invokes an autonomous transaction stored procedure. +CREATE OR REPLACE PROCEDURE autonomous_5(a int, b int) AS +DECLARE +BEGIN + dbe_output.print_line('just no use call.'); + insert into t2 values(666, 666); + autonomous_4(a,b); + rollback; +END; +/ +-- Invoke a common stored procedure. +select autonomous_5(11,22); +-- View the table result. +select * from t2 order by a; +``` + +In the preceding example, a stored procedure containing an autonomous transaction is finally executed in a transaction block to be rolled back, which directly illustrates a characteristic of the autonomous transaction, that is, rollback of the primary transaction does not affect content that has been committed by the autonomous transaction. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/autonomous-transaction/4-restrictions.md b/product/en/docs-mogdb/v3.0/developer-guide/autonomous-transaction/4-restrictions.md new file mode 100644 index 0000000000000000000000000000000000000000..e15ba978707d998d53f7ddfb0ce7c59797211bc7 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/autonomous-transaction/4-restrictions.md @@ -0,0 +1,73 @@ +--- +title: Restrictions +summary: Restrictions +author: Zhang Cuiping +date: 2021-05-10 +--- + +# Restrictions + +- A trigger function does not support autonomous transactions. + +- In the autonomous transaction block of a function or stored procedure, static SQL statements do not support variable transfer. + + ```sql + -- Autonomous transactions do not support the execution of the following functions. The SQL statement contains the variable i. + CREATE OR REPLACE FUNCTION autonomous_easy_2(i int) RETURNS integer + LANGUAGE plpgsql + AS $$ + DECLARE + PRAGMA AUTONOMOUS_TRANSACTION; + BEGIN + START TRANSACTION; + INSERT INTO test1 VALUES (i, 'test'); + COMMIT; + RETURN 42; + END; + $$; + -- To use the parameter transfer, use the dynamic statement EXECUTE to replace variables. The following is an example: + CREATE OR REPLACE FUNCTION autonomous_easy(i int) RETURNS integer + LANGUAGE plpgsql + AS $$ + DECLARE + PRAGMA AUTONOMOUS_TRANSACTION; + BEGIN + START TRANSACTION; + EXECUTE 'INSERT INTO test1 VALUES (' || i::integer || ', ''test'')'; + COMMIT; + RETURN 42; + END; + $$; + ``` + +- Autonomous transactions do not support nesting. + + > **NOTICE:** In a function that contains an autonomous transaction, it is not allowed to explicitly execute another function or stored procedure that contains an autonomous transaction through **PERFORM**, **SELECT**, or **CALL**. However, another function or stored procedure that contains an autonomous transaction can be explicitly called in the last **RETURN**. + +- A function containing an autonomous transaction does not support the return value of parameter transfer. + + ```sql + -- In the following example, the return value ret is not transferred and only null is returned. + create or replace function at_test2(i int) returns text + LANGUAGE plpgsql + as $$ + declare + ret text; + pragma autonomous_transaction; + begin + START TRANSACTION; + insert into at_tb2 values(1, 'before s1'); + if i > 10 then + rollback; + else + commit; + end if; + select val into ret from at_tb2 where id=1; + return ret; + end; + $$; + ``` + +- A stored procedure or function that contains an autonomous transaction does not support exception handling. + +- A trigger function does not support autonomous transactions. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/autonomous-transaction/anonymous-block-supporting-autonomous-transaction.md b/product/en/docs-mogdb/v3.0/developer-guide/autonomous-transaction/anonymous-block-supporting-autonomous-transaction.md new file mode 100644 index 0000000000000000000000000000000000000000..93485a923a88aa2d935624ee75946740d7ae2224 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/autonomous-transaction/anonymous-block-supporting-autonomous-transaction.md @@ -0,0 +1,29 @@ +--- +title: Anonymous Block Supporting Autonomous Transaction +summary: Anonymous Block Supporting Autonomous Transaction +author: Guo Huan +date: 2021-10-15 +--- + +# Anonymous Block Supporting Autonomous Transaction + +An autonomous transaction can be defined in an anonymous block. The identifier of an autonomous transaction is **PRAGMA AUTONOMOUS_TRANSACTION**. The syntax of an autonomous transaction is the same as that of creating an anonymous block. The following is an example. + +```sql +create table t1(a int ,b text); + +START TRANSACTION; +DECLARE + PRAGMA AUTONOMOUS_TRANSACTION; +BEGIN + dbe_output.print_line('just use call.'); + insert into t1 values(1,'you are so cute,will commit!'); +END; +/ +insert into t1 values(1,'you will rollback!'); +rollback; + +select * from t1; +``` + +In the preceding example, an anonymous block containing an autonomous transaction is finally executed before a transaction block to be rolled back, which directly illustrates a characteristic of the autonomous transaction, that is, rollback of the primary transaction does not affect content that has been committed by the autonomous transaction. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/1-development-specifications.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/1-development-specifications.md new file mode 100644 index 0000000000000000000000000000000000000000..a19bf3de377ecca7ada9703afcd67cdcd48a3006 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/1-development-specifications.md @@ -0,0 +1,752 @@ +--- +title: Development Specifications +summary: Development Specifications +author: Guo Huan +date: 2021-07-21 +--- + +# Development Specifications + +If the connection pool mechanism is used during application development, comply with the following specifications: + +- If GUC parameters are set in the connection, run **SET SESSION AUTHORIZATION DEFAULT;RESET ALL;** to clear the connection status before you return the connection to the connection pool. +- If a temporary table is used, delete the temporary table before you return the connection to the connection pool. + +If you do not do so, the connection in the connection pool will be stateful, which affects subsequent operations on the connection pool. + +## Overview + +### Introduction + +Although ISO has issued SQL-92, SQL:1999, SQL:2006, and other standards for SQL, due to the characteristics of different databases, the same functions are not the same in the implementation of their products, which also makes the relevant grammatical rules different. Therefore, when formulating specific development specifications, it is necessary to write corresponding specifications for different databases. + +This specification emphasizes practicability and operability. According to the common problems and mistakes easily made by developers in the coding process, detailed and clear specifications and constraints are carried out on all aspects of code writing. It mainly includes the following content: + +- Naming specification + +- Design specification + +- Syntax specification + +- Optimization-related specification + +- PG compatibility + +- Commonly used functions + +In addition, specific examples are given for each detailed rule of the specification. + +### Application Scope + +This specification applies to MogDB 1.1.0 and later versions. + +## Naming Specification + +### Unified Object Naming Specification + +The unified standards for naming database objects, such as database, schema, table, column, view, index, constraint, sequence, function, trigger, etc. are as follows: + +- It is advised to use a combination of lowercase letters, numbers, and underscores. + +- It is advised to use meaningful English vocabularies. + +- It is not advised to use double quotation marks (") unless it must contain special characters such as uppercase letters or spaces. + +- The length cannot exceed 63 characters. + +- It is not advised to start with PG, GS (to avoid confusion with the system DB object), and it is not advised to start with a number. + +- It is forbidden to use reserved words. Refer to official documents for reserved keywords. + +- The number of columns that a table can contain varies from 250 to 1600 depending on the field type. + +### Temporary and Backup Object Naming + +- It is recommended to add a date to the names of temporary or backup database objects (such as table), for example, dba.trade_record_2020_12_08 (where dba is the DBA-specific schema, trade_record is the table name, and 2020_12_08 is the backup date). + +### Tablespace Naming + +- The user tablespace of the database is represented by **ts_\**, where the **tablespace name** contains the following two categories: + 1. Data space: For the user's default tablespace, it is represented by **default**. For other tablespaces, it is represented according to the category of the tables hosted on the tablespace. For example, the table that stores code is represented by **code**. The table that stores customer information is represented by **customer**. Try to use one tablespace to host the tables of that category. If a table is particularly large, consider using a separate tablespace. + 2. Index space: add **idx_** in front of the name of the corresponding data tablespace. For example, the index space for the user's default tablespace is represented by **ts_idx_default**. For index tablespace of code table, use **ts_idx_code**. +- The tablespace name is prohibited to start with **PG_**. + +### Index Naming + +- Index object naming rules: **table_column_idx**, such as **student_name_idx**, the index naming method is the default naming method when the index name is not explicitly specified when an index is created for the MogDB database. + + Therefore, it is advised to create indexes without naming them explicitly, but using DBMS defaults. + +```sql +create unique index on departments(department_id); + +CREATE INDEX + + \di + ++----------+-------------------------------+--------+---------+ + +| Schema | Name | Type | Owner | + +|----------+-------------------------------+--------+---------| + +| mogdb | departments_department_id_idx | index | mogdb | + ++----------+-------------------------------+--------+---------+ + +SELECT 1 +``` + +### Variables Naming + +- English words should be used for naming, and pinyin should be avoided, especially pinyin abbreviations should not be used. Chinese or special characters are not allowed in the naming. + +- If no complicated operations are involved, simple applications such as counting are always defined by number. + +### Partitioned Table Naming + +- The name of the partitioned table follows the naming rules of ordinary tables. + +- A table is partitioned by time range (one partition per month), and the partition name is **PART_YYYYMM**. + + For example, PART_201901 and PART_201902 + +### Function Naming + +- The name should be consistent with its actual function. A verb should be used as a prefix command to cause an action to take place. + +Example: The following naming conforms to the specification: + +``` +func_addgroups (Add multiple groups) +func_addgroup (Add one group) +``` + +## Design Specification + +### Database Design + +- The database is preferentially created using the PG compatibility type. + +- The database encoding can use only utf8. + +### Tablespace Design + +- Generally larger tables or indexes use a separate tablespace. + +- The objects for which high frequency insert statements need to be run are divided into a group and stored in the corresponding tablespace. + +- The objects added, deleted, and modified are divided into groups and stored in the corresponding tablespace. + +- Tables and indexes are stored in separate tablespaces. + +- In principle, each schema corresponds to a tablespace and a corresponding index tablespace; each large table under a schema corresponds to a separate tablespace and index tablespace. + +### Table Design + +- When designing a table structure, you should plan well to avoid adding fields frequently, or modifying field types or lengths. + +- You must add comment information to the table, and make sure that the table name matches the comment information. + +- It is forbidden to use the **unlogged** keyword to create a new table. By default, a non-compressed row-based table is created. + +- When each table is created, you must specify the tablespace where it is located. Do not use the default tablespace to prevent the table from being built on the system tablespace and thereby causing performance problems. For data tables with busy transactions, they must be stored in a dedicated tablespace. + +- The data types of the fields used for the connection relationship between the tables must be strictly consistent to avoid the inability of the index to be used normally. + +- It is forbidden to use **VARCHAR** or other character types to store date values. If it is used, operations cannot be performed on this field, and it needs to be strictly defined in the data specification. + +- The field must be added with a comment that can clearly indicate its meaning, and the description of each state value must be clearly listed in the comment of the state field. + +- For frequently updated tables, it is advised to specify **fillfactor=85** during table creation, and reserve 15% of the space on each page for HOT updates. + +- The data type defined by the field in the table structure is consistent with that in the application, and the field collation rules between tables are consistent to avoid errors or inability to use indexes. + + Note: For example, the data type of the **user_id** field of table A is defined as **varchar**, but the SQL statement is **where user_id=1234;** + +### Partitioned Table Design + +- The partitioned tables supported by MogDB database are range partitioned tables. + +- The number of partitioned tables is not recommended to exceed 1000. + +- The primary key or unique index must contain the partition key. + +- For tables with a relatively large amount of data, they should be partitioned according to the properties of the table data to get a better performance. + +- To convert a normal table into a partitioned table, you need to create a new partitioned table, and then import the data from the normal table into the newly created partitioned table. Therefore, when you initially design the table, please plan in advance whether to use partitioned tables according to your business. + +- For businesses with regular historical data deletion needs, it is recommended to partition the tables by time and not use the **DELETE** operation when deleting, but **DROP** or **TRUNCATE** the corresponding table. + +- It is not recommended to use a global index in a partitioned table, because the partition maintenance operation may cause the global index to fail and make it difficult to maintain. + +#### Use of Partitioned Table + +Operate on the range partitioned table as follows. + +- Create a tablespace + +```sql +mogdb=# CREATE TABLESPACE example1 RELATIVE LOCATION 'tablespace1/tablespace_1'; +mogdb=# CREATE TABLESPACE example2 RELATIVE LOCATION 'tablespace2/tablespace_2'; +mogdb=# CREATE TABLESPACE example3 RELATIVE LOCATION 'tablespace3/tablespace_3'; +mogdb=# CREATE TABLESPACE example4 RELATIVE LOCATION 'tablespace4/tablespace_4'; +``` + +When the following information is displayed, it means the creation is successful. + +```sql +CREATE TABLESPACE +``` + +- Create a partitioned table + +```sql +mogdb=# CREATE TABLE mogdb_usr.customer_address +( + ca_address_sk integer NOT NULL , + ca_address_id character(16) NOT NULL , + ca_street_number character(10) , + ca_street_name character varying(60) , + ca_street_type character(15) , + ca_suite_number character(10) , + ca_city character varying(60) , + ca_county character varying(30) , + ca_state character(2) , + ca_zip character(10) , + ca_country character varying(20) , + ca_gmt_offset numeric(5,2) , + ca_location_type character(20) +) +TABLESPACE example1 + +PARTITION BY RANGE (ca_address_sk) +( + PARTITION P1 VALUES LESS THAN(5000), + PARTITION P2 VALUES LESS THAN(10000), + PARTITION P3 VALUES LESS THAN(15000), + PARTITION P4 VALUES LESS THAN(20000), + PARTITION P5 VALUES LESS THAN(25000), + PARTITION P6 VALUES LESS THAN(30000), + PARTITION P7 VALUES LESS THAN(40000), + PARTITION P8 VALUES LESS THAN(MAXVALUE) TABLESPACE example2 +) +ENABLE ROW MOVEMENT; +``` + +When the following information is displayed, it means the creation is successful. + +```sql +CREATE TABLE +``` + + ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) It is recommended that the number of column-based partitioned tables does not exceed 1,000. + +- Insert data + +Insert data from table **mogdb_usr.customer_address** into table **mogdb_usr.customer_address_bak**. For example, you have created a backup table **mogdb_usr.customer_address_bak** of table **mogdb_usr.customer_address** in the database, and now you need to insert the data from table **mogdb_usr.customer_address** into table **mogdb_usr. customer_address_bak**, then you can run the following command. + +```sql +mogdb=# CREATE TABLE mogdb_usr.customer_address_bak +( + ca_address_sk integer NOT NULL , + ca_address_id character(16) NOT NULL , + ca_street_number character(10) , + ca_street_name character varying(60) , + ca_street_type character(15) , + ca_suite_number character(10) , + ca_city character varying(60) , + ca_county character varying(30) , + ca_state character(2) , + ca_zip character(10) , + ca_country character varying(20) , + ca_gmt_offset numeric(5,2) , + ca_location_type character(20) +) +TABLESPACE example1 +PARTITION BY RANGE (ca_address_sk) +( + PARTITION P1 VALUES LESS THAN(5000), + PARTITION P2 VALUES LESS THAN(10000), + PARTITION P3 VALUES LESS THAN(15000), + PARTITION P4 VALUES LESS THAN(20000), + PARTITION P5 VALUES LESS THAN(25000), + PARTITION P6 VALUES LESS THAN(30000), + PARTITION P7 VALUES LESS THAN(40000), + PARTITION P8 VALUES LESS THAN(MAXVALUE) TABLESPACE example2 +) +ENABLE ROW MOVEMENT; +CREATE TABLE +mogdb=# INSERT INTO mogdb_usr.customer_address_bak SELECT * FROM mogdb_usr.customer_address; +INSERT 0 0 +``` + +- Alter partitioned table row movement properties + +```sql +mogdb=# ALTER TABLE mogdb_usr.customer_address_bak DISABLE ROW MOVEMENT; +ALTER TABLE +``` + +- Delete a partition + +Delete partition P8。 + +```sql +mogdb=# ALTER TABLE mogdb_usr.customer_address_bak DROP PARTITION P8; +ALTER TABLE +``` + +- Add a partition + +Add partition P8. The range is 40000<= P8<=MAXVALUE. + +```sql +mogdb=# ALTER TABLE mogdb_usr.customer_address_bak ADD PARTITION P8 VALUES LESS THAN (MAXVALUE); +ALTER TABLE +``` + +- Rename a partition + +Rename partition P8 as P_9. + +```sql +mogdb=# ALTER TABLE mogdb_usr.customer_address_bak RENAME PARTITION P8 TO P_9; +ALTER TABLE +``` + +Rename partition P_9 as P8. + +```sql +mogdb=# ALTER TABLE mogdb_usr.customer_address_bak RENAME PARTITION FOR (40000) TO P8; +ALTER TABLE +``` + +- Alter the tablespace of partition + +Alter the tablespace of partition P6 to example3. + +```sql +mogdb=# ALTER TABLE mogdb_usr.customer_address_bak MOVE PARTITION P6 TABLESPACE example3; +ALTER TABLE +``` + +Alter the tablespace of partition P4 to example4. + +```sql +mogdb=# ALTER TABLE mogdb_usr.customer_address_bak MOVE PARTITION P4 TABLESPACE example4; +ALTER TABLE +``` + +- Query a partition + +Query partition P6. + +```sql +mogdb=# SELECT * FROM mogdb_usr.customer_address_bak PARTITION (P6); +mogdb=# SELECT * FROM mogdb_usr.customer_address_bak PARTITION FOR (35888); +``` + +- Delete a partitioned table and tablespace + +```sql +mogdb=# DROP TABLE mogdb_usr.customer_address_bak; +DROP TABLE +mogdb=# DROP TABLESPACE example1; +mogdb=# DROP TABLESPACE example2; +mogdb=# DROP TABLESPACE example3; +mogdb=# DROP TABLESPACE example4; +DROP TABLESPACE +``` + +### Column Design + +- It is recommended to avoid using character types when numeric types can be used. + +- It is recommended to avoid using **char(N)** if you can use **varchar(N)**, and avoid using **text** and **varchar** if you can use **varchar(N)**. + +- Only **char(N)**, **varchar(N)** and **text** character types are allowed. + +- The newly created MogDB database is compatible with Oracle by default, and the **not null** constraint does not support empty strings. Empty strings will be converted to **null** by default. Databases compatible with the PG mode will not have this problem. + +- It is recommended to use **timestamp with time zone (timestamptz)** instead of **timestamp without time zone**. + +- It is recommended to use **NUMERIC (precision, scale)** to store currency amounts and other values that require precise calculations, but not to use **real**, **double precision**. + +### Sequence Design + +- It is forbidden to manually add sequences related to the table. + +- A sequence is created by specifying the **serial** or **bigserial** type of the column when a table is created. + +- The sequence should be consistent with the variable definition type and range in the code to prevent data from being unable to be inserted. + +### Constraint Design + +#### Primary Key Constraint + +- Each table must include a primary key. +- It is not recommended that the name of the primary key has the service meaning, such as identification certificate or country name although the name is unique. +- It is recommended that a primary key is written as id serial primary key or id bigserial primary key. +- It is recommended that the primary key in a large-sized table can be written as follows, which is easy to maintain later. + +```sql +create table test(id serial not null ); +create unique index CONCURRENTLY ON test (id); +``` + +#### Unique Constraint + +Apart from the primary key, unique constraint is needed. You can create a unique index with uk_ as the prefix to create unique constraint. + +#### Foreign Key Constraint + +- You'd better create foreign key constraints for a table with foreign key relationship. +- When using the foreign key, you must set the action of the foreign key, such as cascade, set null, or set default. + +#### Non-Empty Column + +- All non-empty columns must be clearly marked as NOT NULL during database creation. After the database is used, no change can be performed. Additionally, you need to pay attention to the difference of the query results between NULL and "": null will be converted to NULL while "" does not display any character. + +#### Check Constraint + +- For fields with the check constraint, it is required to specify the check rules, such as the gender and status fields. + +### Index Design + +- MogDB provides the row-store and column-store tables. The row-store table supports the btree (default), gin, and gist index types. The column-store table supports the Psort (default), btree, and gin index types. +- It is recommended that the CONCURRENTLY parameter is added when you create or drop an index. This can achieve concurrency when data is written into a table. The column-store, partition, and temporary tables do not support index created CONCURRENTLY. +- It is recommended that "create index CONCURRENTLY" and "drop index CONCURRENTLY" are used to maintain the related indexes of a table whose columns included in the indexes are frequently updated and deleted. +- It is recommended that unique index is used to replace unique constraints, facilitating follow-up maintenance. +- It is recommended that a joint index of multiple fields are created based on data distribution for a high-frequency query in which there are multiple fields and conditions in the where statement. +- Each table can include five indexes at most. +- Deep analysis is required for creation of composite indexes. + - The first field in a composite index needs to be correctly chosen. Generally, it has good selectivity and is a common field in the where clause. + - If several fields in a composite index are usually presented in a where clause and linked with AND, and single-field query is less or even not involved, you can create a composite index. Otherwise, you can create a single-field index. + - If several fields in a composite index are usually presented in a where clause individually, they can be divided into multiple single-field indexes. + - If both single-field index and composite index with the single field as its first column, the single-field index can be deleted. + - Typically, the first field in a composite index cannot be a time field because the time field is used to scan a range. However, when the former fields are scanned by range, the latter fields cannot be used for index filtration. + - A composite index can include four fields at most. +- For a table with the number of write times significantly greater than that of read times, you'd better not create too many indexes. +- Unused indexes and duplicated indexes should be deleted so that the execution plan and database performance are not affected. + +### View Design + +- You'd better use simple views and use less complex views. + + Simple view: Data comes from a single table and a simple view does not contain groups of data and functions. + + Complex view: Data comes from multiple tables, or a complex view contains groups of data or functions. A complex view can contain three tables at most. + +- You'd better not use nested views. If nested views have to be used, it is advised to have two levels of nesting at most. + +### Function Design + +- A function must retrieve database table records or other database objects, or even modify database information, such as Insert, Delete, Update, Drop, or Create. +- If a function does not relate to a database, it cannot be realized using a database function. +- It is not advised to use DML or DDL statements in a function. + +## Syntax Specification + +### About NULL + +- Note: Check whether it is null or is not null. +- Note: The values of the boolean type can be true, false, and NULL. +- Note: Pay attention to that the NOT IN set includes some NULL elements. + +```sql +mogdb=# SELECT * FROM (VALUES(1),(2)) v(a) ; a + +\--- + + 1 + + 2 + +(2 rows) + +mogdb=# select 1 NOT IN (1,NULL); + +?column? + +\--------- + +f + +(1 row) + +mogdb=# select 2 NOT IN (1,NULL); + +?column? + +\--------- + +(1 row) + +mogdb=# SELECT * FROM (VALUES(1),(2)) v(a) WHERE a NOT IN (1, NULL); a + +\--- + +(0 rows) +``` + +- Suggestion: It is recommended that count(1) or count(\*) is used to count the number of rows. count(col) is not used to count the number of rows because the NULL value is not counted. +- Rule: For count(names of multiple columns), the names of multiple columns must be enclosed in brackets, for example count((col1,col2,col3)). +- Note: For count (names of multiple columns), even if the values of all columns are null, the columns will also be counted. Therefore, the calculating result of count(names of multiple columns) is consistent with that of count(\*). +- Note: count(distinct col) is used to count the number of values that are distinct from each other and not null. + +count(distinct (col1,col2,...)) is used to calculate the unique value of those of all columns where NULL is counted. Additionally, two NULL values are considered the same. + +- Note: Distinction between count and sum of NULL + +```sql +select count(1), count(a), sum(a) from (SELECT * FROM (VALUES (NULL), (2) ) v(a)) as foo where a is NULL; + +count | count | sum + +-------+-------+----- + + 1 | 0 | + +(1 row) +``` + +- Check whether two values are the same (NULL is considered as the same value). + +```sql +select null is distinct from null; + +?column? + +\--------- + +f + +(1 row) + +select null is distinct from 1; + +?column? + +\--------- + +t + +(1 row) + +select null is not distinct from null; + +?column? + +\--------- + +t + +(1 row) + +select null is not distinct from 1; + +?column? + +\--------- + +f + +(1 row) +``` + +### About Invalid Indexes + +- During SQL statement writing, functions and expressions are usually used in query operations. It is not recommended that functions and expressions are used in condition columns. Using a function or expression in a condition column will make indexes of the condition column unused, thereby affecting the SQL query efficiency. It is recommended that functions or expressions are used in condition values. For example, + + `select name from tab where id+100>1000;` + + This statement can be changed to the following: + + `select name from tab where id>1000-100;` + +- Do not use left fuzzy query. For example, + + `select id from tab where name like '%ly';` + +- Do not use the negative query, such as not in/like. For example, + + `select id from tab where name not in ('ly','ty');` + +### Ensuring That All Variables and Parameters Are Used + +- Declare-variable also generates certain system overhead and makes code look loose. If some variables are not used in compilation, they will report alarms. Make sure that no any alarm is reported. + +## Query Operations + +### DDL Operation + +- Database object, especially columns with comments added can facilitate service learning and maintenance. +- DDL sent to DBAs, which is attached with common SQLs, such as SELECT, INSERT, DELETE, and UPDATE, can assist DBAs providing optimization suggestions, including creating index CONCURRENTLY. +- When columns need to be added to a large-sized table, "alter table t add column col datatype not null default xxx" can be processed as follows. This can prevent the table from being locked due to long time for filling in the default values. + +```sql +alter table t add column col datatype ; + +alter table t alter column col set default xxx; + +update table t set column= DEFAULT where id in ( select id from t where column is null limit + +1000 ) ; \watch 3 + +alter table t alter column col set not null; +``` + +### DML Operation + +- When updating a table, the "<>" judgement is needed. For example, the statement "update table_a set column_b = c where column_b <> c" indicates that a table needs to be updated to make the value of column b equal to that of column c if the value of column b is not equal to that of column c. In the statement, it is prohibited that the value of column b is equal to that of column c in the where clause. +- A single DML statement can support a maximum of 100 thousand data records. +- When a table needs to be cleared, it is recommended that TRUNCATE is used rather than DELETE. + +### DQL Operation + +- Typically, it is prohibited to use select \*. Selecting only necessary fields can reduce the consumption of including but not limited to network bandwidth and prevent programs from being affected by table structure modification, such as some prepare queries. +- For report-based queries or basic data queries, materialized views can be used to periodically take data snapshots, so that multiple tables are not performed on the same query repeatedly, especially for tables with frequent write operations. +- Window functions can be used for complex statistics queries. +- Make sure that the data type of the associated fields are consistent. It is prohibited to use implicit type conversion. +- The or statements of different fields can be replaced with union. + +### Data Import + +- When a large amount of data needs to be stored in a table, it is recommended that COPY is used rather than INSERT. This can improve the data write speed. +- Before data is imported, delete related indexes. After the import is complete, recreate indexes. This can improve the data import speed. + +### Transaction Operation + +- Make sure that the SQL logic in a transaction is simple, the granularity of each transaction is small, less resources are locked, lock and deadlock can be avoided, and transaction can be committed in a timely manner after being executed. +- For DDL operations, especially multiple DDL operations, including CRAETE, DROP, and ALTER, do not explicitly start a transaction because the lock mode value is very high and deadlock easily occurs. +- If the state of the master node is idle in transaction, related resources will be locked, thereby leading to lock, even deadlock. If the state of the slave node is idle in transaction, synchronization between the master and slave nodes will be suspended. + +### Others + +- For instances running in SSDs, it is recommended that the value of **random_page_cost** (default value: **4**) is set to a value ranging from 1.0 to 2.0. This can make the query planner preferably use the index to perform scanning. +- In the scenario where EXPLAIN ANALYZE needs to be used to view the actual execution plan and time, if a write query is to be performed, it is strongly recommended that a transaction is started first and then rollback is performed. +- For tables frequently updated and with the data size largely increased, table reorganization should be performed in appropriate time to lower the high water mark. + +## PostgreSQL Compatibility + +### Database Creation Specifications + +During MogDB database creation, the following PG compatibility mode is used: + +create database dbnam DBCOMPATIBILITY='PG' encoding=’utf8’; + +### Data Type + +#### Value Type + +During development and usage, MogDB supports only the smallint, integer, bigint, numeric[(p[,s])], serial, and bigserial value types. + +| Type | PostgreSQL | MogDB | Storage Length | Remarks | +| :--------------- | :--------- | :-------- | :------------- | :----------------------------------------------------------- | +| tinyint | / | Supported | 1 byte | 0 to 255 | +| smallint | Supported | Supported | 2 bytes | -32,768 to +32,767 | +| integer | Supported | Supported | 4 bytes | -2,147,483,648 to +2,147,483,647 | +| binary_integer | / | Supported | / | integer alias | +| bigint | Supported | Supported | 8 bytes | -9,223,372,036,854,775,808 to +9,223,372,036,854,775,807 | +| decimal[(p[,s])] | Supported | Supported | Variable byte | A maximum of 131072 before the decimal point and 16383 after the decimal point | +| numeric[(p[,s])] | Supported | Supported | Variable byte | A maximum of 131072 before the decimal point and 16383 after the decimal point | +| number[(p[,s])] | / | Supported | / | Numeric alias | +| real | Supported | Supported | 4 bytes | Accurate to six decimal digits | +| float4 | / | Supported | 4 bytes | Accurate to six decimal digits | +| double precision | Supported | Supported | 8 bytes | Accurate to fifteen decimal digits | +| binary_double | / | Supported | 8 bytes | Double precision alias | +| float8 | / | Supported | 8 bytes | Accurate to fifteen decimal digits | +| float[(p )] | / | Supported | 4 or 8 bytes | | +| dec[(p,[s])] | / | Supported | / | A maximum of 131072 before the decimal point and 16383 after the decimal point | +| integer[(p,[s])] | / | Supported | / | A maximum of 131072 before the decimal point and 16383 after the decimal point | +| smallserial | Supported | Supported | 2 bytes | 1 to 32,767 | +| serial | Supported | Supported | 4 bytes | 1 to 2,147,483,647 | +| bigserial | Supported | Supported | 8 bytes | 1 to 9,223,372,036,854,775,807 | +| tinyint | / | Supported | 1 byte | 0 to 255 | + +#### Character Type + +During the development, MogDB supports only the char(n), varchar(n), and text character types. + +| Type | PostgreSQL | MogDB | Storage Length | Remarks | +| :----------- | :--------- | :-------- | :----------------------------------------------------------- | :----------------------------------------------------------- | +| char(n) | Supported | Supported | A maximum of 1 GB in postgreSQL
A maximum of 10 MB in MogDB | In postgreSQL, *n* indicates the number of characters.
In MogDB, *n* indicates the number of bytes.
In the compatibility PG mode, *n* indicates the number of characters. | +| nchar(n) | / | Supported | A maximum of 10 MB | *n* indicates the number of bytes.
In the compatibility PG mode, *n* indicates the number of characters. | +| varchar(n) | Supported | Supported | A maximum of 1 GB in postgreSQL
A maximum of 10 MB in MogDB | In postgreSQL, *n* indicates the number of characters.
In MogDB, *n* indicates the number of bytes.
In the compatibility PG mode, *n* indicates the number of characters. | +| varchar2(n) | / | Supported | A maximum of 10 MB | varchar(n) alias | +| nvarchar2(n) | / | Supported | A maximum of 10 MB | *n* indicates the number of characters. | +| text | Supported | Supported | 1 GB - 1 | | +| clob | / | Supported | 1 GB - 1 | text alias | + +#### Time Type + +During the development, MogDB supports only the timestamp[(p )][with time zone] and date time types. + +| Type | PostgreSQL | MogDB | Storage Length | Remarks | +| :--------------------------------- | :--------- | :-------- | :------------- | :----------------------------------------------------------- | +| timestamp[(p )][without time zone] | Supported | Supported | 8 bytes | 4713 BC to 294276 AD | +| timestamp[(p )][with time zone] | Supported | Supported | 8 bytes | 4713 BC to 294276 AD | +| date | Supported | Supported | 4 bytes | 4713 BC to 5874897 AD (The actual storage size is 8 bytes in MogDB) | +| time[(p )][without time zone] | Supported | Supported | 8 bytes | 00:00:00 to 24:00:00 | +| time[(p )][with time zone] | Supported | Supported | 12 bytes | 00:00:00+1459 to 24:00:00-1459 | +| interval[fields][(p )] | Supported | Supported | 16 bytes | -178000000 to 178000000 years | +| smalldatetime | / | Supported | 8 bytes | Date and time without timezone, accurating to the minute, 30s equaling one minute | +| interval day(1) to second(p ) | / | Supported | 16 bytes | | +| reltime | / | Supported | 4 bytes | | + +#### JSON Type + +MogDB supports only the JSON type. + +| Type | PostgreSQL | MogDB | Storage Length | Remarks | +| :---- | :--------- | :-------- | :------------- | :------ | +| json | Supported | Supported | / | | +| jsonb | Supported | / | / | | + +### Keywords + +In the following table, **Reserved** indicates that keywords in a database are reserved and cannot be customized. **Non-reserved** or **N/A** indicates that keywords can be customized. + +| Keyword | MogDB | PostgreSQL | +| :------------ | :----------------------------------------------- | :----------------------------------------------- | +| AUTHID | Reserved | N/A | +| BUCKETS | Reserved | N/A | +| COMPACT | Reserved | N/A | +| DATE | Non-reserved (function or type is not supported) | | +| DELTAMERGE | Reserved | N/A | +| EXCLUDED | Reserved | N/A | +| FENCED | Reserved | N/A | +| GROUPING | | Non-reserved (function or type is not supported) | +| HDFSDIRECTORY | Reserved | N/A | +| IS | Reserved | Reserved (function or type is supported) | +| ISNULL | Non-reserved | Reserved (function or type is supported) | +| LATERAL | | Reserved | +| LESS | Reserved | N/A | +| MAXVALUE | Reserved | Non-reserved | +| MINUS | Reserved | N/A | +| MODIFY | Reserved | N/A | +| NLSSORT | Reserved | N/A | +| NUMBER | Non-reserved (function or type is not supported) | | +| PERFORMANCE | Reserved | N/A | +| PROCEDURE | Reserved | Non-reserved | +| REJECT | Reserved | N/A | +| ROWNUM | Reserved | N/A | +| SYSDATE | Reserved | N/A | +| VERIFY | Reserved | N/A | + +### Implicit Conversion Comparison Table + +| Input Type | Target Type | MogDB | +| :---------- | :--------------------------------------------------------- | :-------- | +| bool | int2, int4, int8 | Supported | +| int2 | bool, text, varchar,interval | Supported | +| int4 | bool, int2, text, varchar, interval | Supported | +| int8 | bool, text, varchar | Supported | +| text | int8, int4, int2, float4, float8, date, timestamp, nemeric | Supported | +| float4 | int8, int4, int2, text, varchar | Supported | +| float8 | int8, int4, int2, text, float4, varchar, interval, numeric | Supported | +| date | text, varchar | Supported | +| timestamp | text, varchar | Supported | +| timestamptz | text | Supported | +| numeric | int8, int4, int2, text, varchar, interval | Supported | diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/1-development-based-on-jdbc-overview.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/1-development-based-on-jdbc-overview.md new file mode 100644 index 0000000000000000000000000000000000000000..74f232a2f0df7b72d9bf6687e4b9f077ae9f84bd --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/1-development-based-on-jdbc-overview.md @@ -0,0 +1,10 @@ +--- +title: Overview +summary: Overview +author: Guo Huan +date: 2021-04-26 +--- + +# Overview + +Java Database Connectivity (JDBC) is a Java API for running SQL statements. It provides unified access interfaces for different relational databases, based on which applications process data. MogDB supports JDBC 4.0 and requires JDK 1.8 for code compiling. It does not support JDBC-ODBC bridge. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/10-example-common-operations.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/10-example-common-operations.md new file mode 100644 index 0000000000000000000000000000000000000000..2bd173e804ed6dc8e599f90781b7d7236004b8fd --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/10-example-common-operations.md @@ -0,0 +1,243 @@ +--- +title: Example Common Operations +summary: Example Common Operations +author: Guo Huan +date: 2021-04-26 +--- + +# Example: Common Operations + +**Example 1:** + +The following illustrates how to develop applications based on MogDB JDBC interfaces. + +```java +//DBtest.java +// This example illustrates the main processes of JDBC-based development, covering database connection creation, table creation, and data insertion. + +import java.sql.Connection; +import java.sql.DriverManager; +import java.sql.PreparedStatement; +import java.sql.SQLException; +import java.sql.Statement; +import java.sql.CallableStatement; + +public class DBTest { + + // Create a database connection. + public static Connection GetConnection(String username, String passwd) { + String driver = "org.postgresql.Driver"; + String sourceURL = "jdbc:postgresql://localhost:8000/postgres"; + Connection conn = null; + try { + // Load the database driver. + Class.forName(driver).newInstance(); + } catch (Exception e) { + e.printStackTrace(); + return null; + } + + try { + // Create a database connection. + conn = DriverManager.getConnection(sourceURL, username, passwd); + System.out.println("Connection succeed!"); + } catch (Exception e) { + e.printStackTrace(); + return null; + } + + return conn; + }; + + // Run a common SQL statement to create table customer_t1. + public static void CreateTable(Connection conn) { + Statement stmt = null; + try { + stmt = conn.createStatement(); + + // Run a common SQL statement. + int rc = stmt + .executeUpdate("CREATE TABLE customer_t1(c_customer_sk INTEGER, c_customer_name VARCHAR(32));"); + + stmt.close(); + } catch (SQLException e) { + if (stmt != null) { + try { + stmt.close(); + } catch (SQLException e1) { + e1.printStackTrace(); + } + } + e.printStackTrace(); + } + } + + // Run a prepared statement to insert data in batches. + public static void BatchInsertData(Connection conn) { + PreparedStatement pst = null; + + try { + // Generate a prepared statement. + pst = conn.prepareStatement("INSERT INTO customer_t1 VALUES (?,?)"); + for (int i = 0; i < 3; i++) { + // Add parameters. + pst.setInt(1, i); + pst.setString(2, "data " + i); + pst.addBatch(); + } + // Perform batch processing. + pst.executeBatch(); + pst.close(); + } catch (SQLException e) { + if (pst != null) { + try { + pst.close(); + } catch (SQLException e1) { + e1.printStackTrace(); + } + } + e.printStackTrace(); + } + } + + // Run a prepared statement to update data. + public static void ExecPreparedSQL(Connection conn) { + PreparedStatement pstmt = null; + try { + pstmt = conn + .prepareStatement("UPDATE customer_t1 SET c_customer_name = ? WHERE c_customer_sk = 1"); + pstmt.setString(1, "new Data"); + int rowcount = pstmt.executeUpdate(); + pstmt.close(); + } catch (SQLException e) { + if (pstmt != null) { + try { + pstmt.close(); + } catch (SQLException e1) { + e1.printStackTrace(); + } + } + e.printStackTrace(); + } + } + + +// Run a stored procedure. + public static void ExecCallableSQL(Connection conn) { + CallableStatement cstmt = null; + try { + + cstmt=conn.prepareCall("{? = CALL TESTPROC(?,?,?)}"); + cstmt.setInt(2, 50); + cstmt.setInt(1, 20); + cstmt.setInt(3, 90); + cstmt.registerOutParameter(4, Types.INTEGER); // Register an OUT parameter of the integer type. + cstmt.execute(); + int out = cstmt.getInt(4); // Obtain the OUT parameter. + System.out.println("The CallableStatment TESTPROC returns:"+out); + cstmt.close(); + } catch (SQLException e) { + if (cstmt != null) { + try { + cstmt.close(); + } catch (SQLException e1) { + e1.printStackTrace(); + } + } + e.printStackTrace(); + } + } + + + /** + *Main process. Call static methods one by one. + * @param args + */ + public static void main(String[] args) { + // Create a database connection. + Connection conn = GetConnection("tester", "Password1234"); + + // Create a table. + CreateTable(conn); + + // Insert data in batches. + BatchInsertData(conn); + + // Run a prepared statement to update data. + ExecPreparedSQL(conn); + + // Run a stored procedure. + ExecCallableSQL(conn); + + // Close the connection to the database. + try { + conn.close(); + } catch (SQLException e) { + e.printStackTrace(); + } + + } + +} +``` + +**Example 2 High Client Memory Usage** + +In this example, **setFetchSize** adjusts the memory usage of the client by using the database cursor to obtain server data in batches. It may increase network interaction and deteriorate some performance. + +The cursor is valid within a transaction. Therefore, disable automatic commit and then manually commit the code. + +```java +// Disable automatic commit. +conn.setAutoCommit(false); +Statement st = conn.createStatement(); + +// Open the cursor and obtain 50 lines of data each time. +st.setFetchSize(50); +ResultSet rs = st.executeQuery("SELECT * FROM mytable"); +conn.commit(); +while (rs.next()) +{ + System.out.print("a row was returned."); +} +rs.close(); + +// Disable the server cursor. +st.setFetchSize(0); +rs = st.executeQuery("SELECT * FROM mytable"); +conn.commit(); +while (rs.next()) +{ + System.out.print("many rows were returned."); +} +rs.close(); + +// Close the statement. +st.close(); +conn.close(); +``` + +Run the following command to enable automatic commit: + +``` +conn.setAutoCommit(true); +``` + +**Example 3 Common Data Type** + +``` +//Use the bit type as an example. Note that the value of the bit type of data ranges from 0 to 1. +Statement st = conn.createStatement(); +String sqlstr = "create or replace function fun_1()\n" + + "returns bit AS $$\n" + + "select col_bit from t_bit limit 1;\n" + + "$$\n" + + "LANGUAGE SQL;"; +st.execute(sqlstr); +CallableStatement c = conn.prepareCall("{ ? = call fun_1() }"); +//Register the output type and string type. +c.registerOutParameter(1, Types.BIT); +c.execute(); +//Use the Boolean type to obtain the result. +System.out.println(c.getBoolean(1)); +``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/11-example-retrying-sql-queries-for-applications.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/11-example-retrying-sql-queries-for-applications.md new file mode 100644 index 0000000000000000000000000000000000000000..7593d4a940ef353e065846a9eda4dd1ed4a8a2fe --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/11-example-retrying-sql-queries-for-applications.md @@ -0,0 +1,205 @@ +--- +title: Example Retrying SQL Queries for Applications +summary: Example Retrying SQL Queries for Applications +author: Guo Huan +date: 2021-04-26 +--- + +# Example Retrying SQL Queries for Applications + +If the primary database node is faulty and cannot be restored within 10s, the standby database node automatically switches to the active state to ensure the normal running of MogDB. During the switchover, jobs that are running will fail and those start running after the switchover are not affected. To prevent upper-layer services from being affected by the failover, refer to the following example to construct an SQL retry mechanism at the service layer. + +```java +import java.sql.Connection; +import java.sql.DriverManager; +import java.sql.PreparedStatement; +import java.sql.ResultSet; +import java.sql.SQLException; +import java.sql.Statement; + +class ExitHandler extends Thread { + private Statement cancel_stmt = null; + + public ExitHandler(Statement stmt) { + super("Exit Handler"); + this.cancel_stmt = stmt; + } + public void run() { + System.out.println("exit handle"); + try { + this.cancel_stmt.cancel(); + } catch (SQLException e) { + System.out.println("cancel query failed."); + e.printStackTrace(); + } + } +} + +public class SQLRetry { + // Create a database connection. + public static Connection GetConnection(String username, String passwd) { + String driver = "org.postgresql.Driver"; + String sourceURL = "jdbc:postgresql://10.131.72.136:8000/postgres"; + Connection conn = null; + try { + // Load the database driver. + Class.forName(driver).newInstance(); + } catch (Exception e) { + e.printStackTrace(); + return null; + } + + try { + // Create a database connection. + conn = DriverManager.getConnection(sourceURL, username, passwd); + System.out.println("Connection succeed!"); + } catch (Exception e) { + e.printStackTrace(); + return null; + } + + return conn; +} + + // Run a common SQL statement. Create the jdbc_test1 table. + public static void CreateTable(Connection conn) { + Statement stmt = null; + try { + stmt = conn.createStatement(); + + + Runtime.getRuntime().addShutdownHook(new ExitHandler(stmt)); + + // Run a common SQL statement. + int rc2 = stmt + .executeUpdate("DROP TABLE if exists jdbc_test1;"); + + int rc1 = stmt + .executeUpdate("CREATE TABLE jdbc_test1(col1 INTEGER, col2 VARCHAR(10));"); + + stmt.close(); + } catch (SQLException e) { + if (stmt != null) { + try { + stmt.close(); + } catch (SQLException e1) { + e1.printStackTrace(); + } + } + e.printStackTrace(); + } + } + + // Run a prepared statement to insert data in batches. + public static void BatchInsertData(Connection conn) { + PreparedStatement pst = null; + + try { + // Generate a prepared statement. + pst = conn.prepareStatement("INSERT INTO jdbc_test1 VALUES (?,?)"); + for (int i = 0; i < 100; i++) { + // Add parameters. + pst.setInt(1, i); + pst.setString(2, "data " + i); + pst.addBatch(); + } + // Perform batch processing. + pst.executeBatch(); + pst.close(); + } catch (SQLException e) { + if (pst != null) { + try { + pst.close(); + } catch (SQLException e1) { + e1.printStackTrace(); + } + } + e.printStackTrace(); + } + } + + // Run a prepared statement to update data. + private static boolean QueryRedo(Connection conn){ + PreparedStatement pstmt = null; + boolean retValue = false; + try { + pstmt = conn + .prepareStatement("SELECT col1 FROM jdbc_test1 WHERE col2 = ?"); + + pstmt.setString(1, "data 10"); + ResultSet rs = pstmt.executeQuery(); + + while (rs.next()) { + System.out.println("col1 = " + rs.getString("col1")); + } + rs.close(); + + pstmt.close(); + retValue = true; + } catch (SQLException e) { + System.out.println("catch...... retValue " + retValue); + if (pstmt != null) { + try { + pstmt.close(); + } catch (SQLException e1) { + e1.printStackTrace(); + } + } + e.printStackTrace(); + } + + System.out.println("finesh......"); + return retValue; + } + + // Configure the number of retry attempts for the retry of a query statement upon a failure. + public static void ExecPreparedSQL(Connection conn) throws InterruptedException { + int maxRetryTime = 50; + int time = 0; + String result = null; + do { + time++; + try { + System.out.println("time:" + time); + boolean ret = QueryRedo(conn); + if(ret == false){ + System.out.println("retry, time:" + time); + Thread.sleep(10000); + QueryRedo(conn); + } + } catch (Exception e) { + e.printStackTrace(); + } + } while (null == result && time < maxRetryTime); + + } + + /** + *Main process. Call static methods one by one. + * @param args + * @throws InterruptedException + */ + public static void main(String[] args) throws InterruptedException { + // Create a database connection. + Connection conn = GetConnection("testuser", "test@123"); + + // Create a table. + CreateTable(conn); + + // Insert data in batches. + BatchInsertData(conn); + + // Run a prepared statement to update data. + ExecPreparedSQL(conn); + + // Close the connection to the database. + try { + conn.close(); + } catch (SQLException e) { + e.printStackTrace(); + } + + } + + } +``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/12-example-importing-and-exporting-data-through-local-files.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/12-example-importing-and-exporting-data-through-local-files.md new file mode 100644 index 0000000000000000000000000000000000000000..ce8625c3bb24c768f297e382da0f5de10d0b9261 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/12-example-importing-and-exporting-data-through-local-files.md @@ -0,0 +1,119 @@ +--- +title: Example Importing and Exporting Data Through Local Files +summary: Example Importing and Exporting Data Through Local Files +author: Guo Huan +date: 2021-04-26 +--- + +# Example Importing and Exporting Data Through Local Files + +When Java is used for secondary development based on MogDB, you can use the CopyManager interface to export data from the database to a local file or import a local file to the database by streaming. The file can be in CSV or TEXT format. + +The sample program is as follows. Load the MogDB JDBC driver before running it. + +```java +import java.sql.Connection; +import java.sql.DriverManager; +import java.io.IOException; +import java.io.FileInputStream; +import java.io.FileOutputStream; +import java.sql.SQLException; +import org.postgresql.copy.CopyManager; +import org.postgresql.core.BaseConnection; + +public class Copy{ + + public static void main(String[] args) + { + String urls = new String("jdbc:postgresql://10.180.155.74:8000/postgres"); // Database URL + String username = new String("jack"); // Username + String password = new String("Gauss@123"); // Password + String tablename = new String("migration_table"); // Table information + String tablename1 = new String("migration_table_1"); // Table information + String driver = "org.postgresql.Driver"; + Connection conn = null; + + try { + Class.forName(driver); + conn = DriverManager.getConnection(urls, username, password); + } catch (ClassNotFoundException e) { + e.printStackTrace(System.out); + } catch (SQLException e) { + e.printStackTrace(System.out); + } + + // Export the query result of SELECT * FROM migration_table to the local file d:/data.txt. + try { + copyToFile(conn, "d:/data.txt", "(SELECT * FROM migration_table)"); + } catch (SQLException e) { + k + e.printStackTrace(); + } catch (IOException e) { + + e.printStackTrace(); + } + // Import data from the d:/data.txt file to the migration_table_1 table. + try { + copyFromFile(conn, "d:/data.txt", tablename1); + } catch (SQLException e) { + e.printStackTrace(); + } catch (IOException e) { + + e.printStackTrace(); + } + + // Export the data from the migration_table_1 table to the d:/data1.txt file. + try { + copyToFile(conn, "d:/data1.txt", tablename1); + } catch (SQLException e) { + + e.printStackTrace(); + } catch (IOException e) { + + + e.printStackTrace(); + } + } + // Use copyIn to import data from a file to the database. + public static void copyFromFile(Connection connection, String filePath, String tableName) + throws SQLException, IOException { + + FileInputStream fileInputStream = null; + + try { + CopyManager copyManager = new CopyManager((BaseConnection)connection); + fileInputStream = new FileInputStream(filePath); + copyManager.copyIn("COPY " + tableName + " FROM STDIN", fileInputStream); + } finally { + if (fileInputStream != null) { + try { + fileInputStream.close(); + } catch (IOException e) { + e.printStackTrace(); + } + } + } + } + + // Use copyOut to export data from the database to a file. + public static void copyToFile(Connection connection, String filePath, String tableOrQuery) + throws SQLException, IOException { + + FileOutputStream fileOutputStream = null; + + try { + CopyManager copyManager = new CopyManager((BaseConnection)connection); + fileOutputStream = new FileOutputStream(filePath); + copyManager.copyOut("COPY " + tableOrQuery + " TO STDOUT", fileOutputStream); + } finally { + if (fileOutputStream != null) { + try { + fileOutputStream.close(); + } catch (IOException e) { + e.printStackTrace(); + } + } + } + } +} +``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/13-example-2-migrating-data-from-a-my-database-to-mogdb.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/13-example-2-migrating-data-from-a-my-database-to-mogdb.md new file mode 100644 index 0000000000000000000000000000000000000000..c17f3f6450469f96b9f741bed7ab90ca5f8d0466 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/13-example-2-migrating-data-from-a-my-database-to-mogdb.md @@ -0,0 +1,97 @@ +--- +title: Example 2 Migrating Data from a MY Database to MogDB +summary: Example 2 Migrating Data from a MY Database to MogDB +author: Guo Huan +date: 2021-04-26 +--- + +# Example 2 Migrating Data from a MY Database to MogDB + +The following example shows how to use CopyManager to migrate data from MY to MogDB. + +```java +import java.io.StringReader; +import java.sql.Connection; +import java.sql.DriverManager; +import java.sql.ResultSet; +import java.sql.SQLException; +import java.sql.Statement; + +import org.postgresql.copy.CopyManager; +import org.postgresql.core.BaseConnection; + +public class Migration{ + + public static void main(String[] args) { + String url = new String("jdbc:postgresql://10.180.155.74:8000/postgres"); // Database URL + String user = new String("jack"); // MogDB username + String pass = new String("Gauss@123"); // MogDB password + String tablename = new String("migration_table"); // Table information + String delimiter = new String("|"); // Delimiter + String encoding = new String("UTF8"); // Character set + String driver = "org.postgresql.Driver"; + StringBuffer buffer = new StringBuffer(); // Buffer to store formatted data + + try { + // Obtain the query result set of the source database. + ResultSet rs = getDataSet(); + + // Traverse the result set and obtain records row by row. + // The values of columns in each record are separated by the specified delimiter and end with a linefeed, forming strings. + // Add the strings to the buffer. + while (rs.next()) { + buffer.append(rs.getString(1) + delimiter + + rs.getString(2) + delimiter + + rs.getString(3) + delimiter + + rs.getString(4) + + "\n"); + } + rs.close(); + + try { + // Connect to the target database. + Class.forName(driver); + Connection conn = DriverManager.getConnection(url, user, pass); + BaseConnection baseConn = (BaseConnection) conn; + baseConn.setAutoCommit(false); + + // Initialize the table. + String sql = "Copy " + tablename + " from STDIN DELIMITER " + "'" + delimiter + "'" + " ENCODING " + "'" + encoding + "'"; + + // Commit data in the buffer. + CopyManager cp = new CopyManager(baseConn); + StringReader reader = new StringReader(buffer.toString()); + cp.copyIn(sql, reader); + baseConn.commit(); + reader.close(); + baseConn.close(); + } catch (ClassNotFoundException e) { + e.printStackTrace(System.out); + } catch (SQLException e) { + e.printStackTrace(System.out); + } + + } catch (Exception e) { + e.printStackTrace(); + } + } + + //******************************** + // Return the query result set from the source database. + //********************************* + private static ResultSet getDataSet() { + ResultSet rs = null; + try { + Class.forName("com.MY.jdbc.Driver").newInstance(); + Connection conn = DriverManager.getConnection("jdbc:MY://10.119.179.227:3306/jack?useSSL=false&allowPublicKeyRetrieval=true", "jack", "Gauss@123"); + Statement stmt = conn.createStatement(); + rs = stmt.executeQuery("select * from migration_table"); + } catch (SQLException e) { + e.printStackTrace(); + } catch (Exception e) { + e.printStackTrace(); + } + return rs; + } +} +``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/14-example-logic-replication-code.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/14-example-logic-replication-code.md new file mode 100644 index 0000000000000000000000000000000000000000..6d1d422f169b6ac1bbc7ff9debe9972514a29e15 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/14-example-logic-replication-code.md @@ -0,0 +1,111 @@ +--- +title: Example Logic Replication Code +summary: Example Logic Replication Code +author: Guo Huan +date: 2021-04-26 +--- + +# Example Logic Replication Code + +The following example demonstrates how to use the logical replication function through the JDBC interface. + +```java +//Logical replication function example: file name, LogicalReplicationDemo.java +//Prerequisite: Add the IP address of the JDBC user machine to the database whitelist. Add the following content to pg_hba.conf: +//Assume that the IP address of the JDBC user machine is 10.10.10.10. +//host all all 10.10.10.10/32 sha256 +//host replication all 10.10.10.10/32 sha256 + +import org.postgresql.PGProperty; +import org.postgresql.jdbc.PgConnection; +import org.postgresql.replication.LogSequenceNumber; +import org.postgresql.replication.PGReplicationStream; + +import java.nio.ByteBuffer; +import java.sql.DriverManager; +import java.util.Properties; +import java.util.concurrent.TimeUnit; + +public class LogicalReplicationDemo { + public static void main(String[] args) { + String driver = "org.postgresql.Driver"; + //Set the IP address and port number of the database. + String sourceURL = "jdbc:postgresql://$ip:$port/postgres"; + PgConnection conn = null; + //The default name of the logical replication slot is replication_slot. + //Test mode: Create a logical replication slot. + int TEST_MODE_CREATE_SLOT = 1; + //Test mode: Enable logical replication (The prerequisite is that the logical replication slot already exists). + int TEST_MODE_START_REPL = 2; + //Test mode: Delete a logical replication slot. + int TEST_MODE_DROP_SLOT = 3; + //Enable different test modes. + int testMode = TEST_MODE_START_REPL; + + try { + Class.forName(driver); + } catch (Exception e) { + e.printStackTrace(); + return; + } + + try { + Properties properties = new Properties(); + PGProperty.USER.set(properties, "user"); + PGProperty.PASSWORD.set(properties, "passwd"); + //For logical replication, the following three attributes are mandatory. + PGProperty.ASSUME_MIN_SERVER_VERSION.set(properties, "9.4"); + PGProperty.REPLICATION.set(properties, "database"); + PGProperty.PREFER_QUERY_MODE.set(properties, "simple"); + conn = (PgConnection) DriverManager.getConnection(sourceURL, properties); + System.out.println("connection success!"); + + if(testMode == TEST_MODE_CREATE_SLOT){ + conn.getReplicationAPI() + .createReplicationSlot() + .logical() + .withSlotName("replication_slot") + .withOutputPlugin("test_decoding") + .make(); + }else if(testMode == TEST_MODE_START_REPL) { + //Create a replication slot before enabling this mode. + LogSequenceNumber waitLSN = LogSequenceNumber.valueOf("6F/E3C53568"); + PGReplicationStream stream = conn + .getReplicationAPI() + .replicationStream() + .logical() + .withSlotName("replication_slot") + .withSlotOption("include-xids", false) + .withSlotOption("skip-empty-xacts", true) + .withStartPosition(waitLSN) + .start(); + while (true) { + ByteBuffer byteBuffer = stream.readPending(); + + if (byteBuffer == null) { + TimeUnit.MILLISECONDS.sleep(10L); + continue; + } + + int offset = byteBuffer.arrayOffset(); + byte[] source = byteBuffer.array(); + int length = source.length - offset; + System.out.println(new String(source, offset, length)); + + //If the LSN needs to be flushed, call the following APIs based on the service requirements: + //LogSequenceNumber lastRecv = stream.getLastReceiveLSN(); + //stream.setFlushedLSN(lastRecv); + //stream.forceUpdateStatus(); + + } + }else if(testMode == TEST_MODE_DROP_SLOT){ + conn.getReplicationAPI() + .dropReplicationSlot("replication_slot"); + } + } catch (Exception e) { + e.printStackTrace(); + return; + } + } +} +``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/14.1-example-parameters-for-connecting-to-the-database-in-different-scenarios.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/14.1-example-parameters-for-connecting-to-the-database-in-different-scenarios.md new file mode 100644 index 0000000000000000000000000000000000000000..c953f095aed64b393e85994611d3b23b5a23b098 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/14.1-example-parameters-for-connecting-to-the-database-in-different-scenarios.md @@ -0,0 +1,60 @@ +--- +title: Parameters for Connecting to the Database in Different Scenarios +summary: Parameters for Connecting to the Database in Different Scenarios +author: Zhang Cuiping +date: 2021-10-11 +--- + +# Example: Parameters for Connecting to the Database in Different Scenarios + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** In the following example, **host:port** represents a node, where **host** indicates the name or IP address of the server where the database resides, and **port** indicates the port number of the server where the database resides. + +## DR + +A customer has two database instances. Database instance A is the production database instance, and database instance B is the DR database instance. When the customer performs a DR switchover, database instance A is demoted to the DR database instance, and database instance B is promoted the production database instance. In this case, to avoid application restart or re-release caused by modifications on the configuration file, the customer can write database instances A and B to the connection string when initializing the configuration file. If the primary database instance cannot be connected, the driver attempts to connect to the DR database instance. For example, database instance A consists of *node1*, *node2*, and *node3*, and database instance B consists of *node4*, *node5*, and *node6*. + +The URL can be configured as follows: + +``` +jdbc:postgresql://node1,node2,node3,node4,node5,node6/database?priorityServers=3 +``` + +## Load Balancing + +A customer has a centralized database instance that consists of one primary node and two standby nodes, that is, *node1*, *node2*, and *node3*. *node1* is the primary node, and *node2* and *node3* are the standby nodes. + +If the customer wants to evenly distribute the connections established on the same application to three nodes, the URL can be configured as follows: + +``` +jdbc:postgresql://node1,node2,node3/database?loadBalanceHosts=true +``` + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-caution.gif) **CAUTION:** When **loadBalanceHosts** is used, if the connection is established on the standby DN, write operations cannot be performed. If read and write operations are required, do not set this parameter. + +## Log Diagnosis + +If a customer encounters slow data import or some errors that are difficult to analyze, the trace log function can be enabled for diagnosis. The URL can be configured as follows: + +``` +jdbc:postgresql://node1/database?loggerLevel=trace&loggerFile=jdbc.log +``` + +## High Performance + +A customer may execute the same SQL statement for multiple times with different input parameters. To improve the execution efficiency, the **prepareThreshold** parameter can be enabled to avoid repeatedly generating execution plans. The URL can be configured as follows: + +``` +jdbc:postgresql://node1/database?prepareThreshold=5 +``` + +A customer queries 10 million data records at a time. To prevent memory overflow caused by simultaneous return of the data records, the **defaultRowFetchSize** parameter can be used. The URL can be configured as follows: + +``` +jdbc:postgresql://node1/database?defaultRowFetchSize=50000 +``` + +A customer needs to insert 10 million data records in batches. To improve efficiency, the **batchMode** parameter can be used. The URL can be configured as follows: + +``` +jdbc:postgresql://node1/database?batchMode=true +``` \ No newline at end of file diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/1-java-sql-Connection.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/1-java-sql-Connection.md new file mode 100644 index 0000000000000000000000000000000000000000..0598e5efa995bda8f5c6978c869fcc29a70a63f8 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/1-java-sql-Connection.md @@ -0,0 +1,67 @@ +--- +title: java.sql.Connection +summary: java.sql.Connection +author: Guo Huan +date: 2021-05-17 +--- + +# java.sql.Connection + +This section describes **java.sql.Connection**, the interface for connecting to a database. + +**Table 1** Support status for java.sql.Connection + +| Method Name | Return Type | JDBC 4 Is Supported Or Not | +| :----------------------------------------------------------- | :------------------------------- | :------------------------- | +| abort(Executor executor) | void | Yes | +| clearWarnings() | void | Yes | +| close() | void | Yes | +| commit() | void | Yes | +| createArrayOf(String typeName, Object[] elements) | Array | Yes | +| createBlob() | Blob | Yes | +| createClob() | Clob | Yes | +| createSQLXML() | SQLXML | Yes | +| createStatement() | Statement | Yes | +| createStatement(int resultSetType, int resultSetConcurrency) | Statement | Yes | +| createStatement(int resultSetType, int resultSetConcurrency, int resultSetHoldability) | Statement | Yes | +| getAutoCommit() | Boolean | Yes | +| getCatalog() | String | Yes | +| getClientInfo() | Properties | Yes | +| getClientInfo(String name) | String | Yes | +| getHoldability() | int | Yes | +| getMetaData() | DatabaseMetaData | Yes | +| getNetworkTimeout() | int | Yes | +| getSchema() | String | Yes | +| getTransactionIsolation() | int | Yes | +| getTypeMap() | Map<String,Class<?>> | Yes | +| getWarnings() | SQLWarning | Yes | +| isClosed() | Boolean | Yes | +| isReadOnly() | Boolean | Yes | +| isValid(int timeout) | boolean | Yes | +| nativeSQL(String sql) | String | Yes | +| prepareCall(String sql) | CallableStatement | Yes | +| prepareCall(String sql, int resultSetType, int resultSetConcurrency) | CallableStatement | Yes | +| prepareCall(String sql, int resultSetType, int resultSetConcurrency, int resultSetHoldability) | CallableStatement | Yes | +| prepareStatement(String sql) | PreparedStatement | Yes | +| prepareStatement(String sql, int autoGeneratedKeys) | PreparedStatement | Yes | +| prepareStatement(String sql, int[] columnIndexes) | PreparedStatement | Yes | +| prepareStatement(String sql, int resultSetType, int resultSetConcurrency) | PreparedStatement | Yes | +| prepareStatement(String sql, int resultSetType, int resultSetConcurrency, int resultSetHoldability) | PreparedStatement | Yes | +| prepareStatement(String sql, String[] columnNames) | PreparedStatement | Yes | +| releaseSavepoint(Savepoint savepoint) | void | Yes | +| rollback() | void | Yes | +| rollback(Savepoint savepoint) | void | Yes | +| setAutoCommit(boolean autoCommit) | void | Yes | +| setClientInfo(Properties properties) | void | Yes | +| setClientInfo(String name,String value) | void | Yes | +| setHoldability(int holdability) | void | Yes | +| setNetworkTimeout(Executor executor, int milliseconds) | void | Yes | +| setReadOnly(boolean readOnly) | void | Yes | +| setSavepoint() | Savepoint | Yes | +| setSavepoint(String name) | Savepoint | Yes | +| setSchema(String schema) | void | Yes | +| setTransactionIsolation(int level) | void | Yes | +| setTypeMap(Map<String,Class<?>> map) | void | Yes | + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-notice.gif) **NOTICE:** +> The AutoCommit mode is used by default within the interface. If you disable it by running **setAutoCommit(false)**, all the statements executed later will be packaged in explicit transactions, and you cannot execute statements that cannot be executed within transactions. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/10-javax-sql-DataSource.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/10-javax-sql-DataSource.md new file mode 100644 index 0000000000000000000000000000000000000000..0aeccd036e7e7b833f90f25bc62ea06c3abed656 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/10-javax-sql-DataSource.md @@ -0,0 +1,21 @@ +--- +title: javax.sql.DataSource +summary: javax.sql.DataSource +author: Guo Huan +date: 2021-05-17 +--- + +# javax.sql.DataSource + +This section describes **javax.sql.DataSource**, the interface for data sources. + +**Table 1** Support status for javax.sql.DataSource + +| Method Name | Return Type | Support JDBC 4 | +| :--------------------------------------------- | :---------- | :------------- | +| getConneciton() | Connection | Yes | +| getConnection(String username,String password) | Connection | Yes | +| getLoginTimeout() | int | Yes | +| getLogWriter() | PrintWriter | Yes | +| setLoginTimeout(int seconds) | void | Yes | +| setLogWriter(PrintWriter out) | void | Yes | diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/11-javax-sql-PooledConnection.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/11-javax-sql-PooledConnection.md new file mode 100644 index 0000000000000000000000000000000000000000..b1f4004dc0fcf426efcfc75bf4d586cac2d2ebb3 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/11-javax-sql-PooledConnection.md @@ -0,0 +1,19 @@ +--- +title: javax.sql.PooledConnection +summary: javax.sql.PooledConnection +author: Guo Huan +date: 2021-05-17 +--- + +# javax.sql.PooledConnection + +This section describes **javax.sql.PooledConnection**, the connection interface created by a connection pool. + +**Table 1** Support status for javax.sql.PooledConnection + +| Method Name | Return Type | JDBC 4 Is Supported Or Not | +| :----------------------------------------------------------- | :---------- | :------------------------- | +| addConnectionEventListener (ConnectionEventListener listener) | void | Yes | +| close() | void | Yes | +| getConnection() | Connection | Yes | +| removeConnectionEventListener (ConnectionEventListener listener) | void | Yes | diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/12-javax-naming-Context.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/12-javax-naming-Context.md new file mode 100644 index 0000000000000000000000000000000000000000..26da1f5a0b5d258855b70ea3967d650844c19c6c --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/12-javax-naming-Context.md @@ -0,0 +1,25 @@ +--- +title: javax.naming.Context +summary: javax.naming.Context +author: Guo Huan +date: 2021-05-17 +--- + +# javax.naming.Context + +This section describes **javax.naming.Context**, the context interface for connection configuration. + +**Table 1** Support status for javax.naming.Context + +| Method Name | Return Type | Support JDBC 4 | +| :------------------------------------- | :---------- | :------------- | +| bind(Name name, Object obj) | void | Yes | +| bind(String name, Object obj) | void | Yes | +| lookup(Name name) | Object | Yes | +| lookup(String name) | Object | Yes | +| rebind(Name name, Object obj) | void | Yes | +| rebind(String name, Object obj) | void | Yes | +| rename(Name oldName, Name newName) | void | Yes | +| rename(String oldName, String newName) | void | Yes | +| unbind(Name name) | void | Yes | +| unbind(String name) | void | Yes | diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/13-javax-naming-spi-InitialContextFactory.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/13-javax-naming-spi-InitialContextFactory.md new file mode 100644 index 0000000000000000000000000000000000000000..ef15dfccd645fedeb9a5175528da0703ebf2a40a --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/13-javax-naming-spi-InitialContextFactory.md @@ -0,0 +1,16 @@ +--- +title: javax.naming.spi.InitialContextFactory +summary: javax.naming.spi.InitialContextFactory +author: Guo Huan +date: 2021-05-17 +--- + +# javax.naming.spi.InitialContextFactory + +This section describes **javax.naming.spi.InitialContextFactory**, the initial context factory interface. + +**Table 1** Support status for javax.naming.spi.InitialContextFactory + +| Method Name | Return Type | Support JDBC 4 | +| :-------------------------------------------------- | :---------- | :------------- | +| getInitialContext(Hashtable<?,?> environment) | Context | Yes | diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/14-CopyManager.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/14-CopyManager.md new file mode 100644 index 0000000000000000000000000000000000000000..258d60683e2651f8cbbb5a8b13fbb47feb924e95 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/14-CopyManager.md @@ -0,0 +1,40 @@ +--- +title: CopyManager +summary: CopyManager +author: Guo Huan +date: 2021-05-17 +--- + +# CopyManager + +CopyManager is an API class provided by the JDBC driver in MogDB. It is used to import data to MogDB in batches. + +## Inheritance Relationship of CopyManager + +The CopyManager class is in the **org.postgresql.copy** package and inherits the java.lang.Object class. The declaration of the class is as follows: + +```java +public class CopyManager +extends Object +``` + +## Construction Method + +public CopyManager(BaseConnection connection) + +throws SQLException + +## Common Methods + +**Table 1** Common methods of CopyManager + +| Return Value | Method | Description | throws | +| :----------------------- | :------------------- | :------------------- | :------------------- | +| CopyIn | copyIn(String sql) | - | SQLException | +| long | copyIn(String sql, InputStream from) | Uses **COPY FROM STDIN** to quickly load data to tables in the database from InputStream. | SQLException,IOException | +| long | copyIn(String sql, InputStream from, int bufferSize) | Uses **COPY FROM STDIN** to quickly load data to tables in the database from InputStream. | SQLException,IOException | +| long | copyIn(String sql, Reader from) | Uses **COPY FROM STDIN** to quickly load data to tables in the database from Reader. | SQLException,IOException | +| long | copyIn(String sql, Reader from, int bufferSize) | Uses **COPY FROM STDIN** to quickly load data to tables in the database from Reader. | SQLException,IOException | +| CopyOut | copyOut(String sql) | - | SQLException | +| long | copyOut(String sql, OutputStream to) | Sends the result set of **COPY TO STDOUT** from the database to the OutputStream class. | SQLException,IOException | +| long | copyOut(String sql, Writer to) | Sends the result set of **COPY TO STDOUT** from the database to the Writer class. | SQLException,IOException | diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/2-java-sql-CallableStatement.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/2-java-sql-CallableStatement.md new file mode 100644 index 0000000000000000000000000000000000000000..dc493d97a52c8b9ae60e13d2c77609c5e530a108 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/2-java-sql-CallableStatement.md @@ -0,0 +1,46 @@ +--- +title: java.sql.CallableStatement +summary: java.sql.CallableStatement +author: Guo Huan +date: 2021-05-17 +--- + +# java.sql.CallableStatement + +This section describes **java.sql.CallableStatement**, the interface for executing the stored procedure. + +**Table 1** Support status for java.sql.CallableStatement + +| Method Name | Return Type | JDBC 4 Is Supported Or Not | +| :------------------------------------------------- | :---------- | :------------------------- | +| getArray(int parameterIndex) | Array | Yes | +| getBigDecimal(int parameterIndex) | BigDecimal | Yes | +| getBlob(int parameterIndex) | Blob | Yes | +| getBoolean(int parameterIndex) | boolean | Yes | +| getByte(int parameterIndex) | byte | Yes | +| getBytes(int parameterIndex) | byte[] | Yes | +| getClob(int parameterIndex) | Clob | Yes | +| getDate(int parameterIndex) | Date | Yes | +| getDate(int parameterIndex, Calendar cal) | Date | Yes | +| getDouble(int parameterIndex) | double | Yes | +| getFloat(int parameterIndex) | float | Yes | +| getInt(int parameterIndex) | int | Yes | +| getLong(int parameterIndex) | long | Yes | +| getObject(int parameterIndex) | Object | Yes | +| getObject(int parameterIndex, Class<T> type) | Object | Yes | +| getShort(int parameterIndex) | short | Yes | +| getSQLXML(int parameterIndex) | SQLXML | Yes | +| getString(int parameterIndex) | String | Yes | +| getNString(int parameterIndex) | String | Yes | +| getTime(int parameterIndex) | Time | Yes | +| getTime(int parameterIndex, Calendar cal) | Time | Yes | +| getTimestamp(int parameterIndex) | Timestamp | Yes | +| getTimestamp(int parameterIndex, Calendar cal) | Timestamp | Yes | +| registerOutParameter(int parameterIndex, int type) | void | Yes | +| wasNull() | Boolean | Yes | + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> +> - The batch operation of statements containing OUT parameter is not allowed. +> - The following methods are inherited from java.sql.Statement: close, execute, executeQuery, executeUpdate, getConnection, getResultSet, getUpdateCount, isClosed, setMaxRows, and setFetchSize. +> - The following methods are inherited from java.sql.PreparedStatement: addBatch, clearParameters, execute, executeQuery, executeUpdate, getMetaData, setBigDecimal, setBoolean, setByte, setBytes, setDate, setDouble, setFloat, setInt, setLong, setNull, setObject, setString, setTime, and setTimestamp. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/3-java-sql-DatabaseMetaData.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/3-java-sql-DatabaseMetaData.md new file mode 100644 index 0000000000000000000000000000000000000000..bb39de1129ff7d110c05c6941159a67c1e1ed353 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/3-java-sql-DatabaseMetaData.md @@ -0,0 +1,189 @@ +--- +title: java.sql.DatabaseMetaData +summary: java.sql.DatabaseMetaData +author: Guo Huan +date: 2021-05-17 +--- + +# java.sql.DatabaseMetaData + +This section describes **java.sql.DatabaseMetaData**, the interface for defining database objects. + +**Table 1** Support status for java.sql.DatabaseMetaData + +| Method Name | Return Type | JDBC 4 Is Supported Or Not | +| :----------------------------------------------------------- | :----------- | :------------------------- | +| allProceduresAreCallable() | boolean | Yes | +| allTablesAreSelectable() | boolean | Yes | +| autoCommitFailureClosesAllResultSets() | boolean | Yes | +| dataDefinitionCausesTransactionCommit() | boolean | Yes | +| dataDefinitionIgnoredInTransactions() | boolean | Yes | +| deletesAreDetected(int type) | boolean | Yes | +| doesMaxRowSizeIncludeBlobs() | boolean | Yes | +| generatedKeyAlwaysReturned() | boolean | Yes | +| getBestRowIdentifier(String catalog, String schema, String table, int scope, boolean nullable) | ResultSet | Yes | +| getCatalogs() | ResultSet | Yes | +| getCatalogSeparator() | String | Yes | +| getCatalogTerm() | String | Yes | +| getClientInfoProperties() | ResultSet | Yes | +| getColumnPrivileges(String catalog, String schema, String table, String columnNamePattern) | ResultSet | Yes | +| getConnection() | Connection | Yes | +| getCrossReference(String parentCatalog, String parentSchema, String parentTable, String foreignCatalog, String foreignSchema, String foreignTable) | ResultSet | Yes | +| getDefaultTransactionIsolation() | int | Yes | +| getExportedKeys(String catalog, String schema, String table) | ResultSet | Yes | +| getExtraNameCharacters() | String | Yes | +| getFunctionColumns(String catalog, String schemaPattern, String functionNamePattern, String columnNamePattern) | ResultSet | Yes | +| getFunctions(String catalog, String schemaPattern, String functionNamePattern) | ResultSet | Yes | +| getIdentifierQuoteString() | String | Yes | +| getImportedKeys(String catalog, String schema, String table) | ResultSet | Yes | +| getIndexInfo(String catalog, String schema, String table, boolean unique, boolean approximate) | ResultSet | Yes | +| getMaxBinaryLiteralLength() | int | Yes | +| getMaxCatalogNameLength() | int | Yes | +| getMaxCharLiteralLength() | int | Yes | +| getMaxColumnNameLength() | int | Yes | +| getMaxColumnsInGroupBy() | int | Yes | +| getMaxColumnsInIndex() | int | Yes | +| getMaxColumnsInOrderBy() | int | Yes | +| getMaxColumnsInSelect() | int | Yes | +| getMaxColumnsInTable() | int | Yes | +| getMaxConnections() | int | Yes | +| getMaxCursorNameLength() | int | Yes | +| getMaxIndexLength() | int | Yes | +| getMaxLogicalLobSize() | default long | Yes | +| getMaxProcedureNameLength() | int | Yes | +| getMaxRowSize() | int | Yes | +| getMaxSchemaNameLength() | int | Yes | +| getMaxStatementLength() | int | Yes | +| getMaxStatements() | int | Yes | +| getMaxTableNameLength() | int | Yes | +| getMaxTablesInSelect() | int | Yes | +| getMaxUserNameLength() | int | Yes | +| getNumericFunctions() | String | Yes | +| getPrimaryKeys(String catalog, String schema, String table) | ResultSet | Yes | +| getProcedureColumns(String catalog, String schemaPattern, String procedureNamePattern, String columnNamePattern) | ResultSet | Yes | +| getProcedures(String catalog, String schemaPattern, String procedureNamePattern) | ResultSet | Yes | +| getProcedureTerm() | String | Yes | +| getSchemas() | ResultSet | Yes | +| getSchemas(String catalog, String schemaPattern) | ResultSet | Yes | +| getSchemaTerm() | String | Yes | +| getSearchStringEscape() | String | Yes | +| getSQLKeywords() | String | Yes | +| getSQLStateType() | int | Yes | +| getStringFunctions() | String | Yes | +| getSystemFunctions() | String | Yes | +| getTablePrivileges(String catalog, String schemaPattern, String tableNamePattern) | ResultSet | Yes | +| getTimeDateFunctions() | String | Yes | +| getTypeInfo() | ResultSet | Yes | +| getUDTs(String catalog, String schemaPattern, String typeNamePattern, int[] types) | ResultSet | Yes | +| getURL() | String | Yes | +| getVersionColumns(String catalog, String schema, String table) | ResultSet | Yes | +| insertsAreDetected(int type) | boolean | Yes | +| locatorsUpdateCopy() | boolean | Yes | +| othersDeletesAreVisible(int type) | boolean | Yes | +| othersInsertsAreVisible(int type) | boolean | Yes | +| othersUpdatesAreVisible(int type) | boolean | Yes | +| ownDeletesAreVisible(int type) | boolean | Yes | +| ownInsertsAreVisible(int type) | boolean | Yes | +| ownUpdatesAreVisible(int type) | boolean | Yes | +| storesLowerCaseIdentifiers() | boolean | Yes | +| storesMixedCaseIdentifiers() | boolean | Yes | +| storesUpperCaseIdentifiers() | boolean | Yes | +| supportsBatchUpdates() | boolean | Yes | +| supportsCatalogsInDataManipulation() | boolean | Yes | +| supportsCatalogsInIndexDefinitions() | boolean | Yes | +| supportsCatalogsInPrivilegeDefinitions() | boolean | Yes | +| supportsCatalogsInProcedureCalls() | boolean | Yes | +| supportsCatalogsInTableDefinitions() | boolean | Yes | +| supportsCorrelatedSubqueries() | boolean | Yes | +| supportsDataDefinitionAndDataManipulationTransactions() | boolean | Yes | +| supportsDataManipulationTransactionsOnly() | boolean | Yes | +| supportsGetGeneratedKeys() | boolean | Yes | +| supportsMixedCaseIdentifiers() | boolean | Yes | +| supportsMultipleOpenResults() | boolean | Yes | +| supportsNamedParameters() | boolean | Yes | +| supportsOpenCursorsAcrossCommit() | boolean | Yes | +| supportsOpenCursorsAcrossRollback() | boolean | Yes | +| supportsOpenStatementsAcrossCommit() | boolean | Yes | +| supportsOpenStatementsAcrossRollback() | boolean | Yes | +| supportsPositionedDelete() | boolean | Yes | +| supportsPositionedUpdate() | boolean | Yes | +| supportsRefCursors() | boolean | Yes | +| supportsResultSetConcurrency(int type, int concurrency) | boolean | Yes | +| supportsResultSetType(int type) | boolean | Yes | +| supportsSchemasInIndexDefinitions() | boolean | Yes | +| supportsSchemasInPrivilegeDefinitions() | boolean | Yes | +| supportsSchemasInProcedureCalls() | boolean | Yes | +| supportsSchemasInTableDefinitions() | boolean | Yes | +| supportsSelectForUpdate() | boolean | Yes | +| supportsStatementPooling() | boolean | Yes | +| supportsStoredFunctionsUsingCallSyntax() | boolean | Yes | +| supportsStoredProcedures() | boolean | Yes | +| supportsSubqueriesInComparisons() | boolean | Yes | +| supportsSubqueriesInExists() | boolean | Yes | +| supportsSubqueriesInIns() | boolean | Yes | +| supportsSubqueriesInQuantifieds() | boolean | Yes | +| supportsTransactionIsolationLevel(int level) | boolean | Yes | +| supportsTransactions() | boolean | Yes | +| supportsUnion() | boolean | Yes | +| supportsUnionAll() | boolean | Yes | +| updatesAreDetected(int type) | boolean | Yes | +| getTables(String catalog, String schemaPattern, String tableNamePattern, String[] types) | ResultSet | Yes | +| getColumns(String catalog, String schemaPattern, String tableNamePattern, String columnNamePattern) | ResultSet | Yes | +| getTableTypes() | ResultSet | Yes | +| getUserName() | String | Yes | +| isReadOnly() | boolean | Yes | +| nullsAreSortedHigh() | boolean | Yes | +| nullsAreSortedLow() | boolean | Yes | +| nullsAreSortedAtStart() | boolean | Yes | +| nullsAreSortedAtEnd() | boolean | Yes | +| getDatabaseProductName() | String | Yes | +| getDatabaseProductVersion() | String | Yes | +| getDriverName() | String | Yes | +| getDriverVersion() | String | Yes | +| getDriverMajorVersion() | int | Yes | +| getDriverMinorVersion() | int | Yes | +| usesLocalFiles() | boolean | Yes | +| usesLocalFilePerTable() | boolean | Yes | +| supportsMixedCaseIdentifiers() | boolean | Yes | +| storesUpperCaseIdentifiers() | boolean | Yes | +| storesLowerCaseIdentifiers() | boolean | Yes | +| supportsMixedCaseQuotedIdentifiers() | boolean | Yes | +| storesUpperCaseQuotedIdentifiers() | boolean | Yes | +| storesLowerCaseQuotedIdentifiers() | boolean | Yes | +| storesMixedCaseQuotedIdentifiers() | boolean | Yes | +| supportsAlterTableWithAddColumn() | boolean | Yes | +| supportsAlterTableWithDropColumn() | boolean | Yes | +| supportsColumnAliasing() | boolean | Yes | +| nullPlusNonNullIsNull() | boolean | Yes | +| supportsConvert() | boolean | Yes | +| supportsConvert(int fromType, int toType) | boolean | Yes | +| supportsTableCorrelationNames() | boolean | Yes | +| supportsDifferentTableCorrelationNames() | boolean | Yes | +| supportsExpressionsInOrderBy() | boolean | Yes | +| supportsOrderByUnrelated() | boolean | Yes | +| supportsGroupBy() | boolean | Yes | +| supportsGroupByUnrelated() | boolean | Yes | +| supportsGroupByBeyondSelect() | boolean | Yes | +| supportsLikeEscapeClause() | boolean | Yes | +| supportsMultipleResultSets() | boolean | Yes | +| supportsMultipleTransactions() | boolean | Yes | +| supportsNonNullableColumns() | boolean | Yes | +| supportsMinimumSQLGrammar() | boolean | Yes | +| supportsCoreSQLGrammar() | boolean | Yes | +| supportsExtendedSQLGrammar() | boolean | Yes | +| supportsANSI92EntryLevelSQL() | boolean | Yes | +| supportsANSI92IntermediateSQL() | boolean | Yes | +| supportsANSI92FullSQL() | boolean | Yes | +| supportsIntegrityEnhancementFacility() | boolean | Yes | +| supportsOuterJoins() | boolean | Yes | +| supportsFullOuterJoins() | boolean | Yes | +| supportsLimitedOuterJoins() | boolean | Yes | +| isCatalogAtStart() | boolean | Yes | +| supportsSchemasInDataManipulation() | boolean | Yes | +| supportsSavepoints() | boolean | Yes | +| supportsResultSetHoldability(int holdability) | boolean | Yes | +| getResultSetHoldability() | int | Yes | +| getDatabaseMajorVersion() | int | Yes | +| getDatabaseMinorVersion() | int | Yes | +| getJDBCMajorVersion() | int | Yes | +| getJDBCMinorVersion() | int | Yes | diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/4-java-sql-Driver.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/4-java-sql-Driver.md new file mode 100644 index 0000000000000000000000000000000000000000..cb99815436e25f30ee33cab95079467f9aab0a14 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/4-java-sql-Driver.md @@ -0,0 +1,22 @@ +--- +title: java.sql.Driver +summary: java.sql.Driver +author: Guo Huan +date: 2021-05-17 +--- + +# java.sql.Driver + +This section describes **java.sql.Driver**, the database driver interface. + +**Table 1** Support status for java.sql.Driver + +| Method Name | Return Type | JDBC 4 Is Supported Or Not | +| :------------------------------------------- | :------------------- | :------------------------- | +| acceptsURL(String url) | Boolean | Yes | +| connect(String url, Properties info) | Connection | Yes | +| jdbcCompliant() | Boolean | Yes | +| getMajorVersion() | int | Yes | +| getMinorVersion() | int | Yes | +| getParentLogger() | Logger | Yes | +| getPropertyInfo(String url, Properties info) | DriverPropertyInfo[] | Yes | diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/5-java-sql-PreparedStatement.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/5-java-sql-PreparedStatement.md new file mode 100644 index 0000000000000000000000000000000000000000..50091489a626e20b507a6220852ed0e968204686 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/5-java-sql-PreparedStatement.md @@ -0,0 +1,70 @@ +--- +title: java.sql.PreparedStatement +summary: java.sql.PreparedStatement +author: Guo Huan +date: 2021-05-17 +--- + +# java.sql.PreparedStatement + +This section describes **java.sql.PreparedStatement**, the interface for preparing statements. + +**Table 1** Support status for java.sql.PreparedStatement + +| Method Name | Return Type | JDBC 4 Is Supported Or Not | +| :----------------------------------------------------------- | :---------------- | :------------------------- | +| clearParameters() | void | Yes | +| execute() | Boolean | Yes | +| executeQuery() | ResultSet | Yes | +| excuteUpdate() | int | Yes | +| executeLargeUpdate() | long | No | +| getMetaData() | ResultSetMetaData | Yes | +| getParameterMetaData() | ParameterMetaData | Yes | +| setArray(int parameterIndex, Array x) | void | Yes | +| setAsciiStream(int parameterIndex, InputStream x, int length) | void | Yes | +| setBinaryStream(int parameterIndex, InputStream x) | void | Yes | +| setBinaryStream(int parameterIndex, InputStream x, int length) | void | Yes | +| setBinaryStream(int parameterIndex, InputStream x, long length) | void | Yes | +| setBlob(int parameterIndex, InputStream inputStream) | void | Yes | +| setBlob(int parameterIndex, InputStream inputStream, long length) | void | Yes | +| setBlob(int parameterIndex, Blob x) | void | Yes | +| setCharacterStream(int parameterIndex, Reader reader) | void | Yes | +| setCharacterStream(int parameterIndex, Reader reader, int length) | void | Yes | +| setClob(int parameterIndex, Reader reader) | void | Yes | +| setClob(int parameterIndex, Reader reader, long length) | void | Yes | +| setClob(int parameterIndex, Clob x) | void | Yes | +| setDate(int parameterIndex, Date x, Calendar cal) | void | Yes | +| setNull(int parameterIndex, int sqlType) | void | Yes | +| setNull(int parameterIndex, int sqlType, String typeName) | void | Yes | +| setObject(int parameterIndex, Object x) | void | Yes | +| setObject(int parameterIndex, Object x, int targetSqlType) | void | Yes | +| setObject(int parameterIndex, Object x, int targetSqlType, int scaleOrLength) | void | Yes | +| setSQLXML(int parameterIndex, SQLXML xmlObject) | void | Yes | +| setTime(int parameterIndex, Time x) | void | Yes | +| setTime(int parameterIndex, Time x, Calendar cal) | void | Yes | +| setTimestamp(int parameterIndex, Timestamp x) | void | Yes | +| setTimestamp(int parameterIndex, Timestamp x, Calendar cal) | void | Yes | +| setUnicodeStream(int parameterIndex, InputStream x, int length) | void | Yes | +| setURL(int parameterIndex, URL x) | void | Yes | +| setBoolean(int parameterIndex, boolean x) | void | Yes | +| setBigDecimal(int parameterIndex, BigDecimal x) | void | Yes | +| setByte(int parameterIndex, byte x) | void | Yes | +| setBytes(int parameterIndex, byte[] x) | void | Yes | +| setDate(int parameterIndex, Date x) | void | Yes | +| setDouble(int parameterIndex, double x) | void | Yes | +| setFloat(int parameterIndex, float x) | void | Yes | +| setInt(int parameterIndex, int x) | void | Yes | +| setLong(int parameterIndex, long x) | void | Yes | +| setShort(int parameterIndex, short x) | void | Yes | +| setString(int parameterIndex, String x) | void | Yes | +| setNString(int parameterIndex, String x) | void | Yes | +| addBatch() | void | Yes | +| executeBatch() | int[] | Yes | + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> +> - Execute addBatch() and execute() only after running clearBatch(). +> - Batch is not cleared by calling executeBatch(). Clear batch by explicitly calling clearBatch(). +> - After bounded variables of a batch are added, if you want to reuse these values (add a batch again), set*() is not necessary. +> - The following methods are inherited from java.sql.Statement: close, execute, executeQuery, executeUpdate, getConnection, getResultSet, getUpdateCount, isClosed, setMaxRows, and setFetchSize. +> - The **executeLargeUpdate()** method can only be used in JDBC 4.2 or later. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/6-java-sql-ResultSet.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/6-java-sql-ResultSet.md new file mode 100644 index 0000000000000000000000000000000000000000..390cb97bc47eca24ad83ed1a653e4f4f7602d0a1 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/6-java-sql-ResultSet.md @@ -0,0 +1,154 @@ +--- +title: java.sql.ResultSet +summary: java.sql.ResultSet +author: Guo Huan +date: 2021-05-17 +--- + +# java.sql.ResultSet + +This section describes **java.sql.ResultSet**, the interface for execution result sets. + +**Table 1** Support status for java.sql.ResultSet + +| Method Name | Return Type | JDBC 4 Is Supported Or Not | +| :----------------------------------------------------------- | :---------------- | :------------------------- | +| absolute(int row) | Boolean | Yes | +| afterLast() | void | Yes | +| beforeFirst() | void | Yes | +| cancelRowUpdates() | void | Yes | +| clearWarnings() | void | Yes | +| close() | void | Yes | +| deleteRow() | void | Yes | +| findColumn(String columnLabel) | int | Yes | +| first() | Boolean | Yes | +| getArray(int columnIndex) | Array | Yes | +| getArray(String columnLabel) | Array | Yes | +| getAsciiStream(int columnIndex) | InputStream | Yes | +| getAsciiStream(String columnLabel) | InputStream | Yes | +| getBigDecimal(int columnIndex) | BigDecimal | Yes | +| getBigDecimal(String columnLabel) | BigDecimal | Yes | +| getBinaryStream(int columnIndex) | InputStream | Yes | +| getBinaryStream(String columnLabel) | InputStream | Yes | +| getBlob(int columnIndex) | Blob | Yes | +| getBlob(String columnLabel) | Blob | Yes | +| getBoolean(int columnIndex) | Boolean | Yes | +| getBoolean(String columnLabel) | Boolean | Yes | +| getByte(int columnIndex) | byte | Yes | +| getBytes(int columnIndex) | byte[] | Yes | +| getByte(String columnLabel) | byte | Yes | +| getBytes(String columnLabel) | byte[] | Yes | +| getCharacterStream(int columnIndex) | Reader | Yes | +| getCharacterStream(String columnLabel) | Reader | Yes | +| getClob(int columnIndex) | Clob | Yes | +| getClob(String columnLabel) | Clob | Yes | +| getConcurrency() | int | Yes | +| getCursorName() | String | Yes | +| getDate(int columnIndex) | Date | Yes | +| getDate(int columnIndex, Calendar cal) | Date | Yes | +| getDate(String columnLabel) | Date | Yes | +| getDate(String columnLabel, Calendar cal) | Date | Yes | +| getDouble(int columnIndex) | double | Yes | +| getDouble(String columnLabel) | double | Yes | +| getFetchDirection() | int | Yes | +| getFetchSize() | int | Yes | +| getFloat(int columnIndex) | float | Yes | +| getFloat(String columnLabel) | float | Yes | +| getInt(int columnIndex) | int | Yes | +| getInt(String columnLabel) | int | Yes | +| getLong(int columnIndex) | long | Yes | +| getLong(String columnLabel) | long | Yes | +| getMetaData() | ResultSetMetaData | Yes | +| getObject(int columnIndex) | Object | Yes | +| getObject(int columnIndex, Class<T> type) | <T> T | Yes | +| getObject(int columnIndex, Map<String,Class<?>> map) | Object | Yes | +| getObject(String columnLabel) | Object | Yes | +| getObject(String columnLabel, Class<T> type) | <T> T | Yes | +| getObject(String columnLabel, Map<String,Class<?>> map) | Object | Yes | +| getRow() | int | Yes | +| getShort(int columnIndex) | short | Yes | +| getShort(String columnLabel) | short | Yes | +| getSQLXML(int columnIndex) | SQLXML | Yes | +| getSQLXML(String columnLabel) | SQLXML | Yes | +| getStatement() | Statement | Yes | +| getString(int columnIndex) | String | Yes | +| getString(String columnLabel) | String | Yes | +| getNString(int columnIndex) | String | Yes | +| getNString(String columnLabel) | String | Yes | +| getTime(int columnIndex) | Time | Yes | +| getTime(int columnIndex, Calendar cal) | Time | Yes | +| getTime(String columnLabel) | Time | Yes | +| getTime(String columnLabel, Calendar cal) | Time | Yes | +| getTimestamp(int columnIndex) | Timestamp | Yes | +| getTimestamp(int columnIndex, Calendar cal) | Timestamp | Yes | +| getTimestamp(String columnLabel) | Timestamp | Yes | +| getTimestamp(String columnLabel, Calendar cal) | Timestamp | Yes | +| getType() | int | Yes | +| getWarnings() | SQLWarning | Yes | +| insertRow() | void | Yes | +| isAfterLast() | Boolean | Yes | +| isBeforeFirst() | Boolean | Yes | +| isClosed() | Boolean | Yes | +| isFirst() | Boolean | Yes | +| isLast() | Boolean | Yes | +| last() | Boolean | Yes | +| moveToCurrentRow() | void | Yes | +| moveToInsertRow() | void | Yes | +| next() | Boolean | Yes | +| previous() | Boolean | Yes | +| refreshRow() | void | Yes | +| relative(int rows) | Boolean | Yes | +| rowDeleted() | Boolean | Yes | +| rowInserted() | Boolean | Yes | +| rowUpdated() | Boolean | Yes | +| setFetchDirection(int direction) | void | Yes | +| setFetchSize(int rows) | void | Yes | +| updateArray(int columnIndex, Array x) | void | Yes | +| updateArray(String columnLabel, Array x) | void | Yes | +| updateAsciiStream(int columnIndex, InputStream x, int length) | void | Yes | +| updateAsciiStream(String columnLabel, InputStream x, int length) | void | Yes | +| updateBigDecimal(int columnIndex, BigDecimal x) | void | Yes | +| updateBigDecimal(String columnLabel, BigDecimal x) | void | Yes | +| updateBinaryStream(int columnIndex, InputStream x, int length) | void | Yes | +| updateBinaryStream(String columnLabel, InputStream x, int length) | void | Yes | +| updateBoolean(int columnIndex, boolean x) | void | Yes | +| updateBoolean(String columnLabel, boolean x) | void | Yes | +| updateByte(int columnIndex, byte x) | void | Yes | +| updateByte(String columnLabel, byte x) | void | Yes | +| updateBytes(int columnIndex, byte[] x) | void | Yes | +| updateBytes(String columnLabel, byte[] x) | void | Yes | +| updateCharacterStream(int columnIndex, Reader x, int length) | void | Yes | +| updateCharacterStream(String columnLabel, Reader reader, int length) | void | Yes | +| updateDate(int columnIndex, Date x) | void | Yes | +| updateDate(String columnLabel, Date x) | void | Yes | +| updateDouble(int columnIndex, double x) | void | Yes | +| updateDouble(String columnLabel, double x) | void | Yes | +| updateFloat(int columnIndex, float x) | void | Yes | +| updateFloat(String columnLabel, float x) | void | Yes | +| updateInt(int columnIndex, int x) | void | Yes | +| updateInt(String columnLabel, int x) | void | Yes | +| updateLong(int columnIndex, long x) | void | Yes | +| updateLong(String columnLabel, long x) | void | Yes | +| updateNull(int columnIndex) | void | Yes | +| updateNull(String columnLabel) | void | Yes | +| updateObject(int columnIndex, Object x) | void | Yes | +| updateObject(int columnIndex, Object x, int scaleOrLength) | void | Yes | +| updateObject(String columnLabel, Object x) | void | Yes | +| updateObject(String columnLabel, Object x, int scaleOrLength) | void | Yes | +| updateRow() | void | Yes | +| updateShort(int columnIndex, short x) | void | Yes | +| updateShort(String columnLabel, short x) | void | Yes | +| updateSQLXML(int columnIndex, SQLXML xmlObject) | void | Yes | +| updateSQLXML(String columnLabel, SQLXML xmlObject) | void | Yes | +| updateString(int columnIndex, String x) | void | Yes | +| updateString(String columnLabel, String x) | void | Yes | +| updateTime(int columnIndex, Time x) | void | Yes | +| updateTime(String columnLabel, Time x) | void | Yes | +| updateTimestamp(int columnIndex, Timestamp x) | void | Yes | +| updateTimestamp(String columnLabel, Timestamp x) | void | Yes | +| wasNull() | Boolean | Yes | + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> +> - One Statement cannot have multiple open ResultSets. +> - The cursor that is used for traversing the ResultSet cannot be open after being committed. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/7-java-sql-ResultSetMetaData.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/7-java-sql-ResultSetMetaData.md new file mode 100644 index 0000000000000000000000000000000000000000..e8d43f9a30edd55393bc1fbede5d13d6e4b781f7 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/7-java-sql-ResultSetMetaData.md @@ -0,0 +1,36 @@ +--- +title: java.sql.ResultSetMetaData +summary: java.sql.ResultSetMetaData +author: Guo Huan +date: 2021-05-17 +--- + +# java.sql.ResultSetMetaData + +This section describes **java.sql.ResultSetMetaData**, which provides details about ResultSet object information. + +**Table 1** Support status for java.sql.ResultSetMetaData + +| Method Name | Return Type | JDBC 4 Is Supported Or Not | +| :------------------------------- | :---------- | :------------------------- | +| getCatalogName(int column) | String | Yes | +| getColumnClassName(int column) | String | Yes | +| getColumnCount() | int | Yes | +| getColumnDisplaySize(int column) | int | Yes | +| getColumnLabel(int column) | String | Yes | +| getColumnName(int column) | String | Yes | +| getColumnType(int column) | int | Yes | +| getColumnTypeName(int column) | String | Yes | +| getPrecision(int column) | int | Yes | +| getScale(int column) | int | Yes | +| getSchemaName(int column) | String | Yes | +| getTableName(int column) | String | Yes | +| isAutoIncrement(int column) | boolean | Yes | +| isCaseSensitive(int column) | boolean | Yes | +| isCurrency(int column) | boolean | Yes | +| isDefinitelyWritable(int column) | boolean | Yes | +| isNullable(int column) | int | Yes | +| isReadOnly(int column) | boolean | Yes | +| isSearchable(int column) | boolean | Yes | +| isSigned(int column) | boolean | Yes | +| isWritable(int column) | boolean | Yes | diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/8-java-sql-Statement.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/8-java-sql-Statement.md new file mode 100644 index 0000000000000000000000000000000000000000..d3333ee2e427a91f87c980033b042a552cfe339e --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/8-java-sql-Statement.md @@ -0,0 +1,69 @@ +--- +title: java.sql.Statement +summary: java.sql.Statement +author: Guo Huan +date: 2021-05-17 +--- + +# java.sql.Statement + +This section describes **java.sql.Statement**, the interface for executing SQL statements. + +**Table 1** Support status for java.sql.Statement + +| Method Name | Return Type | JDBC 4 Is Supported Or Not | +| :---------------------------------------------------- | :---------- | :------------------------- | +| addBatch(String sql) | void | Yes | +| clearBatch() | void | Yes | +| clearWarnings() | void | Yes | +| close() | void | Yes | +| closeOnCompletion() | void | Yes | +| execute(String sql) | Boolean | Yes | +| execute(String sql, int autoGeneratedKeys) | Boolean | Yes | +| execute(String sql, int[] columnIndexes) | Boolean | Yes | +| execute(String sql, String[] columnNames) | Boolean | Yes | +| executeBatch() | Boolean | Yes | +| executeQuery(String sql) | ResultSet | Yes | +| executeUpdate(String sql) | int | Yes | +| executeUpdate(String sql, int autoGeneratedKeys) | int | Yes | +| executeUpdate(String sql, int[] columnIndexes) | int | Yes | +| executeUpdate(String sql, String[] columnNames) | int | Yes | +| getConnection() | Connection | Yes | +| getFetchDirection() | int | Yes | +| getFetchSize() | int | Yes | +| getGeneratedKeys() | ResultSet | Yes | +| getMaxFieldSize() | int | Yes | +| getMaxRows() | int | Yes | +| getMoreResults() | boolean | Yes | +| getMoreResults(int current) | boolean | Yes | +| getResultSet() | ResultSet | Yes | +| getResultSetConcurrency() | int | Yes | +| getResultSetHoldability() | int | Yes | +| getResultSetType() | int | Yes | +| getQueryTimeout() | int | Yes | +| getUpdateCount() | int | Yes | +| getWarnings() | SQLWarning | Yes | +| isClosed() | Boolean | Yes | +| isCloseOnCompletion() | Boolean | Yes | +| isPoolable() | Boolean | Yes | +| setCursorName(String name) | void | Yes | +| setEscapeProcessing(boolean enable) | void | Yes | +| setFetchDirection(int direction) | void | Yes | +| setMaxFieldSize(int max) | void | Yes | +| setMaxRows(int max) | void | Yes | +| setPoolable(boolean poolable) | void | Yes | +| setQueryTimeout(int seconds) | void | Yes | +| setFetchSize(int rows) | void | Yes | +| cancel() | void | Yes | +| executeLargeUpdate(String sql) | long | No | +| getLargeUpdateCount() | long | No | +| executeLargeBatch() | long | No | +| executeLargeUpdate(String sql, int autoGeneratedKeys) | long | No | +| executeLargeUpdate(String sql, int[] columnIndexes) | long | No | +| executeLargeUpdate(String sql, String[] columnNames) | long | No | + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> +> - Using setFetchSize can reduce the memory occupied by result sets on the client. Result sets are packaged into cursors and segmented for processing, which will increase the communication traffic between the database and the client, affecting performance. +> - Database cursors are valid only within their transactions. If **setFetchSize** is set, set **setAutoCommit(false)** and commit transactions on the connection to flush service data to a database. +> - **LargeUpdate** methods can only be used in JDBC 4.2 or later. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/9-javax-sql-ConnectionPoolDataSource.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/9-javax-sql-ConnectionPoolDataSource.md new file mode 100644 index 0000000000000000000000000000000000000000..73d1518fb6b9ee1849a7d1fba13d185e3d87b1d1 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/15-JDBC/9-javax-sql-ConnectionPoolDataSource.md @@ -0,0 +1,17 @@ +--- +title: javax.sql.ConnectionPoolDataSource +summary: javax.sql.ConnectionPoolDataSource +author: Guo Huan +date: 2021-05-17 +--- + +# javax.sql.ConnectionPoolDataSource + +This section describes **javax.sql.ConnectionPoolDataSource**, the interface for data source connection pools. + +**Table 1** Support status for javax.sql.ConnectionPoolDataSource + +| Method Name | Return Type | JDBC 4 Is Supported Or Not | +| :----------------------------------------------- | :--------------- | :------------------------- | +| getPooledConnection() | PooledConnection | Yes | +| getPooledConnection(String user,String password) | PooledConnection | Yes | diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/2-jdbc-package-driver-class-and-environment-class.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/2-jdbc-package-driver-class-and-environment-class.md new file mode 100644 index 0000000000000000000000000000000000000000..b25547806847ed5d3ca1f71cee3cc0a8f1f09597 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/2-jdbc-package-driver-class-and-environment-class.md @@ -0,0 +1,51 @@ +--- +title: JDBC Package, Driver Class, and Environment Class +summary: JDBC Package, Driver Class, and Environment Class +author: Guo Huan +date: 2021-04-26 +--- + +# JDBC Package, Driver Class, and Environment Class + +**JDBC Package** + +Run **build.sh** in the source code directory on Linux OS to obtain the driver JAR package **postgresql.jar**, which is stored in the source code directory. Obtain the package from the release package named [**openGauss-x.x.x-JDBC.tar.gz**](https://opengauss.org/en/download.html). + +The driver package is compatible with PostgreSQL. The class name and structure in the driver are the same as those in the PostgreSQL driver. All applications running on PostgreSQL can be smoothly migrated to the current system. + +**Driver Class** + +Before establishing a database connection, load the **org.postgresql.Driver** database driver class. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> +> - MogDB is compatible with PostgreSQL in the use of JDBC. Therefore, when two JDBC drivers are used in the same process, class names may conflict. +> +> - Compared with the PostgreSQL driver, the openGauss JDBC driver has the following enhanced features: +> - The SHA256 encryption mode is supported for login. +> - The third-party log framework that implements the sf4j API can be connected. +> - DR failover is supported. + +**Environment Class** + +JDK 1.8 must be configured on the client. The configuration method is as follows: + +1. In the MS-DOS window, run **java -version** to check the JDK version. Ensure that the version is JDK 1.8. If JDK is not installed, download the installation package from the official website and install it. + +2. Configure system environment variables. + + 1. Right-click **My computer** and choose **Properties**. + + 2. In the navigation pane, choose **Advanced system settings**. + + 3. In the **System Properties** dialog box, click **Environment Variables** on the **Advanced** tab page. + + 4. In the **System variables** area of the **Environment Variables** dialog box, click **New** or **Edit** to configure system variables. For details, see [Table 1](#Description). + + **Table 1** Description + + | Variable | Operation | Variable Value | + | :-------- | :----------------------------------------------------------- | :----------------------------------------------------------- | + | JAVA_HOME | - If the variable exists, click **Edit**.
- If the variable does not exist, click **New**. | Specifies the Java installation directory.
Example: C:\Program Files\Java\jdk1.8.0_131 | + | Path | Edit | - If JAVA_HOME is configured, add **%JAVA_HOME%\bin** before the variable value.
- If JAVA_HOME is not configured, add the full Java installation path before the variable value:
C:\Program Files\Java\jdk1.8.0_131\bin; | + | CLASSPATH | New | .;%JAVA_HOME%\lib;%JAVA_H | diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/3-development-process.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/3-development-process.md new file mode 100644 index 0000000000000000000000000000000000000000..e73100d573bff2b2082b342988768e7879df88f2 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/3-development-process.md @@ -0,0 +1,12 @@ +--- +title: Development Process +summary: Development Process +author: Guo Huan +date: 2021-04-26 +--- + +# Development Process + +**Figure 1** Application development process based on JDBC + +![application-development-process-based-on-jdbc](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/development-process-2.png) diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/4-loading-the-driver.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/4-loading-the-driver.md new file mode 100644 index 0000000000000000000000000000000000000000..246146e23e08dc20044f575372e0d642866019b3 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/4-loading-the-driver.md @@ -0,0 +1,19 @@ +--- +title: Loading the Driver +summary: Loading the Driver +author: Guo Huan +date: 2021-04-26 +--- + +# Loading the Driver + +Load the database driver before creating a database connection. + +You can load the driver in the following ways: + +- Before creating a connection, implicitly load the driver in the code:**Class.forName("org.postgresql.Driver")** + +- During the JVM startup, transfer the driver as a parameter to the JVM:**java -Djdbc.drivers=org.postgresql.Driver jdbctest** + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + > **jdbctest** is the name of a test application. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/5-connecting-to-a-database.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/5-connecting-to-a-database.md new file mode 100644 index 0000000000000000000000000000000000000000..13682723215a7b23be3444b6d8a47d54bcfd57dd --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/5-connecting-to-a-database.md @@ -0,0 +1,105 @@ +--- +title: Connecting to a Database +summary: Connecting to a Database +author: Guo Huan +date: 2021-04-26 +--- + +# Connecting to a Database + +After a database is connected, you can use JDBC to run SQL statements to operate data. + +**Function Prototype** + +JDBC provides the following three database connection methods: + +- DriverManager.getConnection(String url); +- DriverManager.getConnection(String url, Properties info); +- DriverManager.getConnection(String url, String user, String password); + +**Parameters** + +**Table 1** Database connection parameters + +| Parameter | Description | +| --------- | ------------------------------------------------------------ | +| url | **postgresql.jar** database connection descriptor. The format is as follows:
- jdbc:postgresql:database
- jdbc:postgresql://host/database
- jdbc:postgresql://host:port/database
- jdbc:postgresql://host:port/database?param1=value1¶m2=value2
- jdbc:postgresql://host1:port1,host2:port2/database?param1=value1¶m2=value2
NOTE:
- **database** indicates the name of the database to connect.
- **host** indicates the name or IP address of the database server.
If a machine connected to MogDB is not in the same network segment as MogDB, the IP address specified by **host** should be the value of **coo.cooListenIp2** (application access IP address) set in Manager.
For security purposes, the primary database node forbids access from other nodes in MogDB without authentication. To access the primary database node from inside MogDB, deploy the JDBC program on the host where the primary database node is located and set **host** to **127.0.0.1**. Otherwise, the error message "FATAL: Forbid remote connection with trust method!" may be displayed.
It is recommended that the service system be deployed outside MogDB. If it is deployed inside, database performance may be affected.
By default, the local host is used to connect to the server.
- **port** indicates the port number of the database server.
By default, the database on port 5432 of the local host is connected.
- **param** indicates a database connection attribute.
The parameter can be configured in the URL. The URL starts with a question mark (?), uses an equal sign (=) to assign a value to the parameter, and uses an ampersand (&) to separate parameters. You can also use the attributes of the **info** object for configuration. For details, see the example below.
- **value** indicates the database connection attribute values.
The **connectTimeout** and **socketTimeout** parameters must be set for connection. If they are not set, the default value **0** is used, indicating that the connection will not time out. When the network between the DN and client is faulty, the client does not receive the ACK packet from the DN. In this case, the client starts the timeout retransmission mechanism to continuously retransmit packets. A timeout error is reported only when the timeout interval reaches the default value **600s**. As a result, the RTO is high. | +| info | Database connection attributes (all attributes are case sensitive). Common attributes are described as follows:
- **PGDBNAME**: string type. This parameter specifies the database name. (This parameter does not need to be set in the URL. The system automatically parses the URL to obtain its value.)
- **PGHOST**: string type. This parameter specifies the host IP address. For details, see the example below.
- **PGPORT**: integer type. This parameter specifies the host port number. For details, see the example below.
- **user**: string type. This parameter specifies the database user who creates the connection.
- **password**: string type. This parameter specifies the password of the database user.
- **enable_ce**: string type. If **enable_ce** is set to **1**, JDBC supports encrypted equality query.
- **loggerLevel**: string type. The following log levels are supported:**OFF**, **DEBUG**, and **TRACE**. The value **OFF** indicates that the log function is disabled. **DEBUG** and **TRACE** logs record information of different levels.
- **loggerFile**: string type. This parameter specifies the name of a log file. You can specify a directory for storing logs. If no directory is specified, logs are stored in the directory where the client program is running.
- **allowEncodingChanges**: Boolean type. If this parameter is set to **true**, the character set type can be changed. This parameter is used together with **characterEncoding=CHARSET** to set the character set. The two parameters are separated by ampersands (&). The value of **characterEncoding** can be **UTF8**, **GBK**, or **LATIN1**.
- **currentSchema**: string type. This parameter specifies the schema to be set in **search-path**.
- **hostRecheckSeconds**: integer type. After JDBC attempts to connect to a host, the host status is saved: connection success or connection failure. This status is trusted within the duration specified by **hostRecheckSeconds**. After the duration expires, the status becomes invalid. The default value is 10 seconds.
- **ssl**: Boolean type. This parameter specifies a connection in SSL mode. When **ssl** is set to **true**, the NonValidatingFactory channel and certificate mode are supported.
1. For the NonValidatingFactory channel, configure the username and password and set **SSL** to **true**.
2. In certification mode, configure the client certificate, key, and root certificate, and set **SSL** to **true**.
- **sslmode**: string type. This parameter specifies the SSL authentication mode. The value can be **require**, **verify-ca**, or **verify-full**.
- **require**: The system attempts to set up an SSL connection. If there is a CA file, the system performs verification as if the parameter was set to **verify-ca**.
- **verify-ca**: The system attempts to set up an SSL connection and checks whether the server certificate is issued by a trusted CA.
- **verify-full**: The system attempts to set up an SSL connection, checks whether the server certificate is issued by a trusted CA, and checks whether the host name of the server is the same as that in the certificate.
- **sslcert**: string type. This parameter specifies the complete path of the certificate file. The type of the client and server certificates is **End Entity**.
- **sslkey**: string type. This parameter specifies the complete path of the key file. You must run the following command to convert the client certificate to the DER format:
`openssl pkcs8 -topk8 -outform DER -in client.key -out client.key.pk8 -nocrypt`
- **sslrootcert**: string type. This parameter specifies the name of the SSL root certificate. The root certificate type is CA.
- **sslpassword**: string type. This parameter is provided for ConsoleCallbackHandler.
- **sslpasswordcallback**: string type. This parameter specifies the class name of the SSL password provider. The default value is **org.postgresql.ssl.jdbc4.LibPQFactory.ConsoleCallbackHandler**.
- **sslfactory**: string type. This parameter specifies the class name used by SSLSocketFactory to establish an SSL connection.
- **sslfactoryarg**: string type. The value is an optional parameter of the constructor function of the **sslfactory** class and is not recommended.
- **sslhostnameverifier**: string type. This parameter specifies the class name of the host name verifier. The interface must implement javax.net.ssl.HostnameVerifier. The default value is **org.postgresql.ssl.PGjdbcHostnameVerifier**.
- **loginTimeout**: integer type. This parameter specifies the waiting time for establishing the database connection, in seconds.**connectTimeout**: integer type. This parameter specifies the timeout duration for connecting to a server, in seconds. If the time taken to connect to a server exceeds the value specified, the connection is interrupted. If the value is **0**, the timeout mechanism is disabled.
- **socketTimeout**: integer type. This parameter specifies the timeout duration for a socket read operation, in seconds. If the time taken to read data from a server exceeds the value specified, the connection is closed. If the value is **0**, the timeout mechanism is disabled.
- **cancelSignalTimeout**: integer type. Cancel messages may cause a block. This parameter controls **connectTimeout** and **socketTimeout** in a cancel message, in seconds. The default value is 10 seconds.
- **tcpKeepAlive**: Boolean type. This parameter is used to enable or disable TCP keepalive detection. The default value is **false**.
- **logUnclosedConnections**: Boolean type. The client may leak a connection object because it does not call the connection object's close() method. These objects will be collected as garbage and finalized using the finalize() method. If the caller ignores this operation, this method closes the connection.
- **assumeMinServerVersion**: string type. The client sends a request to set a floating point. This parameter specifies the version of the server to connect, for example, **assumeMinServerVersion=9.0**. This parameter can reduce the number of packets to send during connection setup.
- **ApplicationName**: string type. This parameter specifies the name of the JDBC driver that is being connected. You can query the **pg_stat_activity** table on the primary database node to view information about the client that is being connected. The JDBC driver name is displayed in the **application_name** column. The default value is **PostgreSQL JDBC Driver**.
- **connectionExtraInfo**: Boolean type. This parameter specifies whether the JDBC driver reports the driver deployment path and process owner to the database.The value can be **true** or **false**. The default value is **false**. If **connectionExtraInfo** is set to **true**, the JDBC driver reports the driver deployment path and process owner, and URL connection configuration information to the database and displays the information in the **connection_info** parameter. In this case, you can query the information from **PG_STAT_ACTIVITY**.
- **autosave**: string type. The value can be **always**, **never**, or **conservative**. The default value is **never**. This parameter specifies the action that the driver should perform upon a query failure. If **autosave** is set to **always**, the JDBC driver sets a savepoint before each query and rolls back to the savepoint if the query fails. If **autosave** is set to **never**, there is no savepoint. If **autosave** is set to **conservative**, a savepoint is set for each query. However, the system rolls back and retries only when there is an invalid statement.
- **protocolVersion**: integer type. This parameter specifies the connection protocol version. Only version 3 is supported. Note: MD5 encryption is used when this parameter is set. You must use the following command to change the database encryption mode:**gs_guc set -N all -I all -c "password_encryption_type=1"**. After MogDB is restarted, create a user that uses MD5 encryption to encrypt passwords. You must also change the client connection mode to **md5** in the **pg_hba.conf** file. Log in as the new user (not recommended).
NOTE:
The MD5 encryption algorithm has lower security and poses security risks. Therefore, you are advised to use a more secure encryption algorithm.
- **prepareThreshold**: integer type. This parameter specifies the time when the parse statement is sent. The default value is **5**. It takes a long time to parse an SQL statement for the first time, but a short time to parse SQL statements later because of cache. If a session runs an SQL statement multiple consecutive times and the number of execution times exceeds the value of **prepareThreshold**, JDBC does not send the parse command to the SQL statement.
- **preparedStatementCacheQueries**: integer type. This parameter specifies the number of queries cached in each connection. The default value is **256**. If more than 256 different queries are used in the prepareStatement() call, the least recently used query cache will be discarded. The value **0** indicates that the cache function is disabled.
- **preparedStatementCacheSizeMiB**: integer type. This parameter specifies the maximum cache size of each connection, in MB. The default value is **5**. If the size of the cached queries exceeds 5 MB, the least recently used query cache will be discarded. The value **0** indicates that the cache function is disabled.
- **databaseMetadataCacheFields**: integer type. The default value is **65536**. This parameter specifies the maximum cache size of each connection. The value **0** indicates that the cache function is disabled.
- **databaseMetadataCacheFieldsMiB**: integer type. The default value is **5**. This parameter specifies the maximum cache size of each connection, in MB. The value **0** indicates that the cache function is disabled.
- **stringtype**: string type. The value can be **false**, **unspecified**, or **varchar**. The default value is **varchar**. This parameter specifies the type of the **PreparedStatement** parameter used by the setString() method. If **stringtype** is set to **varchar**, these parameters are sent to the server as varchar parameters. If **stringtype** is set to **unspecified**, these parameters are sent to the server as an untyped value, and the server attempts to infer their appropriate type.
- **batchMode**: Boolean type. This parameter specifies whether to connect the database in batch mode. The default value is **on**, indicating that the batch mode is enabled.
- **fetchsize**: integer type. This parameter specifies the default fetchsize for statements in the created connection. The default value is **0**, indicating that all results are obtained at a time.
- **reWriteBatchedInserts**: Boolean type. During batch import, set this parameter to **true** to combine **N** insertion statements into one: insert into TABLE_NAME values(values1, …, valuesN), …, (values1, …, valuesN). To use this parameter, set **batchMode** to **off**.
- **unknownLength**: integer type. The default value is **Integer.MAX\_VALUE**. This parameter specifies the length of the unknown length type when the data of some postgresql types (such as TEXT) is returned by functions such as ResultSetMetaData.getColumnDisplaySize and ResultSetMetaData.getPrecision.
- **defaultRowFetchSize**: integer type. This parameter specifies the number of rows read by fetch in ResultSet at a time. Limiting the number of rows read each time in a database access request can avoid unnecessary memory consumption, thereby avoiding out of memory exception. The default value is **0**, indicating that all rows are obtained at a time in ResultSet. There is no negative value.
- **binaryTransfer**: Boolean type. This parameter specifies whether data is sent and received in binary format. The default value is **false**.
- **binaryTransferEnable**: string type. This parameter specifies the type for which binary transmission is enabled. Every two types are separated by commas (,). You can select either the OID or name, for example, binaryTransferEnable=Integer4_ARRAY,Integer8_ARRAY.
For example, if the OID name is **BLOB** and the OID number is 88, you can configure the OID as follows:
**binaryTransferEnable=BLOB or binaryTransferEnable=88**
- **binaryTransferDisEnable**: string type. This parameter specifies the type for which binary transmission is disabled. Every two types are separated by commas (,). You can select either the OID or name. The value of this parameter overwrites the value of **binaryTransferEnable**.
- **blobMode**: string type. This parameter sets the setBinaryStream method to assign values to different types of data. The value **on** indicates that values are assigned to blob data. The value **off** indicates that values are assigned to bytea data. The default value is **on**.
- **socketFactory**: string type. This parameter specifies the name of the class used to create a socket connection with the server. This class must implement the **javax.net.SocketFactory** interface and define a constructor with no parameter or a single string parameter.
- **socketFactoryArg**: string type. The value is an optional parameter of the constructor function of the socketFactory class and is not recommended.
- **receiveBufferSize**: integer type. This parameter is used to set **SO\_RCVBUF** on the connection stream.
- **sendBufferSize**: integer type. This parameter is used to set **SO\_SNDBUF** on the connection stream.
- **preferQueryMode**: string type. The value can be **extended**, **extendedForPrepared**, **extendedCacheEverything**, or **simple**. This parameter specifies the query mode. In **simple** mode, the query is executed without parsing or binding. In **extended** mode, the query is executed and bound. The **extendedForPrepared** mode is used for prepared statement extension. In **extendedCacheEverything** mode, each statement is cached.
- **targetServerType**: string type. The difference between primary DN and standby DN is whether the DN allows the write operation in the URL connection string. The default value is **any**. The value can be **any**, **master**, **slave**, or **preferSlave**.
- **master**: attempts to connect to a primary DN in the URL connection string. If the primary DN cannot be found, an exception is thrown.
- **slave**: attempts to connect to a standby DN in the URL connection string. If the primary DN cannot be found, an exception is thrown.
- **preferSlave** attempts to connect to a standby DN (if available) in the URL connection string. Otherwise, it connects to the primary DN.
- **any** attempts to connect to any DN in the URL connection string.
- **priorityServers**: integer type. This value is used to specify the first **n** nodes configured in the URL as the primary database instance to be connected preferentially. The default value is **null**. The value is a number greater than 0 and less than the number of DNs configured in the URL.
For example, jdbc:postgresql://host1:port1,host2:port2,host3:port3,host4:port4,/database?priorityServers=2. That is, **host1** and **host2** are primary database instance nodes, and **host3** and **host4** are DR database instance nodes.
- **forceTargetServerSlave**: Boolean type. This parameter specifies whether to enable the function of forcibly connecting to the standby node and forbid the existing connections to be used on the standby node that is promoted to primary during the primary/standby switchover of the database instance. The default value is **false**, indicating that the function of forcibly connecting to the standby node is disabled. **true**: The function of forcibly connecting to the standby node is enabled. | +| user | Database user. | +| password | Password of the database user. | + +**Examples** + +```java +// The following code encapsulates database connection operations into an interface. The database can then be connected using an authorized username and a password. +public static Connection getConnect(String username, String passwd) + { + // Driver class. + String driver = "org.postgresql.Driver"; + // Database connection descriptor. + String sourceURL = "jdbc:postgresql://10.10.0.13:8000/postgres"; + Connection conn = null; + + try + { + // Load the driver. + Class.forName(driver); + } + catch( Exception e ) + { + e.printStackTrace(); + return null; + } + + try + { + // Create a connection. + conn = DriverManager.getConnection(sourceURL, username, passwd); + System.out.println("Connection succeed!"); + } + catch(Exception e) + { + e.printStackTrace(); + return null; + } + + return conn; + }; +// The following code uses the Properties object as a parameter to establish a connection. +public static Connection getConnectUseProp(String username, String passwd) + { + // Driver class. + String driver = "org.postgresql.Driver"; + // Database connection descriptor. + String sourceURL = "jdbc:postgresql://10.10.0.13:8000/postgres?"; + Connection conn = null; + Properties info = new Properties(); + + try + { + // Load the driver. + Class.forName(driver); + } + catch( Exception e ) + { + e.printStackTrace(); + return null; + } + + try + { + info.setProperty("user", username); + info.setProperty("password", passwd); + // Create a connection. + conn = DriverManager.getConnection(sourceURL, info); + System.out.println("Connection succeed!"); + } + catch(Exception e) + { + e.printStackTrace(); + return null; + } + + return conn; + }; +``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/6-connecting-to-a-database-using-ssl.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/6-connecting-to-a-database-using-ssl.md new file mode 100644 index 0000000000000000000000000000000000000000..9351b393ecb9d139cd4c5a7de9a73391d1fe27a4 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/6-connecting-to-a-database-using-ssl.md @@ -0,0 +1,147 @@ +--- +title: Connecting to the Database (Using SSL) +summary: Connecting to the Database (Using SSL) +author: Guo Huan +date: 2021-04-26 +--- + +# Connecting to the Database (Using SSL) + +When establishing connections to the MogDB server using JDBC, you can enable SSL connections to encrypt client and server communications for security of sensitive data transmission on the Internet. This section describes how applications establish an SSL connection to MogDB using JDBC. To start the SSL mode, you must have the server certificate, client certificate, and private key files. For details on how to obtain these files, see related documents and commands of OpenSSL. + +**Configuring the Server** + +The SSL mode requires a root certificate, a server certificate, and a private key. + +Perform the following operations (assuming that the license files are saved in the data directory **/mogdb/data/datanode** and the default file names are used): + +1. Log in as the OS user **omm** to the primary node of the database. + +2. Generate and import a certificate. + + Generate an SSL certificate. For details, see **Generating Certificates**. Copy the generated **server.crt**, **server.key**, and **cacert.pem** files to the data directory on the server. + + Run the following command to query the data directory of the database node. The instance column indicates the data directory. + + ```bash + gs_om -t status --detail + ``` + + In the Unix OS, **server.crt** and **server.key** must deny the access from the external or any group. Run the following command to set this permission: + + ```bash + chmod 0600 server.key + ``` + +3. Enable the SSL authentication mode. + + ```bash + gs_guc set -D /mogdb/data/datanode -c "ssl=on" + ``` + +4. Set client access authentication parameters. The IP address is the IP address of the host to be connected. + + ```bash + gs_guc reload -D /mogdb/data/datanode -h "hostssl all all 127.0.0.1/32 cert" + gs_guc reload -D /mogdb/data/datanode -h "hostssl all all IP/32 cert" + ``` + + Clients on the **127.0.0.1⁄32** network segment can connect to MogDB servers in SSL mode. + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-notice.gif) **NOTICE:** + > + > - If **METHOD** is set to **cert** in the **pg_hba.conf** file of the server, the client must use the username (common name) configured in the license file (**client.crt**) for the database connection. If **METHOD** is set to **md5**, **sm3** or **sha256**, there is no such a restriction. + > - The MD5 encryption algorithm has lower security and poses security risks. Therefore, you are advised to use a more secure encryption algorithm. + +5. Configure the digital certificate parameters related to SSL authentication. + + The information following each command indicates operation success. + + ```bash + gs_guc set -D /mogdb/data/datanode -c "ssl_cert_file='server.crt'" + gs_guc set: ssl_cert_file='server.crt' + ``` + + ```bash + gs_guc set -D /mogdb/data/datanode -c "ssl_key_file='server.key'" + gs_guc set: ssl_key_file='server.key' + ``` + + ```bash + gs_guc set -D /mogdb/data/datanode -c "ssl_ca_file='cacert.pem'" + gs_guc set: ssl_ca_file='cacert.pem' + ``` + +6. Restart the database. + + ```bash + gs_om -t stop && gs_om -t start + ``` + +7. Generate and upload a certificate file. + +**Configuring the Client** + +To configure the client, perform the following steps: + +Upload the certificate files **client.key.pk8**, **client.crt**, and **cacert.pem** generated in **Configuring the Server** to the client. + +**Example** + +Note: Choose one of example 1 and example 2. + +```java +import java.sql.Connection; +import java.util.Properties; +import java.sql.DriverManager; +import java.sql.Statement; +import java.sql.ResultSet; + +public class SSL{ + public static void main(String[] args) { + Properties urlProps = new Properties(); + String urls = "jdbc:postgresql://10.29.37.136:8000/postgres"; + + /** +* ================== Example 1: The NonValidatingFactory channel is used, and MTETHOD in the pg_hba.conf file is not cert. + */ +/* + urlProps.setProperty("sslfactory","org.postgresql.ssl.NonValidatingFactory"); + urlProps.setProperty("user", "world"); +//test@123 is the password specified when user CREATE USER world WITH PASSWORD 'test123@' is created. + urlProps.setProperty("password", "test@123"); + urlProps.setProperty("ssl", "true"); +*/ + /** +* ================== Example 2 - 5: Use a certificate. In the pg_hba.conf file, MTETHOD is cert. + */ + urlProps.setProperty("sslcert", "client.crt"); +// Client key in DER format + urlProps.setProperty("sslkey", "client.key.pk8"); + urlProps.setProperty("sslrootcert", "cacert.pem"); + urlProps.setProperty("user", "world"); + /* ================== Example 2: Set ssl to true to use the certificate for authentication.*/ + urlProps.setProperty("ssl", "true"); + /* ================== Example 3: Set sslmode to require to use the certificate for authentication. */ +// urlProps.setProperty("sslmode", "require"); + /* ================== Example 4: Set sslmode to verify-ca to use the certificate for authentication. */ +// urlProps.setProperty("sslmode", "verify-ca"); + /* ================== Example 5: Set sslmode to verify-full to use the certificate (in the Linux OS) for authentication. */ +// urls = "jdbc:postgresql://world:8000/postgres"; +// urlProps.setProperty("sslmode", "verify-full"); + + try { + Class.forName("org.postgresql.Driver").newInstance(); + } catch (Exception e) { + e.printStackTrace(); + } + try { + Connection conn; + conn = DriverManager.getConnection(urls,urlProps); + conn.close(); + } catch (Exception e) { + e.printStackTrace(); + } + } +} +``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/7-running-sql-statements.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/7-running-sql-statements.md new file mode 100644 index 0000000000000000000000000000000000000000..0a8b3f94ea82bc33dda7b00c878b03c3360d2eee --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/7-running-sql-statements.md @@ -0,0 +1,166 @@ +--- +title: Running SQL Statements +summary: Running SQL Statements +author: Guo Huan +date: 2021-04-26 +--- + +# Running SQL Statements + +**Running a Common SQL Statement** + +To enable an application to operate data in the database by running SQL statements (statements that do not need to transfer parameters), perform the following operations: + +1. Create a statement object by calling the **createStatement** method in **Connection**. + + ```bash + Connection conn = DriverManager.getConnection("url","user","password"); + Statement stmt = conn.createStatement(); + ``` + +2. Run the SQL statement by calling the **executeUpdate** method in **Statement**. + + ```bash + int rc = stmt.executeUpdate("CREATE TABLE customer_t1(c_customer_sk INTEGER, c_customer_name VARCHAR(32));"); + ``` + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + > If an execution request (not in a transaction block) received in the database contains multiple statements, the request is packed into a transaction. **VACUUM** is not supported in a transaction block. If one of the statements fails, the entire request will be rolled back. + +3. Close the statement object. + + ``` + stmt.close(); + ``` + +**Running a Prepared SQL Statement** + +Prepared statements are complied and optimized once but can be used in different scenarios by assigning multiple values. Using prepared statements improves execution efficiency. If you want to run a statement for several times, use a precompiled statement. Perform the following operations: + +1. Create a prepared statement object by calling the prepareStatement method in Connection. + + ```json + PreparedStatement pstmt = con.prepareStatement("UPDATE customer_t1 SET c_customer_name = ? WHERE c_customer_sk = 1"); + ``` + +2. Set parameters by calling the setShort method in PreparedStatement. + + ```json + pstmt.setShort(1, (short)2); + ``` + +3. Run the prepared statement by calling the executeUpdate method in PreparedStatement. + + ```json + int rowcount = pstmt.executeUpdate(); + ``` + +4. Close the prepared statement object by calling the close method in PreparedStatement. + + ```json + pstmt.close(); + ``` + +**Calling a Stored Procedure** + +To call an existing stored procedure through JDBC in MogDB, perform the following operations: + +1. Create a call statement object by calling the **prepareCall** method in **Connection**. + + ```bash + Connection myConn = DriverManager.getConnection("url","user","password"); + CallableStatement cstmt = myConn.prepareCall("{? = CALL TESTPROC(?,?,?)}"); + ``` + +2. Set parameters by calling the **setInt** method in **CallableStatement**. + + ``` + cstmt.setInt(2, 50); + cstmt.setInt(1, 20); + cstmt.setInt(3, 90); + ``` + +3. Register an output parameter by calling the **registerOutParameter** method in **CallableStatement**. + + ``` + cstmt.registerOutParameter(4, Types.INTEGER); // Register an OUT parameter of the integer type. + ``` + +4. Call the stored procedure by calling the **execute** method in **CallableStatement**. + + ``` + cstmt.execute(); + ``` + +5. Obtain the output parameter by calling the **getInt** method in **CallableStatement**. + + ``` + int out = cstmt.getInt(4); // Obtain the OUT parameter. + ``` + + Example: + + ``` + // The following stored procedure (containing the OUT parameter) has been created: + create or replace procedure testproc + ( + psv_in1 in integer, + psv_in2 in integer, + psv_inout in out integer + ) + as + begin + psv_inout := psv_in1 + psv_in2 + psv_inout; + end; + / + ``` + +6. Close the call statement by calling the **close** method in **CallableStatement**. + + ``` + cstmt.close(); + ``` + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + > + > - Many database classes such as Connection, Statement, and ResultSet have a close() method. Close these classes after using their objects. Closing Connection will close all the related Statements, and closing a Statement will close its ResultSet. + > - Some JDBC drivers support named parameters, which can be used to set parameters by name rather than sequence. If a parameter has the default value, you do not need to specify any parameter value but can use the default value directly. Even though the parameter sequence changes during a stored procedure, the application does not need to be modified. Currently, the MogDB JDBC driver does not support this method. + > - MogDB does not support functions containing OUT parameters, or stored procedures and function parameters containing default values. + > + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-notice.gif) **NOTICE**: + > + > - If JDBC is used to call a stored procedure whose returned value is a cursor, the returned cursor cannot be used. + > - A stored procedure and an SQL statement must be run separately. + +**Batch Processing** + +When a prepared statement processes multiple pieces of similar data, the database creates only one execution plan. This improves compilation and optimization efficiency. Perform the following operations: + +1. Create a prepared statement object by calling the prepareStatement method in Connection. + + ``` + Connection conn = DriverManager.getConnection("url","user","password"); + PreparedStatement pstmt = conn.prepareStatement("INSERT INTO customer_t1 VALUES (?)"); + ``` + +2. Call the setShort parameter for each piece of data, and call addBatch to confirm that the setting is complete. + + ``` + pstmt.setShort(1, (short)2); + pstmt.addBatch(); + ``` + +3. Perform batch processing by calling the executeBatch method in PreparedStatement. + + ``` + int[] rowcount = pstmt.executeBatch(); + ``` + +4. Close the prepared statement object by calling the close method in PreparedStatement. + + ``` + pstmt.close(); + ``` + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + > Do not terminate a batch processing action when it is ongoing; otherwise, database performance will deteriorate. Therefore, disable automatic commit during batch processing. Manually commit several rows at a time. The statement for disabling automatic commit is **conn.setAutoCommit(false);**. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/8-processing-data-in-a-result-set.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/8-processing-data-in-a-result-set.md new file mode 100644 index 0000000000000000000000000000000000000000..d45f9df5ee3e1d155fa4e5f44a1b70f2c2a59cb2 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/8-processing-data-in-a-result-set.md @@ -0,0 +1,76 @@ +--- +title: Processing Data in a Result Set +summary: Processing Data in a Result Set +author: Guo Huan +date: 2021-04-26 +--- + +# Processing Data in a Result Set + +**Setting a Result Set Type** + +Different types of result sets apply to different application scenarios. Applications select proper types of result sets based on requirements. Before running an SQL statement, you must create a statement object. Some methods of creating statement objects can set the type of a result set. [Table 1](#Result set types) lists result set parameters. The related Connection methods are as follows: + +``` +// Create a Statement object. This object will generate a ResultSet object with a specified type and concurrency. +createStatement(int resultSetType, int resultSetConcurrency); + +// Create a PreparedStatement object. This object will generate a ResultSet object with a specified type and concurrency. +prepareStatement(String sql, int resultSetType, int resultSetConcurrency); + +// Create a CallableStatement object. This object will generate a ResultSet object with a specified type and concurrency. +prepareCall(String sql, int resultSetType, int resultSetConcurrency); +``` + +**Table 1** Result set types + +| Parameter | Description | +| :------------------- | :----------------------------------------------------------- | +| resultSetType | Type of a result set. There are three types of result sets:
- **ResultSet.TYPE_FORWARD_ONLY**: The ResultSet object can only be navigated forward. It is the default value.
- **ResultSet.TYPE_SCROLL_SENSITIVE**: You can view the modified result by scrolling to the modified row.
- **ResultSet.TYPE_SCROLL_INSENSITIVE**: The ResultSet object is insensitive to changes in the underlying data source.
NOTE:
After a result set has obtained data from the database, the result set is insensitive to data changes made by other transactions, even if the result set type is **ResultSet.TYPE_SCROLL_SENSITIVE**. To obtain up-to-date data of the record pointed by the cursor from the database, call the refreshRow() method in a ResultSet object. | +| resultSetConcurrency | Concurrency type of a result set. There are two types of concurrency.
- **ResultSet.CONCUR_READ_ONLY**: Data in a result set cannot be updated except that an updated statement has been created in the result set data.
- **ResultSet.CONCUR_UPDATEABLE**: changeable result set. The concurrency type for a result set object can be updated if the result set is scrollable. | + +**Positioning a Cursor in a Result Set** + +ResultSet objects include a cursor pointing to the current data row. The cursor is initially positioned before the first row. The next method moves the cursor to the next row from its current position. When a ResultSet object does not have a next row, a call to the next method returns **false**. Therefore, this method is used in the while loop for result set iteration. However, the JDBC driver provides more cursor positioning methods for scrollable result sets, which allows positioning cursor in the specified row. [Table 2](#Methods for positioning) describes these methods. + +**Table 2** Methods for positioning a cursor in a result set + +| Method | Description | +| :------------ | :----------------------------------------------------------- | +| next() | Moves cursor to the next row from its current position. | +| previous() | Moves cursor to the previous row from its current position. | +| beforeFirst() | Places cursor before the first row. | +| afterLast() | Places cursor after the last row. | +| first() | Places cursor to the first row. | +| last() | Places cursor to the last row. | +| absolute(int) | Places cursor to a specified row. | +| relative(int) | Moves the row specified by the forward parameter (that is, the value of is 1, which is equivalent to next()) or backward (that is, the value of is -1, which is equivalent to previous()). | + +**Obtaining the Cursor Position from a Result Set** + +This cursor positioning method will be used to change the cursor position for a scrollable result set. The JDBC driver provides a method to obtain the cursor position in a result set. [Table 3](#Methods for obtaining) describes these methods. + +**Table 3** Methods for obtaining a cursor position in a result set + +| Method | Description | +| :-------------- | :------------------------------------------------- | +| isFirst() | Checks whether the cursor is in the first row. | +| isLast() | Checks whether the cursor is in the last row. | +| isBeforeFirst() | Checks whether the cursor is before the first row. | +| isAfterLast() | Checks whether the cursor is after the last row. | +| getRow() | Gets the current row number of the cursor. | + +**Obtaining Data from a Result Set** + +ResultSet objects provide a variety of methods to obtain data from a result set. [Table 4](#Common methods for obtaining) describes the common methods for obtaining data. If you want to know more about other methods, see JDK official documents. + +**Table 4** Common methods for obtaining data from a result set + +| Method | Description | +| :----------------------------------- | :----------------------------------------------------------- | +| int getInt(int columnIndex) | Retrieves the value of the column designated by a column index in the current row as an integer. | +| int getInt(String columnLabel) | Retrieves the value of the column designated by a column label in the current row as an integer. | +| String getString(int columnIndex) | Retrieves the value of the column designated by a column index in the current row as a string. | +| String getString(String columnLabel) | Retrieves the value of the column designated by a column label in the current row as a string. | +| Date getDate(int columnIndex) | Retrieves the value of the column designated by a column index in the current row as a date. | +| Date getDate(String columnLabel) | Retrieves the value of the column designated by a column name in the current row as a date. | diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/8.1-log-management.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/8.1-log-management.md new file mode 100644 index 0000000000000000000000000000000000000000..478aa24053789831a28064b2620f81ca9e278f39 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/8.1-log-management.md @@ -0,0 +1,142 @@ +--- +title: Log Management +summary: Log Management +author: Zhang Cuiping +date: 2021-10-11 +--- + +# Log Management + +The MogDB JDBC driver uses log records to help solve problems when the MogDB JDBC driver is used in applications. MogDB JDBC supports the following log management methods: + +1. Use the connection properties to specify a temporary log file. +2. Use the SLF4J log framework for interconnecting with applications. +3. Use the JdkLogger log framework for interconnecting with applications. + +SLF4J and JdkLogger are mainstream frameworks for Java application log management in the industry. For details about how to use these frameworks, see the official documents (SLF4J: ; JdkLogger: ). + +Method 1: Use the connection attribute. + +Configure loggerLevel and loggerFile in the URL. + +This method is easy to configure and applies to the debug driver. However, the log level, file size, and file quantity cannot be controlled. If logs are not manually processed in a timely manner, the disk space will be used up. Therefore, this method is not recommended except for the debug driver. + +Example: + +``` +public static Connection GetConnection(String username, String passwd){ + + String sourceURL = "jdbc:postgresql://10.10.0.13:8000/postgres?loggerLevel=DEBUG&loggerFile=gsjdbc.log"; + Connection conn = null; + + try{ +// Create a connection. + conn = DriverManager.getConnection(sourceURL,username,passwd); + System.out.println("Connection succeed!"); + }catch (Exception e){ + e.printStackTrace(); + return null; + } + return conn; +} +``` + +Method 2: Use the SLF4J log framework for interconnecting with applications. + +When a connection is set up, **logger=Slf4JLogger** is configured in the URL. + +The SLF4J may be implemented by using Log4j or Log4j2. When the Log4j is used to implement the SLF4J, the following JAR packages need to be added: **log4j-\*.jar**, **slf4j-api-\*.jar**, and **slf4j-log4\*-\*.jar** (*varies according to versions), and configuration file **log4j.properties**. If the Log4j2 is used to implement the SLF4J, you need to add the following JAR packages: **log4j-api-\*.jar**, **log4j-core-\*.jar**, **log4j-slf4j18-impl-\*.jar**, and **slf4j-api-\*-alpha1.jar** (* varies according to versions), and configuration file **log4j2.xml**. + +This method supports log management and control. The SLF4J can implement powerful log management and control functions through related configurations in files. This method is recommended. + +Example: + +``` +public static Connection GetConnection(String username, String passwd){ + + String sourceURL = "jdbc:postgresql://10.10.0.13:8000/postgres?logger=Slf4JLogger"; + Connection conn = null; + + try{ +// Create a connection. + conn = DriverManager.getConnection(sourceURL,username,passwd); + System.out.println("Connection succeed!"); + }catch (Exception e){ + e.printStackTrace(); + return null; + } + return conn; +} +``` + +The following is an example of the **log4j.properties** file: + +``` +log4j.logger.org.postgresql=ALL, log_gsjdbc + +# Default file output configuration +log4j.appender.log_gsjdbc=org.apache.log4j.RollingFileAppender +log4j.appender.log_gsjdbc.Append=true +log4j.appender.log_gsjdbc.File=gsjdbc.log +log4j.appender.log_gsjdbc.Threshold=TRACE +log4j.appender.log_gsjdbc.MaxFileSize=10MB +log4j.appender.log_gsjdbc.MaxBackupIndex=5 +log4j.appender.log_gsjdbc.layout=org.apache.log4j.PatternLayout +log4j.appender.log_gsjdbc.layout.ConversionPattern=%d %p %t %c - %m%n +log4j.appender.log_gsjdbc.File.Encoding = UTF-8 +``` + +The following is an example of the **log4j2.xml** file: + +``` + + + + + + + + + + + + + + + + + + + + + + + + + + + + +``` + +Method 3: Use the JdkLogger log framework for interconnecting with applications. + +The default Java logging framework stores its configurations in a file named **logging.properties**. Java installs the global configuration file in the folder in the Java installation directory. The **logging.properties** file can also be created and stored with a single project. + +Configuration example of **logging.properties**: + +``` +# Specify the processing program as a file. +handlers= java.util.logging.FileHandler + +# Specify the default global log level. +.level= ALL + +# Specify the log output control standard. +java.util.logging.FileHandler.level=ALL +java.util.logging.FileHandler.pattern = gsjdbc.log +java.util.logging.FileHandler.limit = 500000 +java.util.logging.FileHandler.count = 30 +java.util.logging.FileHandler.formatter = java.util.logging.SimpleFormatter +java.util.logging.FileHandler.append=false +``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/9-closing-a-connection.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/9-closing-a-connection.md new file mode 100644 index 0000000000000000000000000000000000000000..d597422110816a06857459eed3ae85e666b26ad8 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/2-development-based-on-jdbc/9-closing-a-connection.md @@ -0,0 +1,12 @@ +--- +title: Closing a Connection +summary: Closing a Connection +author: Guo Huan +date: 2021-04-26 +--- + +# Closing a Connection + +After you complete required data operations in the database, close the database connection. + +Call the close method to close the connection, for example, **Connection conn = DriverManager.getConnection("url","user","password"); conn.close();** diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/1-development-based-on-odbc-overview.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/1-development-based-on-odbc-overview.md new file mode 100644 index 0000000000000000000000000000000000000000..f4a1710b8df4539903ca8fcf0220da9980ffcfdb --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/1-development-based-on-odbc-overview.md @@ -0,0 +1,34 @@ +--- +title: Overview +summary: Overview +author: Guo Huan +date: 2021-04-26 +--- + +# Overview + +Open Database Connectivity (ODBC) is a Microsoft API for accessing databases based on the X/OPEN CLI. Applications interact with the database through the APIs provided by ODBC, which enhances their portability, scalability, and maintainability. + +[Figure 1](#ODBC) shows the system structure of ODBC. + +**Figure 1** ODBC system structure + +![odbc-system-structure](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/development-based-on-odbc-overview-2.png) + +MogDB supports ODBC 3.5 in the following environments. + +**Table 1** OSs Supported by ODBC + +| OS | Platform | +| :------------------------------------------------- | :------- | +| CentOS 6.4/6.5/6.6/6.7/6.8/6.9/7.0/7.1/7.2/7.3/7.4 | x86_64 | +| CentOS 7.6 | ARM64 | +| EulerOS 2.0 SP2/SP3 | x86_64 | +| EulerOS 2.0 SP8 | ARM64 | + +The ODBC Driver Manager running on UNIX or Linux can be unixODBC or iODBC. unixODBC-2.3.0 is used as the component for connecting the database. + +Windows has a native ODBC Driver Manager. You can locate **Data Sources (ODBC)** by choosing **Control Panel** > **Administrative Tools**. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> The current database ODBC driver is based on an open-source version and may be incompatible with data types tinyint, smalldatetime, and nvarchar2. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/2-odbc-packages-dependent-libraries-and-header-files.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/2-odbc-packages-dependent-libraries-and-header-files.md new file mode 100644 index 0000000000000000000000000000000000000000..16764e9dddb25c43a26673eda71911a855ee0c1b --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/2-odbc-packages-dependent-libraries-and-header-files.md @@ -0,0 +1,12 @@ +--- +title: ODBC Packages, Dependent Libraries, and Header Files +summary: ODBC Packages, Dependent Libraries, and Header Files +author: Guo Huan +date: 2021-04-26 +--- + +# ODBC Packages, Dependent Libraries, and Header Files + +**ODBC Packages for the Linux OS** + +Obtain the [**openGauss-x.x.x-ODBC.tar.gz**](https://opengauss.org/en/download.html) package from the release package. In the Linux OS, header files (including **sql.h** and **sqlext.h**) and library (**libodbc.so**) are required in application development. These header files and library can be obtained from the **unixODBC-2.3.0** installation package. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/3-configuring-a-data-source-in-the-linux-os.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/3-configuring-a-data-source-in-the-linux-os.md new file mode 100644 index 0000000000000000000000000000000000000000..192d945106107cf9cb8530a6c4e99fad3c5c42d7 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/3-configuring-a-data-source-in-the-linux-os.md @@ -0,0 +1,324 @@ +--- +title: Configuring a Data Source in the Linux OS +summary: Configuring a Data Source in the Linux OS +author: Guo Huan +date: 2021-04-26 +--- + +# Configuring a Data Source in the Linux OS + +The ODBC driver (psqlodbcw.so) provided by MogDB can be used after it has been configured in a data source. To configure a data source, you must configure the **odbc.ini** and **odbcinst.ini** files on the server. The two files are generated during the unixODBC compilation and installation, and are saved in the **/usr/local/etc** directory by default. + +**Procedure** + +1. Obtain the source code package of unixODBC by following link: + + + + After the download, validate the integrity based on the integrity validation algorithm provided by the community. + +2. Install unixODBC. It does not matter if unixODBC of another version has been installed. + + Currently, unixODBC-2.2.1 is not supported. For example, to install unixODBC-2.3.0, run the commands below. unixODBC is installed in the **/usr/local** directory by default. The data source file is generated in the **/usr/local/etc** directory, and the library file is generated in the **/usr/local/lib** directory. + + ```bash + tar zxvf unixODBC-2.3.0.tar.gz + cd unixODBC-2.3.0 + #Modify the configure file. (If it does not exist, modify the configure.ac file.) Find LIB_VERSION. + #Change the value of LIB_VERSION to 1:0:0 to compile a *.so.1 dynamic library with the same dependency on psqlodbcw.so. + vim configure + + ./configure --enable-gui=no #To perform compilation on a Kunpeng server, add the configure parameter --build=aarch64-unknown-linux-gnu. + make + #The installation may require root permissions. + make install + ``` + +3. Replace the openGauss driver on the client. + + a. Decompress **openGauss-** **1.1.0**-**ODBC.tar.gz** to the **/usr/local/lib** directory. **psqlodbcw.la** and **psqlodbcw.so** files are obtained. + + b. Copy the library in the **lib** directory obtained after decompressing **openGauss**-**1.1.0**-**ODBC.tar.gz** to the **/usr/local/lib** directory. + +4. Configure a data source. + + a. Configure the ODBC driver file. + + Add the following content to the **/xxx/odbc/etc/odbcinst.ini** file: + + ``` + [GaussMPP] + Driver64=/xxx/odbc/lib/psqlodbcw.so + setup=/xxx/odbc/lib/psqlodbcw.so + ``` + + For descriptions of the parameters in the **odbcinst.ini** file, see [Table 1](#odbcinst.ini). + + **Table 1** odbcinst.ini configuration parameters + + | **Parameter** | **Description** | **Example** | + | ------------- | ------------------------------------------------------------ | ----------------------------------- | + | [DriverName] | Driver name, corresponding to the driver in DSN. | [DRIVER_N] | + | Driver64 | Path of the dynamic driver library. | Driver64=/xxx/odbc/lib/psqlodbcw.so | + | setup | Driver installation path, which is the same as the dynamic library path in Driver64. | setup=/xxx/odbc/lib/psqlodbcw.so | + + b. Configure the data source file. + + Add the following content to the **/usr/local/etc/odbc.ini** file: + + ```bash + [MPPODBC] + Driver=GaussMPP + Servername=10.145.130.26 (IP address of the server where the database resides) + Database=postgres (database name) + Username=omm (database username) + Password= (user password of the database) + Port=8000 (listening port of the database) + Sslmode=allow + ``` + + For descriptions of the parameters in the **odbc.ini** file, see [Table 2](#odbc.ini). + + **Table 2** odbc.ini configuration parameters + + | **Parameter** | **Description** | **Example** | + | ----------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | + | [DSN] | Data source name | [MPPODBC] | + | Driver | Driver name, corresponding to DriverName in **odbcinst.ini** | Driver=DRIVER_N | + | Servername | Server IP address | Servername=10.145.130.26 | + | Database | Name of the database to connect to | Database=postgres | + | Username | Database username | Username=omm | + | Password | Database user password | Password=
NOTE:
After a user established a connection, the ODBC driver automatically clears their password stored in memory.
However, if this parameter is configured, UnixODBC will cache data source files, which may cause the password to be stored in the memory for a long time.
When you connect to an application, you are advised to send your password through an API instead of writing it in a data source configuration file. After the connection has been established, immediately clear the memory segment where your password is stored. | + | Port | Port number of the server | Port=8000 | + | Sslmode | Whether to enable SSL | Sslmode=allow | + | Debug | If this parameter is set to **1**, the **mylog** file of the PostgreSQL ODBC driver will be printed. The directory generated for storing logs is **/tmp/**. If this parameter is set to **0**, no directory is generated. | Debug=1 | + | UseServerSidePrepare | Whether to enable the extended query protocol for the database.
The value can be **0** or **1**. The default value is **1**, indicating that the extended query protocol is enabled. | UseServerSidePrepare=1 | + | UseBatchProtocol | Whether to enable the batch query protocol. If it is enabled, DML performance can be improved. The value can be **0** or **1**. The default value is **1**.
If this parameter is set to **0**, the batch query protocol is disabled (mainly for communication with earlier database versions).
If this parameter is set to **1** and **support_batch_bind** is set to **on**, the batch query protocol is enabled. | UseBatchProtocol=1 | + | ForExtensionConnector | This parameter specifies whether the savepoint is sent. | ForExtensionConnector=1 | + | UnamedPrepStmtThreshold | Each time **SQLFreeHandle** is invoked to release statements, ODBC sends a **Deallocate plan_name** statement to the server. A large number of such a statement exist in the service. To reduce the number of the statements to be sent, **stmt->plan_name** is left empty so that the database can identify them as unnamed statements. This parameter is added to control the threshold for unnamed statements. | UnamedPrepStmtThreshold=100 | + | ConnectionExtraInfo | Whether to display the driver deployment path and process owner in the **connection_info** parameter mentioned in **connection_info**. | ConnectionExtraInfo=1NOTE:The default value is **0**. If this parameter is set to **1**, the ODBC driver reports the driver deployment path and process owner to the database and displays the information in the **connection_info** parameter (see **BEGIN**). In this case, you can query the information from PG_STAT_ACTIVITY. | + | BoolAsChar | If this parameter is set to **Yes**, the Boolean value is mapped to the SQL_CHAR type. If this parameter is not set, the value is mapped to the SQL_BIT type. | BoolsAsChar = Yes | + | RowVersioning | When an attempt is made to update a row of data, setting this parameter to **Yes** allows the application to detect whether the data has been modified by other users. | RowVersioning = Yes | + | ShowSystemTables | By default, the driver regards the system table as a common SQL table. | ShowSystemTables = Yes | + + The valid values of **Sslmode** are as follows: + + **Table 3** Sslmode options + + | sslmode | Whether SSL Encryption Is Enabled | Description | + | ----------- | --------------------------------- | ------------------------------------------------------------ | + | disable | No | SSL connection is not enabled. | + | allow | Possible | If the database server requires SSL connection, SSL connection can be enabled. However, authenticity of the database server will not be verified. | + | prefer | Possible | If the database supports SSL connection, SSL connection is recommended. However, authenticity of the database server will not be verified. | + | require | Yes | SSL connection is required and data is encrypted. However, authenticity of the database server will not be verified. | + | verify-ca | Yes | SSL connection is required and whether the database has a trusted certificate will be verified. | + | verify-full | Yes | SSL connection is required. In addition to the check scope specified by **verify-ca**, the system checks whether the name of the host where the database resides is the same as that in the certificate. MogDB does not support this mode. | + +5. (Optional) Generate an SSL certificate. For details, see **Generating Certificates**.This step and step 6 need to be performed when the server and the client are connected via ssl. It can be skipped in case of non-ssl connection. + +6. (Optional) Replace an SSL certificate. For details, see **Replacing Certificates**. + +7. SSL mode: + + ``` + Return to the root directory, create the .postgresql directory, and put root.crt, client.crt, client.key, client.key.cipher, client.key.rand, client.req, server.crt, server.key, server.key.cipher, server.key.rand, and server.req in the directory. + Copy root.crt and related certificate files starting with "server" to the install/data directory of the database (the postgresql.conf file is also stored in the install/data directory.) + Modify the postgresql.conf file. + ssl = on + ssl_cert_file = 'server.crt' + ssl_key_file = 'server.key' + ssl_ca_file = 'root.crt' + Restart the database after the modification is complete. + Modify the sslmode parameter in the odbc.ini configuration file (require or verify-ca). + ``` + +8. Configure the database server. + + a. Log in as the OS user **omm** to the primary node of the database. + + b. Run the following command to add NIC IP addresses or host names, with values separated by commas (,). The NICs and hosts are used to provide external services. In the following command, *NodeName* specifies the name of the current node. + + ``` + gs_guc reload -N NodeName -I all -c "listen_addresses='localhost,192.168.0.100,10.11.12.13'" + ``` + + If direct routing of LVS is used, add the virtual IP address (10.11.12.13) of LVS to the server listening list. + + You can also set **listen_addresses** to **\*** or **0.0.0.0** to listen to all NICs, but this incurs security risks and is not recommended. + + c. Run the following command to add an authentication rule to the configuration file of the primary database node. In this example, the IP address (10.11.12.13) of the client is the remote host IP address. + + ``` + gs_guc reload -N all -I all -h "host all jack 10.11.12.13/32 sha256" + ``` + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + > + > - **-N all** indicates all hosts in MogDB. + > - **-I all** indicates all instances of the host. + > - **-h** specifies statements that need to be added in the **pg_hba.conf** file. + > - **all** indicates that a client can connect to any database. + > - **jack** indicates the user that accesses the database. + > - **10.11.12.13/32** indicates hosts whose IP address is 10.11.12.13 can be connected. Configure the parameter based on your network conditions. **32** indicates that there are 32 bits whose value is 1 in the subnet mask. That is, the subnet mask is 255.255.255.255. + > - **sha256** indicates that the password of user **jack** is encrypted using the SHA-256 algorithm. + + If the ODBC client and the primary database node to connect are deployed on the same machine, you can use the local trust authentication mode. Run the following command: + + ``` + local all all trust + ``` + + If the ODBC client and the primary database node to connect are deployed on different machines, use the SHA-256 authentication mode. Run the following command: + + ``` + host all all xxx.xxx.xxx.xxx/32 sha256 + ``` + + d. Restart MogDB. + + ``` + gs_om -t stop + gs_om -t start + ``` + +9. Configure environment variables on the client. + + ``` + vim ~/.bashrc + ``` + + Add the following information to the configuration file: + + ```bash + export LD_LIBRARY_PATH=/usr/local/lib/:$LD_LIBRARY_PATH + export ODBCSYSINI=/usr/local/etc + export ODBCINI=/usr/local/etc/odbc.ini + ``` + +10. Run the following command to validate the addition: + + ``` + source ~/.bashrc + ``` + +**Verifying the Data Source Configuration** + +Run the **./isql-v** **MPPODBC** command (**MPPODBC** is the data source name). + +- If the following information is displayed, the configuration is correct and the connection succeeds. + + ``` + +---------------------------------------+ + | Connected! | + | | + | sql-statement | + | help [tablename] | + | quit | + | | + +---------------------------------------+ + SQL> + ``` + +- If error information is displayed, the configuration is incorrect. Check the configuration. + +**FAQs** + +- [UnixODBC]Can't open lib 'xxx/xxx/psqlodbcw.so' : file not found. + + Possible causes: + + - The path configured in the **odbcinst.ini** file is incorrect. + + Run **ls** to check the path in the error information, and ensure that the **psqlodbcw.so** file exists and you have execute permissions on it. + + - The dependent library of **psqlodbcw.so** does not exist or is not in system environment variables. + + Run **ldd** to check the path in the error information. If **libodbc.so.1** or other UnixODBC libraries do not exist, configure UnixODBC again following the procedure provided in this section, and add the **lib** directory under its installation directory to **LD_LIBRARY_PATH**. If other libraries do not exist, add the **lib** directory under the ODBC driver package to **LD_LIBRARY_PATH**. + +- [UnixODBC]connect to server failed: no such file or directory + + Possible causes: + + - An incorrect or unreachable database IP address or port number was configured. + + Check the **Servername** and **Port** configuration items in data sources. + + - Server monitoring is improper. + + If **Servername** and **Port** are correctly configured, ensure the proper network adapter and port are monitored by following the database server configurations in the procedure in this section. + + - Firewall and network gatekeeper settings are improper. + + Check firewall settings, and ensure that the database communication port is trusted. + + Check to ensure network gatekeeper settings are proper (if any). + +- [unixODBC]The password-stored method is not supported. + + Possible causes: + + The **sslmode** configuration item is not configured in the data sources. + + Solution: + + Set the configuration item to **allow** or a higher level. For details, see [Table 3](#sslmode). + +- Server common name "xxxx" does not match host name "xxxxx" + + Possible causes: + + When **verify-full** is used for SSL encryption, the driver checks whether the host name in certificates is the same as the actual one. + + Solution: + + To solve this problem, use **verify-ca** to stop checking host names, or generate a set of CA certificates containing the actual host names. + +- Driver's SQLAllocHandle on SQL_HANDLE_DBC failed + + Possible causes: + + The executable file (such as the **isql** tool of unixODBC) and the database driver (**psqlodbcw.so**) depend on different library versions of ODBC, such as **libodbc.so.1** and **libodbc.so.2**. You can verify this problem by using the following method: + + ``` + ldd `which isql` | grep odbc + ldd psqlodbcw.so | grep odbc + ``` + + If the suffix digits of the outputs **libodbc.so** are different or indicate different physical disk files, this problem exists. Both **isql** and **psqlodbcw.so** load **libodbc.so**. If different physical files are loaded, different ODBC libraries with the same function list conflict with each other in a visible domain. As a result, the database driver cannot be loaded. + + Solution: + + Uninstall the unnecessary unixODBC, such as libodbc.so.2, and create a soft link with the same name and the .so.2 suffix for the remaining libodbc.so.1 library. + +- FATAL: Forbid remote connection with trust method! + + For security purposes, the primary database node forbids access from other nodes in MogDB without authentication. + + To access the primary database node from inside MogDB, deploy the ODBC program on the host where the primary database node is located and set the server address to **127.0.0.1**. It is recommended that the service system be deployed outside MogDB. If it is deployed inside, database performance may be affected. + +- [unixODBC]Invalid attribute value + + This problem occurs when you use SQL on other MogDB. The possible cause is that the unixODBC version is not the recommended one. You are advised to run the **odbcinst -version** command to check the unixODBC version. + +- authentication method 10 not supported. + + If this error occurs on an open-source client, the cause may be: + + The database stores only the SHA-256 hash of the password, but the open-source client supports only MD5 hashes. + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + > + > - The database stores the hashes of user passwords instead of actual passwords. + > - If a password is updated or a user is created, both types of hashes will be stored, compatible with open-source authentication protocols. + > - An MD5 hash can only be generated using the original password, but the password cannot be obtained by reversing its SHA-256 hash. Passwords in the old version will only have SHA-256 hashes and not support MD5 authentication. + > - The MD5 encryption algorithm has lower security and poses security risks. Therefore, you are advised to use a more secure encryption algorithm. + + To solve this problem, you can update the user password (see **ALTER USER**) or create a user (see **CREATE USER**) having the same permissions as the faulty user. + +- unsupported frontend protocol 3.51: server supports 1.0 to 3.0 + + The database version is too early or the database is an open-source database. Use the driver of the required version to connect to the database. + +- FATAL: GSS authentication method is not allowed because XXXX user password is not disabled. + + In **pg_hba.conf** of the target primary database node, the authentication mode is set to **gss** for authenticating the IP address of the current client. However, this authentication algorithm cannot authenticate clients. Change the authentication algorithm to **sha256** and try again. For details, see [8](#8). diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/4-development-process.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/4-development-process.md new file mode 100644 index 0000000000000000000000000000000000000000..838f216c786839f3594bc9bc40711a29e775164a --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/4-development-process.md @@ -0,0 +1,38 @@ +--- +title: Development Process +summary: Development Process +author: Guo Huan +date: 2021-04-26 +--- + +# Development Process + +**Figure 1** ODBC-based application development process + +![odbc-based-application-development-process](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/development-process-4.png) + +**APIs Involved in the Development Process** + +**Table 1** API description + +| **Function** | **API** | +| :-------------------------------------------------------- | :----------------------------------------------------------- | +| Allocate a handle | SQLAllocHandle is a generic function for allocating a handle. It can replace the following functions:
- SQLAllocEnv: allocate an environment handle
- SQLAllocConnect: allocate a connection handle
- SQLAllocStmt: allocate a statement handle | +| Set environment attributes | SQLSetEnvAttr | +| Set connection attributes | SQLSetConnectAttr | +| Set statement attributes | SQLSetStmtAttr | +| Connect to a data source | SQLConnect | +| Bind a buffer to a column in the result set | SQLBindCol | +| Bind the parameter marker of an SQL statement to a buffer | SQLBindParameter | +| Return the error message of the last operation | SQLGetDiagRec | +| Prepare an SQL statement for execution | SQLPrepare | +| Run a prepared SQL statement | SQLExecute | +| Run an SQL statement directly | SQLExecDirect | +| Fetch the next row (or rows) from the result set | SQLFetch | +| Return data in a column of the result set | SQLGetData | +| Get the column information from a result set | SQLColAttribute | +| Disconnect from a data source | SQLDisconnect | +| Release a handle | SQLFreeHandle is a generic function for releasing a handle. It can replace the following functions:
- SQLFreeEnv: release an environment handle
- SQLFreeConnect: release a connection handle
- SQLFreeStmt: release a statement handle | + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> If an execution request (not in a transaction block) received in the database contains multiple statements, the request is packed into a transaction. If one of the statements fails, the entire request will be rolled back. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/5-example-common-functions-and-batch-binding.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/5-example-common-functions-and-batch-binding.md new file mode 100644 index 0000000000000000000000000000000000000000..eed5abea290f6be68eb9ff7db1734bf531b146e5 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/5-example-common-functions-and-batch-binding.md @@ -0,0 +1,440 @@ +--- +title: Common Functions and Batch Binding +summary: Common Functions and Batch Binding +author: Guo Huan +date: 2021-04-26 +--- + +# Example: Common Functions and Batch Binding + +## Code for Common Functions + +``` +//The following example shows how to obtain data from openGauss through the ODBC interface. +// DBtest.c (compile with: libodbc.so) +#include +#include +#include +#ifdef WIN32 +#include +#endif +SQLHENV V_OD_Env; // Handle ODBC environment +SQLHSTMT V_OD_hstmt; // Handle statement +SQLHDBC V_OD_hdbc; // Handle connection +char typename[100]; +SQLINTEGER value = 100; +SQLINTEGER V_OD_erg,V_OD_buffer,V_OD_err,V_OD_id; +int main(int argc,char *argv[]) +{ + // 1. Allocate an environment handle. + V_OD_erg = SQLAllocHandle(SQL_HANDLE_ENV,SQL_NULL_HANDLE,&V_OD_Env); + if ((V_OD_erg != SQL_SUCCESS) && (V_OD_erg != SQL_SUCCESS_WITH_INFO)) + { + printf("Error AllocHandle\n"); + exit(0); + } + // 2. Set environment attributes (version information). + SQLSetEnvAttr(V_OD_Env, SQL_ATTR_ODBC_VERSION, (void*)SQL_OV_ODBC3, 0); + // 3. Allocate a connection handle. + V_OD_erg = SQLAllocHandle(SQL_HANDLE_DBC, V_OD_Env, &V_OD_hdbc); + if ((V_OD_erg != SQL_SUCCESS) && (V_OD_erg != SQL_SUCCESS_WITH_INFO)) + { + SQLFreeHandle(SQL_HANDLE_ENV, V_OD_Env); + exit(0); + } + // 4. Set connection attributes. + SQLSetConnectAttr(V_OD_hdbc, SQL_ATTR_AUTOCOMMIT, SQL_AUTOCOMMIT_ON, 0); + // 5. Connect to the data source. userName and password indicate the username and password for connecting to the database. Set them as needed. + // If the username and password have been set in the odbc.ini file, you do not need to set userName or password here, retaining "" for them. However, you are not advised to do so because the username and password will be disclosed if the permission for odbc.ini is abused. + V_OD_erg = SQLConnect(V_OD_hdbc, (SQLCHAR*) "gaussdb", SQL_NTS, + (SQLCHAR*) "userName", SQL_NTS, (SQLCHAR*) "password", SQL_NTS); + if ((V_OD_erg != SQL_SUCCESS) && (V_OD_erg != SQL_SUCCESS_WITH_INFO)) + { + printf("Error SQLConnect %d\n",V_OD_erg); + SQLFreeHandle(SQL_HANDLE_ENV, V_OD_Env); + exit(0); + } + printf("Connected !\n"); + // 6. Set statement attributes. + SQLSetStmtAttr(V_OD_hstmt,SQL_ATTR_QUERY_TIMEOUT,(SQLPOINTER *)3,0); + // 7. Allocate a statement handle. + SQLAllocHandle(SQL_HANDLE_STMT, V_OD_hdbc, &V_OD_hstmt); + // 8. Run SQL statements. + SQLExecDirect(V_OD_hstmt,"drop table IF EXISTS customer_t1",SQL_NTS); + SQLExecDirect(V_OD_hstmt,"CREATE TABLE customer_t1(c_customer_sk INTEGER, c_customer_name VARCHAR(32));",SQL_NTS); + SQLExecDirect(V_OD_hstmt,"insert into customer_t1 values(25,li)",SQL_NTS); + // 9. Prepare for execution. + SQLPrepare(V_OD_hstmt,"insert into customer_t1 values(?)",SQL_NTS); + // 10. Bind parameters. + SQLBindParameter(V_OD_hstmt,1,SQL_PARAM_INPUT,SQL_C_SLONG,SQL_INTEGER,0,0, + &value,0,NULL); + // 11. Run prepared statements. + SQLExecute(V_OD_hstmt); + SQLExecDirect(V_OD_hstmt,"select id from testtable",SQL_NTS); + // 12. Obtain attributes of a specific column in the result set. + SQLColAttribute(V_OD_hstmt,1,SQL_DESC_TYPE,typename,100,NULL,NULL); + printf("SQLColAtrribute %s\n",typename); + // 13. Bind the result set. + SQLBindCol(V_OD_hstmt,1,SQL_C_SLONG, (SQLPOINTER)&V_OD_buffer,150, + (SQLLEN *)&V_OD_err); + // 14. Obtain data in the result set by executing SQLFetch. + V_OD_erg=SQLFetch(V_OD_hstmt); + // 15. Obtain and return data by executing SQLGetData. + while(V_OD_erg != SQL_NO_DATA) + { + SQLGetData(V_OD_hstmt,1,SQL_C_SLONG,(SQLPOINTER)&V_OD_id,0,NULL); + printf("SQLGetData ----ID = %d\n",V_OD_id); + V_OD_erg=SQLFetch(V_OD_hstmt); + }; + printf("Done !\n"); + // 16. Disconnect data source connections and release handles. + SQLFreeHandle(SQL_HANDLE_STMT,V_OD_hstmt); + SQLDisconnect(V_OD_hdbc); + SQLFreeHandle(SQL_HANDLE_DBC,V_OD_hdbc); + SQLFreeHandle(SQL_HANDLE_ENV, V_OD_Env); + return(0); + } +``` + +## Code for Batch Processing + +``` +/********************************************************************** +* Enable UseBatchProtocol in the data source and set the database parameter support_batch_bind +* to on. +* The CHECK_ERROR command is used to check and print error information. +* This example is used to interactively obtain the DSN, data volume to be processed, and volume of ignored data from users, and insert required data into the test_odbc_batch_insert table. +***********************************************************************/ +#include +#include +#include +#include +#include + +void Exec(SQLHDBC hdbc, SQLCHAR* sql) +{ + SQLRETURN retcode; // Return status + SQLHSTMT hstmt = SQL_NULL_HSTMT; // Statement handle + SQLCHAR loginfo[2048]; + + // Allocate Statement Handle + retcode = SQLAllocHandle(SQL_HANDLE_STMT, hdbc, &hstmt); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLAllocHandle(SQL_HANDLE_STMT) failed"); + return; + } + + // Prepare Statement + retcode = SQLPrepare(hstmt, (SQLCHAR*) sql, SQL_NTS); + sprintf((char*)loginfo, "SQLPrepare log: %s", (char*)sql); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLPrepare(hstmt, (SQLCHAR*) sql, SQL_NTS) failed"); + return; + } + + // Execute Statement + retcode = SQLExecute(hstmt); + sprintf((char*)loginfo, "SQLExecute stmt log: %s", (char*)sql); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLExecute(hstmt) failed"); + return; + } + // Free Handle + retcode = SQLFreeHandle(SQL_HANDLE_STMT, hstmt); + sprintf((char*)loginfo, "SQLFreeHandle stmt log: %s", (char*)sql); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLFreeHandle(SQL_HANDLE_STMT, hstmt) failed"); + return; + } +} + +int main () +{ + SQLHENV henv = SQL_NULL_HENV; + SQLHDBC hdbc = SQL_NULL_HDBC; + int batchCount = 1000; // Amount of data that is bound in batches + SQLLEN rowsCount = 0; + int ignoreCount = 0; // Amount of data that is not imported to the database among the data that is bound in batches + + SQLRETURN retcode; + SQLCHAR dsn[1024] = {'\0'}; + SQLCHAR loginfo[2048]; + + do + { + if (ignoreCount > batchCount) + { + printf("ignoreCount(%d) should be less than batchCount(%d)\n", ignoreCount, batchCount); + } + }while(ignoreCount > batchCount); + + retcode = SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &henv); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLAllocHandle failed"); + goto exit; + } + + // Set ODBC Verion + retcode = SQLSetEnvAttr(henv, SQL_ATTR_ODBC_VERSION, + (SQLPOINTER*)SQL_OV_ODBC3, 0); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLSetEnvAttr failed"); + goto exit; + } + + // Allocate Connection + retcode = SQLAllocHandle(SQL_HANDLE_DBC, henv, &hdbc); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLAllocHandle failed"); + goto exit; + } + + + // Set Login Timeout + retcode = SQLSetConnectAttr(hdbc, SQL_LOGIN_TIMEOUT, (SQLPOINTER)5, 0); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLSetConnectAttr failed"); + goto exit; + } + + + // Set Auto Commit + retcode = SQLSetConnectAttr(hdbc, SQL_ATTR_AUTOCOMMIT, + (SQLPOINTER)(1), 0); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLSetConnectAttr failed"); + goto exit; + } + + + // Connect to DSN + // gaussdb indicates the name of the data source used by users. + sprintf(loginfo, "SQLConnect(DSN:%s)", dsn); + retcode = SQLConnect(hdbc, (SQLCHAR*) "gaussdb", SQL_NTS, + (SQLCHAR*) NULL, 0, NULL, 0); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLConnect failed"); + goto exit; + } + + // init table info. + Exec(hdbc, "drop table if exists test_odbc_batch_insert"); + Exec(hdbc, "create table test_odbc_batch_insert(id int primary key, col varchar2(50))"); + + // The following code constructs the data to be inserted based on the data volume entered by users: + { + SQLRETURN retcode; + SQLHSTMT hstmtinesrt = SQL_NULL_HSTMT; + int i; + SQLCHAR *sql = NULL; + SQLINTEGER *ids = NULL; + SQLCHAR *cols = NULL; + SQLLEN *bufLenIds = NULL; + SQLLEN *bufLenCols = NULL; + SQLUSMALLINT *operptr = NULL; + SQLUSMALLINT *statusptr = NULL; + SQLULEN process = 0; + + // Data is constructed by column. Each column is stored continuously. + ids = (SQLINTEGER*)malloc(sizeof(ids[0]) * batchCount); + cols = (SQLCHAR*)malloc(sizeof(cols[0]) * batchCount * 50); + // Data size in each row for a column + bufLenIds = (SQLLEN*)malloc(sizeof(bufLenIds[0]) * batchCount); + bufLenCols = (SQLLEN*)malloc(sizeof(bufLenCols[0]) * batchCount); + // Whether this row needs to be processed. The value is SQL_PARAM_IGNORE or SQL_PARAM_PROCEED. + operptr = (SQLUSMALLINT*)malloc(sizeof(operptr[0]) * batchCount); + memset(operptr, 0, sizeof(operptr[0]) * batchCount); + // Processing result of the row + // Note: In the database, a statement belongs to one transaction. Therefore, data is processed as a unit. Either all data is inserted successfully or all data fails to be inserted. + statusptr = (SQLUSMALLINT*)malloc(sizeof(statusptr[0]) * batchCount); + memset(statusptr, 88, sizeof(statusptr[0]) * batchCount); + + if (NULL == ids || NULL == cols || NULL == bufLenCols || NULL == bufLenIds) + { + fprintf(stderr, "FAILED:\tmalloc data memory failed\n"); + goto exit; + } + + for (int i = 0; i < batchCount; i++) + { + ids[i] = i; + sprintf(cols + 50 * i, "column test value %d", i); + bufLenIds[i] = sizeof(ids[i]); + bufLenCols[i] = strlen(cols + 50 * i); + operptr[i] = (i < ignoreCount) ? SQL_PARAM_IGNORE : SQL_PARAM_PROCEED; + } + + // Allocate Statement Handle + retcode = SQLAllocHandle(SQL_HANDLE_STMT, hdbc, &hstmtinesrt); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLAllocHandle failed"); + goto exit; + } + + // Prepare Statement + sql = (SQLCHAR*)"insert into test_odbc_batch_insert values(?, ?)"; + retcode = SQLPrepare(hstmtinesrt, (SQLCHAR*) sql, SQL_NTS); + sprintf((char*)loginfo, "SQLPrepare log: %s", (char*)sql); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLPrepare failed"); + goto exit; + } + + retcode = SQLSetStmtAttr(hstmtinesrt, SQL_ATTR_PARAMSET_SIZE, (SQLPOINTER)batchCount, sizeof(batchCount)); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLSetStmtAttr failed"); + goto exit; + } + + retcode = SQLBindParameter(hstmtinesrt, 1, SQL_PARAM_INPUT, SQL_C_SLONG, SQL_INTEGER, sizeof(ids[0]), 0,&(ids[0]), 0, bufLenIds); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLBindParameter failed"); + goto exit; + } + + retcode = SQLBindParameter(hstmtinesrt, 2, SQL_PARAM_INPUT, SQL_C_CHAR, SQL_CHAR, 50, 50, cols, 50, bufLenCols); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLBindParameter failed"); + goto exit; + } + + retcode = SQLSetStmtAttr(hstmtinesrt, SQL_ATTR_PARAMS_PROCESSED_PTR, (SQLPOINTER)&process, sizeof(process)); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLSetStmtAttr failed"); + goto exit; + } + + retcode = SQLSetStmtAttr(hstmtinesrt, SQL_ATTR_PARAM_STATUS_PTR, (SQLPOINTER)statusptr, sizeof(statusptr[0]) * batchCount); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLSetStmtAttr failed"); + goto exit; + } + + retcode = SQLSetStmtAttr(hstmtinesrt, SQL_ATTR_PARAM_OPERATION_PTR, (SQLPOINTER)operptr, sizeof(operptr[0]) * batchCount); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLSetStmtAttr failed"); + goto exit; + } + + retcode = SQLExecute(hstmtinesrt); + sprintf((char*)loginfo, "SQLExecute stmt log: %s", (char*)sql); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLExecute(hstmtinesrt) failed"); + goto exit; + + retcode = SQLRowCount(hstmtinesrt, &rowsCount); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLRowCount failed"); + goto exit; + } + + if (rowsCount != (batchCount - ignoreCount)) + { + sprintf(loginfo, "(batchCount - ignoreCount)(%d) != rowsCount(%d)", (batchCount - ignoreCount), rowsCount); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLExecute failed"); + goto exit; + } + } + else + { + sprintf(loginfo, "(batchCount - ignoreCount)(%d) == rowsCount(%d)", (batchCount - ignoreCount), rowsCount); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLExecute failed"); + goto exit; + } + } + + // check row number returned + if (rowsCount != process) + { + sprintf(loginfo, "process(%d) != rowsCount(%d)", process, rowsCount); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLExecute failed"); + goto exit; + } + } + else + { + sprintf(loginfo, "process(%d) == rowsCount(%d)", process, rowsCount); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLExecute failed"); + goto exit; + } + } + + for (int i = 0; i < batchCount; i++) + { + if (i < ignoreCount) + { + if (statusptr[i] != SQL_PARAM_UNUSED) + { + sprintf(loginfo, "statusptr[%d](%d) != SQL_PARAM_UNUSED", i, statusptr[i]); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLExecute failed"); + goto exit; + } + } + } + else if (statusptr[i] != SQL_PARAM_SUCCESS) + { + sprintf(loginfo, "statusptr[%d](%d) != SQL_PARAM_SUCCESS", i, statusptr[i]); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLExecute failed"); + goto exit; + } + } + } + + retcode = SQLFreeHandle(SQL_HANDLE_STMT, hstmtinesrt); + sprintf((char*)loginfo, "SQLFreeHandle hstmtinesrt"); + + if (!SQL_SUCCEEDED(retcode)) { + printf("SQLFreeHandle failed"); + goto exit; + } + } + + +exit: + (void) printf ("\nComplete.\n"); + + // Connection + if (hdbc != SQL_NULL_HDBC) { + SQLDisconnect(hdbc); + SQLFreeHandle(SQL_HANDLE_DBC, hdbc); + } + + // Environment + if (henv != SQL_NULL_HENV) + SQLFreeHandle(SQL_HANDLE_ENV, henv); + + return 0; +} +``` \ No newline at end of file diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/5.1-typical-application-scenarios-and-configurations.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/5.1-typical-application-scenarios-and-configurations.md new file mode 100644 index 0000000000000000000000000000000000000000..7017deb1846b5e1fab85f9a0981ded031b1d75cd --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/5.1-typical-application-scenarios-and-configurations.md @@ -0,0 +1,496 @@ +--- +title: Typical Application Scenarios and Configurations +summary: Typical Application Scenarios and Configurations +author: Zhang Cuiping +date: 2021-10-11 +--- + +# Typical Application Scenarios and Configurations + +## Log Diagnosis Scenario + +ODBC logs are classified into unixODBC driver manager logs and psqlODBC driver logs. The former is used to trace whether the application API is successfully executed, and the latter is used to locate problems based on DFX logs generated during underlying implementation. + +The unixODBC log needs to be configured in the **odbcinst.ini** file: + +```bash +[ODBC] +Trace=Yes +TraceFile=/path/to/odbctrace.log + +[GaussMPP] +Driver64=/usr/local/lib/psqlodbcw.so +setup=/usr/local/lib/psqlodbcw.so +``` + +You only need to add the following information to the **odbc.ini** file: + +```bash +[gaussdb] +Driver=GaussMPP +Servername=10.10.0.13 (database server IP address) +... +Debug=1 (Enable the debug log function of the driver.) +``` + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** The unixODBC logs are generated in the path configured by **TraceFile**. The psqlODBC generates the **mylog_***xxx***.log** file in the **/tmp/** directory. + +## High Performance + +If a large amount of data needs to be inserted, you are advised to perform the following operations: + +- You need to set **UseBatchProtocol** to **1** in the **odbc.ini** file and **support_batch_bind** to **on** in the database. +- The ODBC program binding type must be the same as that in the database. +- The character set of the client is the same as that of the database. +- The transaction is committed manually. + +**odbc.ini** configuration file: + +```bash +[gaussdb] +Driver=GaussMPP +Servername=10.10.0.13 (database server IP address) +... +UseBatchProtocol=1 (enabled by default) +ConnSettings=set client_encoding=UTF8 (Set the character code on the client to be the same as that on the server.) +``` + +Binding type case: + +```c +#include +#include +#include +#include +#include +#include + +#define MESSAGE_BUFFER_LEN 128 +SQLHANDLE h_env = NULL; +SQLHANDLE h_conn = NULL; +SQLHANDLE h_stmt = NULL; +void print_error() +{ + SQLCHAR Sqlstate[SQL_SQLSTATE_SIZE+1]; + SQLINTEGER NativeError; + SQLCHAR MessageText[MESSAGE_BUFFER_LEN]; + SQLSMALLINT TextLength; + SQLRETURN ret = SQL_ERROR; + + ret = SQLGetDiagRec(SQL_HANDLE_STMT, h_stmt, 1, Sqlstate, &NativeError, MessageText, MESSAGE_BUFFER_LEN, &TextLength); + if ( SQL_SUCCESS == ret) + { + printf("\n STMT ERROR-%05d %s", NativeError, MessageText); + return; + } + + ret = SQLGetDiagRec(SQL_HANDLE_DBC, h_conn, 1, Sqlstate, &NativeError, MessageText, MESSAGE_BUFFER_LEN, &TextLength); + if ( SQL_SUCCESS == ret) + { + printf("\n CONN ERROR-%05d %s", NativeError, MessageText); + return; + } + + ret = SQLGetDiagRec(SQL_HANDLE_ENV, h_env, 1, Sqlstate, &NativeError, MessageText, MESSAGE_BUFFER_LEN, &TextLength); + if ( SQL_SUCCESS == ret) + { + printf("\n ENV ERROR-%05d %s", NativeError, MessageText); + return; + } + + return; +} + +/* Expect the function to return SQL_SUCCESS. */ +#define RETURN_IF_NOT_SUCCESS(func) \ +{\ + SQLRETURN ret_value = (func);\ + if (SQL_SUCCESS != ret_value)\ + {\ + print_error();\ + printf("\n failed line = %u: expect SQL_SUCCESS, but ret = %d", __LINE__, ret_value);\ + return SQL_ERROR; \ + }\ +} + +/* Expect the function to return SQL_SUCCESS. */ +#define RETURN_IF_NOT_SUCCESS_I(i, func) \ +{\ + SQLRETURN ret_value = (func);\ + if (SQL_SUCCESS != ret_value)\ + {\ + print_error();\ + printf("\n failed line = %u (i=%d): : expect SQL_SUCCESS, but ret = %d", __LINE__, (i), ret_value);\ + return SQL_ERROR; \ + }\ +} + +/* Expect the function to return SQL_SUCCESS_WITH_INFO. */ +#define RETURN_IF_NOT_SUCCESS_INFO(func) \ +{\ + SQLRETURN ret_value = (func);\ + if (SQL_SUCCESS_WITH_INFO != ret_value)\ + {\ + print_error();\ + printf("\n failed line = %u: expect SQL_SUCCESS_WITH_INFO, but ret = %d", __LINE__, ret_value);\ + return SQL_ERROR; \ + }\ +} + +/* Expect the values are the same. */ +#define RETURN_IF_NOT(expect, value) \ +if ((expect) != (value))\ +{\ + printf("\n failed line = %u: expect = %u, but value = %u", __LINE__, (expect), (value)); \ + return SQL_ERROR;\ +} + +/* Expect the character strings are the same. */ +#define RETURN_IF_NOT_STRCMP_I(i, expect, value) \ +if (( NULL == (expect) ) || (NULL == (value)))\ +{\ + printf("\n failed line = %u (i=%u): input NULL pointer !", __LINE__, (i)); \ + return SQL_ERROR; \ +}\ +else if (0 != strcmp((expect), (value)))\ +{\ + printf("\n failed line = %u (i=%u): expect = %s, but value = %s", __LINE__, (i), (expect), (value)); \ + return SQL_ERROR;\ +} + + +// prepare + execute SQL statement +int execute_cmd(SQLCHAR *sql) +{ + if ( NULL == sql ) + { + return SQL_ERROR; + } + + if ( SQL_SUCCESS != SQLPrepare(h_stmt, sql, SQL_NTS)) + { + return SQL_ERROR; + } + + if ( SQL_SUCCESS != SQLExecute(h_stmt)) + { + return SQL_ERROR; + } + + return SQL_SUCCESS; +} +// execute + commit handle +int commit_exec() +{ + if ( SQL_SUCCESS != SQLExecute(h_stmt)) + { + return SQL_ERROR; + } + + // Manual committing + if ( SQL_SUCCESS != SQLEndTran(SQL_HANDLE_DBC, h_conn, SQL_COMMIT)) + { + return SQL_ERROR; + } + + return SQL_SUCCESS; +} + +int begin_unit_test() +{ + SQLINTEGER ret; + + /* Allocate an environment handle. */ + ret = SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &h_env); + if ((SQL_SUCCESS != ret) && (SQL_SUCCESS_WITH_INFO != ret)) + { + printf("\n begin_unit_test::SQLAllocHandle SQL_HANDLE_ENV failed ! ret = %d", ret); + return SQL_ERROR; + } + + /* Set the version number before connection. */ + if (SQL_SUCCESS != SQLSetEnvAttr(h_env, SQL_ATTR_ODBC_VERSION, (SQLPOINTER)SQL_OV_ODBC3, 0)) + { + print_error(); + printf("\n begin_unit_test::SQLSetEnvAttr SQL_ATTR_ODBC_VERSION failed ! ret = %d", ret); + SQLFreeHandle(SQL_HANDLE_ENV, h_env); + return SQL_ERROR; + } + + /* Allocate a connection handle. */ + ret = SQLAllocHandle(SQL_HANDLE_DBC, h_env, &h_conn); + if (SQL_SUCCESS != ret) + { + print_error(); + printf("\n begin_unit_test::SQLAllocHandle SQL_HANDLE_DBC failed ! ret = %d", ret); + SQLFreeHandle(SQL_HANDLE_ENV, h_env); + return SQL_ERROR; + } + + /* Establish a connection. */ + ret = SQLConnect(h_conn, (SQLCHAR*) "gaussdb", SQL_NTS, + (SQLCHAR*) NULL, 0, NULL, 0); + if (SQL_SUCCESS != ret) + { + print_error(); + printf("\n begin_unit_test::SQLConnect failed ! ret = %d", ret); + SQLFreeHandle(SQL_HANDLE_DBC, h_conn); + SQLFreeHandle(SQL_HANDLE_ENV, h_env); + return SQL_ERROR; + } + + /* Allocate a statement handle. */ + ret = SQLAllocHandle(SQL_HANDLE_STMT, h_conn, &h_stmt); + if (SQL_SUCCESS != ret) + { + print_error(); + printf("\n begin_unit_test::SQLAllocHandle SQL_HANDLE_STMT failed ! ret = %d", ret); + SQLFreeHandle(SQL_HANDLE_DBC, h_conn); + SQLFreeHandle(SQL_HANDLE_ENV, h_env); + return SQL_ERROR; + } + + return SQL_SUCCESS; +} + +void end_unit_test() +{ + /* Release a statement handle. */ + if (NULL != h_stmt) + { + SQLFreeHandle(SQL_HANDLE_STMT, h_stmt); + } + + /* Release a connection handle. */ + if (NULL != h_conn) + { + SQLDisconnect(h_conn); + SQLFreeHandle(SQL_HANDLE_DBC, h_conn); + } + + /* Release an environment handle. */ + if (NULL != h_env) + { + SQLFreeHandle(SQL_HANDLE_ENV, h_env); + } + + return; +} + +int main() +{ + // begin test + if (begin_unit_test() != SQL_SUCCESS) + { + printf("\n begin_test_unit failed."); + return SQL_ERROR; + } + // The handle configuration is the same as that in the preceding case + int i = 0; + SQLCHAR* sql_drop = "drop table if exists test_bindnumber_001"; + SQLCHAR* sql_create = "create table test_bindnumber_001(" + "f4 number, f5 number(10, 2)" + ")"; + SQLCHAR* sql_insert = "insert into test_bindnumber_001 values(?, ?)"; + SQLCHAR* sql_select = "select * from test_bindnumber_001"; + SQLLEN RowCount; + SQL_NUMERIC_STRUCT st_number; + SQLCHAR getValue[2][MESSAGE_BUFFER_LEN]; + + /* Step 1. Create a table. */ + RETURN_IF_NOT_SUCCESS(execute_cmd(sql_drop)); + RETURN_IF_NOT_SUCCESS(execute_cmd(sql_create)); + + /* Step 2.1 Bind parameters using the SQL_NUMERIC_STRUCT structure. */ + RETURN_IF_NOT_SUCCESS(SQLPrepare(h_stmt, sql_insert, SQL_NTS)); + + // First line: 1234.5678 + memset(st_number.val, 0, SQL_MAX_NUMERIC_LEN); + st_number.precision = 8; + st_number.scale = 4; + st_number.sign = 1; + st_number.val[0] = 0x4E; + st_number.val[1] = 0x61; + st_number.val[2] = 0xBC; + + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 1, SQL_PARAM_INPUT, SQL_C_NUMERIC, SQL_NUMERIC, sizeof(SQL_NUMERIC_STRUCT), 4, &st_number, 0, NULL)); + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 2, SQL_PARAM_INPUT, SQL_C_NUMERIC, SQL_NUMERIC, sizeof(SQL_NUMERIC_STRUCT), 4, &st_number, 0, NULL)); + + // Disable the automatic commit function. + SQLSetConnectAttr(h_conn, SQL_ATTR_AUTOCOMMIT, (SQLPOINTER)SQL_AUTOCOMMIT_OFF, 0); + + RETURN_IF_NOT_SUCCESS(commit_exec()); + RETURN_IF_NOT_SUCCESS(SQLRowCount(h_stmt, &RowCount)); + RETURN_IF_NOT(1, RowCount); + + // Second line: 12345678 + memset(st_number.val, 0, SQL_MAX_NUMERIC_LEN); + st_number.precision = 8; + st_number.scale = 0; + st_number.sign = 1; + st_number.val[0] = 0x4E; + st_number.val[1] = 0x61; + st_number.val[2] = 0xBC; + + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 1, SQL_PARAM_INPUT, SQL_C_NUMERIC, SQL_NUMERIC, sizeof(SQL_NUMERIC_STRUCT), 0, &st_number, 0, NULL)); + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 2, SQL_PARAM_INPUT, SQL_C_NUMERIC, SQL_NUMERIC, sizeof(SQL_NUMERIC_STRUCT), 0, &st_number, 0, NULL)); + RETURN_IF_NOT_SUCCESS(commit_exec()); + RETURN_IF_NOT_SUCCESS(SQLRowCount(h_stmt, &RowCount)); + RETURN_IF_NOT(1, RowCount); + + // Third line: 12345678 + memset(st_number.val, 0, SQL_MAX_NUMERIC_LEN); + st_number.precision = 0; + st_number.scale = 4; + st_number.sign = 1; + st_number.val[0] = 0x4E; + st_number.val[1] = 0x61; + st_number.val[2] = 0xBC; + + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 1, SQL_PARAM_INPUT, SQL_C_NUMERIC, SQL_NUMERIC, sizeof(SQL_NUMERIC_STRUCT), 4, &st_number, 0, NULL)); + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 2, SQL_PARAM_INPUT, SQL_C_NUMERIC, SQL_NUMERIC, sizeof(SQL_NUMERIC_STRUCT), 4, &st_number, 0, NULL)); + RETURN_IF_NOT_SUCCESS(commit_exec()); + RETURN_IF_NOT_SUCCESS(SQLRowCount(h_stmt, &RowCount)); + RETURN_IF_NOT(1, RowCount); + + + /* Step 2.2 Bind parameters by using the SQL_C_CHAR character string in the fourth line. */ + RETURN_IF_NOT_SUCCESS(SQLPrepare(h_stmt, sql_insert, SQL_NTS)); + SQLCHAR* szNumber = "1234.5678"; + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 1, SQL_PARAM_INPUT, SQL_C_CHAR, SQL_NUMERIC, strlen(szNumber), 0, szNumber, 0, NULL)); + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 2, SQL_PARAM_INPUT, SQL_C_CHAR, SQL_NUMERIC, strlen(szNumber), 0, szNumber, 0, NULL)); + RETURN_IF_NOT_SUCCESS(commit_exec()); + RETURN_IF_NOT_SUCCESS(SQLRowCount(h_stmt, &RowCount)); + RETURN_IF_NOT(1, RowCount); + + /* Step 2.3 Bind parameters by using SQL_C_FLOAT in the fifth line. */ + RETURN_IF_NOT_SUCCESS(SQLPrepare(h_stmt, sql_insert, SQL_NTS)); + SQLREAL fNumber = 1234.5678; + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 1, SQL_PARAM_INPUT, SQL_C_FLOAT, SQL_NUMERIC, sizeof(fNumber), 4, &fNumber, 0, NULL)); + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 2, SQL_PARAM_INPUT, SQL_C_FLOAT, SQL_NUMERIC, sizeof(fNumber), 4, &fNumber, 0, NULL)); + RETURN_IF_NOT_SUCCESS(commit_exec()); + RETURN_IF_NOT_SUCCESS(SQLRowCount(h_stmt, &RowCount)); + RETURN_IF_NOT(1, RowCount); + + /* Step 2.4 Bind parameters by using SQL_C_DOUBLE in the sixth line. */ + RETURN_IF_NOT_SUCCESS(SQLPrepare(h_stmt, sql_insert, SQL_NTS)); + SQLDOUBLE dNumber = 1234.5678; + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 1, SQL_PARAM_INPUT, SQL_C_DOUBLE, SQL_NUMERIC, sizeof(dNumber), 4, &dNumber, 0, NULL)); + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 2, SQL_PARAM_INPUT, SQL_C_DOUBLE, SQL_NUMERIC, sizeof(dNumber), 4, &dNumber, 0, NULL)); + RETURN_IF_NOT_SUCCESS(commit_exec()); + RETURN_IF_NOT_SUCCESS(SQLRowCount(h_stmt, &RowCount)); + RETURN_IF_NOT(1, RowCount); + + SQLBIGINT bNumber1 = 0xFFFFFFFFFFFFFFFF; + SQLBIGINT bNumber2 = 12345; + + /* Step 2.5 Bind parameters by using SQL_C_SBIGINT in the seventh line. */ + RETURN_IF_NOT_SUCCESS(SQLPrepare(h_stmt, sql_insert, SQL_NTS)); + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 1, SQL_PARAM_INPUT, SQL_C_SBIGINT, SQL_NUMERIC, sizeof(bNumber1), 4, &bNumber1, 0, NULL)); + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 2, SQL_PARAM_INPUT, SQL_C_SBIGINT, SQL_NUMERIC, sizeof(bNumber2), 4, &bNumber2, 0, NULL)); + RETURN_IF_NOT_SUCCESS(commit_exec()); + RETURN_IF_NOT_SUCCESS(SQLRowCount(h_stmt, &RowCount)); + RETURN_IF_NOT(1, RowCount); + + /* Step 2.6 Bind parameters by using SQL_C_UBIGINT in the eighth line. */ + RETURN_IF_NOT_SUCCESS(SQLPrepare(h_stmt, sql_insert, SQL_NTS)); + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 1, SQL_PARAM_INPUT, SQL_C_UBIGINT, SQL_NUMERIC, sizeof(bNumber1), 4, &bNumber1, 0, NULL)); + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 2, SQL_PARAM_INPUT, SQL_C_UBIGINT, SQL_NUMERIC, sizeof(bNumber2), 4, &bNumber2, 0, NULL)); + RETURN_IF_NOT_SUCCESS(commit_exec()); + RETURN_IF_NOT_SUCCESS(SQLRowCount(h_stmt, &RowCount)); + RETURN_IF_NOT(1, RowCount); + + SQLLEN lNumber1 = 0xFFFFFFFFFFFFFFFF; + SQLLEN lNumber2 = 12345; + + /* Step 2.7 Bind parameters by using SQL_C_LONG in the ninth line. */ + RETURN_IF_NOT_SUCCESS(SQLPrepare(h_stmt, sql_insert, SQL_NTS)); + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 1, SQL_PARAM_INPUT, SQL_C_LONG, SQL_NUMERIC, sizeof(lNumber1), 0, &lNumber1, 0, NULL)); + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 2, SQL_PARAM_INPUT, SQL_C_LONG, SQL_NUMERIC, sizeof(lNumber2), 0, &lNumber2, 0, NULL)); + RETURN_IF_NOT_SUCCESS(commit_exec()); + RETURN_IF_NOT_SUCCESS(SQLRowCount(h_stmt, &RowCount)); + RETURN_IF_NOT(1, RowCount); + + /* Step 2.8 Bind parameters by using SQL_C_ULONG in the tenth line. */ + RETURN_IF_NOT_SUCCESS(SQLPrepare(h_stmt, sql_insert, SQL_NTS)); + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 1, SQL_PARAM_INPUT, SQL_C_ULONG, SQL_NUMERIC, sizeof(lNumber1), 0, &lNumber1, 0, NULL)); + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 2, SQL_PARAM_INPUT, SQL_C_ULONG, SQL_NUMERIC, sizeof(lNumber2), 0, &lNumber2, 0, NULL)); + RETURN_IF_NOT_SUCCESS(commit_exec()); + RETURN_IF_NOT_SUCCESS(SQLRowCount(h_stmt, &RowCount)); + RETURN_IF_NOT(1, RowCount); + + SQLSMALLINT sNumber = 0xFFFF; + + /* Step 2.9 Bind parameters by using SQL_C_SHORT in the eleventh line. */ + RETURN_IF_NOT_SUCCESS(SQLPrepare(h_stmt, sql_insert, SQL_NTS)); + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 1, SQL_PARAM_INPUT, SQL_C_SHORT, SQL_NUMERIC, sizeof(sNumber), 0, &sNumber, 0, NULL)); + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 2, SQL_PARAM_INPUT, SQL_C_SHORT, SQL_NUMERIC, sizeof(sNumber), 0, &sNumber, 0, NULL)); + RETURN_IF_NOT_SUCCESS(commit_exec()); + RETURN_IF_NOT_SUCCESS(SQLRowCount(h_stmt, &RowCount)); + RETURN_IF_NOT(1, RowCount); + + /* Step 2.10 Bind parameters by using SQL_C_USHORT in the twelfth line. */ + RETURN_IF_NOT_SUCCESS(SQLPrepare(h_stmt, sql_insert, SQL_NTS)); + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 1, SQL_PARAM_INPUT, SQL_C_USHORT, SQL_NUMERIC, sizeof(sNumber), 0, &sNumber, 0, NULL)); + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 2, SQL_PARAM_INPUT, SQL_C_USHORT, SQL_NUMERIC, sizeof(sNumber), 0, &sNumber, 0, NULL)); + RETURN_IF_NOT_SUCCESS(commit_exec()); + RETURN_IF_NOT_SUCCESS(SQLRowCount(h_stmt, &RowCount)); + RETURN_IF_NOT(1, RowCount); + + SQLCHAR cNumber = 0xFF; + + /* Step 2.11 Bind parameters by using SQL_C_TINYINT in the thirteenth line. */ + RETURN_IF_NOT_SUCCESS(SQLPrepare(h_stmt, sql_insert, SQL_NTS)); + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 1, SQL_PARAM_INPUT, SQL_C_TINYINT, SQL_NUMERIC, sizeof(cNumber), 0, &cNumber, 0, NULL)); + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 2, SQL_PARAM_INPUT, SQL_C_TINYINT, SQL_NUMERIC, sizeof(cNumber), 0, &cNumber, 0, NULL)); + RETURN_IF_NOT_SUCCESS(commit_exec()); + RETURN_IF_NOT_SUCCESS(SQLRowCount(h_stmt, &RowCount)); + RETURN_IF_NOT(1, RowCount); + + /* Step 2.12 Bind parameters by using SQL_C_UTINYINT in the fourteenth line.*/ + RETURN_IF_NOT_SUCCESS(SQLPrepare(h_stmt, sql_insert, SQL_NTS)); + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 1, SQL_PARAM_INPUT, SQL_C_UTINYINT, SQL_NUMERIC, sizeof(cNumber), 0, &cNumber, 0, NULL)); + RETURN_IF_NOT_SUCCESS(SQLBindParameter(h_stmt, 2, SQL_PARAM_INPUT, SQL_C_UTINYINT, SQL_NUMERIC, sizeof(cNumber), 0, &cNumber, 0, NULL)); + RETURN_IF_NOT_SUCCESS(commit_exec()); + RETURN_IF_NOT_SUCCESS(SQLRowCount(h_stmt, &RowCount)); + RETURN_IF_NOT(1, RowCount); + + /* Use the character string type to unify the expectation. */ + SQLCHAR* expectValue[14][2] = {{"1234.5678", "1234.57"}, + {"12345678", "12345678"}, + {"0", "0"}, + {"1234.5678", "1234.57"}, + {"1234.5677", "1234.57"}, + {"1234.5678", "1234.57"}, + {"-1", "12345"}, + {"18446744073709551615", "12345"}, + {"-1", "12345"}, + {"4294967295", "12345"}, + {"-1", "-1"}, + {"65535", "65535"}, + {"-1", "-1"}, + {"255", "255"}, + }; + + RETURN_IF_NOT_SUCCESS(execute_cmd(sql_select)); + while ( SQL_NO_DATA != SQLFetch(h_stmt)) + { + RETURN_IF_NOT_SUCCESS_I(i, SQLGetData(h_stmt, 1, SQL_C_CHAR, &getValue[0], MESSAGE_BUFFER_LEN, NULL)); + RETURN_IF_NOT_SUCCESS_I(i, SQLGetData(h_stmt, 2, SQL_C_CHAR, &getValue[1], MESSAGE_BUFFER_LEN, NULL)); + + //RETURN_IF_NOT_STRCMP_I(i, expectValue[i][0], getValue[0]); + //RETURN_IF_NOT_STRCMP_I(i, expectValue[i][1], getValue[1]); + i++; + } + + RETURN_IF_NOT_SUCCESS(SQLRowCount(h_stmt, &RowCount)); + RETURN_IF_NOT(i, RowCount); + SQLCloseCursor(h_stmt); + /* Final step. Delete the table and restore the environment. */ + RETURN_IF_NOT_SUCCESS(execute_cmd(sql_drop)); + + end_unit_test(); +} +``` + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** In the preceding example, the number column is defined. When the **SQLBindParameter** API is called, the performance of binding SQL_NUMERIC is higher than that of SQL_LONG. If char is used, the data type needs to be converted when data is inserted to the database server, causing a performance bottleneck. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-0-odbc-overview.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-0-odbc-overview.md new file mode 100644 index 0000000000000000000000000000000000000000..2afef8791be5a7e9a2cdb15fe82800702d437a08 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-0-odbc-overview.md @@ -0,0 +1,10 @@ +--- +title: Description +summary: Description +author: Guo Huan +date: 2021-05-17 +--- + +# Description + +The ODBC interface is a set of API functions provided to users. This chapter describes its common interfaces. For details on other interfaces, see "ODBC Programmer's Reference" at MSDN (). diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-1-SQLAllocEnv.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-1-SQLAllocEnv.md new file mode 100644 index 0000000000000000000000000000000000000000..64e6ed59f8d132e23dedd889f1a96377233495bc --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-1-SQLAllocEnv.md @@ -0,0 +1,10 @@ +--- +title: SQLAllocEnv +summary: SQLAllocEnv +author: Guo Huan +date: 2021-05-17 +--- + +# SQLAllocEnv + +In ODBC 3.x, SQLAllocEnv (an ODBC 2.x function) was deprecated and replaced by SQLAllocHandle. For details, see SQLAllocHandle. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-10-SQLExecDirect.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-10-SQLExecDirect.md new file mode 100644 index 0000000000000000000000000000000000000000..02cc87ecd76605c6e240d6a5eb3af5b46baf736c --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-10-SQLExecDirect.md @@ -0,0 +1,48 @@ +--- +title: SQLExecDirect +summary: SQLExecDirect +author: Guo Huan +date: 2021-05-17 +--- + +# SQLExecDirect + +## Function + +SQLExecDirect is used to execute a prepared SQL statement specified in this parameter. This is the fastest method for executing only one SQL statement at a time. + +## Prototype + +``` +SQLRETURN SQLExecDirect(SQLHSTMT StatementHandle, + SQLCHAR *StatementText, + SQLINTEGER TextLength); +``` + +## Parameter + +**Table 1** SQLExecDirect parameters + +| **Keyword** | **Parameter Description** | +| :-------------- | :----------------------------------------------------------- | +| StatementHandle | Statement handle, obtained from SQLAllocHandle. | +| StatementText | SQL statement to be executed. One SQL statement can be executed at a time. | +| TextLength | Length of **StatementText**. | + +## Return Value + +- **SQL_SUCCESS** indicates that the call succeeded. +- **SQL_SUCCESS_WITH_INFO** indicates that some warning information is displayed. +- **SQL_NEED_DATA** indicates that parameters provided before executing the SQL statement are insufficient. +- **SQL_ERROR** indicates major errors, such as memory allocation and connection failures. +- **SQL_INVALID_HANDLE** indicates that invalid handles were called. This value may also be returned by other APIs. +- **SQL_STILL_EXECUTING** indicates that the statement is being executed. +- **SQL_NO_DATA** indicates that the SQL statement does not return a result set. + +## Precautions + +If SQLExecDirect returns **SQL_ERROR** or **SQL_SUCCESS_WITH_INFO**, the application can call SQLGetDiagRec, with **HandleType** and **Handle** set to **SQL_HANDLE_STMT** and **StatementHandle**, respectively, to obtain the **SQLSTATE** value. The **SQLSTATE** value provides the detailed function calling information. + +## Example + +See ODBC - Examples. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-11-SQLExecute.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-11-SQLExecute.md new file mode 100644 index 0000000000000000000000000000000000000000..41b4d385be31528f0ea803ed73afe3ad3bccec5d --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-11-SQLExecute.md @@ -0,0 +1,44 @@ +--- +title: SQLExecute +summary: SQLExecute +author: Guo Huan +date: 2021-05-17 +--- + +# SQLExecute + +## Function + +SQLExecute is used to execute a prepared SQL statement using SQLPrepare. The statement is executed using the current value of any application variables that were bound to parameter markers by SQLBindParameter. + +## Prototype + +``` +SQLRETURN SQLExecute(SQLHSTMT StatementHandle); +``` + +## Parameter + +**Table 1** SQLExecute parameters + +| **Keyword** | **Parameter Description** | +| :-------------- | :------------------------------- | +| StatementHandle | Statement handle to be executed. | + +## Return Value + +- **SQL_SUCCESS** indicates that the call succeeded. +- **SQL_SUCCESS_WITH_INFO** indicates that some warning information is displayed. +- **SQL_NEED_DATA** indicates that parameters provided before executing the SQL statement are insufficient. +- **SQL_ERROR** indicates major errors, such as memory allocation and connection failures. +- **SQL_NO_DATA** indicates that the SQL statement does not return a result set. +- **SQL_INVALID_HANDLE** indicates that invalid handles were called. This value may also be returned by other APIs. +- **SQL_STILL_EXECUTING** indicates that the statement is being executed. + +## Precautions + +If SQLExecute returns **SQL_ERROR** or **SQL_SUCCESS_WITH_INFO**, the application can call SQLGetDiagRec, with **HandleType** and **Handle** set to **SQL_HANDLE_STMT** and **StatementHandle**, respectively, to obtain the **SQLSTATE** value. The **SQLSTATE** value provides the detailed function calling information. + +## Example + +See ODBC - Examples. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-12-SQLFetch.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-12-SQLFetch.md new file mode 100644 index 0000000000000000000000000000000000000000..519f9946a7af1fd6ec83970e2d00899934373689 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-12-SQLFetch.md @@ -0,0 +1,43 @@ +--- +title: SQLFetch +summary: SQLFetch +author: Guo Huan +date: 2021-05-17 +--- + +# SQLFetch + +## Function + +SQLFetch is used to advance the cursor to the next row of the result set and retrieve any bound columns. + +## Prototype + +``` +CSQLRETURN SQLFetch(SQLHSTMT StatementHandle); +``` + +## Parameter + +**Table 1** SQLFetch parameters + +| **Keyword** | **Parameter Description** | +| :-------------- | :---------------------------------------------- | +| StatementHandle | Statement handle, obtained from SQLAllocHandle. | + +## Return Value + +- **SQL_SUCCESS** indicates that the call succeeded. +- **SQL_SUCCESS_WITH_INFO** indicates that some warning information is displayed. +- **SQL_ERROR** indicates major errors, such as memory allocation and connection failures. +- **SQL_NO_DATA** indicates that the SQL statement does not return a result set. +- **SQL_INVALID_HANDLE** indicates that invalid handles were called. This value may also be returned by other APIs. +- **SQL_STILL_EXECUTING** indicates that the statement is being executed. + +## Precautions + +If SQLFetch returns **SQL_ERROR** or **SQL_SUCCESS_WITH_INFO**, the application can call SQLGetDiagRec, with **HandleType** and **Handle** set to **SQL_HANDLE_STMT** and **StatementHandle**, respectively, to obtain the **SQLSTATE** value. The **SQLSTATE** value provides the detailed function calling information. + +## Example + +See ODBC - Examples. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-13-SQLFreeStmt.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-13-SQLFreeStmt.md new file mode 100644 index 0000000000000000000000000000000000000000..7c2d3e6705c1e38511614fc63b7803781c5950d3 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-13-SQLFreeStmt.md @@ -0,0 +1,10 @@ +--- +title: SQLFreeStmt +summary: SQLFreeStmt +author: Guo Huan +date: 2021-05-17 +--- + +# SQLFreeStmt + +In ODBC 3.x, SQLFreeStmt (an ODBC 2.x function) was deprecated and replaced by SQLFreeHandle. For details, see SQLFreeHandle. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-14-SQLFreeConnect.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-14-SQLFreeConnect.md new file mode 100644 index 0000000000000000000000000000000000000000..1c097d143263533e1fc5771a5d123c2cf52556ae --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-14-SQLFreeConnect.md @@ -0,0 +1,10 @@ +--- +title: SQLFreeConnect +summary: SQLFreeConnect +author: Guo Huan +date: 2021-05-17 +--- + +# SQLFreeConnect + +In ODBC 3.x, SQLFreeConnect (an ODBC 2.x function) was deprecated and replaced by SQLFreeHandle. For details, see SQLFreeHandle. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-15-SQLFreeHandle.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-15-SQLFreeHandle.md new file mode 100644 index 0000000000000000000000000000000000000000..131a10cf2e7a29a0900056aa820c5a19bd67ec14 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-15-SQLFreeHandle.md @@ -0,0 +1,43 @@ +--- +title: SQLFreeHandle +summary: SQLFreeHandle +author: Guo Huan +date: 2021-05-17 +--- + +# SQLFreeHandle + +## Function + +SQLFreeHandle is used to release resources associated with a specific environment, connection, or statement handle. It replaces the ODBC 2.x functions: SQLFreeEnv, SQLFreeConnect, and SQLFreeStmt. + +## Prototype + +``` +SQLRETURN SQLFreeHandle(SQLSMALLINT HandleType, + SQLHANDLE Handle); +``` + +## Parameter + +**Table 1** SQLFreeHandle parameters + +| **Keyword** | **Parameter Description** | +| :---------- | :----------------------------------------------------------- | +| HandleType | Type of handle to be freed by SQLFreeHandle. The value must be one of the following:
- SQL_HANDLE_ENV
- SQL_HANDLE_DBC
- SQL_HANDLE_STMT
- SQL_HANDLE_DESC
If **HandleType** is not one of the preceding values, SQLFreeHandle returns **SQL_INVALID_HANDLE**. | +| Handle | Name of the handle to be freed. | + +## Return Value + +- **SQL_SUCCESS** indicates that the call succeeded. +- **SQL_SUCCESS_WITH_INFO** indicates that some warning information is displayed. +- **SQL_ERROR** indicates major errors, such as memory allocation and connection failures. +- **SQL_INVALID_HANDLE** indicates that invalid handles were called. This value may also be returned by other APIs. + +## Precautions + +If SQLFreeHandle returns **SQL_ERROR**, the handle is still valid. + +## Example + +See ODBC - Examples. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-16-SQLFreeEnv.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-16-SQLFreeEnv.md new file mode 100644 index 0000000000000000000000000000000000000000..3ff15ad8e72e67aef4df7d7b4fbadd22b31a2175 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-16-SQLFreeEnv.md @@ -0,0 +1,10 @@ +--- +title: SQLFreeEnv +summary: SQLFreeEnv +author: Guo Huan +date: 2021-05-17 +--- + +# SQLFreeEnv + +In ODBC 3.x, SQLFreeEnv (an ODBC 2.x function) was deprecated and replaced by SQLFreeHandle. For details, see SQLFreeHandle. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-17-SQLPrepare.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-17-SQLPrepare.md new file mode 100644 index 0000000000000000000000000000000000000000..e385634421da0b0f76f7984aad5c9fcb9f03ac12 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-17-SQLPrepare.md @@ -0,0 +1,46 @@ +--- +title: SQLPrepare +summary: SQLPrepare +author: Guo Huan +date: 2021-05-17 +--- + +# SQLPrepare + +## Function + +SQLPrepare is used to prepare an SQL statement to be executed. + +## Prototype + +``` +SQLRETURN SQLPrepare(SQLHSTMT StatementHandle, + SQLCHAR *StatementText, + SQLINTEGER TextLength); +``` + +## Parameter + +**Table 1** SQLPrepare parameters + +| **Keyword** | **Parameter Description** | +| :-------------- | :--------------------------- | +| StatementHandle | Statement handle. | +| StatementText | SQL text string. | +| TextLength | Length of **StatementText**. | + +## Return Value + +- **SQL_SUCCESS** indicates that the call succeeded. +- **SQL_SUCCESS_WITH_INFO** indicates that some warning information is displayed. +- **SQL_ERROR** indicates major errors, such as memory allocation and connection failures. +- **SQL_INVALID_HANDLE** indicates that invalid handles were called. This value may also be returned by other APIs. +- **SQL_STILL_EXECUTING** indicates that the statement is being executed. + +## Precautions + +If SQLPrepare returns **SQL_ERROR** or **SQL_SUCCESS_WITH_INFO**, the application can call SQLGetDiagRec, with **HandleType** and **Handle** set to **SQL_HANDLE_STMT** and **StatementHandle**, respectively, to obtain the **SQLSTATE** value. The **SQLSTATE** value provides the detailed function calling information. + +## Example + +See ODBC - Examples. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-18-SQLGetData.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-18-SQLGetData.md new file mode 100644 index 0000000000000000000000000000000000000000..2a88178b775fa03b9523be3a546edeb489d87326 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-18-SQLGetData.md @@ -0,0 +1,53 @@ +--- +title: SQLGetData +summary: SQLGetData +author: Guo Huan +date: 2021-05-17 +--- + +# SQLGetData + +## Function + +SQLGetData is used to retrieve data for a single column in the result set. It can be called for many times to retrieve data of variable lengths. + +## Prototype + +``` +SQLRETURN SQLGetData(SQLHSTMT StatementHandle, + SQLUSMALLINT Col_or_Param_Num, + SQLSMALLINT TargetType, + SQLPOINTER TargetValuePtr, + SQLLEN BufferLength, + SQLLEN *StrLen_or_IndPtr); +``` + +## Parameter + +**Table 1** SQLGetData parameters + +| **Keyword** | **Parameter Description** | +| :--------------- | :----------------------------------------------------------- | +| StatementHandle | Statement handle, obtained from SQLAllocHandle. | +| Col_or_Param_Num | Column number for which the data retrieval is requested. The column number starts with 1 and increases in ascending order. The number of the bookmark column is 0. | +| TargetType | C data type in the TargetValuePtr buffer. If **TargetType** is **SQL_ARD_TYPE**, the driver uses the data type of the **SQL_DESC_CONCISE_TYPE** field in ARD. If **TargetType** is **SQL_C_DEFAULT**, the driver selects a default data type according to the source SQL data type. | +| TargetValuePtr | **Output parameter**: pointer to the pointer that points to the buffer where the data is located. | +| BufferLength | Size of the buffer pointed to by **TargetValuePtr**. | +| StrLen_or_IndPtr | **Output parameter**: pointer to the buffer where the length or identifier value is returned. | + +## Return Value + +- **SQL_SUCCESS** indicates that the call succeeded. +- **SQL_SUCCESS_WITH_INFO** indicates that some warning information is displayed. +- **SQL_ERROR** indicates major errors, such as memory allocation and connection failures. +- **SQL_NO_DATA** indicates that the SQL statement does not return a result set. +- **SQL_INVALID_HANDLE** indicates that invalid handles were called. This value may also be returned by other APIs. +- **SQL_STILL_EXECUTING** indicates that the statement is being executed. + +## Precautions + +If SQLGetData returns **SQL_ERROR** or **SQL_SUCCESS_WITH_INFO**, the application can call SQLGetDiagRec, with **HandleType** and **Handle** set to **SQL_HANDLE_STMT** and **StatementHandle**, respectively, to obtain the **SQLSTATE** value. The **SQLSTATE** value provides the detailed function calling information. + +## Example + +See ODBC - Examples. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-19-SQLGetDiagRec.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-19-SQLGetDiagRec.md new file mode 100644 index 0000000000000000000000000000000000000000..328f677b0e64fa0290cda0b4e4ad9571f5f3b998 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-19-SQLGetDiagRec.md @@ -0,0 +1,74 @@ +--- +title: SQLGetDiagRec +summary: SQLGetDiagRec +author: Guo Huan +date: 2021-05-17 +--- + +# SQLGetDiagRec + +## Function + +SQLGetDiagRec is used to return the current values of multiple fields in a diagnostic record that contains error, warning, and status information. + +## Prototype + +``` +SQLRETURN SQLGetDiagRec(SQLSMALLINT HandleType + SQLHANDLE Handle, + SQLSMALLINT RecNumber, + SQLCHAR *SQLState, + SQLINTEGER *NativeErrorPtr, + SQLCHAR *MessageText, + SQLSMALLINT BufferLength + SQLSMALLINT *TextLengthPtr); +``` + +## Parameter + +**Table 1** SQLGetDiagRec parameters + +| **Keyword** | **Parameter Description** | +| :------------- | :----------------------------------------------------------- | +| HandleType | A handle-type identifier that describes the type of handle for which diagnostics are desired. The value must be one of the following:
- SQL_HANDLE_ENV
- SQL_HANDLE_DBC
- SQL_HANDLE_STMT
- SQL_HANDLE_DESC | +| Handle | A handle for the diagnostic data structure. Its type is indicated by **HandleType**. If **HandleType** is **SQL_HANDLE_ENV**, **Handle** may be a shared or non-shared environment handle. | +| RecNumber | Status record from which the application seeks information. **RecNumber** starts with 1. | +| SQLState | **Output parameter**: pointer to a buffer that saves the 5-character **SQLSTATE** code pertaining to **RecNumber**. | +| NativeErrorPtr | **Output parameter**: pointer to a buffer that saves the native error code. | +| MessageText | Pointer to a buffer that saves text strings of diagnostic information. | +| BufferLength | Length of **MessageText**. | +| TextLengthPtr | **Output parameter**: pointer to the buffer, the total number of bytes in the returned **MessageText**. If the number of bytes available to return is greater than **BufferLength**, then the diagnostics information text in **MessageText** is truncated to **BufferLength** minus the length of the null termination character. | + +## Return Value + +- **SQL_SUCCESS** indicates that the call succeeded. +- **SQL_SUCCESS_WITH_INFO** indicates that some warning information is displayed. +- **SQL_ERROR** indicates major errors, such as memory allocation and connection failures. +- **SQL_INVALID_HANDLE** indicates that invalid handles were called. This value may also be returned by other APIs. + +## Precautions + +SQLGetDiagRec does not release diagnostic records for itself. It uses the following return values to report execution results: + +- **SQL_SUCCESS** indicates that the function successfully returns diagnostic information. +- **SQL_SUCCESS_WITH_INFO** indicates that the **MessageText** buffer is too small to hold the requested diagnostic information. No diagnostic records are generated. +- **SQL_INVALID_HANDLE** indicates that the handle indicated by **HandType** and **Handle** is an invalid handle. +- **SQL_ERROR** indicates that **RecNumber** is less than or equal to 0 or that **BufferLength** is smaller than 0. + +If an ODBC function returns **SQL_ERROR** or **SQL_SUCCESS_WITH_INFO**, the application can call SQLGetDiagRec to obtain the **SQLSTATE** value. The possible **SQLSTATE** values are listed as follows: + +**Table 2** SQLSTATE values + +| SQLSATATE | Error | Description | +| :-------- | :------------------------------------ | :----------------------------------------------------------- | +| HY000 | General error. | An error occurred for which there is no specific SQLSTATE. | +| HY001 | Memory allocation error. | The driver is unable to allocate memory required to support execution or completion of the function. | +| HY008 | Operation canceled. | SQLCancel is called to terminate the statement execution, but the StatementHandle function is still called. | +| HY010 | Function sequence error. | The function is called prior to sending data to data parameters or columns being executed. | +| HY013 | Memory management error. | The function fails to be called. The error may be caused by low memory conditions. | +| HYT01 | Connection timeout. | The timeout period expired before the application was able to connect to the data source. | +| IM001 | Function not supported by the driver. | The called function is not supported by the StatementHandle driver. | + +## Example + +See ODBC - Examples. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-2-SQLAllocConnect.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-2-SQLAllocConnect.md new file mode 100644 index 0000000000000000000000000000000000000000..3e9dd4f93a6c3df229c0dcbf5609f704bb1ba09c --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-2-SQLAllocConnect.md @@ -0,0 +1,10 @@ +--- +title: SQLAllocConnect +summary: SQLAllocConnect +author: Guo Huan +date: 2021-05-17 +--- + +# SQLAllocConnect + +In ODBC 3.x, SQLAllocConnect (an ODBC 2.x function) was deprecated and replaced by SQLAllocHandle. For details, see SQLAllocHandle. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-20-SQLSetConnectAttr.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-20-SQLSetConnectAttr.md new file mode 100644 index 0000000000000000000000000000000000000000..a2d60a9112fd788b6c6b8896d09edff6dc1fab42 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-20-SQLSetConnectAttr.md @@ -0,0 +1,47 @@ +--- +title: SQLSetConnectAttr +summary: SQLSetConnectAttr +author: Guo Huan +date: 2021-05-17 +--- + +# SQLSetConnectAttr + +## Function + +SQLSetConnectAttr is used to set connection attributes. + +## Prototype + +``` +SQLRETURN SQLSetConnectAttr(SQLHDBC ConnectionHandle + SQLINTEGER Attribute, + SQLPOINTER ValuePtr, + SQLINTEGER StringLength); +``` + +## Parameter + +**Table 1** SQLSetConnectAttr parameters + +| **Keyword** | **Parameter Description** | +| :--------------- | :----------------------------------------------------------- | +| ConnectionHandle | Connection handle. | +| Attribute | Attribute to set. | +| ValuePtr | Pointer to the **Attribute** value. **ValuePtr** depends on the **Attribute** value, and can be a 32-bit unsigned integer value or a null-terminated string. If the **ValuePtr** parameter is a driver-specific value, it may be a signed integer. | +| StringLength | If **ValuePtr** points to a string or a binary buffer, **StringLength** is the length of ***ValuePtr**. If **ValuePtr** points to an integer, **StringLength** is ignored. | + +## Return Value + +- **SQL_SUCCESS** indicates that the call succeeded. +- **SQL_SUCCESS_WITH_INFO** indicates that some warning information is displayed. +- **SQL_ERROR** indicates major errors, such as memory allocation and connection failures. +- **SQL_INVALID_HANDLE** indicates that invalid handles were called. This value may also be returned by other APIs. + +## Precautions + +If SQLSetConnectAttr returns **SQL_ERROR** or **SQL_SUCCESS_WITH_INFO**, the application can call SQLGetDiagRec, with **HandleType** and **Handle** set to **SQL_HANDLE_DBC** and **ConnectionHandle**, respectively, to obtain the **SQLSTATE** value. The **SQLSTATE** value provides the detailed function calling information. + +## Example + +See ODBC - Examples. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-21-SQLSetEnvAttr.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-21-SQLSetEnvAttr.md new file mode 100644 index 0000000000000000000000000000000000000000..6e5494f5055ed4e0633227a2fcdc037fb36c1cf0 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-21-SQLSetEnvAttr.md @@ -0,0 +1,47 @@ +--- +title: SQLSetEnvAttr +summary: SQLSetEnvAttr +author: Guo Huan +date: 2021-05-17 +--- + +# SQLSetEnvAttr + +## Function + +SQLSetEnvAttr is used to set environment attributes. + +## Prototype + +``` +SQLRETURN SQLSetEnvAttr(SQLHENV EnvironmentHandle + SQLINTEGER Attribute, + SQLPOINTER ValuePtr, + SQLINTEGER StringLength); +``` + +## Parameter + +**Table 1** SQLSetEnvAttr parameters + +| **Keyword** | **Parameter Description** | +| :---------------- | :----------------------------------------------------------- | +| EnvironmentHandle | Environment handle. | +| Attribute | Environment attribute to be set. The value must be one of the following:
- **SQL_ATTR_ODBC_VERSION**: ODBC version
- **SQL_CONNECTION_POOLING**: connection pool attribute
- **SQL_OUTPUT_NTS**: string type returned by the driver | +| ValuePtr | Pointer to the **Attribute** value. **ValuePtr** depends on the **Attribute** value, and can be a 32-bit integer value or a null-terminated string. | +| StringLength | If **ValuePtr** points to a string or a binary buffer, **StringLength** is the length of ***ValuePtr**. If **ValuePtr** points to an integer, **StringLength** is ignored. | + +## Return Value + +- **SQL_SUCCESS** indicates that the call succeeded. +- **SQL_SUCCESS_WITH_INFO** indicates that some warning information is displayed. +- **SQL_ERROR** indicates major errors, such as memory allocation and connection failures. +- **SQL_INVALID_HANDLE** indicates that invalid handles were called. This value may also be returned by other APIs. + +## Precautions + +If SQLSetEnvAttr returns **SQL_ERROR** or **SQL_SUCCESS_WITH_INFO**, the application can call SQLGetDiagRec, set **HandleType** and **Handle** to **SQL_HANDLE_ENV** and **EnvironmentHandle**, and obtain the **SQLSTATE** value. The **SQLSTATE** value provides the detailed function calling information. + +## Example + +See ODBC - Examples. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-22-SQLSetStmtAttr.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-22-SQLSetStmtAttr.md new file mode 100644 index 0000000000000000000000000000000000000000..1b1d2d6cf2f2583a7b7912999c3cb5f617e3c833 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-22-SQLSetStmtAttr.md @@ -0,0 +1,47 @@ +--- +title: SQLSetStmtAttr +summary: SQLSetStmtAttr +author: Guo Huan +date: 2021-05-17 +--- + +# SQLSetStmtAttr + +## Function + +SQLSetStmtAttr is used to set attributes related to a statement. + +## Prototype + +``` +SQLRETURN SQLSetStmtAttr(SQLHSTMT StatementHandle + SQLINTEGER Attribute, + SQLPOINTER ValuePtr, + SQLINTEGER StringLength); +``` + +## Parameter + +**Table 1** SQLSetStmtAttr parameters + +| **Keyword** | **Parameter Description** | +| :-------------- | :----------------------------------------------------------- | +| StatementHandle | Statement handle. | +| Attribute | Attribute to set. | +| ValuePtr | Pointer to the **Attribute** value. **ValuePtr** depends on the **Attribute** value, and can be a 32-bit unsigned integer value or a pointer to a null-terminated string, a binary buffer, or a driver-specified value. If the **ValuePtr** parameter is a driver-specific value, it may be a signed integer. | +| StringLength | If **ValuePtr** points to a string or a binary buffer, **StringLength** is the length of ***ValuePtr**. If **ValuePtr** points to an integer, **StringLength** is ignored. | + +## Return Value + +- **SQL_SUCCESS** indicates that the call succeeded. +- **SQL_SUCCESS_WITH_INFO** indicates that some warning information is displayed. +- **SQL_ERROR** indicates major errors, such as memory allocation and connection failures. +- **SQL_INVALID_HANDLE** indicates that invalid handles were called. This value may also be returned by other APIs. + +## Precautions + +If SQLSetStmtAttr returns **SQL_ERROR** or **SQL_SUCCESS_WITH_INFO**, the application can call SQLGetDiagRec, with **HandleType** and **Handle** set to **SQL_HANDLE_STMT** and **StatementHandle**, respectively, to obtain the **SQLSTATE** value. The **SQLSTATE** value provides the detailed function calling information. + +## Example + +See ODBC - Examples. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-23-Examples.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-23-Examples.md new file mode 100644 index 0000000000000000000000000000000000000000..ef42c9cba275467f59dce375514878ec989f1ad3 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-23-Examples.md @@ -0,0 +1,345 @@ +--- +title: Examples +summary: Examples +author: Guo Huan +date: 2021-05-17 +--- + +# Examples + +## Code for Common Functions + +```c +// The following example shows how to obtain data from MogDB through the ODBC interface. +// DBtest.c (compile with: libodbc.so) +#include +#include +#include +#ifdef WIN32 +#include +#endif +SQLHENV V_OD_Env; // Handle ODBC environment +SQLHSTMT V_OD_hstmt; // Handle statement +SQLHDBC V_OD_hdbc; // Handle connection +char typename[100]; +SQLINTEGER value = 100; +SQLINTEGER V_OD_erg,V_OD_buffer,V_OD_err,V_OD_id; +int main(int argc,char *argv[]) +{ + // 1. Allocate an environment handle. + V_OD_erg = SQLAllocHandle(SQL_HANDLE_ENV,SQL_NULL_HANDLE,&V_OD_Env); + if ((V_OD_erg != SQL_SUCCESS) && (V_OD_erg != SQL_SUCCESS_WITH_INFO)) + { + printf("Error AllocHandle\n"); + exit(0); + } + // 2. Set environment attributes (version information). + SQLSetEnvAttr(V_OD_Env, SQL_ATTR_ODBC_VERSION, (void*)SQL_OV_ODBC3, 0); + // 3. Allocate a connection handle. + V_OD_erg = SQLAllocHandle(SQL_HANDLE_DBC, V_OD_Env, &V_OD_hdbc); + if ((V_OD_erg != SQL_SUCCESS) && (V_OD_erg != SQL_SUCCESS_WITH_INFO)) + { + SQLFreeHandle(SQL_HANDLE_ENV, V_OD_Env); + exit(0); + } + // 4. Set connection attributes. + SQLSetConnectAttr(V_OD_hdbc, SQL_ATTR_AUTOCOMMIT, SQL_AUTOCOMMIT_ON, 0); + // 5. Connect to the data source. userName and password indicate the username and password for connecting to the database. Set them as needed. + // If the username and password have been set in the odbc.ini file, you do not need to set userName or password here, retaining "" for them. However, you are not advised to do so because the username and password will be disclosed if the permission for odbc.ini is abused. + V_OD_erg = SQLConnect(V_OD_hdbc, (SQLCHAR*) "gaussdb", SQL_NTS, + (SQLCHAR*) "userName", SQL_NTS, (SQLCHAR*) "password", SQL_NTS); + if ((V_OD_erg != SQL_SUCCESS) && (V_OD_erg != SQL_SUCCESS_WITH_INFO)) + { + printf("Error SQLConnect %d\n",V_OD_erg); + SQLFreeHandle(SQL_HANDLE_ENV, V_OD_Env); + exit(0); + } + printf("Connected !\n"); + // 6. Set statement attributes. + SQLSetStmtAttr(V_OD_hstmt,SQL_ATTR_QUERY_TIMEOUT,(SQLPOINTER *)3,0); + // 7. Allocate a statement handle. + SQLAllocHandle(SQL_HANDLE_STMT, V_OD_hdbc, &V_OD_hstmt); + // 8. Run SQL statements. + SQLExecDirect(V_OD_hstmt,"drop table IF EXISTS customer_t1",SQL_NTS); + SQLExecDirect(V_OD_hstmt,"CREATE TABLE customer_t1(c_customer_sk INTEGER, c_customer_name VARCHAR(32));",SQL_NTS); + SQLExecDirect(V_OD_hstmt,"insert into customer_t1 values(25,li)",SQL_NTS); + // 9. Prepare for execution. + SQLPrepare(V_OD_hstmt,"insert into customer_t1 values(?)",SQL_NTS); + // 10. Bind parameters. + SQLBindParameter(V_OD_hstmt,1,SQL_PARAM_INPUT,SQL_C_SLONG,SQL_INTEGER,0,0, + &value,0,NULL); + // 11. Run prepared statements. + SQLExecute(V_OD_hstmt); + SQLExecDirect(V_OD_hstmt,"select id from testtable",SQL_NTS); + // 12. Obtain attributes of a specific column in the result set. + SQLColAttribute(V_OD_hstmt,1,SQL_DESC_TYPE,typename,100,NULL,NULL); + printf("SQLColAtrribute %s\n",typename); + // 13. Bind the result set. + SQLBindCol(V_OD_hstmt,1,SQL_C_SLONG, (SQLPOINTER)&V_OD_buffer,150, + (SQLLEN *)&V_OD_err); + // 14. Obtain data in the result set by executing SQLFetch. + V_OD_erg=SQLFetch(V_OD_hstmt); + // 15. Obtain and return data by executing SQLGetData. + while(V_OD_erg != SQL_NO_DATA) + { + SQLGetData(V_OD_hstmt,1,SQL_C_SLONG,(SQLPOINTER)&V_OD_id,0,NULL); + printf("SQLGetData ----ID = %d\n",V_OD_id); + V_OD_erg=SQLFetch(V_OD_hstmt); + }; + printf("Done !\n"); + // 16. Disconnect data source connections and release handles. + SQLFreeHandle(SQL_HANDLE_STMT,V_OD_hstmt); + SQLDisconnect(V_OD_hdbc); + SQLFreeHandle(SQL_HANDLE_DBC,V_OD_hdbc); + SQLFreeHandle(SQL_HANDLE_ENV, V_OD_Env); + return(0); + } +``` + +## Code for Batch Processing + +```c +/********************************************************************** +*Set UseBatchProtocol to 1 in the data source and set the database parameter support_batch_bind +*to on. +*The CHECK_ERROR command is used to check and print error information. +*This example is used to interactively obtain the DSN, data volume to be processed, and volume of ignored data from users, and insert required data into the test_odbc_batch_insert table. +***********************************************************************/ +#include +#include +#include +#include +#include + +#include "util.c" + +void Exec(SQLHDBC hdbc, SQLCHAR* sql) +{ + SQLRETURN retcode; // Return status + SQLHSTMT hstmt = SQL_NULL_HSTMT; // Statement handle + SQLCHAR loginfo[2048]; + + // Allocate Statement Handle + retcode = SQLAllocHandle(SQL_HANDLE_STMT, hdbc, &hstmt); + CHECK_ERROR(retcode, "SQLAllocHandle(SQL_HANDLE_STMT)", + hstmt, SQL_HANDLE_STMT); + + // Prepare Statement + retcode = SQLPrepare(hstmt, (SQLCHAR*) sql, SQL_NTS); + sprintf((char*)loginfo, "SQLPrepare log: %s", (char*)sql); + CHECK_ERROR(retcode, loginfo, hstmt, SQL_HANDLE_STMT); + + // Execute Statement + retcode = SQLExecute(hstmt); + sprintf((char*)loginfo, "SQLExecute stmt log: %s", (char*)sql); + CHECK_ERROR(retcode, loginfo, hstmt, SQL_HANDLE_STMT); + + // Free Handle + retcode = SQLFreeHandle(SQL_HANDLE_STMT, hstmt); + sprintf((char*)loginfo, "SQLFreeHandle stmt log: %s", (char*)sql); + CHECK_ERROR(retcode, loginfo, hstmt, SQL_HANDLE_STMT); +} + +int main () +{ + SQLHENV henv = SQL_NULL_HENV; + SQLHDBC hdbc = SQL_NULL_HDBC; + int batchCount = 1000; + SQLLEN rowsCount = 0; + int ignoreCount = 0; + + SQLRETURN retcode; + SQLCHAR dsn[1024] = {'\0'}; + SQLCHAR loginfo[2048]; + + // Interactively obtain data source names. + getStr("Please input your DSN", (char*)dsn, sizeof(dsn), 'N'); + // Interactively obtain the volume of data to be batch processed. + getInt("batchCount", &batchCount, 'N', 1); + do + { + // Interactively obtain the volume of batch processing data that is not inserted into the database. + getInt("ignoreCount", &ignoreCount, 'N', 1); + if (ignoreCount > batchCount) + { + printf("ignoreCount(%d) should be less than batchCount(%d)\n", ignoreCount, batchCount); + } + }while(ignoreCount > batchCount); + + retcode = SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &henv); + CHECK_ERROR(retcode, "SQLAllocHandle(SQL_HANDLE_ENV)", + henv, SQL_HANDLE_ENV); + + // Set ODBC Verion + retcode = SQLSetEnvAttr(henv, SQL_ATTR_ODBC_VERSION, + (SQLPOINTER*)SQL_OV_ODBC3, 0); + CHECK_ERROR(retcode, "SQLSetEnvAttr(SQL_ATTR_ODBC_VERSION)", + henv, SQL_HANDLE_ENV); + + // Allocate Connection + retcode = SQLAllocHandle(SQL_HANDLE_DBC, henv, &hdbc); + CHECK_ERROR(retcode, "SQLAllocHandle(SQL_HANDLE_DBC)", + henv, SQL_HANDLE_DBC); + + // Set Login Timeout + retcode = SQLSetConnectAttr(hdbc, SQL_LOGIN_TIMEOUT, (SQLPOINTER)5, 0); + CHECK_ERROR(retcode, "SQLSetConnectAttr(SQL_LOGIN_TIMEOUT)", + hdbc, SQL_HANDLE_DBC); + + // Set Auto Commit + retcode = SQLSetConnectAttr(hdbc, SQL_ATTR_AUTOCOMMIT, + (SQLPOINTER)(1), 0); + CHECK_ERROR(retcode, "SQLSetConnectAttr(SQL_ATTR_AUTOCOMMIT)", + hdbc, SQL_HANDLE_DBC); + + // Connect to DSN + sprintf(loginfo, "SQLConnect(DSN:%s)", dsn); + retcode = SQLConnect(hdbc, (SQLCHAR*) dsn, SQL_NTS, + (SQLCHAR*) NULL, 0, NULL, 0); + CHECK_ERROR(retcode, loginfo, hdbc, SQL_HANDLE_DBC); + + // init table info. + Exec(hdbc, "drop table if exists test_odbc_batch_insert"); + Exec(hdbc, "create table test_odbc_batch_insert(id int primary key, col varchar2(50))"); + + // The following code constructs the data to be inserted based on the data volume entered by users: + { + SQLRETURN retcode; + SQLHSTMT hstmtinesrt = SQL_NULL_HSTMT; + int i; + SQLCHAR *sql = NULL; + SQLINTEGER *ids = NULL; + SQLCHAR *cols = NULL; + SQLLEN *bufLenIds = NULL; + SQLLEN *bufLenCols = NULL; + SQLUSMALLINT *operptr = NULL; + SQLUSMALLINT *statusptr = NULL; + SQLULEN process = 0; + + // Data is constructed by column. Each column is stored continuously. + ids = (SQLINTEGER*)malloc(sizeof(ids[0]) * batchCount); + cols = (SQLCHAR*)malloc(sizeof(cols[0]) * batchCount * 50); + // Data size in each row for a column + bufLenIds = (SQLLEN*)malloc(sizeof(bufLenIds[0]) * batchCount); + bufLenCols = (SQLLEN*)malloc(sizeof(bufLenCols[0]) * batchCount); + // Whether this row needs to be processed. The value is SQL_PARAM_IGNORE or SQL_PARAM_PROCEED. + operptr = (SQLUSMALLINT*)malloc(sizeof(operptr[0]) * batchCount); + memset(operptr, 0, sizeof(operptr[0]) * batchCount); + // Processing result of the row + // Note: In the database, a statement belongs to one transaction. Therefore, data is processed as a unit. Either all data is inserted successfully or all data fails to be inserted. + statusptr = (SQLUSMALLINT*)malloc(sizeof(statusptr[0]) * batchCount); + memset(statusptr, 88, sizeof(statusptr[0]) * batchCount); + + if (NULL == ids || NULL == cols || NULL == bufLenCols || NULL == bufLenIds) + { + fprintf(stderr, "FAILED:\tmalloc data memory failed\n"); + goto exit; + } + + for (int i = 0; i < batchCount; i++) + { + ids[i] = i; + sprintf(cols + 50 * i, "column test value %d", i); + bufLenIds[i] = sizeof(ids[i]); + bufLenCols[i] = strlen(cols + 50 * i); + operptr[i] = (i < ignoreCount) ? SQL_PARAM_IGNORE : SQL_PARAM_PROCEED; + } + + // Allocate Statement Handle + retcode = SQLAllocHandle(SQL_HANDLE_STMT, hdbc, &hstmtinesrt); + CHECK_ERROR(retcode, "SQLAllocHandle(SQL_HANDLE_STMT)", + hstmtinesrt, SQL_HANDLE_STMT); + + // Prepare Statement + sql = (SQLCHAR*)"insert into test_odbc_batch_insert values(?, ?)"; + retcode = SQLPrepare(hstmtinesrt, (SQLCHAR*) sql, SQL_NTS); + sprintf((char*)loginfo, "SQLPrepare log: %s", (char*)sql); + CHECK_ERROR(retcode, loginfo, hstmtinesrt, SQL_HANDLE_STMT); + + retcode = SQLSetStmtAttr(hstmtinesrt, SQL_ATTR_PARAMSET_SIZE, (SQLPOINTER)batchCount, sizeof(batchCount)); + CHECK_ERROR(retcode, "SQLSetStmtAttr", hstmtinesrt, SQL_HANDLE_STMT); + + retcode = SQLBindParameter(hstmtinesrt, 1, SQL_PARAM_INPUT, SQL_C_SLONG, SQL_INTEGER, sizeof(ids[0]), 0,&(ids[0]), 0, bufLenIds); + CHECK_ERROR(retcode, "SQLBindParameter for id", hstmtinesrt, SQL_HANDLE_STMT); + + retcode = SQLBindParameter(hstmtinesrt, 2, SQL_PARAM_INPUT, SQL_C_CHAR, SQL_CHAR, 50, 50, cols, 50, bufLenCols); + CHECK_ERROR(retcode, "SQLBindParameter for cols", hstmtinesrt, SQL_HANDLE_STMT); + + retcode = SQLSetStmtAttr(hstmtinesrt, SQL_ATTR_PARAMS_PROCESSED_PTR, (SQLPOINTER)&process, sizeof(process)); + CHECK_ERROR(retcode, "SQLSetStmtAttr for SQL_ATTR_PARAMS_PROCESSED_PTR", hstmtinesrt, SQL_HANDLE_STMT); + + retcode = SQLSetStmtAttr(hstmtinesrt, SQL_ATTR_PARAM_STATUS_PTR, (SQLPOINTER)statusptr, sizeof(statusptr[0]) * batchCount); + CHECK_ERROR(retcode, "SQLSetStmtAttr for SQL_ATTR_PARAM_STATUS_PTR", hstmtinesrt, SQL_HANDLE_STMT); + + retcode = SQLSetStmtAttr(hstmtinesrt, SQL_ATTR_PARAM_OPERATION_PTR, (SQLPOINTER)operptr, sizeof(operptr[0]) * batchCount); + CHECK_ERROR(retcode, "SQLSetStmtAttr for SQL_ATTR_PARAM_OPERATION_PTR", hstmtinesrt, SQL_HANDLE_STMT); + + retcode = SQLExecute(hstmtinesrt); + sprintf((char*)loginfo, "SQLExecute stmt log: %s", (char*)sql); + CHECK_ERROR(retcode, loginfo, hstmtinesrt, SQL_HANDLE_STMT); + + retcode = SQLRowCount(hstmtinesrt, &rowsCount); + CHECK_ERROR(retcode, "SQLRowCount execution", hstmtinesrt, SQL_HANDLE_STMT); + + if (rowsCount != (batchCount - ignoreCount)) + { + sprintf(loginfo, "(batchCount - ignoreCount)(%d) != rowsCount(%d)", (batchCount - ignoreCount), rowsCount); + CHECK_ERROR(SQL_ERROR, loginfo, NULL, SQL_HANDLE_STMT); + } + else + { + sprintf(loginfo, "(batchCount - ignoreCount)(%d) == rowsCount(%d)", (batchCount - ignoreCount), rowsCount); + CHECK_ERROR(SQL_SUCCESS, loginfo, NULL, SQL_HANDLE_STMT); + } + + // check row number returned + if (rowsCount != process) + { + sprintf(loginfo, "process(%d) != rowsCount(%d)", process, rowsCount); + CHECK_ERROR(SQL_ERROR, loginfo, NULL, SQL_HANDLE_STMT); + } + else + { + sprintf(loginfo, "process(%d) == rowsCount(%d)", process, rowsCount); + CHECK_ERROR(SQL_SUCCESS, loginfo, NULL, SQL_HANDLE_STMT); + } + + for (int i = 0; i < batchCount; i++) + { + if (i < ignoreCount) + { + if (statusptr[i] != SQL_PARAM_UNUSED) + { + sprintf(loginfo, "statusptr[%d](%d) != SQL_PARAM_UNUSED", i, statusptr[i]); + CHECK_ERROR(SQL_ERROR, loginfo, NULL, SQL_HANDLE_STMT); + } + } + else if (statusptr[i] != SQL_PARAM_SUCCESS) + { + sprintf(loginfo, "statusptr[%d](%d) != SQL_PARAM_SUCCESS", i, statusptr[i]); + CHECK_ERROR(SQL_ERROR, loginfo, NULL, SQL_HANDLE_STMT); + } + } + + retcode = SQLFreeHandle(SQL_HANDLE_STMT, hstmtinesrt); + sprintf((char*)loginfo, "SQLFreeHandle hstmtinesrt"); + CHECK_ERROR(retcode, loginfo, hstmtinesrt, SQL_HANDLE_STMT); + } + + +exit: + printf ("\nComplete.\n"); + + // Connection + if (hdbc != SQL_NULL_HDBC) { + SQLDisconnect(hdbc); + SQLFreeHandle(SQL_HANDLE_DBC, hdbc); + } + + // Environment + if (henv != SQL_NULL_HENV) + SQLFreeHandle(SQL_HANDLE_ENV, henv); + + return 0; +} +``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-3-SQLAllocHandle.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-3-SQLAllocHandle.md new file mode 100644 index 0000000000000000000000000000000000000000..7f5e24d7016f62f43891e936990a396a2fa3c2c0 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-3-SQLAllocHandle.md @@ -0,0 +1,45 @@ +--- +title: SQLAllocHandle +summary: SQLAllocHandle +author: Guo Huan +date: 2021-05-17 +--- + +# SQLAllocHandle + +## Function + +SQLAllocHandle is used to allocate environment, connection, statement, or descriptor handles. This function replaces the deprecated ODBC 2.x functions SQLAllocEnv, SQLAllocConnect, and SQLAllocStmt. + +## Prototype + +``` +SQLRETURN SQLAllocHandle(SQLSMALLINT HandleType, + SQLHANDLE InputHandle, + SQLHANDLE *OutputHandlePtr); +``` + +## Parameter + +**Table 1** SQLAllocHandle parameters + +| **Keyword** | **Parameter Description** | +| :-------------- | :----------------------------------------------------------- | +| HandleType | Type of handle to be allocated by SQLAllocHandle. The value must be one of the following:
- SQL_HANDLE_ENV (environment handle)
- SQL_HANDLE_DBC (connection handle)
- SQL_HANDLE_STMT (statement handle)
- SQL_HANDLE_DESC (descriptor handle)
The handle application sequence is: **SQL_HANDLE_ENV** > **SQL_HANDLE_DBC** > **SQL_HANDLE_STMT**. The handle applied later depends on the handle applied prior to it. | +| InputHandle | Existing handle to use as a context for the new handle being allocated.
- If **HandleType** is **SQL_HANDLE_ENV**, this parameter is set to **SQL_NULL_HANDLE**.
- If **HandleType** is **SQL_HANDLE_DBC**, this parameter value must be an environment handle.
- If **HandleType** is **SQL_HANDLE_STMT** or **SQL_HANDLE_DESC**, this parameter value must be a connection handle. | +| OutputHandlePtr | **Output parameter**: Pointer to a buffer that stores the returned handle in the newly allocated data structure. | + +## Return Value + +- **SQL_SUCCESS** indicates that the call succeeded. +- **SQL_SUCCESS_WITH_INFO** indicates that some warning information is displayed. +- **SQL_ERROR** indicates major errors, such as memory allocation and connection failures. +- **SQL_INVALID_HANDLE** indicates that invalid handles were called. This value may also be returned by other APIs. + +## Precautions + +If SQLAllocHandle returns **SQL_ERROR** when it is used to allocate a non-environment handle, it sets **OutputHandlePtr** to **SQL_NULL_HDBC**, **SQL_NULL_HSTMT**, or **SQL_NULL_HDESC**. The application can then call SQLGetDiagRec, with **HandleType** and **Handle** set to the value of **IntputHandle**, to obtain the **SQLSTATE** value. The **SQLSTATE** value provides the detailed function calling information. + +## Example + +See ODBC - Examples. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-4-SQLAllocStmt.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-4-SQLAllocStmt.md new file mode 100644 index 0000000000000000000000000000000000000000..b36225e751ec8cfe330b76cdc829b19217aef31b --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-4-SQLAllocStmt.md @@ -0,0 +1,10 @@ +--- +title: SQLAllocStmt +summary: SQLAllocStmt +author: Guo Huan +date: 2021-05-17 +--- + +# SQLAllocStmt + +In ODBC 3.x, SQLAllocStmt was deprecated and replaced by SQLAllocHandle. For details, see SQLAllocHandle. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-5-SQLBindCol.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-5-SQLBindCol.md new file mode 100644 index 0000000000000000000000000000000000000000..2e7ab5ce1c4e61b74145386300a9fc53ddf6b077 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-5-SQLBindCol.md @@ -0,0 +1,51 @@ +--- +title: SQLBindCol +summary: SQLBindCol +author: Guo Huan +date: 2021-05-17 +--- + +# SQLBindCol + +## Function + +SQLBindCol is used to bind columns in a result set to an application data buffer. + +## Prototype + +``` +SQLRETURN SQLBindCol(SQLHSTMT StatementHandle, + SQLUSMALLINT ColumnNumber, + SQLSMALLINT TargetType, + SQLPOINTER TargetValuePtr, + SQLLEN BufferLength, + SQLLEN *StrLen_or_IndPtr); +``` + +## Parameters + +**Table 1** SQLBindCol parameters + +| **Keyword** | **Parameter Description** | +| :--------------- | :----------------------------------------------------------- | +| StatementHandle | Statement handle. | +| ColumnNumber | Number of the column to be bound. The column number starts with 0 and increases in ascending order. Column 0 is the bookmark column. If no bookmark column is set, column numbers start with 1. | +| TargetType | C data type in the buffer. | +| TargetValuePtr | **Output parameter**: pointer to the buffer bound with the column. The SQLFetch function returns data in the buffer. If **TargetValuePtr** is null, **StrLen_or_IndPtr** is a valid value. | +| BufferLength | Length of the **TargetValuePtr** buffer in bytes. | +| StrLen_or_IndPtr | **Output parameter**: pointer to the length or indicator of the buffer. If **StrLen_or_IndPtr** is null, no length or indicator is used. | + +## Return Value + +- **SQL_SUCCESS** indicates that the call succeeded. +- **SQL_SUCCESS_WITH_INFO** indicates that some warning information is displayed. +- **SQL_ERROR** indicates major errors, such as memory allocation and connection failures. +- **SQL_INVALID_HANDLE** indicates that invalid handles were called. This value may also be returned by other APIs. + +## Precautions + +If SQLBindCol returns **SQL_ERROR** or **SQL_SUCCESS_WITH_INFO**, the application can call SQLGetDiagRec, with **HandleType** and **Handle** set to **SQL_HANDLE_STMT** and **StatementHandle**, respectively, to obtain the **SQLSTATE** value. The **SQLSTATE** value provides the detailed function calling information. + +## Example + +See ODBC - Examples. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-6-SQLBindParameter.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-6-SQLBindParameter.md new file mode 100644 index 0000000000000000000000000000000000000000..0278e069a0c00f4948b8721c6c685cc3fde788a6 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-6-SQLBindParameter.md @@ -0,0 +1,59 @@ +--- +title: SQLBindParameter +summary: SQLBindParameter +author: Guo Huan +date: 2021-05-17 +--- + +# SQLBindParameter + +## Function + +SQLBindParameter is used to bind parameter markers in an SQL statement to a buffer. + +## Prototype + +``` +SQLRETURN SQLBindParameter(SQLHSTMT StatementHandle, + SQLUSMALLINT ParameterNumber, + SQLSMALLINT InputOutputType, + SQLSMALLINT ValuetType, + SQLSMALLINT ParameterType, + SQLULEN ColumnSize, + SQLSMALLINT DecimalDigits, + SQLPOINTER ParameterValuePtr, + SQLLEN BufferLength, + SQLLEN *StrLen_or_IndPtr); +``` + +## Parameters + +**Table 1** SQLBindParameter + +| **Keyword** | **Parameter Description** | +| :---------------- | :----------------------------------------------------------- | +| StatementHandle | Statement handle. | +| ParameterNumber | Parameter marker number, starting with 1 and increasing in ascending order. | +| InputOutputType | Input/output type of the parameter. | +| ValueType | C data type of the parameter. | +| ParameterType | SQL data type of the parameter. | +| ColumnSize | Size of the column or expression of the corresponding parameter marker. | +| DecimalDigits | Decimal digit of the column or the expression of the corresponding parameter marker. | +| ParameterValuePtr | Pointer to the storage parameter buffer. | +| BufferLength | Length of the **ParameterValuePtr** buffer in bytes. | +| StrLen_or_IndPtr | Pointer to the length or indicator of the buffer. If **StrLen_or_IndPtr** is null, no length or indicator is used. | + +## Return Value + +- **SQL_SUCCESS** indicates that the call succeeded. +- **SQL_SUCCESS_WITH_INFO** indicates that some warning information is displayed. +- **SQL_ERROR** indicates major errors, such as memory allocation and connection failures. +- **SQL_INVALID_HANDLE** indicates that invalid handles were called. This value may also be returned by other APIs. + +## Precautions + +If SQLBindParameter returns **SQL_ERROR** or **SQL_SUCCESS_WITH_INFO**, the application can call SQLGetDiagRec, with **HandleType** and **Handle** set to **SQL_HANDLE_STMT** and **StatementHandle**, respectively, to obtain the **SQLSTATE** value. The **SQLSTATE** value provides the detailed function calling information. + +## Example + +See ODBC - Examples. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-7-SQLColAttribute.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-7-SQLColAttribute.md new file mode 100644 index 0000000000000000000000000000000000000000..8290a824e1de6ce54c6ade7d5becd047a417717d --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-7-SQLColAttribute.md @@ -0,0 +1,53 @@ +--- +title: SQLColAttribute +summary: SQLColAttribute +author: Guo Huan +date: 2021-05-17 +--- + +# SQLColAttribute + +## Function + +SQLColAttribute is used to return the descriptor information about a column in the result set. + +## Prototype + +``` +SQLRETURN SQLColAttibute(SQLHSTMT StatementHandle, + SQLUSMALLINT ColumnNumber, + SQLUSMALLINT FieldIdentifier, + SQLPOINTER CharacterAtrriburePtr, + SQLSMALLINT BufferLength, + SQLSMALLINT *StringLengthPtr, + SQLLEN *NumericAttributePtr); +``` + +## Parameters + +**Table 1** SQLColAttribute parameters + +| **Keyword** | **Parameter Description** | +| :-------------------- | :----------------------------------------------------------- | +| StatementHandle | Statement handle. | +| ColumnNumber | Column number of the field to be queried, starting with 1 and increasing in ascending order. | +| FieldIdentifier | Field identifier of **ColumnNumber** in IRD. | +| CharacterAttributePtr | **Output parameter**: pointer to the buffer that returns the **FieldIdentifier** value. | +| BufferLength | - **BufferLength** indicates the length of the buffer if **FieldIdentifier** is an ODBC-defined field and **CharacterAttributePtr** points to a string or a binary buffer.
- Ignore this parameter if **FieldIdentifier** is an ODBC-defined field and **CharacterAttributePtr** points to an integer. | +| StringLengthPtr | **Output parameter**: pointer to a buffer in which the total number of valid bytes (for string data) is stored in ***CharacterAttributePtr**. Ignore the value of **BufferLength** if the data is not a string. | +| NumericAttributePtr | **Output parameter**: pointer to an integer buffer in which the value of **FieldIdentifier** in the **ColumnNumber** row of the IRD is returned. | + +## Return Value + +- **SQL_SUCCESS** indicates that the call succeeded. +- **SQL_SUCCESS_WITH_INFO** indicates that some warning information is displayed. +- **SQL_ERROR** indicates major errors, such as memory allocation and connection failures. +- **SQL_INVALID_HANDLE** indicates that invalid handles were called. This value may also be returned by other APIs. + +## Precautions + +If SQLColAttribute returns **SQL_ERROR** or **SQL_SUCCESS_WITH_INFO**, the application can call SQLGetDiagRec, with **HandleType** and **Handle** set to **SQL_HANDLE_STMT** and **StatementHandle**, respectively, to obtain the **SQLSTATE** value. The **SQLSTATE** value provides the detailed function calling information. + +## Example + +See ODBC - Examples. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-8-SQLConnect.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-8-SQLConnect.md new file mode 100644 index 0000000000000000000000000000000000000000..52c70d22388626eb0abd90d6c2f135c185504dac --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-8-SQLConnect.md @@ -0,0 +1,54 @@ +--- +title: SQLConnect +summary: SQLConnect +author: Guo Huan +date: 2021-05-17 +--- + +# SQLConnect + +## Function + +SQLConnect is used to establish a connection between a driver and a data source. After the connection is established, the connection handle can be used to access all information about the data source, including its application operating status, transaction processing status, and error information. + +## Prototype + +``` +SQLRETURN SQLConnect(SQLHDBC ConnectionHandle, + SQLCHAR *ServerName, + SQLSMALLINT NameLength1, + SQLCHAR *UserName, + SQLSMALLINT NameLength2, + SQLCHAR *Authentication, + SQLSMALLINT NameLength3); +``` + +## Parameter + +**Table 1** SQLConnect parameters + +| **Keyword** | **Parameter Description** | +| :--------------- | :------------------------------------------------ | +| ConnectionHandle | Connection handle, obtained from SQLAllocHandle. | +| ServerName | Name of the data source to connect. | +| NameLength1 | Length of **ServerName**. | +| UserName | Username of the database in the data source. | +| NameLength2 | Length of **UserName**. | +| Authentication | User password of the database in the data source. | +| NameLength3 | Length of **Authentication**. | + +## Return Value + +- **SQL_SUCCESS** indicates that the call succeeded. +- **SQL_SUCCESS_WITH_INFO** indicates that some warning information is displayed. +- **SQL_ERROR** indicates major errors, such as memory allocation and connection failures. +- **SQL_INVALID_HANDLE** indicates that invalid handles were called. This value may also be returned by other APIs. +- **SQL_STILL_EXECUTING** indicates that the statement is being executed. + +## Precautions + +If SQLConnect returns **SQL_ERROR** or **SQL_SUCCESS_WITH_INFO**, the application can call SQLGetDiagRec, with **HandleType** and **Handle** set to **SQL_HANDLE_DBC** and **ConnectionHandle**, respectively, to obtain the **SQLSTATE** value. The **SQLSTATE** value provides the detailed function calling information. + +## Example + +See ODBC - Examples. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-9-SQLDisconnect.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-9-SQLDisconnect.md new file mode 100644 index 0000000000000000000000000000000000000000..9e44b6e4031cf8f23efae55464f52ff0f6f40f71 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/3-development-based-on-odbc/6-ODBC/2-9-SQLDisconnect.md @@ -0,0 +1,41 @@ +--- +title: SQLDisconnect +summary: SQLDisconnect +author: Guo Huan +date: 2021-05-17 +--- + +# SQLDisconnect + +## Function + +SQLDisconnect is used to close the connection associated with a database connection handle. + +## Prototype + +``` +SQLRETURN SQLDisconnect(SQLHDBC ConnectionHandle); +``` + +## Parameter + +**Table 1** SQLDisconnect parameters + +| **Keyword** | **Parameter Description** | +| :--------------- | :----------------------------------------------- | +| ConnectionHandle | Connection handle, obtained from SQLAllocHandle. | + +## Return Value + +- **SQL_SUCCESS** indicates that the call succeeded. +- **SQL_SUCCESS_WITH_INFO** indicates that some warning information is displayed. +- **SQL_ERROR** indicates major errors, such as memory allocation and connection failures. +- **SQL_INVALID_HANDLE** indicates that invalid handles were called. This value may also be returned by other APIs. + +## Precautions + +If SQLDisconnect returns **SQL_ERROR** or **SQL_SUCCESS_WITH_INFO**, the application can call SQLGetDiagRec, with **HandleType** and **Handle** set to **SQL_HANDLE_DBC** and **ConnectionHandle**, respectively, to obtain the **SQLSTATE** value. The **SQLSTATE** value provides the detailed function calling information. + +## Example + +See ODBC - Examples. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/1-development-based-on-libpq.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/1-development-based-on-libpq.md new file mode 100644 index 0000000000000000000000000000000000000000..0677ed340682c3d6be3d22b4538cec6d7f0a49b5 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/1-development-based-on-libpq.md @@ -0,0 +1,10 @@ +--- +title: Development Based on libpq +summary: Development Based on libpq +author: Guo Huan +date: 2021-04-27 +--- + +# Development Based on libpq + +MogDB does not verify the use of libpq interfaces in application development. You are not advised to use this set of interfaces for application development, because underlying risks probably exist. You can use the ODBC or JDBC interface instead. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/1-database-connection-control-functions-overview.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/1-database-connection-control-functions-overview.md new file mode 100644 index 0000000000000000000000000000000000000000..67d1a774f4281eca8e67a29e54ac58145c29c7ac --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/1-database-connection-control-functions-overview.md @@ -0,0 +1,10 @@ +--- +title: Description +summary: Description +author: Guo Huan +date: 2021-05-17 +--- + +# Description + +Database connection control functions control the connections to MogDB servers. An application can connect to multiple servers at a time. For example, a client connects to multiple databases. Each connection is represented by a PGconn object, which is obtained from the function PQconnectdb, PQconnectdbParams, or PQsetdbLogin. Note that these functions will always return a non-null object pointer, unless there is too little memory to allocate the PGconn object. The interface for establishing a connection is stored in the PGconn object. The PQstatus function can be called to check the return value for a successful connection. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/10-PQstatus.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/10-PQstatus.md new file mode 100644 index 0000000000000000000000000000000000000000..5d15ea90e8b3c21d4b7e50dc8f801659a4f46be2 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/10-PQstatus.md @@ -0,0 +1,64 @@ +--- +title: PQstatus +summary: PQstatus +author: Guo Huan +date: 2021-05-17 +--- + +# PQstatus + +## Function + +PQstatus is used to return the connection status. + +## Prototype + +``` +ConnStatusType PQstatus(const PGconn *conn); +``` + +## Parameter + +**Table 1** PQ status parameter + +| **Keyword** | **Parameter Description** | +| :---------- | :----------------------------------------------------------- | +| conn | Points to the object pointer that contains the connection information. | + +## Return Value + +**ConnStatusType** indicates the connection status. The enumerated values are as follows: + +``` +CONNECTION_STARTED +Waiting for the connection to be established. + +CONNECTION_MADE +Connection succeeded; waiting to send + +CONNECTION_AWAITING_RESPONSE +Waiting for a response from the server. + +CONNECTION_AUTH_OK +Authentication received; waiting for backend startup to complete. + +CONNECTION_SSL_STARTUP +Negotiating SSL encryption. + +CONNECTION_SETENV +Negotiating environment-driven parameter settings. + +CONNECTION_OK +Normal connection. + +CONNECTION_BAD +Failed connection. +``` + +## Precautions + +The connection status can be one of the preceding values. After the asynchronous connection procedure is complete, only two of them, **CONNECTION_OK** and **CONNECTION_BAD**, can return. **CONNECTION_OK** indicates that the connection to the database is normal. **CONNECTION_BAD** indicates that the connection attempt fails. Generally, the **CONNECTION_OK** state remains until PQfinish is called. However, a communication failure may cause the connection status to become to **CONNECTION_BAD** before the connection procedure is complete. In this case, the application can attempt to call PQreset to restore the communication. + +## Example + +For details, see Example. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/2-PQconnectdbParams.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/2-PQconnectdbParams.md new file mode 100644 index 0000000000000000000000000000000000000000..d1ed1479f3ebb2cbb78f8ca3a93661f990d0f30a --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/2-PQconnectdbParams.md @@ -0,0 +1,42 @@ +--- +title: PQconnectdbParams +summary: PQconnectdbParams +author: Guo Huan +date: 2021-05-17 +--- + +# PQconnectdbParams + +## Function + +PQconnectdbParams is used to establish a new connection with the database server. + +## Prototype + +``` +PGconn *PQconnectdbParams(const char * const *keywords, + const char * const *values, + int expand_dbname); +``` + +## Parameter + +**Table 1** PQconnectdbParams parameters + +| **Keyword** | **Parameter Description** | +| :------------ | :----------------------------------------------------------- | +| keywords | An array of strings, each of which is a keyword. | +| values | Value assigned to each keyword. | +| expand_dbname | When **expand\_dbname** is non-zero, the **dbname** keyword value can be recognized as a connection string. Only **dbname** that first appears is expanded in this way, and any subsequent **dbname** value is treated as a database name. | + +## Return Value + +**PGconn \*** points to the object pointer that contains a connection. The memory is applied for by the function internally. + +## Precautions + +This function establishes a new database connection using the parameters taken from two NULL-terminated arrays. Unlike PQsetdbLogin, the parameter set can be extended without changing the function signature. Therefore, use of this function (or its non-blocking analogs PQconnectStartParams and PQconnectPoll) is preferred for new application programming. + +## Example + +For details, see Example. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/3-PQconnectdb.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/3-PQconnectdb.md new file mode 100644 index 0000000000000000000000000000000000000000..cf7b3e8aa86e7b5b2695bcf4550e6cb1e05fc3db --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/3-PQconnectdb.md @@ -0,0 +1,39 @@ +--- +title: PQconnectdb +summary: PQconnectdb +author: Guo Huan +date: 2021-05-17 +--- + +# PQconnectdb + +## Function + +PQconnectdb is used to establish a new connection with the database server. + +## Prototype + +``` +PGconn *PQconnectdb(const char *conninfo); +``` + +## Parameter + +**Table 1** PQconnectdb parameter + +| **Keyword** | **Parameter Description** | +| :---------- | :----------------------------------------------------------- | +| conninfo | Connection string. For details about the fields in the string, see Connection Characters. | + +## Return Value + +**PGconn \*** points to the object pointer that contains a connection. The memory is applied for by the function internally. + +## Precautions + +- This function establishes a new database connection using the parameters taken from the string **conninfo**. +- The input parameter can be empty, indicating that all default parameters can be used. It can contain one or more values separated by spaces or contain a URL. + +## Example + +For details, see Example. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/4-PQconninfoParse.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/4-PQconninfoParse.md new file mode 100644 index 0000000000000000000000000000000000000000..12ba279198b8437b40f3c448ab83d04285651e0f --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/4-PQconninfoParse.md @@ -0,0 +1,31 @@ +--- +title: PQconninfoParse +summary: PQconninfoParse +author: Guo Huan +date: 2021-05-17 +--- + +# PQconninfoParse + +## Function + +PQconninfoParse is used to return parsed connection options based on the connection. + +## Prototype + +``` +PQconninfoOption* PQconninfoParse(const char* conninfo, char** errmsg); +``` + +## Parameters + +**Table 1** + +| **Keyword** | **Parameter Description** | +| :---------- | :----------------------------------------------------------- | +| conninfo | Passed string. This parameter can be left empty. In this case, the default value is used. It can contain one or more values separated by spaces or contain a URL. | +| errmsg | Error information. | + +## Return Value + +PQconninfoOption pointers diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/5-PQconnectStart.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/5-PQconnectStart.md new file mode 100644 index 0000000000000000000000000000000000000000..d4214eb50a08c4cc8db762d582a2863b870e1063 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/5-PQconnectStart.md @@ -0,0 +1,30 @@ +--- +title: PQconnectStart +summary: PQconnectStart +author: Guo Huan +date: 2021-05-17 +--- + +# PQconnectStart + +## Function + +PQconnectStart is used to establish a non-blocking connection with the database server. + +## Prototype + +``` +PGconn* PQconnectStart(const char* conninfo); +``` + +## Parameters + +**Table 1** + +| **Keyword** | **Parameter Description** | +| :---------- | :----------------------------------------------------------- | +| conninfo | String of connection information. This parameter can be left empty. In this case, the default value is used. It can contain one or more values separated by spaces or contain a URL. | + +## Return Value + +PGconn pointers diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/6-PQerrorMessage.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/6-PQerrorMessage.md new file mode 100644 index 0000000000000000000000000000000000000000..7d03c6dedd55cc8d9d063da43d05d425dc91fa9f --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/6-PQerrorMessage.md @@ -0,0 +1,34 @@ +--- +title: PQerrorMessage +summary: PQerrorMessage +author: Guo Huan +date: 2021-05-17 +--- + +# PQerrorMessage + +## Function + +PQerrorMessage is used to return error information on a connection. + +## Prototype + +``` +char* PQerrorMessage(const PGconn* conn); +``` + +## Parameter + +**Table 1** + +| **Keyword** | **Parameter Description** | +| :---------- | :------------------------ | +| conn | Connection handle. | + +## Return Value + +char pointers + +## Example + +For details, see Example. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/7-PQsetdbLogin.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/7-PQsetdbLogin.md new file mode 100644 index 0000000000000000000000000000000000000000..ee22108accd397c26240096359f9af765ebc9a0c --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/7-PQsetdbLogin.md @@ -0,0 +1,51 @@ +--- +title: PQsetdbLogin +summary: PQsetdbLogin +author: Guo Huan +date: 2021-05-17 +--- + +# PQsetdbLogin + +## Function + +PQsetdbLogin is used to establish a new connection with the database server. + +## Prototype + +``` +PGconn *PQsetdbLogin(const char *pghost, + const char *pgport, + const char *pgoptions, + const char *pgtty, + const char *dbName, + const char *login, + const char *pwd); +``` + +## Parameter + +**Table 1** PQsetdbLogin parameters + +| **Keyword** | **Parameter Description** | +| :---------- | :----------------------------------------------------------- | +| pghost | Name of the host to be connected. For details, see the **host** field described in Connection Characters. | +| pgport | Port number of the host server. For details, see the **port** field described in Connection Characters. | +| pgoptions | Command-line options to be sent to the server during running. For details, see the **options** field described in Connection Characters. | +| pgtty | This field can be ignored. (Previously, this field declares the output direction of server logs.) | +| dbName | Name of the database to be connected. For details, see the **dbname** field described in Connection Characters. | +| login | Username for connection. For details, see the **user** field described in Connection Characters. | +| pwd | Password used for authentication during connection. For details, see the **password** field described in Connection Characters. | + +## Return Value + +**PGconn \*** points to the object pointer that contains a connection. The memory is applied for by the function internally. + +## Precautions + +- This function is the predecessor of PQconnectdb with a fixed set of parameters. When an undefined parameter is called, its default value is used. Write NULL or an empty string for any one of the fixed parameters that is to be defaulted. +- If the **dbName** value contains an = sign or a valid prefix in the connection URL, it is taken as a conninfo string and passed to PQconnectdb, and the remaining parameters are consistent with PQconnectdbParams parameters. + +## Example + +For details, see Example. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/8-PQfinish.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/8-PQfinish.md new file mode 100644 index 0000000000000000000000000000000000000000..d6abc3b2d5e40ab6a84958ad11b8109630fd7f67 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/8-PQfinish.md @@ -0,0 +1,34 @@ +--- +title: PQfinish +summary: PQfinish +author: Guo Huan +date: 2021-05-17 +--- + +# PQfinish + +## Function + +PQfinish is used to close the connection to the server and release the memory used by the PGconn object. + +## Prototype + +``` +void PQfinish(PGconn *conn); +``` + +## Parameter + +**Table 1** PQfinish parameter + +| **Keyword** | **Parameter Description** | +| :---------- | :----------------------------------------------------------- | +| conn | Points to the object pointer that contains the connection information. | + +## Precautions + +If the server connection attempt fails (as indicated by PQstatus), the application should call PQfinish to release the memory used by the PGconn object. The PGconn pointer must not be used again after PQfinish has been called. + +## Example + +For details, see Example. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/9-PQreset.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/9-PQreset.md new file mode 100644 index 0000000000000000000000000000000000000000..fd8f48fe7eec67e337d088c8ac0a616cf8b28629 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/1-database-connection-control-functions/9-PQreset.md @@ -0,0 +1,34 @@ +--- +title: PQreset +summary: PQreset +author: Guo Huan +date: 2021-05-17 +--- + +# PQreset + +## Function + +PQreset is used to reset the communication port to the server. + +## Prototype + +``` +void PQreset(PGconn *conn); +``` + +## Parameter + +**Table 1** PQreset parameter + +| **Keyword** | **Parameter Description** | +| :---------- | :----------------------------------------------------------- | +| conn | Points to the object pointer that contains the connection information. | + +## Precautions + +This function will close the connection to the server and attempt to establish a new connection to the same server by using all the parameters previously used. This function is applicable to fault recovery after a connection exception occurs. + +## Example + +For details, see Example. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/1-PQclear.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/1-PQclear.md new file mode 100644 index 0000000000000000000000000000000000000000..d31b1cec3b23dcfba12db140a4c9ed4dbf350aaf --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/1-PQclear.md @@ -0,0 +1,34 @@ +--- +title: PQclear +summary: PQclear +author: Guo Huan +date: 2021-05-17 +--- + +# PQclear + +## Function + +PQclear is used to release the storage associated with PGresult. Any query result should be released by PQclear when it is no longer needed. + +## Prototype + +``` +void PQclear(PGresult *res); +``` + +## Parameters + +**Table 1** PQclear parameter + +| **Keyword** | **Parameter Description** | +| :---------- | :--------------------------------------------- | +| res | Object pointer that contains the query result. | + +## Precautions + +PGresult is not automatically released. That is, it does not disappear when a new query is submitted or even if you close the connection. To delete it, you must call PQclear. Otherwise, memory leakage occurs. + +## Example + +For details, see Example. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/10-PQntuples.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/10-PQntuples.md new file mode 100644 index 0000000000000000000000000000000000000000..e3100a76e980ca08dd6ad832623893b1e9b0c5a7 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/10-PQntuples.md @@ -0,0 +1,34 @@ +--- +title: PQntuples +summary: PQntuples +author: Guo Huan +date: 2021-05-17 +--- + +# PQntuples + +## Function + +PQntuples is used to return the number of rows (tuples) in the query result. An overflow may occur if the return value is out of the value range allowed in a 32-bit OS. + +## Prototype + +``` +int PQntuples(const PGresult *res); +``` + +## Parameter + +**Table 1** + +| **Keyword** | **Parameter Description** | +| :---------- | :------------------------ | +| res | Operation result handle. | + +## Return Value + +Value of the int type + +## Example + +For details, see Example. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/11-PQprepare.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/11-PQprepare.md new file mode 100644 index 0000000000000000000000000000000000000000..27ec13ad93ccd18e00ae688c3147de9300532f2e --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/11-PQprepare.md @@ -0,0 +1,50 @@ +--- +title: PQprepare +summary: PQprepare +author: Guo Huan +date: 2021-05-17 +--- + +# PQprepare + +## Function + +PQprepare is used to submit a request to create a prepared statement with given parameters and wait for completion. + +## Prototype + +``` +PGresult *PQprepare(PGconn *conn, + const char *stmtName, + const char *query, + int nParams, + const Oid *paramTypes); +``` + +## Parameters + +**Table 1** PQprepare parameters + +| **Keyword** | **Parameter Description** | +| :---------- | :----------------------------------------------------------- | +| conn | Points to the object pointer that contains the connection information. | +| stmtName | Name of **stmt** to be executed. | +| query | Query string to be executed. | +| nParams | Parameter quantity. | +| paramTypes | Array of the parameter type. | + +## Return Value + +**PGresult** indicates the object pointer that contains the query result. + +## Precautions + +- PQprepare creates a prepared statement for later execution with PQexecPrepared. This function allows commands to be repeatedly executed, without being parsed and planned each time they are executed. PQprepare is supported only in protocol 3.0 or later. It will fail when protocol 2.0 is used. +- This function creates a prepared statement named **stmtName** from the query string, which must contain an SQL command. **stmtName** can be **""** to create an unnamed statement. In this case, any pre-existing unnamed statement will be automatically replaced. Otherwise, this is an error if the statement name has been defined in the current session. If any parameters are used, they are referred to in the query as $1, $2, and so on. **nParams** is the number of parameters for which types are pre-specified in the array paramTypes[]. (The array pointer can be **NULL** when **nParams** is **0**.) paramTypes[] specifies the data types to be assigned to the parameter symbols by OID. If **paramTypes** is **NULL**, or any element in the array is **0**, the server assigns a data type to the parameter symbol in the same way as it does for an untyped literal string. In addition, the query can use parameter symbols whose numbers are greater than **nParams**. Data types of these symbols will also be inferred. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-notice.gif) **NOTICE:** +> You can also execute the **SQLPREPARE** statement to create a prepared statement that is used with PQexecPrepared. Although there is no libpq function of deleting a prepared statement, the **SQL DEALLOCATE** statement can be used for this purpose. + +## Example + +For details, see Example. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/12-PQresultStatus.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/12-PQresultStatus.md new file mode 100644 index 0000000000000000000000000000000000000000..4a2958410cda31749cd23f455a0b3522066a2f20 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/12-PQresultStatus.md @@ -0,0 +1,72 @@ +--- +title: PQresultStatus +summary: PQresultStatus +author: Guo Huan +date: 2021-05-17 +--- + +# PQresultStatus + +## Function + +PQresultStatus is used to return the result status of a command. + +## Prototype + +``` +ExecStatusType PQresultStatus(const PGresult *res); +``` + +## Parameter + +**Table 1** PQresultStatus parameter + +| **Keyword** | **Parameter Description** | +| :---------- | :--------------------------------------------- | +| res | Object pointer that contains the query result. | + +## Return Value + +**PQresultStatus** indicates the command execution status. The enumerated values are as follows: + +``` +PQresultStatus can return one of the following values: +PGRES_EMPTY_QUERY +The string sent to the server was empty. + +PGRES_COMMAND_OK +A command that does not return data was successfully executed. + +PGRES_TUPLES_OK +A query (such as SELECT or SHOW) that returns data was successfully executed. + +PGRES_COPY_OUT +Copy Out (from the server) data transfer started. + +PGRES_COPY_IN +Copy In (to the server) data transfer started. + +PGRES_BAD_RESPONSE +The response from the server cannot be understood. + +PGRES_NONFATAL_ERROR +A non-fatal error (notification or warning) occurred. + +PGRES_FATAL_ERROR +A fatal error occurred. + +PGRES_COPY_BOTH +Copy In/Out (to and from the server) data transfer started. This state occurs only in streaming replication. + +PGRES_SINGLE_TUPLE +PGresult contains a result tuple from the current command. This state occurs in a single-row query. +``` + +## Precautions + +- Note that the SELECT command that happens to retrieve zero rows still returns **PGRES_TUPLES_OK**. **PGRES_COMMAND_OK** is used for commands that can never return rows (such as INSERT or UPDATE, without return clauses). The result status **PGRES_EMPTY_QUERY** might indicate a bug in the client software. +- The result status **PGRES_NONFATAL_ERROR** will never be returned directly by PQexec or other query execution functions. Instead, such results will be passed to the notice processor. + +## Example + +For details, see Example. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/2-PQexec.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/2-PQexec.md new file mode 100644 index 0000000000000000000000000000000000000000..c752c487dfdf9a72784249550ad4ac912b0262fa --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/2-PQexec.md @@ -0,0 +1,42 @@ +--- +title: PQexec +summary: PQexec +author: Guo Huan +date: 2021-05-17 +--- + +# PQexec + +## Function + +PQexec is used to commit a command to the server and wait for the result. + +## Prototype + +``` +PGresult *PQexec(PGconn *conn, const char *command); +``` + +## Parameter + +**Table 1** PQexec parameters + +| **Keyword** | **Parameter Description** | +| :---------- | :----------------------------------------------------------- | +| conn | Points to the object pointer that contains the connection information. | +| command | Query string to be executed. | + +## Return Value + +**PGresult** indicates the object pointer that contains the query result. + +## Precautions + +The PQresultStatus function should be called to check the return value for any errors (including the value of a null pointer, in which **PGRES_FATAL_ERROR** will be returned). The PQerrorMessage function can be called to obtain more information about such errors. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-notice.gif) **NOTICE:** +> The command string can contain multiple SQL commands separated by semicolons (;). Multiple queries sent in a PQexec call are processed in one transaction, unless there are specific BEGIN/COMMIT commands in the query string to divide the string into multiple transactions. Note that the returned PGresult structure describes only the result of the last command executed from the string. If a command fails, the string processing stops and the returned PGresult describes the error condition. + +## Example + +For details, see Example. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/3-PQexecParams.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/3-PQexecParams.md new file mode 100644 index 0000000000000000000000000000000000000000..f8c2120cfb02b653d1cae499a403cb59f366cee0 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/3-PQexecParams.md @@ -0,0 +1,44 @@ +--- +title: PQexecParams +summary: PQexecParams +author: Guo Huan +date: 2021-05-17 +--- + +# PQexecParams + +## Function + +PQexecParams is used to run a command to bind one or more parameters. + +## Prototype + +``` +PGresult* PQexecParams(PGconn* conn, + const char* command, + int nParams, + const Oid* paramTypes, + const char* const* paramValues, + const int* paramLengths, + const int* paramFormats, + int resultFormat); +``` + +## Parameter + +**Table 1** + +| **Keyword** | **Parameter Description** | +| :----------- | :---------------------------------- | +| conn | Connection handle. | +| command | SQL text string. | +| nParams | Number of parameters to be bound. | +| paramTypes | Types of parameters to be bound. | +| paramValues | Values of parameters to be bound. | +| paramLengths | Parameter lengths. | +| paramFormats | Parameter formats (text or binary). | +| resultFormat | Result format (text or binary). | + +## Return Value + +PGresult pointers diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/4-PQexecParamsBatch.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/4-PQexecParamsBatch.md new file mode 100644 index 0000000000000000000000000000000000000000..f7e4b3c0daab95406454d0c8ebe6a710b86b4813 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/4-PQexecParamsBatch.md @@ -0,0 +1,46 @@ +--- +title: PQexecParamsBatch +summary: PQexecParamsBatch +author: Guo Huan +date: 2021-05-17 +--- + +# PQexecParamsBatch + +## Function + +PQexecParamsBatch is used to run a command to bind batches of parameters. + +## Prototype + +``` +PGresult* PQexecParamsBatch(PGconn* conn, + const char* command, + int nParams, + int nBatch, + const Oid* paramTypes, + const char* const* paramValues, + const int* paramLengths, + const int* paramFormats, + int resultFormat); +``` + +## Parameter + +**Table 1** + +| **Keyword** | **Parameter Description** | +| :----------- | :---------------------------------- | +| conn | Connection handle. | +| command | SQL text string. | +| nParams | Number of parameters to be bound. | +| nBatch | Number of batch operations. | +| paramTypes | Types of parameters to be bound. | +| paramValues | Values of parameters to be bound. | +| paramLengths | Parameter lengths. | +| paramFormats | Parameter formats (text or binary). | +| resultFormat | Result format (text or binary). | + +## Return Value + +PGresult pointers diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/5-PQexecPrepared.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/5-PQexecPrepared.md new file mode 100644 index 0000000000000000000000000000000000000000..b2eab81bdce3baedbb35350692afc3f52e353e77 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/5-PQexecPrepared.md @@ -0,0 +1,42 @@ +--- +title: PQexecPrepared +summary: PQexecPrepared +author: Guo Huan +date: 2021-05-17 +--- + +# PQexecPrepared + +## Function + +PQexecPrepared is used to send a request to execute a prepared statement with given parameters and wait for the result. + +## Prototype + +``` +PGresult* PQexecPrepared(PGconn* conn, + const char* stmtName, + int nParams, + const char* const* paramValues, + const int* paramLengths, + const int* paramFormats, + int resultFormat); +``` + +## Parameter + +**Table 1** + +| **Keyword** | **Parameter Description** | +| :----------- | :----------------------------------------------------------- | +| conn | Connection handle. | +| stmtName | **stmt** name, which can be set to "" or NULL to reference an unnamed statement. Otherwise, it must be the name of an existing prepared statement. | +| nParams | Parameter quantity. | +| paramValues | Actual values of parameters. | +| paramLengths | Actual data lengths of parameters. | +| paramFormats | Parameter formats (text or binary). | +| resultFormat | Return result format (text or binary). | + +## Return Value + +PGresult pointers diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/6-PQexecPreparedBatch.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/6-PQexecPreparedBatch.md new file mode 100644 index 0000000000000000000000000000000000000000..23ed2c96854988ad26cf1519fdcc2f7e77519d51 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/6-PQexecPreparedBatch.md @@ -0,0 +1,44 @@ +--- +title: PQexecPreparedBatch +summary: PQexecPreparedBatch +author: Guo Huan +date: 2021-05-17 +--- + +# PQexecPreparedBatch + +## Function + +PQexecPreparedBatch is used to send a request to execute a prepared statement with batches of given parameters and wait for the result. + +## Prototype + +``` +PGresult* PQexecPreparedBatch(PGconn* conn, + const char* stmtName, + int nParams, + int nBatchCount, + const char* const* paramValues, + const int* paramLengths, + const int* paramFormats, + int resultFormat); +``` + +## Parameter + +**Table 1** + +| **Keyword** | **Parameter Description** | +| :----------- | :----------------------------------------------------------- | +| conn | Connection handle. | +| stmtName | **stmt** name, which can be set to "" or NULL to reference an unnamed statement. Otherwise, it must be the name of an existing prepared statement. | +| nParams | Parameter quantity. | +| nBatchCount | Number of batches. | +| paramValues | Actual values of parameters. | +| paramLengths | Actual data lengths of parameters. | +| paramFormats | Parameter formats (text or binary). | +| resultFormat | Return result format (text or binary). | + +## Return Value + +PGresult pointers diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/7-PQfname.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/7-PQfname.md new file mode 100644 index 0000000000000000000000000000000000000000..f62035545dacd5991ec91d386f339a0ad52acbc8 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/7-PQfname.md @@ -0,0 +1,36 @@ +--- +title: PQfname +summary: PQfname +author: Guo Huan +date: 2021-05-17 +--- + +# PQfname + +## Function + +PQfname is used to return the column name associated with the given column number. Column numbers start from 0. The caller should not release the result directly. The result will be released when the associated PGresult handle is passed to PQclear. + +## Prototype + +``` +char *PQfname(const PGresult *res, + int column_number); +``` + +## Parameter + +**Table 1** + +| **Keyword** | **Parameter Description** | +| :------------ | :------------------------ | +| res | Operation result handle. | +| column_number | Number of columns. | + +## Return Value + +char pointers + +## Example + +For details, see Example. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/8-PQgetvalue.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/8-PQgetvalue.md new file mode 100644 index 0000000000000000000000000000000000000000..86a7fa972a42441e7a1e0d7847afaf391e21c14d --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/8-PQgetvalue.md @@ -0,0 +1,42 @@ +--- +title: PQgetvalue +summary: PQgetvalue +author: Guo Huan +date: 2021-05-17 +--- + +# PQgetvalue + +## Function + +PQgetvalue is used to return a single field value of one row of a PGresult. Row and column numbers start from 0. The caller should not release the result directly. The result will be released when the associated PGresult handle is passed to PQclear. + +## Prototype + +``` +char *PQgetvalue(const PGresult *res, + int row_number, + int column_number); +``` + +## Parameter + +**Table 1** + +| **Keyword** | **Parameter Description** | +| :------------ | :------------------------ | +| res | Operation result handle. | +| row_number | Number of rows. | +| column_number | Number of columns. | + +## Return Value + +For data in text format, the value returned by PQgetvalue is a null-terminated string representation of the field value. + +For binary data, the value is a binary representation determined by the typsend and typreceive functions of the data type. + +If this field is left blank, an empty string is returned. + +## Example + +For details, see Example. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/9-PQnfields.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/9-PQnfields.md new file mode 100644 index 0000000000000000000000000000000000000000..459ecaa8aa32218edcb98e0c5e3dbe3b495fd18d --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/2-database-statement-execution-functions/9-PQnfields.md @@ -0,0 +1,34 @@ +--- +title: PQnfields +summary: PQnfields +author: Guo Huan +date: 2021-05-17 +--- + +# PQnfields + +## Function + +PQnfields is used to return the number of columns (fields) in each row of the query result. + +## Prototype + +``` +int PQnfields(const PGresult *res); +``` + +## Parameter + +**Table 1** + +| **Keyword** | **Parameter Description** | +| :---------- | :------------------------ | +| res | Operation result handle. | + +## Return Value + +Value of the int type + +## Example + +For details, see Example. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/3-functions-for-asynchronous-command-processing/1-functions-for-asynchronous-command-processing-overview.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/3-functions-for-asynchronous-command-processing/1-functions-for-asynchronous-command-processing-overview.md new file mode 100644 index 0000000000000000000000000000000000000000..df0fbcb1114ea986a47ee14e8da2aa4194ab269f --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/3-functions-for-asynchronous-command-processing/1-functions-for-asynchronous-command-processing-overview.md @@ -0,0 +1,17 @@ +--- +title: Description +summary: Description +author: Guo Huan +date: 2021-05-17 +--- + +# Description + +The PQexec function is adequate for committing commands in common, synchronous applications. However, it has several defects, which may be important to some users: + +- PQexec waits for the end of the command, but the application may have other work to do (for example, maintaining a user interface). In this case, PQexec would not want to be blocked to wait for the response. +- As the client application is suspended while waiting for the result, it is difficult for the application to determine whether to cancel the ongoing command. +- PQexec can return only one PGresult structure. If the committed command string contains multiple SQL commands, all the PGresult structures except the last PGresult are discarded by PQexec. +- PQexec always collects the entire result of the command and caches it in a PGresult. Although this mode simplifies the error handling logic for applications, it is impractical for results that contain multiple rows. + +Applications that do not want to be restricted by these limitations can use the following functions that PQexec is built from: PQsendQuery and PQgetResult. The functions PQsendQueryParams, PQsendPrepare, and PQsendQueryPrepared can also be used with PQgetResult. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/3-functions-for-asynchronous-command-processing/2-PQsendQuery.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/3-functions-for-asynchronous-command-processing/2-PQsendQuery.md new file mode 100644 index 0000000000000000000000000000000000000000..c8237c713138fe440129269872011057a531536a --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/3-functions-for-asynchronous-command-processing/2-PQsendQuery.md @@ -0,0 +1,39 @@ +--- +title: PQsendQuery +summary: PQsendQuery +author: Guo Huan +date: 2021-05-17 +--- + +# PQsendQuery + +## Function + +PQsendQuery is used to commit a command to the server without waiting for the result. If the query is successful, **1** is returned. Otherwise, **0** is returned. + +## Prototype + +```c +int PQsendQuery(PGconn *conn, const char *command); +``` + +## Parameter + +**Table 1** PQsendQuery parameters + +| **Keyword** | **Parameter Description** | +| :---------- | :----------------------------------------------------------- | +| conn | Points to the object pointer that contains the connection information. | +| command | Query string to be executed. | + +## Return Value + +**int** indicates the execution result. **1** indicates successful execution and **0** indicates an execution failure. The failure cause is stored in **conn->errorMessage**. + +## Precautions + +After PQsendQuery is successfully called, call PQgetResult one or more times to obtain the results. PQsendQuery cannot be called again (on the same connection) until PQgetResult returns a null pointer, indicating that the command execution is complete. + +## Example + +For details, see Example. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/3-functions-for-asynchronous-command-processing/3-PQsendQueryParams.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/3-functions-for-asynchronous-command-processing/3-PQsendQueryParams.md new file mode 100644 index 0000000000000000000000000000000000000000..d8e5c99f9f3437c389dce7cade907d7ee417f12e --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/3-functions-for-asynchronous-command-processing/3-PQsendQueryParams.md @@ -0,0 +1,52 @@ +--- +title: PQsendQueryParams +summary: PQsendQueryParams +author: Guo Huan +date: 2021-05-17 +--- + +# PQsendQueryParams + +## Function + +PQsendQueryParams is used to commit a command and separate parameters to the server without waiting for the result. + +## Prototype + +```c +int PQsendQueryParams(PGconn *conn, + const char *command, + int nParams, + const Oid *paramTypes, + const char * const *paramValues, + const int *paramLengths, + const int *paramFormats, + int resultFormat); +``` + +## Parameter + +**Table 1** PQsendQueryParams parameters + +| **Keyword** | **Parameter Description** | +| :----------- | :----------------------------------------------------------- | +| conn | Points to the object pointer that contains the connection information. | +| command | Query string to be executed. | +| nParams | Parameter quantity. | +| paramTypes | Parameter type. | +| paramValues | Parameter value. | +| paramLengths | Parameter length. | +| paramFormats | Parameter format. | +| resultFormat | Result format. | + +## Return Value + +**int** indicates the execution result. **1** indicates successful execution and **0** indicates an execution failure. The failure cause is stored in **conn->errorMessage**. + +## Precautions + +PQsendQueryParams is equivalent to PQsendQuery. The only difference is that query parameters can be specified separately from the query string. PQsendQueryParams parameters are handled in the same way as PQexecParams parameters. Like PQexecParams, PQsendQueryParams cannot work on connections using protocol 2.0 and it allows only one command in the query string. + +## Example + +For details, see Example. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/3-functions-for-asynchronous-command-processing/4-PQsendPrepare.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/3-functions-for-asynchronous-command-processing/4-PQsendPrepare.md new file mode 100644 index 0000000000000000000000000000000000000000..550c4f1ea98aa02fd94db396a62bc9272aef3aa8 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/3-functions-for-asynchronous-command-processing/4-PQsendPrepare.md @@ -0,0 +1,46 @@ +--- +title: PQsendPrepare +summary: PQsendPrepare +author: Guo Huan +date: 2021-05-17 +--- + +# PQsendPrepare + +## Function + +PQsendPrepare is used to send a request to create a prepared statement with given parameters, without waiting for completion. + +## Prototype + +```c +int PQsendPrepare(PGconn *conn, + const char *stmtName, + const char *query, + int nParams, + const Oid *paramTypes); +``` + +## Parameters + +**Table 1** PQsendPrepare parameters + +| **Keyword** | **Parameter Description** | +| :---------- | :----------------------------------------------------------- | +| conn | Points to the object pointer that contains the connection information. | +| stmtName | Name of **stmt** to be executed. | +| query | Query string to be executed. | +| nParams | Parameter quantity. | +| paramTypes | Array of the parameter type. | + +## Return Value + +**int** indicates the execution result. **1** indicates successful execution and **0** indicates an execution failure. The failure cause is stored in **conn->errorMessage**. + +## Precautions + +PQsendPrepare is an asynchronous version of PQprepare. If it can dispatch a request, **1** is returned. Otherwise, **0** is returned. After a successful calling of PQsendPrepare, call PQgetResult to check whether the server successfully created the prepared statement. PQsendPrepare parameters are handled in the same way as PQprepare parameters. Like PQprepare, PQsendPrepare cannot work on connections using protocol 2.0. + +## Example + +For details, see Example. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/3-functions-for-asynchronous-command-processing/5-PQsendQueryPrepared.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/3-functions-for-asynchronous-command-processing/5-PQsendQueryPrepared.md new file mode 100644 index 0000000000000000000000000000000000000000..4ce131cc997b91c09897c106ec204977bf8fbac5 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/3-functions-for-asynchronous-command-processing/5-PQsendQueryPrepared.md @@ -0,0 +1,50 @@ +--- +title: PQsendQueryPrepared +summary: PQsendQueryPrepared +author: Guo Huan +date: 2021-05-17 +--- + +# PQsendQueryPrepared + +## Function + +PQsendQueryPrepared is used to send a request to execute a prepared statement with given parameters, without waiting for the result. + +## Prototype + +```c +int PQsendQueryPrepared(PGconn *conn, + const char *stmtName, + int nParams, + const char * const *paramValues, + const int *paramLengths, + const int *paramFormats, + int resultFormat); +``` + +## Parameters + +**Table 1** PQsendQueryPrepared parameters + +| **Keyword** | **Parameter Description** | +| :----------- | :----------------------------------------------------------- | +| conn | Points to the object pointer that contains the connection information. | +| stmtName | Name of **stmt** to be executed. | +| nParams | Parameter quantity. | +| paramValues | Parameter value. | +| paramLengths | Parameter length. | +| paramFormats | Parameter format. | +| resultFormat | Result format. | + +## Return Value + +**int** indicates the execution result. **1** indicates successful execution and **0** indicates an execution failure. The failure cause is stored in **conn->errorMessage**. + +## Precautions + +PQsendQueryPrepared is similar to PQsendQueryParams, but the command to be executed is specified by naming a previously-prepared statement, instead of providing a query string. PQsendQueryPrepared parameters are handled in the same way as PQexecPrepared parameters. Like PQexecPrepared, PQsendQueryPrepared cannot work on connections using protocol 2.0. + +## Example + +For details, see Example. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/3-functions-for-asynchronous-command-processing/6-PQflush.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/3-functions-for-asynchronous-command-processing/6-PQflush.md new file mode 100644 index 0000000000000000000000000000000000000000..485875afc9bba8f3479e9a60ebc3eb8c0eb0ed2b --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/3-functions-for-asynchronous-command-processing/6-PQflush.md @@ -0,0 +1,38 @@ +--- +title: PQflush +summary: PQflush +author: Guo Huan +date: 2021-05-17 +--- + +# PQflush + +## Function + +PQflush is used to try to flush any queued output data to the server. + +## Prototype + +```c +Cint PQflush(PGconn *conn); +``` + +## Parameter + +**Table 1** PQflush parameter + +| **Keyword** | **Parameter Description** | +| :---------- | :----------------------------------------------------------- | +| conn | Points to the object pointer that contains the connection information. | + +## Return Value + +**int** indicates the execution result. If the operation is successful (or the send queue is empty), **0** is returned. If the operation fails, **-1** is returned. If all data in the send queue fails to be sent, **1** is returned. (This case occurs only when the connection is non-blocking.) The failure cause is stored in **conn->error_message**. + +## Precautions + +Call PQflush after sending any command or data over a non-blocking connection. If **1** is returned, wait for the socket to become read- or write-ready. If the socket becomes write-ready, call PQflush again. If the socket becomes read-ready, call PQconsumeInput and then call PQflush again. Repeat the operation until the value **0** is returned for PQflush. (It is necessary to check for read-ready and drain the input using PQconsumeInput. This is because the server can block trying to send us data, for example, notification messages, and will not read our data until we read it.) Once PQflush returns **0**, wait for the socket to be read-ready and then read the response as described above. + +## Example + +For details, see Example. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/4-functions-for-canceling-queries-in-progress/1-PQgetCancel.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/4-functions-for-canceling-queries-in-progress/1-PQgetCancel.md new file mode 100644 index 0000000000000000000000000000000000000000..234b1051b81532f0a6a84885251b9599579d3346 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/4-functions-for-canceling-queries-in-progress/1-PQgetCancel.md @@ -0,0 +1,38 @@ +--- +title: PQgetCancel +summary: PQgetCancel +author: Guo Huan +date: 2021-05-17 +--- + +# PQgetCancel + +## Function + +PQgetCancel is used to create a data structure that contains the information required to cancel a command issued through a specific database connection. + +## Prototype + +```c +PGcancel *PQgetCancel(PGconn *conn); +``` + +## Parameter + +**Table 1** PQgetCancel parameter + +| **Keyword** | **Parameter Description** | +| :---------- | :----------------------------------------------------------- | +| conn | Points to the object pointer that contains the connection information. | + +## Return Value + +**PGcancel** points to the object pointer that contains the cancel information. + +## Precautions + +PQgetCancel creates a PGcancel object for a given PGconn connection object. If the given connection object (**conn**) is NULL or an invalid connection, PQgetCancel will return NULL. The PGcancel object is an opaque structure that cannot be directly accessed by applications. It can be transferred only to PQcancel or PQfreeCancel. + +## Example + +For details, see Example. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/4-functions-for-canceling-queries-in-progress/2-PQfreeCancel.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/4-functions-for-canceling-queries-in-progress/2-PQfreeCancel.md new file mode 100644 index 0000000000000000000000000000000000000000..3df29bc62ea1c9870179d967dcb595f3bc19a543 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/4-functions-for-canceling-queries-in-progress/2-PQfreeCancel.md @@ -0,0 +1,34 @@ +--- +title: PQfreeCancel +summary: PQfreeCancel +author: Guo Huan +date: 2021-05-17 +--- + +# PQfreeCancel + +## Function + +PQfreeCancel is used to release the data structure created by PQgetCancel. + +## Prototype + +```c +void PQfreeCancel(PGcancel *cancel); +``` + +## Parameter + +**Table 1** PQfreeCancel parameter + +| **Keyword** | **Parameter Description** | +| :---------- | :----------------------------------------------------------- | +| cancel | Points to the object pointer that contains the cancel information. | + +## Precautions + +PQfreeCancel releases a data object previously created by PQgetCancel. + +## Example + +For details, see Example. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/4-functions-for-canceling-queries-in-progress/3-PQcancel.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/4-functions-for-canceling-queries-in-progress/3-PQcancel.md new file mode 100644 index 0000000000000000000000000000000000000000..1a71f6b718054f756db1770cfb69807ecd846436 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/4-functions-for-canceling-queries-in-progress/3-PQcancel.md @@ -0,0 +1,41 @@ +--- +title: PQcancel +summary: PQcancel +author: Guo Huan +date: 2021-05-17 +--- + +# PQcancel + +## Function + +PQcancel is used to request the server to abandon processing of the current command. + +## Prototype + +```c +int PQcancel(PGcancel *cancel, char *errbuf, int errbufsize); +``` + +## Parameter + +**Table 1** PQcancel parameters + +| **Keyword** | **Parameter Description** | +| :---------- | :----------------------------------------------------------- | +| cancel | Points to the object pointer that contains the cancel information. | +| errbuf | Buffer for storing error information. | +| errbufsize | Size of the buffer for storing error information. | + +## Return Value + +**int** indicates the execution result. **1** indicates successful execution and **0** indicates an execution failure. The failure cause is stored in **errbuf**. + +## Precautions + +- Successful sending does not guarantee that the request will have any effect. If the cancellation is valid, the current command is terminated early and an error is returned. If the cancellation fails (for example, because the server has processed the command), no result is returned. +- If **errbuf** is a local variable in a signal handler, you can safely call PQcancel from the signal handler. For PQcancel, the PGcancel object is read-only, so it can also be called from a thread that is separate from the thread that is operating the PGconn object. + +## Example + +For details, see Example. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/5-libpq-example.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/5-libpq-example.md new file mode 100644 index 0000000000000000000000000000000000000000..131cd599ec1ba2bbf13181cc95a0ad85dbb2202f --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/5-libpq-example.md @@ -0,0 +1,282 @@ +--- +title: Example +summary: Example +author: Guo Huan +date: 2021-05-17 +--- + +# Example + +## Code for Common Functions + +Example 1: + +```c +/* + * testlibpq.c + */ +#include +#include +#include + +static void +exit_nicely(PGconn *conn) +{ + PQfinish(conn); + exit(1); +} + +int +main(int argc, char **argv) +{ + const char *conninfo; + PGconn *conn; + PGresult *res; + int nFields; + int i,j; + + /* + * This value is used when the user provides the value of the conninfo character string in the command line. + * Otherwise, the environment variables or the default values + * are used for all other connection parameters. + */ + if (argc > 1) + conninfo = argv[1]; + else + conninfo = "dbname=postgres port=42121 host='10.44.133.171' application_name=test connect_timeout=5 sslmode=allow user='test' password='test_1234'"; + + /* Connect to the database. */ + conn = PQconnectdb(conninfo); + + /* Check whether the backend connection has been successfully established. */ + if (PQstatus(conn) != CONNECTION_OK) + { + fprintf(stderr, "Connection to database failed: %s", + PQerrorMessage(conn)); + exit_nicely(conn); + } + + /* + * Since a cursor is used in the test case, a transaction block is required. + * Put all data in one "select * from pg_database" + * PQexec() is too simple and is not recommended. + */ + + /* Start a transaction block. */ + res = PQexec(conn, "BEGIN"); + if (PQresultStatus(res) != PGRES_COMMAND_OK) + { + fprintf(stderr, "BEGIN command failed: %s", PQerrorMessage(conn)); + PQclear(res); + exit_nicely(conn); + } + + /* + * PQclear PGresult should be executed when it is no longer needed, to avoid memory leakage. + */ + PQclear(res); + + /* + * Fetch data from the pg_database system catalog. + */ + res = PQexec(conn, "DECLARE myportal CURSOR FOR select * from pg_database"); + if (PQresultStatus(res) != PGRES_COMMAND_OK) + { + fprintf(stderr, "DECLARE CURSOR failed: %s", PQerrorMessage(conn)); + PQclear(res); + exit_nicely(conn); + } + PQclear(res); + + res = PQexec(conn, "FETCH ALL in myportal"); + if (PQresultStatus(res) != PGRES_TUPLES_OK) + { + fprintf(stderr, "FETCH ALL failed: %s", PQerrorMessage(conn)); + PQclear(res); + exit_nicely(conn); + } + + /* First, print out the attribute name. */ + nFields = PQnfields(res); + for (i = 0; i < nFields; i++) + printf("%-15s", PQfname(res, i)); + printf("\n\n"); + + /* Print lines. */ + for (i = 0; i < PQntuples(res); i++) + { + for (j = 0; j < nFields; j++) + printf("%-15s", PQgetvalue(res, i, j)); + printf("\n"); + } + + PQclear(res); + + /* Close the portal. We do not need to check for errors. */ + res = PQexec(conn, "CLOSE myportal"); + PQclear(res); + + /* End the transaction. */ + res = PQexec(conn, "END"); + PQclear(res); + + /* Close the database connection and clean up the database. */ + PQfinish(conn); + + return 0; +} +``` + +
+ +Example 2: + +```c +/* + * testlibpq2.c + * Test out-of-line parameters and binary I/Os. + * + * Before running this example, run the following command to populate a database: + * + * + * CREATE TABLE test1 (i int4, t text); + * + * INSERT INTO test1 values (2, 'ho there'); + * + * The expected output is as follows: + * + * + * tuple 0: got + * i = (4 bytes) 2 + * t = (8 bytes) 'ho there' + * + */ +#include +#include +#include +#include +#include + +/* for ntohl/htonl */ +#include +#include + +static void +exit_nicely(PGconn *conn) +{ + PQfinish(conn); + exit(1); +} + +/* + * This function is used to print out the query results. The results are in binary format +* and fetched from the table created in the comment above. + */ +static void +show_binary_results(PGresult *res) +{ + int i; + int i_fnum, + t_fnum; + + /* Use PQfnumber to avoid assumptions about field order in the result. */ + i_fnum = PQfnumber(res, "i"); + t_fnum = PQfnumber(res, "t"); + + for (i = 0; i < PQntuples(res); i++) + { + char *iptr; + char *tptr; + int ival; + + /* Obtain the field value. (Ignore the possibility that they may be null). */ + iptr = PQgetvalue(res, i, i_fnum); + tptr = PQgetvalue(res, i, t_fnum); + + /* + * The binary representation of INT4 is the network byte order, + * which is better to be replaced with the local byte order. + */ + ival = ntohl(*((uint32_t *) iptr)); + + /* + * The binary representation of TEXT is text. Since libpq can append a zero byte to it, + * and think of it as a C string. + * + */ + + printf("tuple %d: got\n", i); + printf(" i = (%d bytes) %d\n", + PQgetlength(res, i, i_fnum), ival); + printf(" t = (%d bytes) '%s'\n", + PQgetlength(res, i, t_fnum), tptr); + printf("\n\n"); + } +} + +int +main(int argc, char **argv) +{ + const char *conninfo; + PGconn *conn; + PGresult *res; + const char *paramValues[1]; + int paramLengths[1]; + int paramFormats[1]; + uint32_t binaryIntVal; + + /* + * If the user provides a parameter on the command line, + * The value of this parameter is a conninfo character string. Otherwise, + * Use environment variables or default values. + */ + if (argc > 1) + conninfo = argv[1]; + else + conninfo = "dbname=postgres port=42121 host='10.44.133.171' application_name=test connect_timeout=5 sslmode=allow user='test' password='test_1234'"; + + /* Connect to the database. */ + conn = PQconnectdb(conninfo); + + /* Check whether the connection to the server was successfully established. */ + if (PQstatus(conn) != CONNECTION_OK) + { + fprintf(stderr, "Connection to database failed: %s", + PQerrorMessage(conn)); + exit_nicely(conn); + } + + /* Convert the integer value "2" to the network byte order. */ + binaryIntVal = htonl((uint32_t) 2); + + /* Set the parameter array for PQexecParams. */ + paramValues[0] = (char *) &binaryIntVal; + paramLengths[0] = sizeof(binaryIntVal); + paramFormats[0] = 1; /* Binary */ + + res = PQexecParams(conn, + "SELECT * FROM test1 WHERE i = $1::int4", + 1, /* One parameter */ + NULL, /* Enable the backend to deduce the parameter type. */ + paramValues, + paramLengths, + paramFormats, + 1); /* require binary results. */ + + if (PQresultStatus(res) != PGRES_TUPLES_OK) + { + fprintf(stderr, "SELECT failed: %s", PQerrorMessage(conn)); + PQclear(res); + exit_nicely(conn); + } + + show_binary_results(res); + + PQclear(res); + + /* Close the database connection and clean up the database. */ + PQfinish(conn); + + return 0; +} +``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/6-connection-characters.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/6-connection-characters.md new file mode 100644 index 0000000000000000000000000000000000000000..8dae6386d2e085b45644484f66255eafca2960cf --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4-development-based-on-libpq/2-libpq/6-connection-characters.md @@ -0,0 +1,27 @@ +--- +title: Connection Characters +summary: Connection Characters +author: Guo Huan +date: 2021-05-17 +--- + +# Connection Characters + +**Table 1** Connection strings + +| Character String | Description | +| :------------------ | :----------------------------------------------------------- | +| host | Name of the host to connect to. If the host name starts with a slash (/), Unix-domain socket communications instead of TCP/IP communications are used. The value is the directory where the socket file is stored. If **host** is not specified, the default behavior is to connect to the Unix-domain socket in the **/tmp** directory (or the socket directory specified during MogDB installation). On a machine without a Unix-domain socket, the default behavior is to connect to **localhost**. | +| hostaddr | IP address of the host to connect to. The value is in standard IPv4 address format, for example, 172.28.40.9. If your machine supports IPv6, IPv6 address can also be used. If a non-null string is specified, TCP/IP communications are used.
Replacing **host** with **hostaddr** can prevent applications from querying host names, which may be important for applications with time constraints. However, a host name is required for GSSAPI or SSPI authentication methods. Therefore, the following rules are used:
1. If **host** is specified but **hostaddr** is not, a query for the host name will be executed.
2. If **hostaddr** is specified but **host** is not, the value of **hostaddr** is the server network address. If the host name is required by authentication, the connection attempt fails.
3. If both **host** and **hostaddr** are specified, the value of **hostaddr** is the server network address. The value of **host** is ignored unless it is required by authentication, in which case it is used as the host name.
NOTICE:
- If **host** is not the server name in the network address specified by **hostaddr**, the authentication may fail.
- If neither **host** nor **hostaddr** is specified, libpq will use a local Unix-domain socket for connection. If the machine does not have a Unix-domain socket, it will attempt to connect to **localhost**. | +| port | Port number of the host server, or the socket file name extension for Unix-domain connections. | +| user | Name of the user to connect as. By default, the username is the same as the operating system name of the user running the application. | +| dbname | Database name. The default value is the same as the username. | +| password | Password to be used if the server requires password authentication. | +| connect_timeout | Maximum timeout period of the connection, in seconds (in decimal integer string). The value **0** or null indicates infinity. You are not advised to set the connection timeout period to a value less than 2 seconds. | +| client_encoding | Client encoding for the connection. In addition to the values accepted by the corresponding server options, you can use **auto** to determine the correct encoding from the current environment in the client (the LC_CTYPE environment variable in the Unix system). | +| options | Adds command-line options to send to the server at runtime. | +| application_name | Current user identity. | +| keepalives | Whether TCP keepalive is enabled on the client side. The default value is **1**, indicating that the function is enabled. The value **0** indicates that the function is disabled. Ignore this parameter for Unix-domain connections. | +| keepalives_idle | The number of seconds of inactivity after which TCP should send a keepalive message to the server. The value **0** indicates that the default value is used. Ignore this parameter for Unix-domain connections or if keep-alive is disabled. | +| keepalives_interval | The number of seconds after which a TCP keepalive message that is not acknowledged by the server should be retransmitted. The value **0** indicates that the default value is used. Ignore this parameter for Unix-domain connections or if keep-alive is disabled. | +| keepalives_count | Adds command-line options to send to the server at runtime. For example, adding **-c comm_debug_mode=off** to set the value of the GUC parameter **comm_debug_mode** to **off**. | diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/1-psycopg-based-development.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/1-psycopg-based-development.md new file mode 100644 index 0000000000000000000000000000000000000000..a144c443330c0224b7923561984827a8cffb2d41 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/1-psycopg-based-development.md @@ -0,0 +1,19 @@ +--- +title: Psycopg-Based Development +summary: Psycopg-Based Development +author: Zhang Cuiping +date: 2021-10-11 +--- + +# Psycopg-Based Development + +Psycopg is a Python API used to execute SQL statements and provides a unified access API for PostgreSQL and GaussDB. Applications can perform data operations based on psycopg. Psycopg2 is an encapsulation of libpq and is implemented using the C language, which is efficient and secure. It provides cursors on both clients and servers, asynchronous communication and notification, and the COPY TO and COPY FROM functions. Psycopg2 supports multiple types of Python out-of-the-box and adapts to PostgreSQL data types. Through the flexible object adaptation system, you can extend and customize the adaptation. Psycopg2 is compatible with Unicode and Python 3. + +MogDB supports the psycopg2 feature and allows psycopg2 to be connected in SSL mode. + +**Table 1** Platforms supported by Psycopg + +| OS | Platform | +| :---------- | :------- | +| EulerOS 2.5 | x86_64 | +| EulerOS 2.8 | ARM64 | diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/10.1-example-common-operations.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/10.1-example-common-operations.md new file mode 100644 index 0000000000000000000000000000000000000000..42e76fe7977319efe2bd44c5a5dbc6a5d077f311 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/10.1-example-common-operations.md @@ -0,0 +1,243 @@ +--- +title: Common Operations +summary: Common Operations +author: Zhang Cuiping +date: 2021-10-11 +--- + +# Examples: Common Operations + +## Example 1 + +The following illustrates how to develop applications based on MogDB JDBC interfaces. + +``` +//DBtest.java +// This example illustrates the main processes of JDBC-based development, covering database connection creation, table creation, and data insertion. + +import java.sql.Connection; +import java.sql.DriverManager; +import java.sql.PreparedStatement; +import java.sql.SQLException; +import java.sql.Statement; +import java.sql.CallableStatement; + +public class DBTest { + + // Create a database connection. + public static Connection GetConnection(String username, String passwd) { + String driver = "org.postgresql.Driver"; + String sourceURL = "jdbc:postgresql://localhost:8000/postgres"; + Connection conn = null; + try { + // Load the database driver. + Class.forName(driver).newInstance(); + } catch (Exception e) { + e.printStackTrace(); + return null; + } + + try { + // Create a database connection. + conn = DriverManager.getConnection(sourceURL, username, passwd); + System.out.println("Connection succeed!"); + } catch (Exception e) { + e.printStackTrace(); + return null; + } + + return conn; + }; + + // Run a common SQL statement to create table customer_t1. + public static void CreateTable(Connection conn) { + Statement stmt = null; + try { + stmt = conn.createStatement(); + + // Run a common SQL statement. + int rc = stmt + .executeUpdate("CREATE TABLE customer_t1(c_customer_sk INTEGER, c_customer_name VARCHAR(32));"); + + stmt.close(); + } catch (SQLException e) { + if (stmt != null) { + try { + stmt.close(); + } catch (SQLException e1) { + e1.printStackTrace(); + } + } + e.printStackTrace(); + } + } + + // Run a prepared statement to insert data in batches. + public static void BatchInsertData(Connection conn) { + PreparedStatement pst = null; + + try { + // Generate a prepared statement. + pst = conn.prepareStatement("INSERT INTO customer_t1 VALUES (?,?)"); + for (int i = 0; i < 3; i++) { + // Add parameters. + pst.setInt(1, i); + pst.setString(2, "data " + i); + pst.addBatch(); + } + // Perform batch processing. + pst.executeBatch(); + pst.close(); + } catch (SQLException e) { + if (pst != null) { + try { + pst.close(); + } catch (SQLException e1) { + e1.printStackTrace(); + } + } + e.printStackTrace(); + } + } + + // Run a prepared statement to update data. + public static void ExecPreparedSQL(Connection conn) { + PreparedStatement pstmt = null; + try { + pstmt = conn + .prepareStatement("UPDATE customer_t1 SET c_customer_name = ? WHERE c_customer_sk = 1"); + pstmt.setString(1, "new Data"); + int rowcount = pstmt.executeUpdate(); + pstmt.close(); + } catch (SQLException e) { + if (pstmt != null) { + try { + pstmt.close(); + } catch (SQLException e1) { + e1.printStackTrace(); + } + } + e.printStackTrace(); + } + } + + +// Run a stored procedure. + public static void ExecCallableSQL(Connection conn) { + CallableStatement cstmt = null; + try { + + cstmt=conn.prepareCall("{? = CALL TESTPROC(?,?,?)}"); + cstmt.setInt(2, 50); + cstmt.setInt(1, 20); + cstmt.setInt(3, 90); + cstmt.registerOutParameter(4, Types.INTEGER); // Register an OUT parameter of the integer type. + cstmt.execute(); + int out = cstmt.getInt(4); // Obtain the OUT parameter. + System.out.println("The CallableStatment TESTPROC returns:"+out); + cstmt.close(); + } catch (SQLException e) { + if (cstmt != null) { + try { + cstmt.close(); + } catch (SQLException e1) { + e1.printStackTrace(); + } + } + e.printStackTrace(); + } + } + + + /** + * Main process. Call static methods one by one. + * @param args + */ + public static void main(String[] args) { + // Create a database connection. + Connection conn = GetConnection("tester", "Password1234"); + + // Create a table. + CreateTable(conn); + + // Insert data in batches. + BatchInsertData(conn); + + // Run a prepared statement to update data. + ExecPreparedSQL(conn); + + // Run a stored procedure. + ExecCallableSQL(conn); + + // Close the connection to the database. + try { + conn.close(); + } catch (SQLException e) { + e.printStackTrace(); + } + + } + +} +``` + +## Example 2: High Client Memory Usage + +In this example, **setFetchSize** adjusts the memory usage of the client by using the database cursor to obtain server data in batches. It may increase network interaction and damage some performance. + +The cursor is valid within a transaction. Therefore, disable automatic commit and then manually commit the code. + +``` +// Disable automatic commit. +conn.setAutoCommit(false); +Statement st = conn.createStatement(); + +// Open the cursor and obtain 50 lines of data each time. +st.setFetchSize(50); +ResultSet rs = st.executeQuery("SELECT * FROM mytable"); +conn.commit(); +while (rs.next()) +{ + System.out.print("a row was returned."); +} +rs.close(); + +// Disable the server cursor. +st.setFetchSize(0); +rs = st.executeQuery("SELECT * FROM mytable"); +conn.commit(); +while (rs.next()) +{ + System.out.print("many rows were returned."); +} +rs.close(); + +// Close the statement. +st.close(); +conn.close(); +``` + +Run the following command to enable automatic commit: + +``` +conn.setAutoCommit(true); +``` + +## Example 3: Example of Common Data Types + +``` +//Example of the bit type. Note that the value range of the bit type is [0,1]. +Statement st = conn.createStatement(); +String sqlstr = "create or replace function fun_1()\n" + + "returns bit AS $$\n" + + "select col_bit from t_bit limit 1;\n" + + "$$\n" + + "LANGUAGE SQL;"; +st.execute(sqlstr); +CallableStatement c = conn.prepareCall("{ ? = call fun_1() }"); +//Register the output type, which is a bit string. +c.registerOutParameter(1, Types.BIT); +c.execute(); +//Use the Boolean type to obtain the result. +System.out.println(c.getBoolean(1)); +``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/1-psycopg2-connect.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/1-psycopg2-connect.md new file mode 100644 index 0000000000000000000000000000000000000000..d4ca8b008709b0967d1cbf27dddb54e3eb810713 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/1-psycopg2-connect.md @@ -0,0 +1,42 @@ +--- +title: psycopg2.connect() +summary: psycopg2.connect() +author: Zhang Cuiping +date: 2021-10-11 +--- + +# psycopg2.connect() + +## Function + +This method creates a database session and returns a new connection object. + +## Prototype + +``` +conn=psycopg2.connect(dbname="test",user="postgres",password="secret",host="127.0.0.1",port="5432") +``` + +## Parameter + +**Table 1** psycopg2.connect parameters + +| **Keyword** | **Description** | +| :---------- | :----------------------------------------------------------- | +| dbname | Database name. | +| user | Username. | +| password | Password. | +| host | Database IP address. The default type is UNIX socket. | +| port | Connection port number. The default value is **5432**. | +| sslmode | SSL mode, which is used for SSL connection. | +| sslcert | Path of the client certificate, which is used for SSL connection. | +| sslkey | Path of the client key, which is used for SSL connection. | +| sslrootcert | Path of the root certificate, which is used for SSL connection. | + +## Return Value + +Connection object (for connecting to the openGauss DB instance) + +## Examples + +For details, see [Example: Common Operations](10.1-example-common-operations). diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/10-connection-close.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/10-connection-close.md new file mode 100644 index 0000000000000000000000000000000000000000..111a510f406c66d5a652a2910eec9d2c49507d2c --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/10-connection-close.md @@ -0,0 +1,32 @@ +--- +title: connection.close() +summary: connection.close() +author: Zhang Cuiping +date: 2021-10-11 +--- + +# connection.close() + +## Function + +This method closes the database connection. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-caution.gif) **CAUTION:** This method closes the database connection and does not automatically call **commit()**. If you just close the database connection without calling **commit()** first, changes will be lost. + +## Prototype + +``` +connection.close() +``` + +## Parameter + +None + +## Return Value + +None + +## Examples + +For details, see [Example: Common Operations](10.1-example-common-operations). \ No newline at end of file diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/2-connection-cursor.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/2-connection-cursor.md new file mode 100644 index 0000000000000000000000000000000000000000..b78fb863759ad6ff3179bbc072a43743ccccdecc --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/2-connection-cursor.md @@ -0,0 +1,37 @@ +--- +title: connection.cursor() +summary: connection.cursor() +author: Zhang Cuiping +date: 2021-10-11 +--- + +# connection.cursor() + +## Function + +This method returns a new cursor object. + +## Prototype + +``` +cursor(name=None, cursor_factory=None, scrollable=None, withhold=False) +``` + +## Parameter + +**Table 1** connection.cursor parameters + +| **Keyword** | **Description** | +| :------------- | :----------------------------------------------------------- | +| name | Cursor name. The default value is **None**. | +| cursor_factory | Creates a non-standard cursor. The default value is **None**. | +| scrollable | Sets the SCROLL option. The default value is **None**. | +| withhold | Sets the HOLD option. The default value is **False**. | + +## Return Value + +Cursor object (used for cusors that are programmed using Python in the entire database) + +## Examples + +For details, see [Example: Common Operations](10.1-example-common-operations). diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/3-cursor-execute-query-vars-list.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/3-cursor-execute-query-vars-list.md new file mode 100644 index 0000000000000000000000000000000000000000..ea04a2585fd45bc41ea812dc3e675a7581c7a097 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/3-cursor-execute-query-vars-list.md @@ -0,0 +1,35 @@ +--- +title: cursor.execute(query,vars_list) +summary: cursor.execute(query,vars_list) +author: Zhang Cuiping +date: 2021-10-11 +--- + +# cursor.execute(query,vars_list) + +## Function + +This method executes the parameterized SQL statements (that is, placeholders instead of SQL literals). The psycopg2 module supports placeholders marked with **%s**. + +## Prototype + +``` +curosr.execute(query,vars_list) +``` + +## Parameters + +**Table 1** curosr.execute parameters + +| **Keyword** | **Description** | +| :---------- | :----------------------------------------------------------- | +| query | SQL statement to be executed. | +| vars_list | Variable list, which matches the **%s** placeholder in the query. | + +## Return Value + +None + +## Examples + +For details, see [Example: Common Operations](10.1-example-common-operations). diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/4-curosr-executemany-query-vars-list.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/4-curosr-executemany-query-vars-list.md new file mode 100644 index 0000000000000000000000000000000000000000..b4a57a6c16b7b2ff5a044feafb3d853821fab966 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/4-curosr-executemany-query-vars-list.md @@ -0,0 +1,35 @@ +--- +title: curosr.executemany(query,vars_list) +summary: curosr.executemany(query,vars_list) +author: Zhang Cuiping +date: 2021-10-11 +--- + +# curosr.executemany(query,vars_list) + +## Function + +This method executes an SQL command against all parameter sequences or mappings found in the sequence SQL. + +## Prototype + +``` +curosr.executemany(query,vars_list) +``` + +## Parameter + +**Table 1** curosr.executemany parameters + +| **Keyword** | **Description** | +| :---------- | :----------------------------------------------------------- | +| query | SQL statement that you want to execute. | +| vars_list | Variable list, which matches the **%s** placeholder in the query. | + +## Return Value + +None + +## Examples + +For details, see [Example: Common Operations](10.1-example-common-operations). diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/5-connection-commit.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/5-connection-commit.md new file mode 100644 index 0000000000000000000000000000000000000000..1f2cdd5251427edb135a58567a61e24836e69816 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/5-connection-commit.md @@ -0,0 +1,32 @@ +--- +title: connection.commit() +summary: connection.commit() +author: Zhang Cuiping +date: 2021-10-11 +--- + +# connection.commit() + +## Function + +This method commits the currently pending transaction to the database. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-caution.gif) **CAUTION:** By default, Psycopg opens a transaction before executing the first command. If **commit()** is not called, the effect of any data operation will be lost. + +## Prototype + +``` +connection.commit() +``` + +## Parameter + +None + +## Return Value + +None + +## Examples + +For details, see [Example: Common Operations](10.1-example-common-operations). \ No newline at end of file diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/6-connection-rollback.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/6-connection-rollback.md new file mode 100644 index 0000000000000000000000000000000000000000..0284fcd43997b32d6ca7267b4e3ff97be1be6de4 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/6-connection-rollback.md @@ -0,0 +1,32 @@ +--- +title: connection.rollback() +summary: connection.rollback() +author: Zhang Cuiping +date: 2021-10-11 +--- + +# connection.rollback() + +## Function + +This method rolls back the current pending transaction. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-caution.gif) **CAUTION:** If you close the connection using **close()** but do not commit the change using **commit()**, an implicit rollback will be performed. + +## Prototype + +``` +connection.rollback() +``` + +## Parameter + +None + +## Return Value + +None + +## Examples + +For details, see [Example: Common Operations](10.1-example-common-operations). \ No newline at end of file diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/7-cursor-fetchone.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/7-cursor-fetchone.md new file mode 100644 index 0000000000000000000000000000000000000000..a9af28472feac88dae831f1459c6e89d236f650e --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/7-cursor-fetchone.md @@ -0,0 +1,30 @@ +--- +title: cursor.fetchone() +summary: cursor.fetchone() +author: Zhang Cuiping +date: 2021-10-11 +--- + +# cursor.fetchone() + +## Function + +This method extracts the next row of the query result set and returns a tuple. + +## Prototype + +``` +cursor.fetchone() +``` + +## Parameter + +None + +## Return Value + +A single tuple is the first result in the result set. If no more data is available, **None** is returned. + +## Examples + +For details, see [Example: Common Operations](10.1-example-common-operations). \ No newline at end of file diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/8-cursor-fetchall.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/8-cursor-fetchall.md new file mode 100644 index 0000000000000000000000000000000000000000..c8949edd5d7f484e89f95ec1dfbdb9e15220b782 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/8-cursor-fetchall.md @@ -0,0 +1,30 @@ +--- +title: cursor.fetchall() +summary: cursor.fetchall() +author: Zhang Cuiping +date: 2021-10-11 +--- + +# cursor.fetchall() + +## Function + +This method obtains all the (remaining) rows of the query result and returns them as a list of tuples. + +## Prototype + +``` +cursor.fetchall() +``` + +## Parameter + +None + +## Return Value + +Tuple list, which contains all results of the result set. An empty list is returned when no rows are available. + +## Examples + +For details, see [Example: Common Operations](10.1-example-common-operations). \ No newline at end of file diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/9-cursor-close.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/9-cursor-close.md new file mode 100644 index 0000000000000000000000000000000000000000..7fa9acd286fd3029c09d5e3c92468e9d3ab6e1c2 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/11-psycopg-api-reference/9-cursor-close.md @@ -0,0 +1,30 @@ +--- +title: cursor.close() +summary: cursor.close() +author: Zhang Cuiping +date: 2021-10-11 +--- + +# cursor.close() + +## Function + +This method closes the cursor of the current connection. + +## Prototype + +``` +cursor.close() +``` + +## Parameter + +None + +## Return Value + +None + +## Examples + +For details, see [Example: Common Operations](10.1-example-common-operations). \ No newline at end of file diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/2-psycopg-package.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/2-psycopg-package.md new file mode 100644 index 0000000000000000000000000000000000000000..1a6e0577c32670b9e4ddab9b5082ebb891f6ea73 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/2-psycopg-package.md @@ -0,0 +1,15 @@ +--- +title: Psycopg Package +summary: Psycopg Package +author: Zhang Cuiping +date: 2021-10-11 +--- + +# Psycopg Package + +The psycopg package is obtained from the release package. Its name is **GaussDB-Kernel-VxxxRxxxCxx-OS version number-64bit-Python.tar.gz**. + +After the decompression, the following folders are generated: + +- **psycopg2**: **psycopg2** library file +- **lib**: **lib** library file \ No newline at end of file diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/3.1-development-process.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/3.1-development-process.md new file mode 100644 index 0000000000000000000000000000000000000000..9e2f3d5b39a81a023d1db4536d204d6ce9a0a8b3 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/3.1-development-process.md @@ -0,0 +1,12 @@ +--- +title: Development Process +summary: Development Process +author: Zhang Cuiping +date: 2021-10-11 +--- + +# Development Process + +**Figure 1** Application development process based on psycopg2 + +![application-development-process-based-on-psycopg2](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/development-process-2.png) diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/4-loading-a-driver.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/4-loading-a-driver.md new file mode 100644 index 0000000000000000000000000000000000000000..952882e00dc0e038b500f40914acd2e53a1cb72a --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/4-loading-a-driver.md @@ -0,0 +1,21 @@ +--- +title: Loading a Driver +summary: Loading a Driver +author: Zhang Cuiping +date: 2021-10-11 +--- + +# Loading a Driver + +- Before using the driver, perform the following operations: + + 1. Decompress the driver package of the corresponding version and copy psycopg2 to the **site-packages** folder in the Python installation directory as the **root** user. + 2. Change the **psycopg2** directory permission to **755**. + 3. Add the **psycopg2** directory to the environment variable *$PYTHONPATH* and validate it. + 4. For non-database users, configure the **lib** directory in *LD_LIBRARY_PATH* after decompression. + +- Load a database driver before creating a database connection: + + ```bash + import psycopg2 + ``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/5.1-connecting-to-a-database.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/5.1-connecting-to-a-database.md new file mode 100644 index 0000000000000000000000000000000000000000..556bf4391713b2f183702b5fe62c4dbd64e5252f --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/5.1-connecting-to-a-database.md @@ -0,0 +1,12 @@ +--- +title: Connecting to a Database +summary: Connecting to a Database +author: Zhang Cuiping +date: 2021-10-11 +--- + +# Connecting to a Database + +1. Use the .ini file (the **configparser** package of Python can parse this type of configuration file) to save the configuration information about the database connection. +2. Use the **psycopg2.connect** function to obtain the connection object. +3. Use the connection object to create a cursor object. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/6-executing-sql-statements.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/6-executing-sql-statements.md new file mode 100644 index 0000000000000000000000000000000000000000..78313f70fa3ec3772f63aeeb1417c2436a4a268e --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/6-executing-sql-statements.md @@ -0,0 +1,11 @@ +--- +title: Executing SQL Statements +summary: Executing SQL Statements +author: Zhang Cuiping +date: 2021-10-11 +--- + +# Executing SQL Statements + +1. Construct an operation statement and use **%s** as a placeholder. During execution, psycopg2 will replace the placeholder with the parameter value. You can add the RETURNING clause to obtain the automatically generated column values. +2. The **cursor.execute** method is used to perform operations on one row, and the **cursor.executemany** method is used to perform operations on multiple rows. \ No newline at end of file diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/7-processing-the-result-set.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/7-processing-the-result-set.md new file mode 100644 index 0000000000000000000000000000000000000000..c24096ab1b5dd0318a1bbd8ce2984dc552945fe4 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/7-processing-the-result-set.md @@ -0,0 +1,11 @@ +--- +title: Processing the Result Set +summary: Processing the Result Set +author: Zhang Cuiping +date: 2021-10-11 +--- + +# Processing the Result Set + +1. **cursor.fetchone()**: Fetches the next row in a query result set and returns a sequence. If no data is available, null is returned. +2. **cursor.fetchall()**: Fetches all remaining rows in a query result and returns a list. An empty list is returned when no rows are available. \ No newline at end of file diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/8-closing-the-connection.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/8-closing-the-connection.md new file mode 100644 index 0000000000000000000000000000000000000000..6e0d0c06b4b563815b4ede3fc4691063ac85666d --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/8-closing-the-connection.md @@ -0,0 +1,12 @@ +--- +title: Closing the Connection +summary: Closing the Connection +author: Zhang Cuiping +date: 2021-10-11 +--- + +# Closing the Connection + +After you complete required data operations in a database, close the database connection. Call the close method such as **connection.close()** to close the connection. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-caution.gif) **CAUTION:** This method closes the database connection and does not automatically call **commit()**. If you just close the database connection without calling **commit()** first, changes will be lost. \ No newline at end of file diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/9-connecting-to-the-database-using-ssl.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/9-connecting-to-the-database-using-ssl.md new file mode 100644 index 0000000000000000000000000000000000000000..17dffd022da56403d52b2402997109d7afdc900f --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/4.1-psycopg-based-development/9-connecting-to-the-database-using-ssl.md @@ -0,0 +1,30 @@ +--- +title: Connecting to the Database +summary: Connecting to the Database +author: Zhang Cuiping +date: 2021-10-11 +--- + +# Connecting to the Database (Using SSL) + +When you use psycopy2 to connect to the MogDB server, you can enable SSL to encrypt the communication between the client and server. To enable SSL, you must have the server certificate, client certificate, and private key files. For details on how to obtain these files, see related documents and commands of OpenSSL. + +1. Use the .ini file (the **configparser** package of Python can parse this type of configuration file) to save the configuration information about the database connection. +2. Add SSL connection parameters **sslmode**, **sslcert**, **sslkey**, and **sslrootcert** to the connection options. + 1. **sslmode**: [Table 1](#table1.1) + 2. **sslcert**: client certificate path + 3. **sslkey**: client key path + 4. **sslrootcert**: root certificate path +3. Use the **psycopg2.connect** function to obtain the connection object. +4. Use the connection object to create a cursor object. + +**Table 1** sslmode options + +| sslmode | Whether SSL Encryption Is Enabled | Description | +| :---------- | :-------------------------------- | :----------------------------------------------------------- | +| disable | No | SSL connection is not enabled. | +| allow | Possible | If the database server requires SSL connection, SSL connection can be enabled. However, authenticity of the database server will not be verified. | +| prefer | Possible | If the database supports SSL connection, SSL connection is preferred. However, authenticity of the database server will not be verified. | +| require | Yes | SSL connection is required and data is encrypted. However, authenticity of the database server will not be verified. | +| verify-ca | Yes | The SSL connection must be enabled. | +| verify-full | Yes | The SSL connection must be enabled, which is not supported by MogDB currently. | diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/5-commissioning.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/5-commissioning.md new file mode 100644 index 0000000000000000000000000000000000000000..848fff962eb9fe65080f09da7b53fc17109ce741 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/5-commissioning.md @@ -0,0 +1,40 @@ +--- +title: Commissioning +summary: Commissioning +author: Guo Huan +date: 2021-04-27 +--- + +# Commissioning + +To control the output of log files and better understand the operating status of the database, modify specific configuration parameters in the **postgresql.conf** file in the instance data directory. + +[Table 1](#Configuration parameters) describes the adjustable configuration parameters. + +**Table 1** Configuration parameters + +| Parameter | Description | Value Range | Remarks | +| ------------------------------------------------------------ | ------------------------------------- | ------------------------------------- | ------------------------------------- | +| client_min_messages | Level of messages to be sent to clients. | - DEBUG5
- DEBUG4
- DEBUG3
- DEBUG2
- DEBUG1
- LOG
- NOTICE
WARNING
- ERROR
- FATAL
- PANIC
Default value: NOTICE | Messages of the set level or lower will be sent to clients. The lower the level is, the fewer the messages will be sent. | +| log_min_messages | Level of messages to be recorded in server logs. | - DEBUG5
- DEBUG4
- DEBUG3
- DEBUG2
- DEBUG1
- INFO
- NOTICE
- WARNING
- ERROR
- LOG
- FATAL
- PANIC
Default value: WARNING | Messages higher than the set level will be recorded in logs. The higher the level is, the fewer the server logs will be recorded. | +| log_min_error_statement | Level of SQL error statements to be recorded in server logs. | - DEBUG5
- DEBUG4
- DEBUG3
- DEBUG2
- DEBUG1
- INFO
- NOTICE
- WARNING
- ERROR
- FATAL
- PANIC
Default value: ERROR。 | SQL error statements of the set level or higher will be recorded in server logs.Only a system administrator is allowed to modify this parameter. | +| log_min_duration_statement | Minimum execution duration of a statement. If the execution duration of a statement is equal to or longer than the set milliseconds, the statement and its duration will be recorded in logs. Enabling this function can help you track the query attempts to be optimized. | INT type
Default value: 30min
Unit: millisecond | The default value (-1) indicates that the function is disabled.Only a system administrator is allowed to modify this parameter. | +| log_connections/log_disconnections | Whether to record a server log message when each session is connected or disconnected. | - **on**: The system records a log server when each session is connected or disconnected.
- **off**: The system does not record a log server when each session is connected or disconnected.
Default value: off | - | +| log_duration | Whether to record the duration of each executed statement. | - **on**: The system records the duration of each executed statement.
- **off**: The system does not record the duration of each executed statement.
Default value: on | Only a system administrator is allowed to modify this parameter. | +| log_statement | SQL statements to be recorded in logs. | - **none**: The system does not record any SQL statements.
- **ddl**: The system records data definition statements.
- **mod**: The system records data definition statements and data operation statements.
- **all**: The system records all statements.
Default value: none | Only a system administrator is allowed to modify this parameter. | +| log_hostname | Whether to record host names. | - **on**: The system records host names.
- **off**: The system does not record host names.
Default value: off | By default, connection logs only record the IP addresses of connected hosts. With this function, the host names will also be recorded.This parameter affects parameters in **Querying Audit Results**, GS_WLM_SESSION_HISTORY, PG_STAT_ACTIVITY, and **log_line_prefix**. | + +[Table 2](#description) describes the preceding parameter levels. + +**Table 2** Description of log level parameters + +| Level | Description | +| ---------- | ------------------------------------------------------------ | +| DEBUG[1-5] | Provides information that can be used by developers. Level 1 is the lowest level whereas level 5 is the highest level. | +| INFO | Provides information about users' hidden requests, for example, information about the VACUUM VERBOSE process. | +| NOTICE | Provides information that may be important to users, for example, truncations of long identifiers or indexes created as a part of a primary key. | +| WARNING | Provides warning information for users, for example, COMMIT out of transaction blocks. | +| ERROR | Reports an error that causes a command to terminate. | +| LOG | Reports information that administrators may be interested in, for example, the activity levels of check points. | +| FATAL | Reports the reason that causes a session to terminate. | +| PANIC | Reports the reason that causes all sessions to terminate. | diff --git a/product/en/docs-mogdb/v3.0/developer-guide/dev/6-appendix.md b/product/en/docs-mogdb/v3.0/developer-guide/dev/6-appendix.md new file mode 100644 index 0000000000000000000000000000000000000000..137e0bc4f7625aa5f2b37de549a14b438dacd1fa --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/dev/6-appendix.md @@ -0,0 +1,107 @@ +--- +title: Appendices +summary: Appendices +author: Guo Huan +date: 2021-04-27 +--- + +# Appendices + +## **Table 1** PG_STAT_ACTIVITY Columns + +| Name | Type | Description | +| ---------------- | ------------------------ | ------------------------------------------------------------ | +| datid | oid | OID of the database that the user session connects to in the backend | +| datname | name | Name of the database that the user session connects to in the backend | +| pid | bigint | Thread ID of the backend | +| sessionid | bigint | Session ID | +| usesysid | oid | OID of the user logged in to the backend | +| usename | name | Name of the user logged in to the backend | +| application_name | text | Name of the application connected to the backend | +| client_addr | inet | IP address of the client connected to the backend. If this column is null, it indicates either the client is connected via a Unix socket on the server or this is an internal process, such as autovacuum. | +| client_hostname | text | Host name of the connected client, as reported by a reverse DNS lookup of `client_addr`. This column will be non-null only for IP connections and only when **log_hostname** is enabled. | +| client_port | integer | TCP port number that the client uses for communication with this backend (**-1** if a Unix socket is used) | +| backend_start | timestamp with time zone | Time when this process was started, that is, when the client connected to the server | +| xact_start | timestamp with time zone | Time when current transaction was started (null if no transaction is active) If the current query is the first of its transaction, the value of this column is the same as that of the **query_start** column. | +| query_start | timestamp with time zone | Time when the currently active query was started, or if **state** is not **active**, when the last query was started | +| state_change | timestamp with time zone | Time when the **state** was last changed | +| waiting | Boolean | Whether the backend is currently waiting on a lock. If yes, the value is **true**. | +| enqueue | text | Unsupported currently | +| state | text | Overall status of this backend. The value must be one of the following:
- **active**: The backend is executing a query.
- **idle**: The backend is waiting for a new client command.
- **idle in transaction**: The backend is in a transaction, but there is no statement being executed in the transaction.
- **idle in transaction** (aborted): The backend is in a transaction, but there are statements failed in the transaction.
- **fastpath function call**: The backend is executing a fast-path function.
- **disabled**: This state is reported if **track_activities** is disabled in this backend.
NOTE:
Common users can view their own session status only. The state information of other accounts is empty. For example, after user **judy** is connected to the database, the state information of user **joe** and the initial user omm in **pg_stat_activity** is empty.
`SELECT datname, usename, usesysid, state,pid FROM pg_stat_activity;`
`datname | usename | usesysid | state | pid ----+---+----+---+------ postgres | omm | 10 | | 139968752121616 postgres | omm | 10 | | 139968903116560 db_tpcc | judy | 16398 | active | 139968391403280 postgres | omm | 10 | | 139968643069712 postgres | omm | 10 | | 139968680818448 postgres | joe | 16390 | | 139968563377936 (6 rows)` | +| resource_pool | name | Resource pool used by the user | +| query_id | bigint | ID of a query | +| query | text | Text of this backend's most recent query. If **state** is **active**, this column shows the ongoing query. In all other states, it shows the last query that was executed. | +| connection_info | text | A string in JSON format recording the driver type, driver version, driver deployment path, and process owner of the connected database. For details, see **connection_info**. | + +## **Table 2** GS_WLM_SESSION_HISTORY Columns + +| Name | Type | Description | +| ------------------------ | ---------------------- | ---------------------- | +| datid | oid | OID of the database that the backend is connected to | +| dbname | text | Name of the database that the backend is connected to | +| schemaname | text | Schema name | +| nodename | text | Name of the database node where the statement is executed | +| username | text | Username used for connecting to the backend | +| application_name | text | Name of the application connected to the backend | +| client_addr | inet | IP address of the client connected to the backend. If this column is null, it indicates either the client is connected via a Unix socket on the server or this is an internal process, such as autovacuum. | +| client_hostname | text | Host name of the connected client, as reported by a reverse DNS lookup of `client_addr`. This column will be non-null only for IP connections and only when **log_hostname** is enabled. | +| client_port | integer | TCP port number that the client uses for communication with this backend (**-1** if a Unix socket is used) | +| query_band | text | Job type, which is specified by the GUC parameter **query_band**. The default value is a null string. | +| block_time | bigint | Duration that the statement is blocked before being executed, including the statement parsing and optimization duration (unit: ms) | +| start_time | timestamp with time zone | Time when the statement starts to be executed | +| finish_time | timestamp with time zone | Time when the statement execution ends | +| duration | bigint | Execution time of the statement, in ms | +| estimate_total_time | bigint | Estimated execution time of the statement, in ms | +| status | text | Final statement execution status, which can be **finished** (normal) or **aborted** (abnormal). | +| abort_info | text | Exception information displayed if the final statement execution status is **aborted** | +| resource_pool | text | Resource pool used by the user | +| control_group | text | The function is not supported currently. | +| estimate_memory | integer | Estimated memory size of the statement. | +| min_peak_memory | integer | Minimum memory peak of the statement across the database nodes, in MB | +| max_peak_memory | integer | Maximum memory peak of the statement across the database nodes, in MB | +| average_peak_memory | integer | Average memory usage during statement execution (unit: MB) | +| memory_skew_percent | integer | Memory usage skew of the statement among the database nodes | +| spill_info | text | Information about statement spill to the database nodes
- **None**: The statement has not been spilled to disks on the database nodes.
- **All**: The statement has been spilled to disks on the database nodes.
- **[a:b]**: The statement has been spilled to disks on *a* of *b* database nodes. | +| min_spill_size | integer | Minimum spilled data among database nodes when a spill occurs, in MB (default value:**0**) | +| max_spill_size | integer | Maximum spilled data among database nodes when a spill occurs, in MB (default value:**0**) | +| average_spill_size | integer | Average spilled data among database nodes when a spill occurs, in MB (default value:**0**) | +| spill_skew_percent | integer | database node spill skew when a spill occurs | +| min_dn_time | bigint | Minimum execution time of the statement across the database nodes, in ms | +| max_dn_time | bigint | Maximum execution time of the statement across the database nodes, in ms | +| average_dn_time | bigint | Average execution time of the statement across the database nodes, in ms | +| dntime_skew_percent | integer | Execution time skew of the statement among the database nodes | +| min_cpu_time | bigint | Minimum CPU time of the statement across the database nodes, in ms | +| max_cpu_time | bigint | Maximum CPU time of the statement across the database nodes, in ms | +| total_cpu_time | bigint | Total CPU time of the statement across the database nodes, in ms | +| cpu_skew_percent | integer | CPU time skew of the statement among database nodes | +| min_peak_iops | integer | Minimum IOPS peak of the statement across the database nodes. It is counted by ones in a column-store table and by ten thousands in a row-store table. | +| max_peak_iops | integer | Maximum IOPS peak of the statement across the database nodes. It is counted by ones in a column-store table and by ten thousands in a row-store table. | +| average_peak_iops | integer | Average IOPS peak of the statement across the database nodes. It is counted by ones in a column-store table and by ten thousands in a row-store table. | +| iops_skew_percent | integer | I/O skew across database nodes | +| warning | text | Warning. The following warnings are displayed:
- Spill file size large than 256 MB
- Broadcast size large than 100 MB
- Early spill
- Spill times is greater than 3
- Spill on memory adaptive
- Hash table conflict | +| queryid | bigint | Internal query ID used for statement execution | +| query | text | Statement executed | +| query_plan | text | Execution plan of the statement | +| node_group | text | Unsupported currently | +| cpu_top1_node_name | text | Name of the current database node | +| cpu_top2_node_name | text | Unsupported currently | +| cpu_top3_node_name | text | Unsupported currently | +| cpu_top4_node_name | text | Unsupported currently | +| cpu_top5_node_name | text | Unsupported currently | +| mem_top1_node_name | text | Current database name | +| mem_top2_node_name | text | Unsupported currently | +| mem_top3_node_name | text | Unsupported currently | +| mem_top4_node_name | text | Unsupported currently | +| mem_top5_node_name | text | Unsupported currently | +| cpu_top1_value | bigint | CPU usage of the current database node | +| cpu_top2_value | bigint | Unsupported currently | +| cpu_top3_value | bigint | Unsupported currently | +| cpu_top4_value | bigint | Unsupported currently | +| cpu_top5_value | bigint | Unsupported currently | +| mem_top1_value | bigint | Memory usage of the current database node | +| mem_top2_value | bigint | Unsupported currently | +| mem_top3_value | bigint | Unsupported currently | +| mem_top4_value | bigint | Unsupported currently | +| mem_top5_value | bigint | Unsupported currently | +| top_mem_dn | text | Memory usage information of the current database node | +| top_cpu_dn | text | CPU usage information of the current database node | diff --git a/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/database-object-design/constraint-design.md b/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/database-object-design/constraint-design.md new file mode 100644 index 0000000000000000000000000000000000000000..d595b4e68fdd43b1433832386af86d5911445b22 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/database-object-design/constraint-design.md @@ -0,0 +1,37 @@ +--- +title: Constraint Design +summary: Constraint Design +author: Guo Huan +date: 2021-10-14 +--- + +# Constraint Design + +## DEFAULT and NULL Constraints + +- [Proposal] If all the column values can be obtained from services, you are not advised to use the **DEFAULT** constraint. Otherwise, unexpected results will be generated during data loading. +- [Proposal] Add **NOT NULL** constraints to columns that never have NULL values. The optimizer automatically optimizes the columns in certain scenarios. +- [Proposal] Explicitly name all constraints excluding **NOT NULL** and **DEFAULT**. + +## Partial Cluster Keys + +A partial cluster key (PCK) is a local clustering technology used for column-store tables. After creating a PCK, you can quickly filter and scan fact tables using min or max sparse indexes in MogDB. Comply with the following rules to create a PCK: + +- [Notice] Only one PCK can be created in a table. A PCK can contain multiple columns, preferably no more than two columns. +- [Proposal] Create a PCK on simple expression filter conditions in a query. Such filter conditions are usually in the form of **col op const**, where **col** specifies a column name, **op** specifies an operator (such as =, >, >=, <=, and <), and **const** specifies a constant. +- [Proposal] If the preceding conditions are met, create a PCK on the column having the most distinct values. + +## Unique Constraints + +- [Notice] Unique constraints can be used in row-store tables and column-store tables. +- [Proposal] The constraint name should indicate that it is a unique constraint, for example, **UNIIncluded columns**. + +## Primary Key Constraints + +- [Notice] Primary key constraints can be used in row-store tables and column-store tables. +- [Proposal] The constraint name should indicate that it is a primary key constraint, for example, **PKIncluded columns**. + +## Check Constraints + +- [Notice] Check constraints can be used in row-store tables but not in column-store tables. +- [Proposal] The constraint name should indicate that it is a check constraint, for example, **CKIncluded columns**. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/database-object-design/database-and-schema-design.md b/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/database-object-design/database-and-schema-design.md new file mode 100644 index 0000000000000000000000000000000000000000..c72ba82a19d03e6d95d0146ad58024999e86691f --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/database-object-design/database-and-schema-design.md @@ -0,0 +1,27 @@ +--- +title: Database and Schema Design +summary: Database and Schema Design +author: Guo Huan +date: 2021-10-14 +--- + +# Database and Schema Design + +In MogDB, services can be isolated by databases and schemas. Databases share little resources and cannot directly access each other. Connections to and permissions on them are also isolated. Schemas share more resources than databases do. User permissions on schemas and subordinate objects can be controlled using the **GRANT** and **REVOKE** syntax. + +- You are advised to use schemas to isolate services for convenience and resource sharing. +- It is recommended that system administrators create schemas and databases and then assign required permissions to users. + +## Database Design + +- [Rule] Create databases as required by your service. Do not use the default **postgres** database of a database instance. +- [Proposal] Create a maximum of three user-defined databases in a database instance. +- [Proposal] To make your database compatible with most characters, you are advised to use the UTF-8 encoding when creating a database. +- [Notice] When you create a database, exercise caution when you set **ENCODING** and **DBCOMPATIBILITY** configuration items. MogDB supports the A, B and PG compatibility modes, which are compatible with the Oracle syntax, MySQL syntax and PostgreSQL syntax, respectively. The syntax behavior varies according to the compatibility mode. By default, the A compatibility mode is used. +- [Notice] By default, a database owner has all permissions for all objects in the database, including the deletion permission. Exercise caution when deleting a permission. + +## Schema Design + +- [Notice] To let a user access an object in a schema, assign the usage permission and the permissions for the object to the user, unless the user has the **sysadmin** permission or is the schema owner. +- [Notice] To let a user create an object in the schema, grant the create permission for the schema to the user. +- [Notice] By default, a schema owner has all permissions for all objects in the schema, including the deletion permission. Exercise caution when deleting a permission. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/database-object-design/field-design.md b/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/database-object-design/field-design.md new file mode 100644 index 0000000000000000000000000000000000000000..1759925cfc57c182b18f59d9815aad1e4c1ecafc --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/database-object-design/field-design.md @@ -0,0 +1,40 @@ +--- +title: Field Design +summary: Field Design +author: Guo Huan +date: 2021-10-14 +--- + +# Field Design + +## Selecting a Data Type + +To improve query efficiency, comply with the following rules when designing columns: + +- [Proposal] Use the most efficient data types allowed. + + If all of the following number types provide the required service precision, they are recommended in descending order of priority: integer, floating point, and numeric. + +- [Proposal] In tables that are logically related, columns having the same meaning should use the same data type. + +- [Proposal] For string data, you are advised to use variable-length strings and specify the maximum length. To avoid truncation, ensure that the specified maximum length is greater than the maximum number of characters to be stored. You are not advised to use CHAR(n), BPCHAR(n), NCHAR(n), or CHARACTER(n), unless you know that the string length is fixed. + + For details about string types, see below. + +## Common String Types + +Every column requires a data type suitable for its data characteristics. The following table lists common string types in MogDB. + +**Table 1** Common string types + +| **Name** | **Description** | **Max. Storage Capacity** | +| :------------------- | :----------------------------------------------------------- | :------------------------ | +| CHAR(n) | Fixed-length string, where *n* indicates the stored bytes. If the length of an input string is smaller than *n*, the string is automatically padded to *n* bytes using NULL characters. | 10 MB | +| CHARACTER(n) | Fixed-length string, where *n* indicates the stored bytes. If the length of an input string is smaller than *n*, the string is automatically padded to *n* bytes using NULL characters. | 10 MB | +| NCHAR(n) | Fixed-length string, where *n* indicates the stored bytes. If the length of an input string is smaller than *n*, the string is automatically padded to *n* bytes using NULL characters. | 10 MB | +| BPCHAR(n) | Fixed-length string, where *n* indicates the stored bytes. If the length of an input string is smaller than *n*, the string is automatically padded to *n* bytes using NULL characters. | 10 MB | +| VARCHAR(n) | Variable-length string, where *n* indicates the maximum number of bytes that can be stored. | 10 MB | +| CHARACTER VARYING(n) | Variable-length string, where *n* indicates the maximum number of bytes that can be stored. This data type and VARCHAR(n) are different representations of the same data type. | 10 MB | +| VARCHAR2(n) | Variable-length string, where *n* indicates the maximum number of bytes that can be stored. This data type is added to be compatible with the Oracle database, and its behavior is the same as that of VARCHAR(n). | 10 MB | +| NVARCHAR2(n) | Variable-length string, where *n* indicates the maximum number of bytes that can be stored. | 10 MB | +| TEXT | Variable-length string. Its maximum length is 1 GB minus 8203 bytes. | 1 GB minus 8203 bytes | diff --git a/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/database-object-design/table-design.md b/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/database-object-design/table-design.md new file mode 100644 index 0000000000000000000000000000000000000000..c883964861f4d0a56ce02115536e592c0eddaa60 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/database-object-design/table-design.md @@ -0,0 +1,57 @@ +--- +title: Table Design +summary: Table Design +author: Guo Huan +date: 2021-10-14 +--- + +# Table Design + +Comply with the following principles to properly design a table: + +- [Notice] Reduce the amount of data to be scanned. You can use the pruning mechanism of a partitioned table. +- [Notice] Minimize random I/Os. By clustering or local clustering, you can sequentially store hot data, converting random I/O to sequential I/O to reduce the cost of I/O scanning. + +## Selecting a Storage Mode + +[Proposal] Selecting a storage model is the first step in defining a table. The storage model mainly depends on the customer's service type. For details, see Table 1. + +**Table 1** Table storage modes and scenarios + +| Storage Type | Application Scenario | +| :----------- | :----------------------------------------------------------- | +| Row store | - Point queries (simple index-based queries that only return a few records).
- Scenarios requiring frequent addition, deletion, and modification. | +| Column store | - Statistical analysis queries (requiring a large number of association and grouping operations).
- Ad hoc queries (using uncertain query conditions and unable to utilize indexes to scan row-store tables). | + +## Selecting a Partitioning Mode + +If a table contains a large amount of data, partition the table based on the following rules: + +- [Proposal] Create partitions on columns that indicate certain ranges, such as dates and regions. +- [Proposal] A partition name should show the data characteristics of a partition. For example, its format can be **Keyword+Range** characteristics. +- [Proposal] Set the upper limit of a partition to **MAXVALUE** to prevent data overflow. + +The example of a partitioned table definition is as follows: + +```sql +CREATE TABLE staffS_p1 +( + staff_ID NUMBER(6) not null, + FIRST_NAME VARCHAR2(20), + LAST_NAME VARCHAR2(25), + EMAIL VARCHAR2(25), + PHONE_NUMBER VARCHAR2(20), + HIRE_DATE DATE, + employment_ID VARCHAR2(10), + SALARY NUMBER(8,2), + COMMISSION_PCT NUMBER(4,2), + MANAGER_ID NUMBER(6), + section_ID NUMBER(4) +) +PARTITION BY RANGE (HIRE_DATE) +( + PARTITION HIRE_19950501 VALUES LESS THAN ('1995-05-01 00:00:00'), + PARTITION HIRE_19950502 VALUES LESS THAN ('1995-05-02 00:00:00'), + PARTITION HIRE_maxvalue VALUES LESS THAN (MAXVALUE) +); +``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/database-object-design/view-and-joined-table-design.md b/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/database-object-design/view-and-joined-table-design.md new file mode 100644 index 0000000000000000000000000000000000000000..ea5629a424839598bbc19b85b225528c28c53278 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/database-object-design/view-and-joined-table-design.md @@ -0,0 +1,19 @@ +--- +title: View and Joined Table Design +summary: View and Joined Table Design +author: Guo Huan +date: 2021-10-14 +--- + +# View and Joined Table Design + +## View Design + +- [Proposal] Do not nest views unless they have strong dependency on each other. +- [Proposal] Try to avoid collation operations in a view definition. + +## Joined Table Design + +- [Proposal] Minimize joined columns across tables. +- [Proposal] Use the same data type for joined columns. +- [Proposal] The names of joined columns should indicate their relationship. For example, they can use the same name. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/database-object-naming-conventions.md b/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/database-object-naming-conventions.md new file mode 100644 index 0000000000000000000000000000000000000000..1f8e5b2e593267d572b5218bb0caedb2741d7ec4 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/database-object-naming-conventions.md @@ -0,0 +1,31 @@ +--- +title: Database Object Naming Conventions +summary: Database Object Naming Conventions +author: Guo Huan +date: 2021-10-14 +--- + +# Database Object Naming Conventions + +The name of a database object must meet the following requirements: The name of a non-time series table ranges from 1 to 63 characters and that of a time series table ranges from 1 to 53 characters. The name must start with a letter or underscore (_), and can contain letters, digits, underscores (_), dollar signs ($), and number signs (#). + +- [Proposal] Do not use reserved or non-reserved keywords to name database objects. + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** You can use the select * from pg_get_keywords() query openGauss keyword or view the keyword in [Keywords](2-keywords). + +- [Proposal] Do not use a string enclosed in double quotation marks ("") to define the database object name, unless you need to specify its capitalization. Case sensitivity of database object names makes problem location difficult. + +- [Proposal] Use the same naming format for database objects. + + - In a system undergoing incremental development or service migration, you are advised to comply with its historical naming conventions. + - You are advised to use multiple words separated with underscores (_). + - You are advised to use intelligible names and common acronyms or abbreviations for database objects. Acronyms or abbreviations that are generally understood are recommended. For example, you can use English words or Chinese pinyin indicating actual business terms. The naming format should be consistent within a database instance. + - A variable name must be descriptive and meaningful. It must have a prefix indicating its type. + +- [Proposal] The name of a table object should indicate its main characteristics, for example, whether it is an ordinary, temporary, or unlogged table. + + - An ordinary table name should indicate the business relevant to a dataset. + - Temporary tables are named in the format of **tmp_Suffix**. + - Unlogged tables are named in the format of **ul_Suffix**. + - Foreign tables are named in the format of **f_Suffix**. + - Do not create database objects whose names start with **redis_**. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/overview-of-development-and-design-proposal.md b/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/overview-of-development-and-design-proposal.md new file mode 100644 index 0000000000000000000000000000000000000000..9b98deb87e6cb87a2b02359953634464b613bf43 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/overview-of-development-and-design-proposal.md @@ -0,0 +1,15 @@ +--- +title: Overview of Development and Design Proposal +summary: Overview of Development and Design Proposal +author: Guo Huan +date: 2021-10-14 +--- + +# Overview of Development and Design Proposal + +This section describes the design specifications for database modeling and application development. Modeling based on these specifications can better fit the distributed processing architecture of MogDB and output more efficient service SQL code. + +The meaning of "Proposal" and "Notice" in this section is as follows: + +- **Proposal**: Design rules. Services complying with the rules can run efficiently, and those violating the rules may have low performance or logic errors. +- **Notice**: Details requiring attention during service development. This term identifies SQL behavior that complies with SQL standards but users may have misconceptions about, and default behavior that users may be unaware of in a program. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/sql-compilation.md b/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/sql-compilation.md new file mode 100644 index 0000000000000000000000000000000000000000..5935dd148cca60eaf42f7880532c6e2851df1281 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/sql-compilation.md @@ -0,0 +1,143 @@ +--- +title: SQL Compilation +summary: SQL Compilation +author: Guo Huan +date: 2021-10-14 +--- + +# SQL Compilation + +## DDL + +- [Proposal] In MogDB, you are advised to execute DDL operations, such as creating table or making comments, separately from batch processing jobs to avoid performance deterioration caused by many concurrent transactions. +- [Proposal] Execute data truncation after unlogged tables are used because MogDB cannot ensure the security of unlogged tables in abnormal scenarios. +- [Proposal] Suggestions on the storage mode of temporary and unlogged tables are the same as those on base tables. Create temporary tables in the same storage mode as the base tables to avoid high computing costs caused by hybrid row and column correlation. +- [Proposal] The total length of an index column cannot exceed 50 bytes. Otherwise, the index size will increase greatly, resulting in large storage cost and low index performance. +- [Proposal] Do not delete objects using **DROP…CASCADE**, unless the dependency between objects is specified. Otherwise, the objects may be deleted by mistake. + +## Data Loading and Uninstalling + +- [Proposal] Provide the inserted column list in the insert statement. Example: + + ```sql + INSERT INTO task(name,id,comment) VALUES ('task1','100','100th task'); + ``` + +- [Proposal] After data is imported to the database in batches or the data increment reaches the threshold, you are advised to analyze tables to prevent the execution plan from being degraded due to inaccurate statistics. + +- [Proposal] To clear all data in a table, you are advised to use **TRUNCATE TABLE** instead of **DELETE TABLE**. **DELETE TABLE** is not efficient and cannot release disk space occupied by the deleted data. + +## Type Conversion + +- [Proposal] Convert data types explicitly. If you perform implicit conversion, the result may differ from expected. +- [Proposal] During data query, explicitly specify the data type for constants, and do not attempt to perform any implicit data type conversion. +- [Notice] If **sql_compatibility** is set to **A**, null strings will be automatically converted to **NULL** during data import. If null strings need to be reserved, set **sql_compatibility** to **C**. + +## Query Operation + +- [Proposal] Do not return a large number of result sets to a client except the ETL program. If a large result set is returned, consider modifying your service design. + +- [Proposal] Perform DDL and DML operations encapsulated in transactions. Operations like table truncation, update, deletion, and dropping, cannot be rolled back once committed. You are advised to encapsulate such operations in transactions so that you can roll back the operations if necessary. + +- [Proposal] During query compilation, you are advised to list all columns to be queried and avoid using **SELECT \***. Doing so reduces output lines, improves query performance, and avoids the impact of adding or deleting columns on front-end service compatibility. + +- [Proposal] During table object access, add the schema prefix to the table object to avoid accessing an unexpected table due to schema switchover. + +- [Proposal] The cost of joining more than three tables or views, especially full joins, is difficult to be estimated. You are advised to use the **WITH TABLE AS** statement to create interim tables to improve the readability of SQL statements. + +- [Proposal] Avoid using Cartesian products or full joins. Cartesian products and full joins will result in a sharp expansion of result sets and poor performance. + +- [Notice] Only **IS NULL** and **IS NOT NULL** can be used to determine NULL value comparison results. If any other method is used, **NULL** is returned. For example, **NULL** instead of expected Boolean values is returned for **NULL<>NULL**, **NULL=NULL**, and **NULL<>1**. + +- [Notice] Do not use **count(col)** instead of **count(\*)** to count the total number of records in a table. **count(\*)** counts the NULL value (actual rows) while **count(col)** does not. + +- [Notice] While executing **count(col)**, the number of NULL record rows is counted as 0. While executing **sum(col)**, **NULL** is returned if all records are NULL. If not all the records are NULL, the number of NULL record rows is counted as 0. + +- [Notice] To count multiple columns using **count()**, column names must be enclosed in parentheses. For example, count ((col1, col2, col3)). Note: When multiple columns are used to count the number of NULL record rows, a row is counted even if all the selected columns are NULL. The result is the same as that when **count(\*)** is executed. + +- [Notice] NULL records are not counted when **count(distinct col)** is used to calculate the number of non-NULL columns that are not repeated. + +- [Notice] If all statistical columns are NULL when **count(distinct (col1,col2,…))** is used to count the number of unique values in multiple columns, NULL records are also counted, and the records are considered the same. + +- [Proposal] Use the connection operator || to replace the **concat** function for string connection because the execution plan generated by the **concat** function cannot be pushed down to disks. As a result, the query performance severely deteriorates. + +- [Proposal] Use the following time-related macros to replace the **now** function and obtain the current time because the execution plan generated by the **now** function cannot be pushed down to disks. As a result, the query performance severely deteriorates. + + **Table 1** Time-related macros + + | **Macro Name** | **Description** | **Example** | + | :------------------- | :----------------------------------------------------------- | :----------------------------------------------------------- | + | CURRENT_DATE | Obtains the current date, excluding the hour, minute, and second details. | `mogdb=# select CURRENT_DATE; date ----- 2018-02-02 (1 row)` | + | CURRENT_TIME | Obtains the current time, excluding the year, month, and day. | `mogdb=# select CURRENT_TIME; timetz -------- 00:39:34.633938+08 (1 row)` | + | CURRENT_TIMESTAMP(n) | Obtains the current date and time, including year, month, day, hour, minute, and second.
NOTE:
**n** indicates the number of digits after the decimal point in the time string. | `mogdb=# select CURRENT_TIMESTAMP(6); timestamptz ----------- 2018-02-02 00:39:55.231689+08 (1 row)` | + +- [Proposal] Do not use scalar subquery statements. A scalar subquery appears in the output list of a SELECT statement. In the following example, the underlined part is a scalar subquery statement: + + ```sql + SELECT id, (SELECT COUNT(*) FROM films f WHERE f.did = s.id) FROM staffs_p1 s; + ``` + + Scalar subqueries often result in query performance deterioration. During application development, scalar subqueries need to be converted into equivalent table associations based on the service logic. + +- [Proposal] In **WHERE** clauses, the filter conditions should be collated. The condition that few records are selected for reading (the number of filtered records is small) is listed at the beginning. + +- [Proposal] Filter conditions in **WHERE** clauses should comply with unilateral rules, that is, to place the column name on one side of a comparison operator. In this way, the optimizer automatically performs pruning optimization in some scenarios. Filter conditions in a **WHERE** clause will be displayed in **col op expression** format, where **col** indicates a table column, **op** indicates a comparison operator, such as = and >, and **expression** indicates an expression that does not contain a column name. Example: + + ```sql + SELECT id, from_image_id, from_person_id, from_video_id FROM face_data WHERE current_timestamp(6) - time < '1 days'::interval; + ``` + + The modification is as follows: + + ```sql + SELECT id, from_image_id, from_person_id, from_video_id FROM face_data where time > current_timestamp(6) - '1 days'::interval; + ``` + +- [Proposal] Do not perform unnecessary collation operations. Collation requires a large amount of memory and CPU. If service logic permits, **ORDER BY** and **LIMIT** can be combined to reduce resource overheads. By default, MogDB perform collation by ASC & NULL LAST. + +- [Proposal] When the **ORDER BY** clause is used for collation, specify collation modes (ASC or DESC), and use NULL FIRST or NULL LAST for NULL record sorting. + +- [Proposal] Do not rely on only the **LIMIT** clause to return the result set displayed in a specific sequence. Combine **ORDER BY** and **LIMIT** clauses for some specific result sets and use **OFFSET** to skip specific results if necessary. + +- [Proposal] If the service logic is accurate, you are advised to use **UNION ALL** instead of **UNION**. + +- [Proposal] If a filter condition contains only an **OR** expression, convert the **OR** expression to **UNION ALL** to improve performance. SQL statements that use **OR** expressions cannot be optimized, resulting in slow execution. Example: + + ```sql + SELECT * FROM scdc.pub_menu + WHERE (cdp= 300 AND inline=301) OR (cdp= 301 AND inline=302) OR (cdp= 302 ANDinline=301); + ``` + + Convert the statement to the following: + + ```sql + SELECT * FROM scdc.pub_menu + WHERE (cdp= 300 AND inline=301) + union all + SELECT * FROM scdc.pub_menu + WHERE (cdp= 301 AND inline=302) + union all + SELECT * FROM tablename + WHERE (cdp= 302 AND inline=301) + ``` + +- [Proposal] If an **in(val1, val2, val3…)** expression contains a large number of columns, you are advised to replace it with the **in (values (va11), (val2),(val3)…)** statement. The optimizer will automatically convert the **IN** constraint into a non-correlated subquery to improve the query performance. + +- [Proposal] Replace **(not) in** with **(not) exist** when associated columns do not contain **NULL** values. For example, in the following query statement, if the **T1.C1** column does not contain any **NULL** value, add the **NOT NULL** constraint to the **T1.C1** column, and then rewrite the statements. + + ```sql + SELECT * FROM T1 WHERE T1.C1 NOT IN (SELECT T2.C2 FROM T2); + ``` + + Rewrite the statement as follows: + + ```sql + SELECT * FROM T1 WHERE NOT EXISTS (SELECT * FROM T1,T2 WHERE T1.C1=T2.C2); + ``` + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + > + > - If the value of the T1.C1 column is not **NOT NULL**, the preceding rewriting cannot be performed. + > - If the **T1.C1** column is the output of a subquery, check whether the output is **NOT NULL** based on the service logic. + +- [Proposal] Use cursors instead of the **LIMIT OFFSET** syntax to perform pagination queries to avoid resource overheads caused by multiple executions. A cursor must be used in a transaction, and you must disable the cursor and commit the transaction once the query is finished. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/tool-interconnection/jdbc-configuration.md b/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/tool-interconnection/jdbc-configuration.md new file mode 100644 index 0000000000000000000000000000000000000000..cd91b5d436344dfb0036885b0dc28475a40e3042 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/development-and-design-proposal/tool-interconnection/jdbc-configuration.md @@ -0,0 +1,62 @@ +--- +title: JDBC Configuration +summary: JDBC Configuration +author: Guo Huan +date: 2021-10-14 +--- + +# JDBC Configuration + +Currently, third-party tools related to MogDB are connected through JDBC. This section describes the precautions for configuring the tool. + +## Connection Parameters + +- [Notice] When a third-party tool connects to MogDB through JDBC, JDBC sends a connection request to MogDB. By default, the following configuration parameters are added. For details, see the implementation of the ConnectionFactoryImpl class in the JDBC code. + + ``` + params = { + { "user", user }, + { "database", database }, + { "client_encoding", "UTF8" }, + { "DateStyle", "ISO" }, + { "extra_float_digits", "3" }, + { "TimeZone", createPostgresTimeZone() }, + }; + ``` + + These parameters may cause the JDBC and **gsql** clients to display inconsistent data, for example, date data display mode, floating point precision representation, and timezone. + + If the result is not as expected, you are advised to explicitly set these parameters in the Java connection setting. + +- [Proposal] When connecting to the database through JDBC, ensure that the following three time zones are the same: + + - Time zone of the host where the JDBC client is located + + - Time zone of the host where the MogDB database instance is located. + + - Time zone used during MogDB database instance configuration. + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** For details about how to set the time zone, see "[Setting the Time Zone and Time](3-modifying-os-configuration#setting-the-time-zone-and-time)" in *Installation Guide*. + +## fetchsize + +[Notice] To use **fetchsize** in applications, disable **autocommit**. Enabling the **autocommit** switch makes the **fetchsize** configuration invalid. + +## autocommit + +[Proposal] You are advised to enable **autocommit** in the code for connecting to MogDB by the JDBC. If **autocommit** needs to be disabled to improve performance or for other purposes, applications need to ensure their transactions are committed. For example, explicitly commit translations after specifying service SQL statements. Particularly, ensure that all transactions are committed before the client exits. + +## Connection Releasing + +[Proposal] You are advised to use connection pools to limit the number of connections from applications. Do not connect to a database every time you run an SQL statement. + +[Proposal] After an application completes its tasks, disconnect its connection to MogDB to release occupied resources. You are advised to set the session timeout interval in the jobs. + +[Proposal] Reset the session environment before releasing connections to the JDBC connection tool. Otherwise, historical session information may cause object conflicts. + +- If GUC parameters are set in the connection, run **SET SESSION AUTHORIZATION DEFAULT;RESET ALL;** to clear the connection status before you return the connection to the connection pool. +- If a temporary table is used, delete the temporary table before you return the connection to the connection pool. + +## CopyManager + +[Proposal] In the scenario where the ETL tool is not used and real-time data import is required, it is recommended that you use the **CopyManager** interface driven by the MogDB JDBC to import data in batches during application development. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/foreign-data-wrapper/1-oracle_fdw.md b/product/en/docs-mogdb/v3.0/developer-guide/foreign-data-wrapper/1-oracle_fdw.md new file mode 100644 index 0000000000000000000000000000000000000000..18d35ec9e1bb03b517824d2064416eb9b2ca3b38 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/foreign-data-wrapper/1-oracle_fdw.md @@ -0,0 +1,49 @@ +--- +title: oracle_fdw +summary: oracle_fdw +author: Zhang Cuiping +date: 2021-05-17 +--- + +# oracle_fdw + +oracle_fdw is an open-source plug-in. MogDB is developed and adapted based on the open-source oracle_fdw Release 2.2.0. + +To compile and use oracle_fdw, the Oracle development packages must be included in the environment. Therefore, MogDB does not compile oracle_fdw by default. The following describes how to compile and use oracle_fdw. + +## Compiling oracle_fdw + +To compile oracle_fdw, install the Oracle development library and header files from the [Oracle official website](https://www.oracle.com/database/technologies/instant-client/downloads.html). + +Select a proper running environment and version, download **Basic Package** and **SDK Package**, and install them. In addition, **SQLPlus Package** is a client tool of the Oracle server. You can install it as required to connect to the Oracle server for testing. + +After installing the development packages, start oracle_fdw compilation. Add the **-enable-oracle-fdw** option when running the **configure** command. Perform compilation using the common MogDB compilation method. + +After the compilation is complete, the **oracle_fdw.so** file is generated in **lib/postgresql/** in the installation directory. SQL files and control files related to oracle_fdw are stored in **share/postgresql/extension/** in the installation directory. + +If the **-enable-oracle-fdw** option is not added during compilation and installation, compile oracle_fdw again after MogDB is installed, and then manually place the **oracle\_fdw.so** file to **lib/postgresql/** in the installation directory, and place **oracle\_fdw-1.0-1.1.sql**, **oracle\_fdw-1.1.sql**, and **oracle\_fdw.control** to **share/postgresql/extension/** in the installation directory. + +## Using oracle_fdw + +- To use oracle_fdw, install and connect to the Oracle server. +- Load the oracle_fdw extension using **CREATE EXTENSION oracle_fdw;**. +- Create a server object using **CREATE SERVER**. +- Create a user mapping using **CREATE USER MAPPING**. +- Create a foreign table using **CREATE FOREIGN TABLE**. The structure of the foreign table must be the same as that of the Oracle table. The first column in the table on the Oracle server must be unique, for example, **PRIMARY KEY** and **UNIQUE**. +- Perform normal operations on the foreign table, such as **INSERT**, **UPDATE**, **DELETE**, **SELECT**, **EXPLAIN**, **ANALYZE** and **COPY**. +- Drop a foreign table using **DROP FOREIGN TABLE**. +- Drop a user mapping using **DROP USER MAPPING**. +- Drop a server object using **DROP SERVER**. +- Drop an extension using **DROP EXTENSION oracle_fdw;**. + +## Common Issues + +- When a foreign table is created on the MogDB, the table is not created on the Oracle server. You need to use the Oracle client to connect to the Oracle server to create a table. +- The Oracle user used for executing **CREATE USER MAPPING** must have the permission to remotely connect to the Oracle server and perform operations on tables. Before using a foreign table, you can use the Oracle client on the machine where the MogDB server is located and use the corresponding user name and password to check whether the Oracle server can be successfully connected and operations can be performed. +- When **CREATE EXTENSION oracle_fdw;** is executed, the message **libclntsh.so: cannot open shared object file: No such file or directory** is displayed. The reason is that the Oracle development library **libclntsh.so** is not in the related path of the system. You can find the specific path of **libclntsh.so**, and then add the folder where the **libclntsh.so** file is located to **/etc/ld.so.conf**. For example, if the path of **libclntsh.so** is **/usr/lib/oracle/11.2/client64/lib/libclntsh.so.11.1**, add **/usr/lib/oracle/11.2/client64/lib/** to the end of **/etc/ld.so.conf**. Run the **ldconfig** command for the modification to take effect. Note that this operation requires the **root** permission. + +## Precautions + +- **SELECT JOIN** between two Oracle foreign tables cannot be pushed down to the Oracle server for execution. Instead, **SELECT JOIN** is divided into two SQL statements and transferred to the Oracle server for execution. Then the processing result is summarized in the MogDB. +- The **IMPORT FOREIGN SCHEMA** syntax is not supported. +- **CREATE TRIGGER** cannot be executed for foreign tables. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/foreign-data-wrapper/2-mysql_fdw.md b/product/en/docs-mogdb/v3.0/developer-guide/foreign-data-wrapper/2-mysql_fdw.md new file mode 100644 index 0000000000000000000000000000000000000000..981b7691d5f1796fb341a0da2d269f51348d1ee9 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/foreign-data-wrapper/2-mysql_fdw.md @@ -0,0 +1,49 @@ +--- +title: mysql_fdw +summary: mysql_fdw +author: Zhang Cuiping +date: 2021-05-17 +--- + +# mysql_fdw + +mysql_fdw is an open-source plug-in. MogDB is developed and adapted based on the open-source [mysql_fdw Release 2.5.3](https://github.com/EnterpriseDB/mysql_fdw/archive/REL-2_5_3.tar.gz). + +To compile and use mysql_fdw, the MariaDB development packages must be included in the environment. Therefore, MogDB does not compile mysql_fdw by default. The following describes how to compile and use mysql_fdw. + +## Compiling mysql_fdw + +To compile mysql_fdw, install the development library and header file of MariaDB. You are advised to use the official MariaDB repositories. For details about how to select a repository, visit [http://downloads.mariadb.org/mariadb/repositories/](http://downloads.mariadb.org/mariadb/repositories). + +After the repository is configured, run the **yum install MariaDB-devel MariaDB-shared** command to install the related development libraries. In addition, **MariaDB-client** is a client tool of the MariaDB. You can install it as required to connect to the MariaDB for testing. + +After installing the development packages, start mysql_fdw compilation. Add the **-enable-mysql-fdw** option when running the **configure** command. Perform compilation using the common MogDB compilation method. + +After the compilation is complete, the **mysql_fdw.so** file is generated in **lib/postgresql/** in the installation directory. SQL files and control files related to mysql_fdw are stored in **share/postgresql/extension/** in the installation directory. + +If the **-enable-mysql-fdw** option is not added during compilation and installation, compile mysql_fdw again after MogDB is installed, and then manually place the **mysql\_fdw.so** file to **lib/postgresql/** in the installation directory, and place **mysql\_fdw-1.0-1.1.sql**, **mysql\_fdw-1.1.sql**, and **mysql\_fdw.control** to **share/postgresql/extension/** in the installation directory. + +## Using mysql_fdw + +- To use mysql_fdw, install and connect to MariaDB or MySQL server. +- Load the mysql_fdw extension using **CREATE EXTENSION mysql_fdw;**. +- Create a server object using **CREATE SERVER**. +- Create a user mapping using **CREATE USER MAPPING**. +- Create a foreign table using **CREATE FOREIGN TABLE**. The structure of the foreign table must be the same as that of the MySQL or MariaDB table. The first column in the table on the MySQL or MariaDB must be unique, for example, **PRIMARY KEY** and **UNIQUE**. +- Perform normal operations on the foreign table, such as **INSERT**, **UPDATE**, **DELETE**, **SELECT**, **EXPLAIN**, **ANALYZE** and **COPY**. +- Drop a foreign table using **DROP FOREIGN TABLE**. +- Drop a user mapping using **DROP USER MAPPING**. +- Drop a server object using **DROP SERVER**. +- Drop an extension using **DROP EXTENSION mysql_fdw;**. + +## Common Issues + +- When a foreign table is created on the MogDB, the table is not created on the MariaDB or MySQL server. You need to use the MariaDB or MySQL server client to connect to the MariaDB or MySQL server to create a table. +- The MariaDB or MySQL server user used for creating **USER MAPPING** must have the permission to remotely connect to the MariaDB or MySQL server and perform operations on tables. Before using a foreign table, you can use the MariaDB or MySQL server client on the machine where the MogDB server is located and use the corresponding user name and password to check whether the MariaDB or MySQL server can be successfully connected and operations can be performed. +- The **Can't initialize character set SQL_ASCII (path: compiled_in)** error occurs when the DML operation is performed on a foreign table. MariaDB does not support the **SQL_ASCII** encoding format. Currently, this problem can be resolved only by modifying the encoding format of the MogDB database. Change the database encoding format to **update pg_database set encoding = pg_char_to_encoding('UTF-8') where datname = 'postgres';**. Set **datname** based on the actual requirements. After the encoding format is changed, start a gsql session again so that mysql_fdw can use the updated parameters. You can also use **-locale=LOCALE** when running **gs_initdb** to set the default encoding format to non-SQL_ASCII. + +## Precautions + +- **SELECT JOIN** between two MySQL foreign tables cannot be pushed down to the MariaDB or MySQL server for execution. Instead, **SELECT JOIN** is divided into two SQL statements and transferred to the MariaDB or MySQL server for execution. Then the processing result is summarized in the MogDB. +- The **IMPORT FOREIGN SCHEMA** syntax is not supported. +- **CREATE TRIGGER** cannot be executed for foreign tables. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/foreign-data-wrapper/3-postgres_fdw.md b/product/en/docs-mogdb/v3.0/developer-guide/foreign-data-wrapper/3-postgres_fdw.md new file mode 100644 index 0000000000000000000000000000000000000000..50864c9e199e3024164a19597cd6f9f662f09489 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/foreign-data-wrapper/3-postgres_fdw.md @@ -0,0 +1,37 @@ +--- +title: postgres_fdw +summary: postgres_fdw +author: Zhang Cuiping +date: 2021-05-17 +--- + +# postgres_fdw + +postgres_fdw is an open-source plug-in. Its code is released with the PostgreSQL source code. MogDB is developed and adapted based on the open-source postgres_fdw source code () in PostgreSQL 9.4.26. + +The postgres_fdw plug-in is involved in compilation by default. After installing MogDB using the installation package, you can directly use postgres_fdw without performing other operations. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** Currently, postgres_fdw supports only connection between MogDB databases. + +## Using postgres_fdw + +- Load the postgres_fdw extension using **CREATE EXTENSION postgres_fdw;**. +- Create a server object using **CREATE SERVER**. +- Create a user mapping using **CREATE USER MAPPING**. +- Create a foreign table using **CREATE FOREIGN TABLE**. The structure of the foreign table must be the same as that of the remote MogDB table. +- Perform normal operations on the foreign table, such as **INSERT**, **UPDATE**, **DELETE**, **SELECT**, **EXPLAIN**, **ANALYZE** and **COPY**. +- Drop a foreign table using **DROP FOREIGN TABLE**. +- Drop a user mapping using **DROP USER MAPPING**. +- Drop a server object using **DROP SERVER**. +- Drop an extension using **DROP EXTENSION postgres_fdw;**. + +## Common Issues + +- When a foreign table is created on the MogDB, the table is not created on the remote MogDB database. You need to use the Oracle client to connect to the remote MogDB database to create a table. +- The MogDB user used for executing **CREATE USER MAPPING** must have the permission to remotely connect to the MogDB database and perform operations on tables. Before using a foreign table, you can use the gsql client on the local machine and use the corresponding user name and password to check whether the remote MogDB database can be successfully connected and operations can be performed. + +## Precautions + +- **SELECT JOIN** between two postgres_fdw foreign tables cannot be pushed down to the remote MogDB database for execution. Instead, **SELECT JOIN** is divided into two SQL statements and transferred to the remote MogDB database for execution. Then the processing result is summarized locally. +- The **IMPORT FOREIGN SCHEMA** syntax is not supported. +- **CREATE TRIGGER** cannot be executed for foreign tables. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/foreign-data-wrapper/dblink.md b/product/en/docs-mogdb/v3.0/developer-guide/foreign-data-wrapper/dblink.md new file mode 100644 index 0000000000000000000000000000000000000000..9206f7629751653ee8a5a9b1f8f7b8cdd594e498 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/foreign-data-wrapper/dblink.md @@ -0,0 +1,80 @@ +--- +title: dblink +summary: dblink +author: Guo Huan +date: 2021-11-22 +--- + +# dblink + +dblink is a tool that can connect to other MogDB databases in an MogDB database session. The connection parameters supported by dblink are the same as those supported by libpq. For details, see [Connection Characters](6-connection-characters). By default, MogDB does not compile dblink. The following describes how to compile and use dblink. + +## Compiling dblink + +Currently, the source code of dblink is stored in the [contrib/dblink](https://gitee.com/opengauss/openGauss-server/tree/master/contrib/dblink) directory. After the MogDB database is compiled and installed, if you need to use the dblink, go to the preceding directory and run the following command to compile and install the dblink: + +```bash +make +make install +``` + +## Common dblink Functions + +- Load the dblink extension. + + ```sql + CREATE EXTENSION dblink; + ``` + +- Open a persistent connection to a remote database. + + ```sql + SELECT dblink_connect(text connstr); + ``` + +- Close a persistent connection to a remote database. + + ```sql + SELECT dblink_disconnect(); + ``` + +- Query data in a remote database. + + ```sql + SELECT * FROM dblink(text connstr, text sql); + ``` + +- Execute commands in a remote database. + + ```sql + SELECT dblink_exec(text connstr, text sql); + ``` + +- Return the names of all opened dblinks. + + ```sql + SELECT dblink_get_connections(); + ``` + +- Send an asynchronous query to a remote database. + + ```sql + SELECT dblink_send_query(text connname, text sql); + ``` + +- Check whether the connection is busy with an asynchronous query. + + ```sql + SELECT dblink_is_busy(text connname); + ``` + +- Delete the extension. + + ```sql + DROP EXTENSION dblink; + ``` + +## Precautions + +- Currently, dblink allows only the MogDB database to access another MogDB database and does not allow the MogDB database to access a PostgreSQL database. +- Currently, dblink does not support the thread pool mode. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/foreign-data-wrapper/fdw-introduction.md b/product/en/docs-mogdb/v3.0/developer-guide/foreign-data-wrapper/fdw-introduction.md new file mode 100644 index 0000000000000000000000000000000000000000..d1d655fbafa870f9a6b002f80a28d0d3a95bce1d --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/foreign-data-wrapper/fdw-introduction.md @@ -0,0 +1,10 @@ +--- +title: Introduction +summary: Introduction +author: Zhang Cuiping +date: 2021-05-17 +--- + +# Introduction + +The foreign data wrapper (FDW) of the MogDB can implement cross-database operations between MogDB databases and remote databases. Currently, the following remote servers are supported: Oracle, MySQL(MariaDB), PostgreSQL/openGauss/MogDB(postgres_fdw), file_fdw, dblink. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/foreign-data-wrapper/file_fdw.md b/product/en/docs-mogdb/v3.0/developer-guide/foreign-data-wrapper/file_fdw.md new file mode 100644 index 0000000000000000000000000000000000000000..b85647201c5843241e64e57ffd28b5d1d9e7f03c --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/foreign-data-wrapper/file_fdw.md @@ -0,0 +1,81 @@ +--- +title: file_fdw +summary: file_fdw +author: Guo Huan +date: 2021-10-19 +--- + +# file_fdw + +The file_fdw module provides the external data wrapper file_fdw, which can be used to access data files in the file system of a server. The format of the data files must be readable by the **COPY FROM** command. For details, see [COPY](41-COPY). file_fdw is only used to access readable data files, but cannot write data to the data files. + +By default, the file_fdw is compiled in MogDB. During database initialization, the plug-in is created in the **pg_catalog** schema. + +When you create a foreign table using file_fdw, you can add the following options: + +- filename + + File to be read. This parameter is mandatory and must be an absolute path. + +- format + + File format of the remote server, which is the same as the **FORMAT** option in the **COPY** statement. The value can be **text**, **csv**, **binary**, or **fixed**. + +- header + + Whether the specified file has a header, which is the same as the **HEADER** option of the **COPY** statement. + +- delimiter + + File delimiter, which is the same as the **DELIMITER** option of the **COPY** statement. + +- quote + + Quote character of a file, which is the same as the **QUOTE** option of the **COPY** statement. + +- escape + + Escape character of a file, which is the same as the **ESCAPE** option of the **COPY** statement. + +- null + + Null string of a file, which is the same as the **NULL** option of the **COPY** statement. + +- encoding + + Encoding of a file, which is the same as the **ENCODING** option of the **COPY** statement. + +- force_not_null + + File-level null option, which is a Boolean option. If it is true, the value of the declared field cannot be an empty string. This option is the same as the **FORCE_NOT_NULL** option of the **COPY** statement. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> +> - file_fdw does not support the **OIDS** and **FORCE\_QUOTE** options of the **COPY** statement. +> - These options can only be declared for a foreign table or the columns of the foreign table, not for the file_fdw itself, nor for the server or user mapping that uses file_fdw. +> - To modify table-level options, you must obtain the system administrator role permissions. For security reasons, only the system administrator can determine the files to be read. +> - For an external table that uses file_fdw, **EXPLAIN** displays the name and size (in bytes) of the file to be read. If the keyword **COSTS OFF** is specified, the file size is not displayed. + +## Using file_fdw + +- To create a server object, use **CREATE SERVER**. + +- To create a user mapping, use **CREATE USER MAPPING**. + +- To create a foreign table, use **CREATE FOREIGN TABLE**. + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + > + > - The structure of the foreign table must be consistent with the data in the specified file. + > - When a foreign table is queried, no write operation is allowed. + +- To drop a foreign table, use **DROP FOREIGN TABLE**. + +- To drop a user mapping, use **DROP USER MAPPING**. + +- To drop a server object, use **DROP SERVER**. + +## Precautions + +- To use file_fdw, you need to specify the file to be read. Prepare the file and grant the read permission on the file for the database to access the file. +- **DROP EXTENSION file_fdw** is not supported. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/logical-replication/1-logical-decoding.md b/product/en/docs-mogdb/v3.0/developer-guide/logical-replication/1-logical-decoding.md new file mode 100644 index 0000000000000000000000000000000000000000..66f24ddaf79e33b3c232365eaed8e65731e7880d --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/logical-replication/1-logical-decoding.md @@ -0,0 +1,53 @@ +--- +title: Overview +summary: Overview +author: Zhang Cuiping +date: 2021-05-10 +--- + +# Overview + +## Function + +The data replication capabilities supported by MogDB are as follows: + +Data is periodically synchronized to heterogeneous databases (such as Oracle databases) using a data migration tool. Real-time data replication is not supported. Therefore, the requirements for real-time data synchronization to heterogeneous databases are not satisfied. + +MogDB provides the logical decoding function to generate logical logs by decoding Xlogs. A target database parses logical logs to replicate data in real time. For details, see Figure 1. Logical replication reduces the restrictions on target databases, allowing for data synchronization between heterogeneous databases and homogeneous databases with different forms. It allows data to be read and written during data synchronization on a target database, reducing the data synchronization latency. + +**Figure 1** Logical replication + +![image-20210512181021060](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/logical-decoding-2.png) + +Logical replication consists of logical decoding and data replication. Logical decoding outputs logical logs by transaction. The database service or middleware parses the logical logs to implement data replication. Currently, MogDB supports only logical decoding. Therefore, this section involves only logical decoding. + +Logical decoding provides basic transaction decoding capabilities for logical replication. MogDB uses SQL functions for logical decoding. This method features easy function calling, requires no tools to obtain logical logs, and provides specific interfaces for interconnecting with external replay tools, saving the need of additional adaptation. + +Logical logs are output only after transactions are committed because they use transactions as the unit and logical decoding is driven by users. Therefore, to prevent Xlogs from being reclaimed by the system when transactions start and prevent required transaction information from being reclaimed by **VACUUM**, MogDB introduces logical replication slots to block Xlog reclaiming. + +A logical replication slot means a stream of changes that can be replayed in other databases in the order they were generated in the original database. Each owner of logical logs maintains one logical replication slot. + +## Precautions + +- DDL statement decoding is not supported. When a specific DDL statement (for example, to truncate an ordinary table or exchange a partitioned table) is executed, decoded data may be lost. +- Decoding for column-store data and data page replication is not supported. +- Logical decoding is not supported on the cascaded standby node. +- After a DDL statement (for example, **ALTER TABLE**) is executed, the physical logs that are not decoded before the DDL statement execution may be lost. +- The size of a single tuple cannot exceed 1 GB, and decoded data may be larger than inserted data. Therefore, it is recommended that the size of a single tuple be less than or equal to 500 MB. +- openGauss supports the following data types for decoding: **INTEGER**, **BIGINT**, **SMALLINT**, **TINYINT**, **SERIAL**, **SMALLSERIAL**, **BIGSERIAL**, **FLOAT**, **DOUBLE PRECISION**, **DATE**, **TIME[WITHOUT TIME ZONE]**, **TIMESTAMP[WITHOUT TIME ZONE]**, **CHAR(***n***)**, **VARCHAR(***n***)**, and **TEXT**. +- Currently, SSL connections are not supported by default. If SSL connections are required, set the GUC parameter **ssl** to **on**. +- The logical replication slot name must contain fewer than 64 characters and contain only one or more types of the following characters: lowercase letters, digits, and underscores (_). +- Currently, logical replication does not support the MOT feature. +- After the database where a logical replication slot resides is deleted, the replication slot becomes unavailable and needs to be manually deleted. +- Only the UTF-8 character set is supported. +- To decode multiple databases, you need to create a stream replication slot in each database and start decoding. Logs need to be scanned for decoding of each database. +- Forcible startup is not supported. After forcible startup, you need to export all data again. +- During decoding on the standby node, the decoded data may increase during switchover and failover, which needs to be manually filtered out. When the quorum protocol is used, switchover and failover should be performed on the standby node that is to be promoted to primary, and logs must be synchronized from the primary node to the standby node. +- The same replication slot for decoding cannot be used between the primary node and standby node or between different standby nodes at the same time. Otherwise, data inconsistency occurs. +- Replication slots can only be created or deleted on hosts. +- After the database is restarted due to a fault or the logical replication process is restarted, duplicate decoded data exists. You need to filter out the duplicate data. +- If the computer kernel is faulty, garbled characters are displayed during decoding which need to be manually or automatically filtered out. +- Currently, the logical decoding on the standby node does not support enabling the ultimate RTO. +- Ensure that the long transaction is not started during the creation of the logical replication slot. If the long transaction is started, the creation of the logical replication slot will be blocked. +- Interval partitioned tables cannot be replicated. +- After a DDL statement is executed in a transaction, the DDL statement and subsequent statements are not decoded. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/logical-replication/2-logical-decoding-by-sql-function-interfaces.md b/product/en/docs-mogdb/v3.0/developer-guide/logical-replication/2-logical-decoding-by-sql-function-interfaces.md new file mode 100644 index 0000000000000000000000000000000000000000..f02c7c08f98b9706bf07248d63c26cd4e9d4edbc --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/logical-replication/2-logical-decoding-by-sql-function-interfaces.md @@ -0,0 +1,83 @@ +--- +title: Logical Decoding by SQL Function Interfaces +summary: Logical Decoding by SQL Function Interfaces +author: Zhang Cuiping +date: 2021-05-10 +--- + +# Logical Decoding by SQL Function Interfaces + +In MogDB, you can call SQL functions to create, delete, and push logical replication slots, as well as obtain decoded transaction logs. + +## Prerequisites + +- Currently, logical logs are extracted from host nodes. Since SSL connections are disabled by default, to perform logical replication, set the GUC parameter **ssl** to **on** on host nodes. + + > **NOTE:** For security purposes, ensure that SSL connections are enabled. + +- The GUC parameter **wal_level** is set to **logical**. + +- The GUC parameter max_replication_slots is set to a value greater than the number of physical replication slots and logical replication slots required by each node. + + Physical replication slots provide an automatic method to ensure that Xlogs are not removed from a primary node before they are received by all the standby nodes and secondary nodes. That is, physical replication slots are used to support HA clusters. The number of physical replication slots required by a cluster is equal to the ratio of standby and secondary nodes to the primary node. For example, if an HA cluster has 1 primary node, 1 standby node, and 1 secondary node, the number of required physical replication slots will be 2. If an HA cluster has 1 primary node and 3 standby nodes, the number of required physical replication slots will be 3. + + Plan the number of logical replication slots as follows: + + - A logical replication slot can carry changes of only one database for decoding. If multiple databases are involved, create multiple logical replication slots. + - If logical replication is needed by multiple target databases, create multiple logical replication slots in the source database. Each logical replication slot corresponds to one logical replication link. + +- Only database administrators and users with the **REPLICATION** permission can perform operations in this scenario. + +## Procedure + +1. Log in to the primary node of the MogDB cluster as the cluster installation user. + +2. Run the following command to connect to the default database **postgres**: + + ``` + gsql -d postgres -p 16000 -r + ``` + + In this command, **16000** is the database port number. It can be replaced by an actual port number. + +3. Create a logical replication slot named **slot1**. + + ``` + mogdb=# SELECT * FROM pg_create_logical_replication_slot('slot1', 'mppdb_decoding'); + slotname | xlog_position + ----------+--------------- + slot1 | 0/601C150 + (1 row) + ``` + +4. Create a table **t** in the database and insert data into it. + + ``` + mogdb=# CREATE TABLE t(a int PRIMARY KEY, b int); + mogdb=# INSERT INTO t VALUES(3,3); + ``` + +5. Read the decoding result of **slot1**. The number of decoded records is 4096. + + ``` + mogdb=# SELECT * FROM pg_logical_slot_peek_changes('slot1', NULL, 4096); + location | xid | data + -----------+-------+------------------------------------------------------------------------------------------------------------------------------------------------- + ------------------------------------------- + 0/601C188 | 1010023 | BEGIN 1010023 + 0/601ED60 | 1010023 | COMMIT 1010023 CSN 1010022 + 0/601ED60 | 1010024 | BEGIN 1010024 + 0/601ED60 | 1010024 | {"table_name":"public.t","op_type":"INSERT","columns_name":["a","b"],"columns_type":["integer","integer"],"columns_val":["3","3"],"old_keys_name":[],"old_keys_type":[],"old_keys_val":[]} + 0/601EED8 | 1010024 | COMMIT 1010024 CSN 1010023 + (5 rows) + ``` + +6. Delete the logical replication slot **slot1**. + + ``` + mogdb=# SELECT * FROM pg_drop_replication_slot('slot1'); + pg_drop_replication_slot + -------------------------- + + (1 row) + ``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/materialized-view/1-materialized-view-overview.md b/product/en/docs-mogdb/v3.0/developer-guide/materialized-view/1-materialized-view-overview.md new file mode 100644 index 0000000000000000000000000000000000000000..b5ddb309b5b48e0c4fc630fab240b65e1230c17a --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/materialized-view/1-materialized-view-overview.md @@ -0,0 +1,10 @@ +--- +title: Materialized View Overview +summary: Materialized View Overview +author: Guo Huan +date: 2021-05-21 +--- + +# Materialized View Overview + +A materialized view is a special physical table, which is relative to a common view. A common view is a virtual table and has many application limitations. Any query on a view is actually converted into a query on an SQL statement, and performance is not actually improved. The materialized view actually stores the results of the statements executed by the SQL statement, and is used to cache the results. \ No newline at end of file diff --git a/product/en/docs-mogdb/v3.0/developer-guide/materialized-view/2-full-materialized-view/1-full-materialized-view-overview.md b/product/en/docs-mogdb/v3.0/developer-guide/materialized-view/2-full-materialized-view/1-full-materialized-view-overview.md new file mode 100644 index 0000000000000000000000000000000000000000..721d69a42595935535c885dd0a4e700b934aca3a --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/materialized-view/2-full-materialized-view/1-full-materialized-view-overview.md @@ -0,0 +1,10 @@ +--- +title: Overview +summary: Overview +author: Liuxu +date: 2021-05-21 +--- + +# Overview + +Full materialized views can be fully refreshed only. The syntax for creating a full materialized view is similar to the CREATE TABLE AS syntax. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/materialized-view/2-full-materialized-view/2-full-materialized-view-usage.md b/product/en/docs-mogdb/v3.0/developer-guide/materialized-view/2-full-materialized-view/2-full-materialized-view-usage.md new file mode 100644 index 0000000000000000000000000000000000000000..80e76ef110c0f773e274b73aa839e50c74a4dd13 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/materialized-view/2-full-materialized-view/2-full-materialized-view-usage.md @@ -0,0 +1,69 @@ +--- +title: Usage +summary: Usage +author: Guo Huan +date: 2021-05-21 +--- + +# Usage + +## Syntax + +- Create a full materialized view. + + ``` + CREATE MATERIALIZED VIEW [ view_name ] AS { query_block }; + ``` + +- Fullly refresh a materialized view. + + ``` + REFRESH MATERIALIZED VIEW [ view_name ]; + ``` + +- Delete a materialized view. + + ``` + DROP MATERIALIZED VIEW [ view_name ]; + ``` + +- Query a materialized view. + + ``` + SELECT * FROM [ view_name ]; + ``` + +## Examples + +``` +-- Prepare data. +mogdb=# CREATE TABLE t1(c1 int, c2 int); +mogdb=# INSERT INTO t1 VALUES(1, 1); +mogdb=# INSERT INTO t1 VALUES(2, 2); + +-- Create a full materialized view. +mogdb=# CREATE MATERIALIZED VIEW mv AS select count(*) from t1; + +-- Query the materialized view result. +mogdb=# SELECT * FROM mv; + count +------- + 2 +(1 row) + +-- Insert data into the base table in the materialized view. +mogdb=# INSERT INTO t1 VALUES(3, 3); + +-- Fully refresh a full materialized view. +mogdb=# REFRESH MATERIALIZED VIEW mv; + +-- Query the materialized view result. +mogdb=# SELECT * FROM mv; + count +------- + 3 +(1 row) + +-- Delete a materialized view. +mogdb=# DROP MATERIALIZED VIEW mv; +``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/materialized-view/2-full-materialized-view/3-full-materialized-view-support-and-constraints.md b/product/en/docs-mogdb/v3.0/developer-guide/materialized-view/2-full-materialized-view/3-full-materialized-view-support-and-constraints.md new file mode 100644 index 0000000000000000000000000000000000000000..d0f35b92976a8681d9c77614c624b19b67093d98 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/materialized-view/2-full-materialized-view/3-full-materialized-view-support-and-constraints.md @@ -0,0 +1,18 @@ +--- +title: Support and Constraints +summary: Support and Constraints +author: Liuxu +date: 2021-05-21 +--- + +# Support and Constraints + +## Supported Scenarios + +- Supports the same query scope as the CREATE TABLE AS statement does. +- Supports index creation in full materialized views. +- Supports ANALYZE and EXPLAIN. + +## Unsupported Scenarios + +Materialized views cannot be added, deleted, or modified. They support only query statements. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/materialized-view/3-incremental-materialized-view/1-incremental-materialized-view-overview.md b/product/en/docs-mogdb/v3.0/developer-guide/materialized-view/3-incremental-materialized-view/1-incremental-materialized-view-overview.md new file mode 100644 index 0000000000000000000000000000000000000000..8da9c3df584795fc5ebd719cbfb74bcdb7a67037 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/materialized-view/3-incremental-materialized-view/1-incremental-materialized-view-overview.md @@ -0,0 +1,10 @@ +--- +title: Overview +summary: Overview +author: Guo Huan +date: 2021-05-21 +--- + +# Overview + +Incremental materialized views can be incrementally refreshed. You need to manually execute statements to incrementally refresh materialized views in a period of time. The difference between the incremental and the full materialized views is that the incremental materialized view supports only a small number of scenarios. Currently, only base table scanning statements or UNION ALL can be used to create materialized views. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/materialized-view/3-incremental-materialized-view/2-incremental-materialized-view-usage.md b/product/en/docs-mogdb/v3.0/developer-guide/materialized-view/3-incremental-materialized-view/2-incremental-materialized-view-usage.md new file mode 100644 index 0000000000000000000000000000000000000000..e0f81c2f02f11179bce9812a85cc1f7b04e1afaf --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/materialized-view/3-incremental-materialized-view/2-incremental-materialized-view-usage.md @@ -0,0 +1,92 @@ +--- +title: Usage +summary: Usage +author: Guo Huan +date: 2021-05-21 +--- + +# Usage + +## Syntax + +- Create a incremental materialized view. + + ``` + CREATE INCREMENTAL MATERIALIZED VIEW [ view_name ] AS { query_block }; + ``` + +- Fully refresh a materialized view. + + ``` + REFRESH MATERIALIZED VIEW [ view_name ]; + ``` + +- Incrementally refresh a materialized view. + + ``` + REFRESH INCREMENTAL MATERIALIZED VIEW [ view_name ]; + ``` + +- Delete a materialized view. + + ``` + DROP MATERIALIZED VIEW [ view_name ]; + ``` + +- Query a materialized view. + + ``` + SELECT * FROM [ view_name ]; + ``` + +## Examples + +``` +-- Prepare data. +CREATE TABLE t1(c1 int, c2 int); +INSERT INTO t1 VALUES(1, 1); +INSERT INTO t1 VALUES(2, 2); + +-- Create an incremental materialized view. +mogdb=# CREATE INCREMENTAL MATERIALIZED VIEW mv AS SELECT * FROM t1; +CREATE MATERIALIZED VIEW + +-- Insert data. +mogdb=# INSERT INTO t1 VALUES(3, 3); +INSERT 0 1 + +-- Incrementally refresh a materialized view. +mogdb=# REFRESH INCREMENTAL MATERIALIZED VIEW mv; +REFRESH MATERIALIZED VIEW + +-- Query the materialized view result. +mogdb=# SELECT * FROM mv; + c1 | c2 +----+---- + 1 | 1 + 2 | 2 + 3 | 3 +(3 rows) + +-- Insert data. +mogdb=# INSERT INTO t1 VALUES(4, 4); +INSERT 0 1 + +-- Fullly refresh a materialized view. +mogdb=# REFRESH MATERIALIZED VIEW mv; +REFRESH MATERIALIZED VIEW + +-- Query the materialized view result. +mogdb=# select * from mv; + c1 | c2 +----+---- + 1 | 1 + 2 | 2 + 3 | 3 + 4 | 4 +(4 rows) + +-- Delete a materialized view. +mogdb=# DROP MATERIALIZED VIEW mv; +DROP MATERIALIZED VIEW +``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/materialized-view/3-incremental-materialized-view/3-incremental-materialized-view-support-and-constraints.md b/product/en/docs-mogdb/v3.0/developer-guide/materialized-view/3-incremental-materialized-view/3-incremental-materialized-view-support-and-constraints.md new file mode 100644 index 0000000000000000000000000000000000000000..6bcde8f8819415fd63c2383b54e65b198a921ff8 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/materialized-view/3-incremental-materialized-view/3-incremental-materialized-view-support-and-constraints.md @@ -0,0 +1,29 @@ +--- +title: Support and Constraints +summary: Support and Constraints +author: Guo Huan +date: 2021-05-21 +--- + +# Support and Constraints + +## Supported Scenarios + +- Supports statements for querying a single table. +- Supports UNION ALL for querying multiple single tables. +- Supports index creation in materialized views. +- Supports the Analyze operation in materialized views. + +## Unsupported Scenarios + +- Multi-table join plans and subquery plans are not supported in materialized views. +- Except for a few ALTER operations, most DDL operations cannot be performed on base tables in materialized views. +- Materialized views cannot be added, deleted, or modified. They support only query statements. +- The temporary table, hashbucket, unlog, or partitioned table cannot be used to create materialized views. +- Materialized views cannot be created in nested mode (that is, a materialized view cannot be created in another materialized view). +- The column-store tables are not supported. Only row-store tables are supported. +- Materialized views of the UNLOGGED type are not supported, and the WITH syntax is not supported. + +## Constraints + +If the materialized view definition is UNION ALL, each subquery needs to use a different base table. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-1-plpgsql-overview.md b/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-1-plpgsql-overview.md new file mode 100644 index 0000000000000000000000000000000000000000..2e19bda3747159d690ade3a33a22ef567d9a0cb7 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-1-plpgsql-overview.md @@ -0,0 +1,24 @@ +--- +title: Overview of PL/pgSQL Functions +summary: Overview of PL/pgSQL Functions +author: Guo Huan +date: 2021-11-10 +--- + +# Overview of PL/pgSQL Functions + +PL/pgSQL is a loadable procedural language. + +The functions created using PL/pgSQL can be used in any place where you can use built-in functions. For example, you can create calculation functions with complex conditions and use them to define operators or use them for index expressions. + +SQL is used by most databases as a query language. It is portable and easy to learn. Each SQL statement must be executed independently by a database server. + +In this case, when a client application sends a query to the server, it must wait for it to be processed, receive and process the results, and then perform some calculation before sending more queries to the server. If the client and server are not on the same machine, all these operations will cause inter-process communication and increase network loads. + +PL/pgSQL enables a whole computing part and a series of queries to be grouped inside a database server. This makes procedural language available and SQL easier to use. In addition, the client/server communication cost is reduced. + +- Extra round-trip communication between clients and servers is eliminated. +- Intermediate results that are not required by clients do not need to be sorted or transmitted between the clients and servers. +- Parsing can be skipped in multiple rounds of queries. + +PL/pgSQL can use all data types, operators, and functions in SQL. There are some common functions, such as gs_extend_library. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-10-other-statements.md b/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-10-other-statements.md new file mode 100644 index 0000000000000000000000000000000000000000..a5e5be684ea02c9c5ed0bd1a6c54bf4d1789d6a5 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-10-other-statements.md @@ -0,0 +1,20 @@ +--- +title: Other Statements +summary: Other Statements +author: Guo Huan +date: 2021-03-04 +--- + +# Other Statements + +## Lock Operations + +MogDB provides multiple lock modes to control concurrent accesses to table data. These modes are used when Multi-Version Concurrency Control (MVCC) cannot give expected behaviors. Alike, most MogDB commands automatically apply appropriate locks to ensure that called tables are not deleted or modified in an incompatible manner during command execution. For example, when concurrent operations exist, **ALTER TABLE** cannot be executed on the same table. + +## Cursor Operations + +MogDB provides cursors as a data buffer for users to store execution results of SQL statements. Each cursor region has a name. Users can use SQL statements to obtain records one by one from cursors and grant the records to master variables, then being processed further by host languages. + +Cursor operations include cursor definition, open, fetch, and close operations. + +For the complete example of cursor operations, see [Explicit Cursor](1-11-cursors#explicit-cursor). diff --git a/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-11-cursors.md b/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-11-cursors.md new file mode 100644 index 0000000000000000000000000000000000000000..4a011600737b2d86b2516f1ada39f53ab7026af0 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-11-cursors.md @@ -0,0 +1,178 @@ +--- +title: Cursors +summary: Cursors +author: Guo Huan +date: 2021-03-04 +--- + +# Cursors + +## Overview + +To process SQL statements, the stored procedure process assigns a memory segment to store context association. Cursors are handles or pointers pointing to context regions. With cursors, stored procedures can control alterations in context regions. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-notice.gif) **NOTICE:** +> If JDBC is used to call a stored procedure whose returned value is a cursor, the returned cursor cannot be used. + +Cursors are classified into explicit cursors and implicit cursors. [Table 1](#Table 1) shows the usage conditions of explicit and implicit cursors for different SQL statements. + +**Table 1** Cursor usage conditions + +| SQL Statement | Cursor | +| :---------------------------------------- | :------------------- | +| Non-query statements | Implicit | +| Query statements with single-line results | Implicit or explicit | +| Query statements with multi-line results | Explicit | + +## Explicit Cursor + +An explicit cursor is used to process query statements, particularly when query results are multiple records. + +**Procedure** + +An explicit cursor performs the following six PL/SQL steps to process query statements: + +1. **Define a static cursor:** Define a cursor name and its corresponding **SELECT** statement. + + [Figure 1](#static_cursor_define) shows the syntax diagram for defining a static cursor. + + **Figure 1** static_cursor_define::= + + ![static_cursor_define](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/cursors-1.jpg) + + Parameter description: + + - **cursor_name**: defines a cursor name. + + - **parameter**: specifies cursor parameters. Only input parameters are allowed in the following format: + + ``` + parameter_name datatype + ``` + + - **select_statement**: specifies a query statement. + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + > The system automatically determines whether the cursor can be used for backward fetches based on the execution plan. + + **Define a dynamic cursor:** Define a **ref** cursor, which means that the cursor can be opened dynamically by a set of static SQL statements. First define the type of the **ref** cursor first and then the cursor variable of this cursor type. Dynamically bind a **SELECT** statement through **OPEN FOR** when the cursor is opened. + + [Figure 2](#cursor_typename) and [Figure 3](#dynamic_cursor_define) show the syntax diagrams for defining a dynamic cursor. + + **Figure 2** cursor_typename::= + + ![cursor_typename](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/cursors-2.png) + + **Figure 3** dynamic_cursor_define::= + + ![dynamic_cursor_define](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/cursors-3.png) + +2. **Open the static cursor:** Execute the **SELECT** statement corresponding to the cursor. The query result is placed in the workspace and the pointer directs to the head of the workspace to identify the cursor result set. If the cursor query statement carries the **FOR UPDATE** option, the **OPEN** statement locks the data row corresponding to the cursor result set in the database table. + + [Figure 4](#open_static_cursor) shows the syntax diagram for opening a static cursor. + + **Figure 4** open_static_cursor::= + + ![open_static_cursor](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/cursors-4.png) + + **Open the dynamic cursor:** Use the **OPEN FOR** statement to open the dynamic cursor and the SQL statement is dynamically bound. + + [Figure 5](#open_dynamic_cursor) shows the syntax diagrams for opening a dynamic cursor. + + **Figure 5** open_dynamic_cursor::= + + ![open_dynamic_cursor](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/cursors-5.png) + + A PL/SQL program cannot use the OPEN statement to repeatedly open a cursor. + +3. **Fetch cursor data**: Retrieve data rows in the result set and place them in specified output variables. + + [Figure 6](#fetch_cursor) shows the syntax diagrams for fetching cursor data. + + **Figure 6** fetch_cursor::= + + ![fetch_cursor](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/cursors-6.png) + +4. Process the record. + +5. Continue to process until the active set has no record. + +6. **Close the cursor**: When fetching and finishing the data in the cursor result set, close the cursor immediately to release system resources used by the cursor and invalidate the workspace of the cursor so that the **FETCH** statement cannot be used to fetch data any more. A closed cursor can be reopened by an OPEN statement. + + [Figure 7](#close_cursor) shows the syntax diagram for closing a cursor. + + **Figure 7** close_cursor::= + + ![close_cursor](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/cursors-7.jpg) + +**Attribute** + +Cursor attributes are used to control program procedures or know program status. When a DML statement is executed, the PL/SQL opens a built-in cursor and processes its result. A cursor is a memory segment for maintaining query results. It is opened when a DML statement is executed and closed when the execution is finished. An explicit cursor has the following attributes: + +- **%FOUND** attribute: returns **TRUE** if the last fetch returns a row. +- **%NOTFOUND** attribute: works opposite to the **%FOUND** attribute. +- **%ISOPEN** attribute: returns **TRUE** if the cursor has been opened. +- **%ROWCOUNT** attribute: returns the number of records fetched from the cursor. + +## Implicit Cursor + +Implicit cursors are automatically set by the system for non-query statements such as modify or delete operations, along with their workspace. Implicit cursors are named **SQL**, which is defined by the system. + +**Overview** + +Implicit cursor operations, such as definition, open, value-grant, and close operations, are automatically performed by the system and do not need users to process. Users can use only attributes related to implicit cursors to complete operations. In workspace of implicit cursors, the data of the latest SQL statement is stored and is not related to explicit cursors defined by users. + +Format call:**SQL%** + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> **INSERT**, **UPDATE**, **DELETE**, and **SELECT** statements do not need defined cursors. + +**Attributes** + +An implicit cursor has the following attributes: + +- **SQL%FOUND**: Boolean attribute, which returns **TRUE** if the last fetch returns a row. +- **SQL%NOTFOUND**: Boolean attribute, which works opposite to the **SQL%FOUND** attribute. +- **SQL%ROWCOUNT**: numeric attribute, which returns the number of records fetched from the cursor. +- **SQL%ISOPEN**: Boolean attribute, whose value is always **FALSE**. Close implicit cursors immediately after an SQL statement is run. + +**Examples** + +```sql +-- Delete all employees in a department from the hr.staffs table. If the department has no employees, delete the department from the hr.sections table. +CREATE OR REPLACE PROCEDURE proc_cursor3() +AS + DECLARE + V_DEPTNO NUMBER(4) := 100; + BEGIN + DELETE FROM hr.staffs WHERE section_ID = V_DEPTNO; + -- Proceed based on cursor status. + IF SQL%NOTFOUND THEN + DELETE FROM hr.sections WHERE section_ID = V_DEPTNO; + END IF; + END; +/ + +CALL proc_cursor3(); + +-- Delete the stored procedure and the temporary table. +DROP PROCEDURE proc_cursor3; +``` + +## Cursor Loop + +Use of cursors in WHILE and LOOP statements is called a cursor loop. Generally, OPEN, FETCH, and CLOSE statements are involved in this kind of loop. The following describes a loop that simplifies a cursor loop without the need for these operations. This kind of loop is applicable to a static cursor loop, without executing four steps about a static cursor. + +**Syntax** + +[Figure 8](#FOR_AS_loop) shows the syntax diagram of the **FOR AS** loop. + +**Figure 8** FOR_AS_loop::= + +![for_as_loop](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/cursors-8.png) + +**Precautions** + +- The **UPDATE** operation for the queried table is not allowed in the loop statement. +- The variable loop_name is automatically defined and is valid only in this loop. Its type is the same as that in the select_statement query result. The value of **loop_name** is the query result of **select_statement**. +- The **%FOUND**, **%NOTFOUND**, and **%ROWCOUNT** attributes access the same internal variable in MogDB. Transactions and the anonymous block do not support multiple cursor accesses at the same time. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-12-retry-management.md b/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-12-retry-management.md new file mode 100644 index 0000000000000000000000000000000000000000..f45df13aa74998f278abfa5ec533637858f46ea0 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-12-retry-management.md @@ -0,0 +1,26 @@ +--- +title: Retry Management +summary: Retry Management +author: Guo Huan +date: 2021-03-04 +--- + +# Retry Management + +Retry is a process in which the database executes a SQL statement or stored procedure (including anonymous block) again in the case of execution failure, improving the execution success rate and user experience. The database checks the error code and retry configuration to determine whether to retry. + +- If the execution fails, the system rolls back the executed statements and executes the stored procedure again. + + Example: + + ```sql + mogdb=# CREATE OR REPLACE PROCEDURE retry_basic ( IN x INT) + AS + BEGIN + INSERT INTO t1 (a) VALUES (x); + INSERT INTO t1 (a) VALUES (x+1); + END; + / + + mogdb=# CALL retry_basic(1); + ``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-13-debugging.md b/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-13-debugging.md new file mode 100644 index 0000000000000000000000000000000000000000..65081dcdcbda572a26c90880efe654cad715fea7 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-13-debugging.md @@ -0,0 +1,136 @@ +--- +title: Debugging +summary: Debugging +author: Guo Huan +date: 2021-03-04 +--- + +# Debugging + +**Syntax** + +RAISE has the following five syntax formats: + +**Figure 1** raise_format::= + +![raise_format](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/debugging-1.png) + +**Figure 2** raise_condition::= + +![raise_condition](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/debugging-2.png) + +**Figure 3** raise_sqlstate::= + +![raise_sqlstate](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/debugging-3.png) + +**Figure 4** raise_option::= + +![raise_option](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/debugging-4.png) + +**Figure 5** raise::= + +![raise](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/debugging-5.png) + +**Parameter description**: + +- The level option is used to specify the error level, that is, **DEBUG**, **LOG**, **INFO**, **NOTICE**, **WARNING**, or **EXCEPTION** (default). **EXCEPTION** throws an error that normally terminates the current transaction and the others only generate information at their levels. The log_min_messages and client_min_messages parameters control whether the error messages of specific levels are reported to the client and are written to the server log. + +- **format**: specifies the error message text to be reported, a format string. The format string can be appended with an expression for insertion to the message text. In a format string, **%** is replaced by the parameter value attached to format and **%%** is used to print **%**. For example: + + ``` + --v_job_id replaces % in the string. + RAISE NOTICE 'Calling cs_create_job(%)',v_job_id; + ``` + +- **option = expression**: inserts additional information to an error report. The keyword option can be **MESSAGE**, **DETAIL**, **HINT**, or **ERRCODE**, and each expression can be any string. + + - **MESSAGE**: specifies the error message text. This option cannot be used in a **RAISE** statement that contains a format character string in front of **USING**. + - **DETAIL**: specifies detailed information of an error. + - **HINT**: prints hint information. + - **ERRCODE**: designates an error code (SQLSTATE) to a report. A condition name or a five-character SQLSTATE error code can be used. + +- **condition_name**: specifies the condition name corresponding to the error code. + +- **sqlstate**: specifies the error code. + +If neither a condition name nor an **SQLSTATE** is designated in a **RAISE EXCEPTION** command, the **RAISE EXCEPTION (P0001)** is used by default. If no message text is designated, the condition name or SQLSTATE is used as the message text by default. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-notice.gif) **NOTICE:** +> If the **SQLSTATE** designates an error code, the error code is not limited to a defined error code. It can be any error code containing five digits or ASCII uppercase rather than **00000**. Do not use an error code ended with three zeros because this kind of error codes are type codes and can be captured by the whole category. +> +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> The syntax described in [Figure 5](#raise) does not append any parameter. This form is used only for the **EXCEPTION** statement in a **BEGIN** block so that the error can be re-processed. + +**Example** + +Display error and hint information when a transaction terminates: + +```sql +CREATE OR REPLACE PROCEDURE proc_raise1(user_id in integer) +AS +BEGIN +RAISE EXCEPTION 'Noexistence ID --> %',user_id USING HINT = 'Please check your user ID'; +END; +/ + +call proc_raise1(300011); + +-- Execution result: +ERROR: Noexistence ID --> 300011 +HINT: Please check your user ID +``` + +Two methods are available for setting **SQLSTATE**: + +```sql +CREATE OR REPLACE PROCEDURE proc_raise2(user_id in integer) +AS +BEGIN +RAISE 'Duplicate user ID: %',user_id USING ERRCODE = 'unique_violation'; +END; +/ + +\set VERBOSITY verbose +call proc_raise2(300011); + +-- Execution result: +ERROR: Duplicate user ID: 300011 +SQLSTATE: 23505 +LOCATION: exec_stmt_raise, pl_exec.cpp:3482 +``` + +If the main parameter is a condition name or **SQLSTATE**, the following applies: + +RAISE division_by_zero; + +RAISE SQLSTATE '22012'; + +For example: + +```sql +CREATE OR REPLACE PROCEDURE division(div in integer, dividend in integer) +AS +DECLARE +res int; + BEGIN + IF dividend=0 THEN + RAISE division_by_zero; + RETURN; + ELSE + res := div/dividend; + RAISE INFO 'division result: %', res; + RETURN; + END IF; + END; +/ +call division(3,0); + +-- Execution result: +ERROR: division_by_zero +``` + +Alternatively: + +``` +RAISE unique_violation USING MESSAGE = 'Duplicate user ID: ' || user_id; +``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-2-data-types.md b/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-2-data-types.md new file mode 100644 index 0000000000000000000000000000000000000000..e43bc6ea5ee4c8e9239cdbb57171b4d25418ae1d --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-2-data-types.md @@ -0,0 +1,10 @@ +--- +title: Data Types +summary: Data Types +author: Guo Huan +date: 2021-03-04 +--- + +# Data Types + +A data type refers to a value set and an operation set defined on the value set. The MogDB database consists of tables, each of which is defined by its own columns. Each column corresponds to a data type. The MogDB uses corresponding functions to perform operations on data based on data types. For example, the MogDB can perform addition, subtraction, multiplication, and division operations on data of numeric values. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-3-data-type-conversion.md b/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-3-data-type-conversion.md new file mode 100644 index 0000000000000000000000000000000000000000..f37d0f9bf0c1e8eb4d4ed5d76990ed1cb36dcc21 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-3-data-type-conversion.md @@ -0,0 +1,41 @@ +--- +title: Data Type Conversion +summary: Data Type Conversion +author: Guo Huan +date: 2021-03-04 +--- + +# Data Type Conversion + +Certain data types in the database support implicit data type conversions, such as assignments and parameters called by functions. For other data types, you can use the type conversion functions provided by MogDB, such as the **CAST** function, to forcibly convert them. + +MogDB lists common implicit data type conversions in [Table 1](#Implicit data type conversions). + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-notice.gif) **NOTICE:** +> The valid value range of **DATE** supported by MogDB is from 4713 B.C. to 294276 A.D. + +**Table 1** Implicit data type conversions + +| Raw Data Type | Target Data Type | **Remarks** | +| :------------ | :--------------- | :------------------------------------------- | +| CHAR | VARCHAR2 | - | +| CHAR | NUMBER | Raw data must consist of digits. | +| CHAR | DATE | Raw data cannot exceed the valid date range. | +| CHAR | RAW | - | +| CHAR | CLOB | - | +| VARCHAR2 | CHAR | - | +| VARCHAR2 | NUMBER | Raw data must consist of digits. | +| VARCHAR2 | DATE | Raw data cannot exceed the valid date range. | +| VARCHAR2 | CLOB | - | +| NUMBER | CHAR | - | +| NUMBER | VARCHAR2 | - | +| DATE | CHAR | - | +| DATE | VARCHAR2 | - | +| RAW | CHAR | - | +| RAW | VARCHAR2 | - | +| CLOB | CHAR | - | +| CLOB | VARCHAR2 | - | +| CLOB | NUMBER | Raw data must consist of digits. | +| INT4 | CHAR | - | +| INT4 | BOOLEAN | - | +| BOOLEAN | INT4 | - | diff --git a/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-4-arrays-and-records.md b/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-4-arrays-and-records.md new file mode 100644 index 0000000000000000000000000000000000000000..1b8266af0426574424b442cd620376ab5e090e14 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-4-arrays-and-records.md @@ -0,0 +1,156 @@ +--- +title: Arrays and Records +summary: Arrays and Records +author: Guo Huan +date: 2021-03-04 +--- + +# Arrays and Records + +## Arrays + +**Use of Array Types** + +Before the use of arrays, an array type needs to be defined: + +Define an array type immediately after the **AS** keyword in a stored procedure. The method is as follows: + +``` +TYPE array_type IS VARRAY(size) OF data_type; +``` + +In the preceding information: + +- **array_type**: indicates the name of the array type to be defined. +- **VARRAY**: indicates the array type to be defined. +- **size**: indicates the maximum number of members in the array type to be defined. The value is a positive integer. +- **data_type**: indicates the types of members in the array type to be created. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> +> - In MogDB, an array automatically increases. If an access violation occurs, a null value is returned, and no error message is reported. +> - The scope of an array type defined in a stored procedure takes effect only in this storage process. +> - It is recommended that you use one of the preceding methods to define an array type. If both methods are used to define the same array type, MogDB prefers the array type defined in a stored procedure to declare array variables. +> - **data_type** can also be a **record** type defined in a stored procedure (anonymous blocks are not supported), but cannot be an array or collection type defined in a stored procedure. + +MogDB supports the access of contents in an array by using parentheses, and the **extend**, **count**, **first**, and **last** functions. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> If the stored procedure contains the DML statement (SELECT, UPDATE, INSERT, or DELETE), DML statements can access array elements only using brackets. In this way, it may be separated from the function expression area. + +## record + +**record Variables** + +Perform the following operations to create a record variable: + +Define a record type and use this type to declare a variable. + +**Syntax** + +For the syntax of the record type, see [Figure 1](#Syntax of the record type). + +**Figure 1** Syntax of the record type + +![syntax-of-the-record-type](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/arrays-and-records-1.png) + +The above syntax diagram is explained as follows: + +- **record_type**: record name +- **field**: record columns +- **datatype**: record data type +- **expression**: expression for setting a default value + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> In MogDB: +> +> - When assigning values to record variables, you can: +> - Declare a record type and define member variables of this type when you declare a function or stored procedure. +> - Assign the value of a record variable to another record variable. +> - Use **SELECT INTO** or **FETCH** to assign values to a record type. +> - Assign the **NULL** value to a record variable. +> - The **INSERT** and **UPDATE** statements cannot use a record variable to insert or update data. +> - Just like a variable, a record column of the compound type does not have a default value in the declaration. +> - **date_type** can also be the **record** type, array type, and collection type defined in the stored procedure (anonymous blocks are not supported). + +**Example** + +```sql +The table used in the following example is defined as follows: +mogdb=# \d emp_rec + Table "public.emp_rec" + Column | Type | Modifiers +----------+--------------------------------+----------- + empno | numeric(4,0) | not null + ename | character varying(10) | + job | character varying(9) | + mgr | numeric(4,0) | + hiredate | timestamp(0) without time zone | + sal | numeric(7,2) | + comm | numeric(7,2) | + deptno | numeric(2,0) | + +-- Perform array operations in the function. +mogdb=# CREATE OR REPLACE FUNCTION regress_record(p_w VARCHAR2) +RETURNS +VARCHAR2 AS $$ +DECLARE + + -- Declare a record type. + type rec_type is record (name varchar2(100), epno int); + employer rec_type; + + -- Use %type to declare the record type. + type rec_type1 is record (name emp_rec.ename%type, epno int not null :=10); + employer1 rec_type1; + + -- Declare a record type with a default value. + type rec_type2 is record ( + name varchar2 not null := 'SCOTT', + epno int not null :=10); + employer2 rec_type2; + CURSOR C1 IS select ename,empno from emp_rec order by 1 limit 1; + +BEGIN + -- Assign a value to a member record variable. + employer.name := 'WARD'; + employer.epno = 18; + raise info 'employer name: % , epno:%', employer.name, employer.epno; + + -- Assign the value of a record variable to another variable. + employer1 := employer; + raise info 'employer1 name: % , epno: %',employer1.name, employer1.epno; + + -- Assign the NULL value to a record variable. + employer1 := NULL; + raise info 'employer1 name: % , epno: %',employer1.name, employer1.epno; + + -- Obtain the default value of a record variable. + raise info 'employer2 name: % ,epno: %', employer2.name, employer2.epno; + + -- Use a record variable in the FOR loop. + for employer in select ename,empno from emp_rec order by 1 limit 1 + loop + raise info 'employer name: % , epno: %', employer.name, employer.epno; + end loop; + + -- Use a record variable in the SELECT INTO statement. + select ename,empno into employer2 from emp_rec order by 1 limit 1; + raise info 'employer name: % , epno: %', employer2.name, employer2.epno; + + -- Use a record variable in a cursor. + OPEN C1; + FETCH C1 INTO employer2; + raise info 'employer name: % , epno: %', employer2.name, employer2.epno; + CLOSE C1; + RETURN employer.name; +END; +$$ +LANGUAGE plpgsql; + +-- Invoke the function. +mogdb=# CALL regress_record('abc'); + +-- Delete the function. +mogdb=# DROP FUNCTION regress_record; +``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-5-declare-syntax.md b/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-5-declare-syntax.md new file mode 100644 index 0000000000000000000000000000000000000000..0ae3fb792494249ae60c66515e851e1bb0362118 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-5-declare-syntax.md @@ -0,0 +1,82 @@ +--- +title: DECLARE Syntax +summary: DECLARE Syntax +author: Guo Huan +date: 2021-03-04 +--- + +# DECLARE Syntax + +## Basic Structure + +**Structure** + +A PL/SQL block can contain a sub-block which can be placed in any section. The following describes the architecture of a PL/SQL block: + +- **DECLARE**: declares variables, types, cursors, and regional stored procedures and functions used in the PL/SQL block. + + ```sql + DECLARE + ``` + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** + > This part is optional if no variable needs to be declared. + > + > - An anonymous block may omit the **DECLARE** keyword if no variable needs to be declared. + > - For a stored procedure, **AS** is used, which is equivalent to **DECLARE**. The **AS** keyword must be reserved even if there is no variable declaration part. + +- **EXECUTION**: specifies procedure and SQL statements. It is the main part of a program. Mandatory. + + ```sql + BEGIN + ``` + +- Exception part: processes errors. Optional. + + ```sql + EXCEPTION + ``` + +- End + + ``` + END; + / + ``` + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-notice.gif) **NOTICE:** + > You are not allowed to use consecutive tabs in the PL/SQL block because they may result in an exception when the **gsql** tool is executed with the **-r** parameter specified. + +**Category** + +PL/SQL blocks are classified into the following types: + +- Anonymous block: a dynamic block that can be executed only for once. For details about the syntax, see [Figure 1](#anonymous_block::=). +- Subprogram: a stored procedure, function, operator, or packages stored in a database. A subprogram created in a database can be called by other programs. + +## Anonymous Blocks + +An anonymous block applies to a script infrequently executed or a one-off activity. An anonymous block is executed in a session and is not stored. + +**Syntax** + +[Figure 1](#anonymous_block::=) shows the syntax diagrams for an anonymous block. + +**Figure 1** anonymous_block::= + +![anonymous_block](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/declare-syntax-1.png) + +Details about the syntax diagram are as follows: + +- The execute part of an anonymous block starts with a **BEGIN** statement, has a break with an **END** statement, and ends with a semicolon (;). Type a slash (/) and press **Enter** to execute the statement. + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-notice.gif) **NOTICE:** + > The terminator "/" must be written in an independent row. + +- The declaration section includes the variable definition, type, and cursor definition. + +- A simplest anonymous block does not execute any commands. At least one statement, even a **NULL** statement, must be presented in any implementation blocks. + +## Subprogram + +A subprogram stores stored procedures, functions, operators, and advanced packages. A subprogram created in a database can be called by other programs. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-6-basic-statements.md b/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-6-basic-statements.md new file mode 100644 index 0000000000000000000000000000000000000000..8441f154af2b468ef00776cdabde183c2c503b84 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-6-basic-statements.md @@ -0,0 +1,169 @@ +--- +title: Basic Statements +summary: Basic Statements +author: Guo Huan +date: 2021-03-04 +--- + +# Basic Statements + +During PL/SQL programming, you may define some variables, assign values to variables, and call other stored procedures. This chapter describes basic PL/SQL statements, including variable definition statements, value assignment statements, call statements, and return statements. + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> You are advised not to call the SQL statements containing passwords in the stored procedures because authorized users may view the stored procedure file in the database and password information is leaked. If a stored procedure contains other sensitive information, permission to access this procedure must be configured, preventing information leakage. + +## Define Variable + +This section describes the declaration of variables in the PL/SQL and the scope of this variable in codes. + +**Variable Declaration** + +For details about the variable declaration syntax, see [Figure 1](#declare_variable::=). + +**Figure 1** declare_variable::= + +![declare_variable](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/basic-statements-1.png) + +The above syntax diagram is explained as follows: + +- **variable_name** indicates the name of a variable. +- **type** indicates the type of a variable. +- **value** indicates the initial value of the variable. (If the initial value is not given, NULL is taken as the initial value.) **value** can also be an expression. + +**Examples** + +```sql +mogdb=# DECLARE + emp_id INTEGER := 7788; -- Define a variable and assign a value to it. +BEGIN + emp_id := 5*7784; -- Assign a value to the variable. +END; +/ +``` + +In addition to the declaration of basic variable types, **%TYPE** and **%ROWTYPE** can be used to declare variables related to table columns or table structures. + +**%TYPE Attribute** + +**%TYPE** declares a variable to be of the same data type as a previously declared variable (for example, a column in a table). For example, if you want to define a *my_name* variable whose data type is the same as the data type of the **firstname** column in the **employee** table, you can define the variable as follows: + +``` +my_name employee.firstname%TYPE +``` + +In this way, you can declare *my_name* without the need of knowing the data type of **firstname** in **employee**, and the data type of **my_name** can be automatically updated when the data type of **firstname** changes. + +**%ROWTYPE Attribute** + +**%ROWTYPE** declares data types of a set of data. It stores a row of table data or results fetched from a cursor. For example, if you want to define a set of data with the same column names and column data types as the **employee** table, you can define the data as follows: + +``` +my_employee employee%ROWTYPE +``` + +**Scope of a Variable** + +The scope of a variable indicates the accessibility and availability of the variable in code block. In other words, a variable takes effect only within its scope. + +- To define a function scope, a variable must declare and create a **BEGIN-END** block in the declaration section. The necessity of such declaration is also determined by block structure, which requires that a variable has different scopes and lifetime during a process. +- A variable can be defined multiple times in different scopes, and inner definition can cover outer one. +- A variable defined in an outer block can also be used in a nested block. However, the outer block cannot access variables in the nested block. + +## Assignment Statements + +**Syntax** + +[Figure 2](#assignment_value) shows the syntax diagram for assigning a value to a variable. + +**Figure 2** assignment_value::= + +![assignment_value](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/basic-statements-2.png) + +The above syntax diagram is explained as follows: + +- **variable_name** indicates the name of a variable. +- **value** can be a value or an expression. The type of **value** must be compatible with the type of **variable_name**. + +**Example** + +```sql +mogdb=# DECLARE + emp_id INTEGER := 7788; --Assignment +BEGIN + emp_id := 5; --Assignment + emp_id := 5*7784; +END; +/ +``` + +**Nested Value Assignment** + +[Figure 3](#nested_assignment_value) shows the syntax diagram for assigning a nested value to a variable. + +**Figure 3** nested_assignment_value::= + +![nested_assignment_value](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/basic-statements-3.png) + +The syntax in Figure 3 is described as follows: + +- **variable_name**: variable name +- **col_name**: column name +- **subscript**: subscript, which is used for an array variable. The value can be a value or an expression and must be of the int type. +- **value**: value or expression. The type of **value** must be compatible with the type of **variable_name**. + +## Examples + +```sql +mogdb=# CREATE TYPE o1 as (a int, b int); +mogdb=# DECLARE + TYPE r1 is VARRAY(10) of o1; + emp_id r1; +BEGIN + emp_id(1).a := 5;-- Assign a value. + emp_id(1).b := 5*7784; +END; +/ +``` + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-notice.gif) **NOTICE:** +> +> - In INTO mode, values can be assigned only to the columns at the first layer. Two-dimensional or above arrays are not supported. +> - When a nested column value is referenced, if an array subscript exists, only one parenthesis can exist in the first three layers of columns. You are advised to use square brackets to reference the subscript. + +## Call Statement + +**Syntax** + +[Figure 4](#call_clause) shows the syntax diagram for calling a clause. + +**Figure 4** call_clause::= + +![call_clause](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/basic-statements-4.png) + +The above syntax diagram is explained as follows: + +- **procedure_name** specifies the name of a stored procedure. +- **parameter** specifies the parameters for the stored procedure. You can set no parameter or multiple parameters. + +**Example** + +```sql +-- Create the stored procedure proc_staffs: +mogdb=# CREATE OR REPLACE PROCEDURE proc_staffs +( +section NUMBER(6), +salary_sum out NUMBER(8,2), +staffs_count out INTEGER +) +IS +BEGIN +SELECT sum(salary), count(*) INTO salary_sum, staffs_count FROM hr.staffs where section_id = section; +END; +/ + +-- Invoke a stored procedure proc_return: +mogdb=# CALL proc_staffs(2,8,6); + +-- Delete a stored procedure: +mogdb=# DROP PROCEDURE proc_staffs; +``` diff --git a/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-7-dynamic-statements.md b/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-7-dynamic-statements.md new file mode 100644 index 0000000000000000000000000000000000000000..7d982b4672c660c46cd401edf7ac95478069d3d4 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-7-dynamic-statements.md @@ -0,0 +1,166 @@ +--- +title: Dynamic Statements +summary: Dynamic Statements +author: Guo Huan +date: 2021-03-04 +--- + +# Dynamic Statements + +## Executing Dynamic Query Statements + +You can perform dynamic queries MogDB provides two modes: EXECUTE IMMEDIATE and OPEN FOR. **EXECUTE IMMEDIATE** dynamically executes **SELECT** statements and **OPEN FOR** combines use of cursors. If you need to store query results in a data set, use **OPEN FOR**. + +**EXECUTE IMMEDIATE** + +[Figure 1](#EXECUTE IMMEDIATE) shows the syntax diagram. + +**Figure 1** EXECUTE IMMEDIATE dynamic_select_clause::= + +![execute-immediate-dynamic_select_clause](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/dynamic-statements-1.png) + +[Figure 2](#using_clause) shows the syntax diagram for **using_clause**. + +**Figure 2** using_clause::= + +![using_clause](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/dynamic-statements-2.png) + +The above syntax diagram is explained as follows: + +- **define_variable**: specifies variables to store single-line query results. + +- **USING IN bind_argument**: specifies where the variable passed to the dynamic SQL value is stored, that is, in the dynamic placeholder of **dynamic_select_string**. + +- **USING OUT bind_argument**: specifies where the dynamic SQL returns the value of the variable. + + > ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-notice.gif) **NOTICE:** + > + > - In query statements, **INTO** and **OUT** cannot coexist. + > - A placeholder name starts with a colon (:) followed by digits, characters, or strings, corresponding to **bind_argument** in the **USING** clause. + > - **bind_argument** can only be a value, variable, or expression. It cannot be a database object such as a table name, column name, and data type. That is, **bind_argument** cannot be used to transfer schema objects for dynamic SQL statements. If a stored procedure needs to transfer database objects through **bind_argument** to construct dynamic SQL statements (generally, DDL statements), you are advised to use double vertical bars (||) to concatenate **dynamic_select_clause** with a database object. + > - A dynamic PL/SQL block allows duplicate placeholders. That is, a placeholder can correspond to only one **bind_argument** in the **USING** clause. + +**OPEN FOR** + +Dynamic query statements can be executed by using **OPEN FOR** to open dynamic cursors. + +[Figure 3](#open_for) shows the syntax diagram. + +**Figure 3** open_for::= + +![open_for](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/dynamic-statements-3.png) + +Parameter description: + +- **cursor_name**: specifies the name of the cursor to be opened. +- **dynamic_string**: specifies the dynamic query statement. +- **USING value**: applies when a placeholder exists in dynamic_string. + +For use of cursors, see [Cursors](1-11-cursors). + +## Executing Dynamic Non-query Statements + +**Syntax** + +[Figure 4](#noselect) shows the syntax diagram. + +**Figure 4** noselect::= + +![noselect](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/dynamic-statements-4.png) + +[Figure 5](#using_clause::=) shows the syntax diagram for **using_clause**. + +**Figure 5** using_clause::= + +![using_clause-0](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/dynamic-statements-5.png) + +The above syntax diagram is explained as follows: + +**USING IN bind_argument** is used to specify the variable whose value is passed to the dynamic SQL statement. The variable is used when a placeholder exists in **dynamic_noselect_string**. That is, a placeholder is replaced by the corresponding **bind_argument** when a dynamic SQL statement is executed. Note that **bind_argument** can only be a value, variable, or expression, and cannot be a database object such as a table name, column name, and data type. If a stored procedure needs to transfer database objects through **bind_argument** to construct dynamic SQL statements (generally, DDL statements), you are advised to use double vertical bars (||) to concatenate **dynamic_select_clause** with a database object. In addition, a dynamic PL/SQL block allows duplicate placeholders. That is, a placeholder can correspond to only one **bind_argument**. + +**Example** + +```sql +-- Create a table: +mogdb=# CREATE TABLE sections_t1 +( + section NUMBER(4) , + section_name VARCHAR2(30), + manager_id NUMBER(6), + place_id NUMBER(4) +); + +-- Declare a variable: +mogdb=# DECLARE + section NUMBER(4) := 280; + section_name VARCHAR2(30) := 'Info support'; + manager_id NUMBER(6) := 103; + place_id NUMBER(4) := 1400; + new_colname VARCHAR2(10) := 'sec_name'; +BEGIN +-- Execute the query: + EXECUTE IMMEDIATE 'insert into sections_t1 values(:1, :2, :3, :4)' + USING section, section_name, manager_id,place_id; +-- Execute the query (duplicate placeholders): + EXECUTE IMMEDIATE 'insert into sections_t1 values(:1, :2, :3, :1)' + USING section, section_name, manager_id; +-- Run the ALTER statement. (You are advised to use double vertical bars (||) to concatenate the dynamic DDL statement with a database object.) + EXECUTE IMMEDIATE 'alter table sections_t1 rename section_name to ' || new_colname; +END; +/ + +-- Query data: +mogdb=# SELECT * FROM sections_t1; + +--Delete the table. +mogdb=# DROP TABLE sections_t1; +``` + +## Dynamically Calling Stored Procedures + +This section describes how to dynamically call store procedures. You must use anonymous statement blocks to package stored procedures or statement blocks and append **IN** and **OUT** behind the **EXECUTE IMMEDIATE…USING** statement to input and output parameters. + +**Syntax** + +[Figure 6](#call_procedure) shows the syntax diagram. + +**Figure 6** call_procedure::= + +![call_procedure](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/dynamic-statements-6.png) + +[Figure 7](#Figure 2) shows the syntax diagram for **using_clause**. + +**Figure 7** using_clause::= + +![using_clause-1](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/dynamic-statements-7.png) + +The above syntax diagram is explained as follows: + +- **CALL procedure_name**: calls the stored procedure. +- **[:placeholder1,:placeholder2,…]**: specifies the placeholder list of the stored procedure parameters. The numbers of the placeholders and parameters are the same. +- **USING [IN|OUT|IN OUT]bind_argument**: specifies where the variable passed to the stored procedure parameter value is stored. The modifiers in front of **bind_argument** and of the corresponding parameter are the same. + +## Dynamically Calling Anonymous Blocks + +This section describes how to execute anonymous blocks in dynamic statements. Append **IN** and **OUT** behind the **EXECUTE IMMEDIATE…USING** statement to input and output parameters. + +**Syntax** + +[Figure 8](#call_anonymous_block) shows the syntax diagram. + +**Figure 8** call_anonymous_block::= + +![call_anonymous_block](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/dynamic-statements-8.png) + +[Figure 9](#Figure 2using_clause) shows the syntax diagram for **using_clause**. + +**Figure 9** using_clause::= + +![using_clause-2](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/dynamic-statements-9.png) + +The above syntax diagram is explained as follows: + +- The execute part of an anonymous block starts with a **BEGIN** statement, has a break with an **END** statement, and ends with a semicolon (;). +- **USING [IN|OUT|IN OUT]bind_argument**: specifies where the variable passed to the stored procedure parameter value is stored. The modifiers in front of **bind_argument** and of the corresponding parameter are the same. +- The input and output parameters in the middle of an anonymous block are designated by placeholders. The numbers of the placeholders and parameters are the same. The sequences of the parameters corresponding to the placeholders and the USING parameters are the same. +- Currently in MogDB, when dynamic statements call anonymous blocks, placeholders cannot be used to pass input and output parameters in an **EXCEPTION** statement. diff --git a/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-8-control-statements.md b/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-8-control-statements.md new file mode 100644 index 0000000000000000000000000000000000000000..32e01601c65c22490ff452de4f1e99c64e2dd087 --- /dev/null +++ b/product/en/docs-mogdb/v3.0/developer-guide/plpgsql/1-8-control-statements.md @@ -0,0 +1,647 @@ +--- +title: Control Statements +summary: Control Statements +author: Guo Huan +date: 2021-03-04 +--- + +# Control Statements + +## RETURN Statements + +In MogDB, data can be returned in either of the following ways:**RETURN**, **RETURN NEXT**, or **RETURN QUERY**. **RETURN NEXT** and **RETURN QUERY** are used only for functions and cannot be used for stored procedures. + +### RETURN + +**Syntax** + +[Figure 1](#return_clause::=) shows the syntax diagram for a return statement. + +**Figure 1** return_clause::= + +![return_clause](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/control-statements-1.jpg) + +The above syntax diagram is explained as follows: + +This statement returns control from a stored procedure or function to a caller. + +**Examples** + +See [Example](1-6-basic-statements#call-statement) for call statement examples. + +### RETURN NEXT and RETURN QUERY + +**Syntax** + +When creating a function, specify **SETOF datatype** for the return values. + +return_next_clause::= + +![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/control-statements-2.png) + +return_query_clause::= + +![img](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/control-statements-3.png) + +The above syntax diagram is explained as follows: + +If a function needs to return a result set, use **RETURN NEXT** or **RETURN QUERY** to add results to the result set, and then continue to execute the next statement of the function. As the **RETURN NEXT** or **RETURN QUERY** statement is executed repeatedly, more and more results will be added to the result set. After the function is executed, all results are returned. + +**RETURN NEXT** can be used for scalar and compound data types. + +**RETURN QUERY** has a variant **RETURN QUERY EXECUTE**. You can add dynamic queries and add parameters to the queries by **USING**. + +**Examples** + +```sql +mogdb=# CREATE TABLE t1(a int); +mogdb=# INSERT INTO t1 VALUES(1),(10); + +--RETURN NEXT +mogdb=# CREATE OR REPLACE FUNCTION fun_for_return_next() RETURNS SETOF t1 AS $$ +DECLARE + r t1%ROWTYPE; +BEGIN + FOR r IN select * from t1 + LOOP + RETURN NEXT r; + END LOOP; + RETURN; +END; +$$ LANGUAGE PLPGSQL; +mogdb=# call fun_for_return_next(); + a +--- + 1 + 10 +(2 rows) + +-- RETURN QUERY +mogdb=# CREATE OR REPLACE FUNCTION fun_for_return_query() RETURNS SETOF t1 AS $$ +DECLARE + r t1%ROWTYPE; +BEGIN + RETURN QUERY select * from t1; +END; +$$ +language plpgsql; +mogdb=# call fun_for_return_query(); + a +--- + 1 + 10 +(2 rows) +``` + +## Conditional Statements + +Conditional statements are used to decide whether given conditions are met. Operations are executed based on the decisions made. + +MogDB supports five usages of **IF**: + +- IF_THEN + + **Figure 2** IF_THEN::= + + ![if_then](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/control-statements-4.jpg) + + **IF_THEN** is the simplest form of **IF**. If the condition is true, statements are executed. If it is false, they are skipped. + + Example: + + ```sql + mogdb=# IF v_user_id <> 0 THEN + UPDATE users SET email = v_email WHERE user_id = v_user_id; + END IF; + ``` + +- IF_THEN_ELSE + + **Figure 3** IF_THEN_ELSE::= + + ![if_then_else](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/control-statements-5.jpg) + + **IF-THEN-ELSE** statements add **ELSE** branches and can be executed if the condition is false. + + Example: + + ```sql + mogdb=# IF parentid IS NULL OR parentid = '' + THEN + RETURN; + ELSE + hp_true_filename(parentid); -- Call the stored procedure. + END IF; + ``` + +- IF_THEN_ELSE IF + + **IF** statements can be nested in the following way: + + ```sql + mogdb=# IF sex = 'm' THEN + pretty_sex := 'man'; + ELSE + IF sex = 'f' THEN + pretty_sex := 'woman'; + END IF; + END IF; + ``` + + Actually, this is a way of an **IF** statement nesting in the **ELSE** part of another **IF** statement. Therefore, an **END IF** statement is required for each nesting **IF** statement and another **END IF** statement is required to end the parent **IF-ELSE** statement. To set multiple options, use the following form: + +- IF_THEN_ELSIF_ELSE + + **Figure 4** IF_THEN_ELSIF_ELSE::= + + ![if_then_elsif_else](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/control-statements-6.png) + + Example: + + ```sql + IF number_tmp = 0 THEN + result := 'zero'; + ELSIF number_tmp > 0 THEN + result := 'positive'; + ELSIF number_tmp < 0 THEN + result := 'negative'; + ELSE + result := 'NULL'; + END IF; + ``` + +- IF_THEN_ELSEIF_ELSE + + **ELSEIF** is an alias of **ELSIF**. + + Example: + + ```sql + CREATE OR REPLACE PROCEDURE proc_control_structure(i in integer) + AS + BEGIN + IF i > 0 THEN + raise info 'i:% is greater than 0. ',i; + ELSIF i < 0 THEN + raise info 'i:% is smaller than 0. ',i; + ELSE + raise info 'i:% is equal to 0. ',i; + END IF; + RETURN; + END; + / + + CALL proc_control_structure(3); + + -- Delete the stored procedure. + DROP PROCEDURE proc_control_structure; + ``` + +## Loop Statements + +**Simple LOOP Statements** + +The syntax diagram is as follows: + +**Figure 5** loop::= + +![loop](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/control-statements-7.png) + +**Example** + +```sql +CREATE OR REPLACE PROCEDURE proc_loop(i in integer, count out integer) +AS + BEGIN + count:=0; + LOOP + IF count > i THEN + raise info 'count is %. ', count; + EXIT; + ELSE + count:=count+1; + END IF; + END LOOP; + END; +/ + +CALL proc_loop(10,5); +``` + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-notice.gif) **NOTICE:** +> The loop must be exploited together with **EXIT**; otherwise, a dead loop occurs. + +**WHILE-LOOP Statements** + +**Syntax diagram** + +**Figure 6** while_loop::= + +![while_loop](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/control-statements-8.png) + +If the conditional expression is true, a series of statements in the WHILE statement are repeatedly executed and the condition is decided each time the loop body is executed. + +**Example** + +```sql +CREATE TABLE integertable(c1 integer) ; +CREATE OR REPLACE PROCEDURE proc_while_loop(maxval in integer) +AS + DECLARE + i int :=1; + BEGIN + WHILE i < maxval LOOP + INSERT INTO integertable VALUES(i); + i:=i+1; + END LOOP; + END; +/ + +-- Invoke a function: +CALL proc_while_loop(10); + +-- Delete the stored procedure and table. +DROP PROCEDURE proc_while_loop; +DROP TABLE integertable; +``` + +**FOR_LOOP (Integer variable) Statement** + +**Syntax diagram** + +**Figure 7** for_loop::= + +![for_loop](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/control-statements-9.png) + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> +> - The variable **name** is automatically defined as the **integer** type and exists only in this loop. The variable name falls between lower_bound and upper_bound. +> - When the keyword **REVERSE** is used, the lower bound must be greater than or equal to the upper bound; otherwise, the loop body is not executed. + +**FOR_LOOP Query Statements** + +**Syntax diagram** + +**Figure 8** for_loop_query::= + +![for_loop_query](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/control-statements-10.png) + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> The variable **target** is automatically defined, its type is the same as that in the **query** result, and it is valid only in this loop. The target value is the query result. + +**FORALL Batch Query Statements** + +**Syntax diagram** + +**Figure 9** forall::= + +![forall](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/control-statements-11.png) + +> ![img](https://cdn-mogdb.enmotech.com/docs-media/icon/icon-note.gif) **NOTE:** +> The variable **index** is automatically defined as the **integer** type and exists only in this loop. The index value falls between low_bound and upper_bound. + +**Example** + +```sql +CREATE TABLE hdfs_t1 ( + title NUMBER(6), + did VARCHAR2(20), + data_peroid VARCHAR2(25), + kind VARCHAR2(25), + interval VARCHAR2(20), + time DATE, + isModified VARCHAR2(10) +); + +INSERT INTO hdfs_t1 VALUES( 8, 'Donald', 'OConnell', 'DOCONNEL', '650.507.9833', to_date('21-06-1999', 'dd-mm-yyyy'), 'SH_CLERK' ); + +CREATE OR REPLACE PROCEDURE proc_forall() +AS +BEGIN + FORALL i IN 100..120 + update hdfs_t1 set title = title + 100*i; +END; +/ + +-- Invoke a function: +CALL proc_forall(); + +-Query the invocation result of the stored procedure. +SELECT * FROM hdfs_t1 WHERE title BETWEEN 100 AND 120; + +-- Delete the stored procedure and table. +DROP PROCEDURE proc_forall; +DROP TABLE hdfs_t1; +``` + +## Branch Statements + +**Syntax** + +[Figure 10](#case_when) shows the syntax diagram for a branch statement. + +**Figure 10** case_when::= + +![case_when](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/control-statements-12.png) + +[Figure 11](#when_clause) shows the syntax diagram for **when_clause**. + +**Figure 11** when_clause::= + +![when_clause](https://cdn-mogdb.enmotech.com/docs-media/mogdb/developer-guide/control-statements-13.png) + +Parameter description: + +- **case_expression**: specifies the variable or expression. +- **when_expression**: specifies the constant or conditional expression. +- **statement**: specifies the statement to be executed. + +**Examples** + +```sql +CREATE OR REPLACE PROCEDURE proc_case_branch(pi_result in integer, pi_return out integer) +AS + BEGIN + CASE pi_result + WHEN 1 THEN + pi_return := 111; + WHEN 2 THEN + pi_return := 222; + WHEN 3 THEN + pi_return := 333; + WHEN 6 THEN + pi_return := 444; + WHEN 7 THEN + pi_return := 555; + WHEN 8 THEN + pi_return := 666; + WHEN 9 THEN + pi_return := 777; + WHEN 10 THEN + pi_return := 888; + ELSE + pi_return := 999; + END CASE; + raise info 'pi_return : %',pi_return ; +END; +/ + +CALL proc_case_branch(3,0); + +-- Delete the stored procedure. +DROP PROCEDURE proc_case_branch; +``` + +## NULL Statements + +In PL/SQL programs, **NULL** statements are used to indicate "nothing should be done", equal to placeholders. They grant meanings to some statements and improve program readability. + +**Syntax** + +The following shows example use of **NULL** statements. + +```sql +DECLARE + ... +BEGIN + ... + IF v_num IS NULL THEN + NULL; --No data needs to be processed. + END IF; +END; +/ +``` + +## Error Trapping Statements + +By default, any error occurring in a PL/SQL function aborts execution of the function, and indeed of the surrounding transaction as well. You can trap errors and restore from them by using a **BEGIN** block with an **EXCEPTION** clause. The syntax is an extension of the normal syntax for a **BEGIN** block: + +```sql +[<