# powervs_watsonx_APIbased_nlp2sql_toolkit **Repository Path**: mirrors_ibm/powervs_watsonx_APIbased_nlp2sql_toolkit ## Basic Information - **Project Name**: powervs_watsonx_APIbased_nlp2sql_toolkit - **Description**: API-based Toolkit for leveraging watsonx with Power Virtual Servers - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-06-04 - **Last Updated**: 2026-05-05 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # NLP2SQL Toolkit: watsonx™ and IBM® Power® Virtual Server With the ready-to-use NLP2SQL Toolkit, businesses can simplify data analysis. Whether it's a sales manager, assessing product performance, or a finance team monitor revenue trend, this AI-powered tool makes data analysis more accessible, efficient, and actionable for all.
The Toolkit converts simple text-based questions such as What were the top-selling products last quarter? - into SQL queries that retrieve the necessary data. The use of this toolkit eliminates reliance on technical teams, speeding up decision-making processes and significantly cutting down on the time that is required to implement these new features to fruition. ## Reference Architecture

The above reference architecture diagram illustrates the Toolkit architecture for NLP2 SQL, highlighting its modular design and key considerations. RedHat OpenShift Container Platform is optional, and Toolkit can be installed directly on RHEL as explained in further sections. The overall structure is divided into several components: - Databases which have mission critical data on Power VS - An Enterprise Application for example a core banking enterprise application - API-based NLP2SQL Toolkit and - watsonx AI Services. Databases supported include SAP HANA, Oracle and Postgres SQL which interact with the Enterprise Application. The Toolkit comprises three main layers: - Gen AI Asst - UI layer - API Layer - Database Layer. These layers facilitate API calls and NLP queries, converting natural language inputs into SQL statements. The watsonx.ai services, provided by IBM Cloud SaaS, include foundation models, prompt lab, and watsonx machine learning, which support the Toolkit by deploying pre-packaged LLM models and tuning them as needed. Overall, the diagram conveys a modular and scalable architecture designed to integrate watsonx.ai and PowerVS functionalities seamlessly. ## Installation **Step 1:**
Login to VM , git clone the toolkit repo : (https://github.com/IBM/powervs_watsonx_APIbased_nlp2sql_toolkit)
**Step 2:**
Ensure Python3.8+ and pip is installed

``#python -version``
OR
``#python3 -version``
``#pip version``
If Python or pip is not installed, download Python from python.org and pip typically come bundled with it. **Step 3:** Install packages from requirements.txt(present in the code) ``#pip install -r requirements.txt``

Working with different database systems in Python, specific adapters and extension modules are required to establish connections and run database operations. - **psycopg2:** A database adapter that follows the DB API 2.0 standard, which is designed specifically for PostgreSQL. It is essential for interacting with PostgreSQL databases.
- **oracledb:** A Python extension module that enables seamless connections to Oracle databases, allowing efficient data access and manipulation.
- **hdbcli:** A dedicated Python extension module for SAP HANA, facilitating integration and database operations.

By default, the Toolkit supports all three databases: Oracle, PostgreSQL, and SAP HANA. If your project does not involve PostgreSQL, Oracle, or SAP HANA, you can exclude psycopg2, oracledb, or hdbcli from the requirements.txt file, keeping dependencies minimal and relevant.
**Step 4:** Ensure all packages were installed correctly by listing installed packages: ` #pip list` **Step 5:** Go to the folder “watsonx-integration-server” open the configuration files and update the following parameters

- [apiserver]
Port: Provide the port number at which the flask server must run. List of available ports on IBM Power Virtual Server : https://cloud.ibm.com/docs/power-iaas?topic=power-iaas-network-security - [llmurl]
url: Provide the LLM scoring endpoint deployed on watsonx - [apikey]
api_key: Create a personal API key here : https://cloud.ibm.com/iam/apikeys, and use it to create temporary access tokens.

The resp_config.json file defines the expected structured response format from an LLM that interacts with the toolkit. Defining the format allows an LLM to generate structured, machine-readable responses, ensuring easy integration with API layer.
type: "agent": Indicates that the response is coming from an AI agent.
sections: A list that contains different types of response elements.
- First section:

- type: "text" → This section contains textual data.
- data: A string message informing the user about retrieved transactions.
- Second section:

- type: "table" → This section is meant to hold tabular data.
- data: [] (Empty array) → In case no transactions were found.

The Json structure here constitutes the body of the request sent to watsonx.ai service . Below are the Key description : -
- input: Contains a text prompt formatted in a specific syntax indicating roles and their inputs. Can include Database schema with sample NLP statement and equivalent SQL Query
- parameters: This object contains various parameters for the text generation process:
- decoding_method: The method used to generate the text. In this case, it's set to "greedy".
- max_new_tokens: The maximum number of new tokens (words) to generate. Here, it's set to 100.
- repetition_penalty: A value that discourages the model from repeating the same text. Here, it's set to 1.
- model_id: The ID of the model to use for text generation
- project_id: The ID of the project associated with the model.
- moderations: This object contains settings for moderating the generated text. Here, it includes settings for handling sensitive information (PII) and harmful content (HAP). Both are set to mask any sensitive information with a threshold of 0.5.
**Step 6:** Go to the folder “database-integration-service” open the db configuration files and update accordingly

Below are the values for databases: - - dbtype = 1 for Oracle DB - dbtype = 2 for postgres - dbtype = 3 for HANA DB where
- User: username used to authenticate with the database. - Password: password associated with the username - Host: host / IP address where the database is located - Port: port number where the database is listening for connections - Dbname: Name of the database and in case of SAP HANA it also serves as the schema name Open the file "database_integrate.py" and comment the lines based on the databases you are not using (Oracle, PostgreSQL, or SAP HANA):

**Step 7:** Go to the folder “watsonx-integration-server” and run flask application ` #FLASK_APP=flask_api.py FLASK_RUN_HOST=0.0.0.0 FLASK_RUN_PORT=9476 flask run` Sample Output:

**Step 8:** To set up Gen AI Assistant follow the instructions in the below readme link https://github.com/IBM/powervs_watsonx_APIbased_nlp2sql_toolkit/blob/main/chatbot_ui/README.md