Text to SQL: Building an LLM-Powered Analytics Dashboard

In today’s world, data is being generated at an unprecedented rate, and extracting meaningful insights from this data has become a crucial task for businesses and organizations. Traditional analytics tools can often be complex, requiring technical expertise to understand and use effectively. But what if we could simplify this process, making it as easy as asking a question?

Imagine a tool that combines the power of natural language processing with the precision of structured data analytics. That’s exactly what we can achieve by building an LLM-powered analytics dashboard. By leveraging large language models (LLMs) like OpenAI’s GPT and integrating them with database querying capabilities, we can empower users to get valuable insights simply by asking questions in plain language.

In this blog, we’ll walk you through the process of building an LLM-powered analytics dashboard using Langchain, OpenAI’s GPT models, and a simple SQLite database. Whether you’re new to LLMs or just looking to enhance your existing data tools, this guide will help you create a powerful, intuitive interface for querying and analyzing data. The github link to the project is at: https://github.com/cloudxlab/text-to-sql.

What is LLM?

An LLM, or Large Language Model, is a type of artificial intelligence (AI) model designed to understand, generate, and process human-like text based on vast amounts of data it has been trained on. These models, such as OpenAI’s GPT (Generative Pre-trained Transformer), use advanced machine learning techniques, particularly deep learning, to predict and generate text based on given prompts.

Key Features of LLMs:

  1. Natural Language Understanding: LLMs can interpret and respond to natural language inputs, making them ideal for conversational interfaces, summarization, and more.
  2. Context Awareness: They maintain contextual coherence in conversations, enabling meaningful and contextually relevant responses.
  3. Versatility: LLMs can perform a range of tasks, including answering questions, generating text, code, and creative writing, and assisting in research.

Application in Analytics Dashboards:

When integrated with analytics tools, LLMs enhance user experience by:

  • Allowing users to query data in plain language without requiring SQL or coding expertise.
  • Providing intuitive and human-like interactions.
  • Simplifying the process of extracting insights from structured and unstructured data.

For example, an LLM-powered analytics dashboard can translate a user’s plain-language question like “What were the top-selling products last quarter?” into a database query, retrieve the data, and present it in an easy-to-understand format. This integration democratizes data analytics, making it accessible to non-technical users.

What is an LLM-Powered Analytics Dashboard?

An LLM-Powered Analytics Dashboard is a smart solution that lets users ask questions in natural language, and in return, the system automatically generates and executes the corresponding SQL queries to retrieve the desired data. This creates a seamless, user-friendly interface that does not require any knowledge of SQL or database schema.

For example, instead of manually crafting SQL queries like:

sql Code:
SELECT COUNT("EmployeeId") FROM employees; 

Users can simply ask:

  • “How many employees are there currently?”
  • “Where does Andrew Adams live?”

The system will handle the rest, generating SQL queries, executing them, and providing an easy-to-understand response.

What is Query Generation with LLMs?

Query generation with LLMs involves using AI models like GPT to transform natural language questions into database queries, such as SQL. This capability allows users to interact with databases using plain language, eliminating the need for advanced SQL knowledge.

1. System Prompt:

The system prompt instructs the Large Language Model (LLM) on how to handle the user’s query. It provides the context that the LLM is a MySQL expert. The system prompt template is:

css Code:
 "You are a MySQL expert. Given the database schema {database_schema} and below user's query, generate a SQL query. {user_question}" 

This means that the LLM will:

  • Understand the structure of the database from the Database Schema.
  • Use this knowledge to interpret the User’s Query.
  • Generate a valid and accurate SQL Query.

2. Database Schema:

The Database Schema defines the structure of the database. In this example, the schema describes a table named Orders:

sql Code:
 CREATE TABLE Orders (
     OrderID INTEGER NOT NULL AUTO_INCREMENT,
     UserID INTEGER,
     OrderDate DATETIME NOT NULL,
     TotalAmount DECIMAL(10, 2) NOT NULL,
     PRIMARY KEY (OrderID),
     CONSTRAINT orders_ibfk_1 FOREIGN KEY(UserID) REFERENCES User(UserID)
 ); 

3. User’s Question

The User’s Question specifies the information they are looking for. In this case:

plaintext Code:
"How many smartphones of model 10 were ordered in February?" 

This question implies:

  • Smartphone Model: Focus is on “model 10.”
  • Month: The month is February.
  • Count: The result should return the number of such orders.

4. Final Prompt

The Final Prompt is a combination of the System Prompt, Database Schema, and User’s Question. It provides all necessary context for the LLM to generate the SQL query.

The Final Prompt combines:

  • System Prompt: The LLM’s role as a MySQL expert.
  • Database Schema: The structure of the database.
  • User’s Question: A natural language query, such as “How many smartphones with Model 10 were sold after February this year?”

The final prompt given to the LLM is:

sql Code:
System Prompt + Database Schema + User's question 

Final Prompt:

plaintext Code:
 "You are a MySQL expert. Given the database schema:
 
 CREATE TABLE `Orders` (
     OrderID INTEGER NOT NULL AUTO_INCREMENT,
     UserID INTEGER,
     OrderDate DATETIME NOT NULL,
     TotalAmount DECIMAL(10, 2) NOT NULL,
     PRIMARY KEY (OrderID),
     CONSTRAINT orders_ibfk_1 FOREIGN KEY(UserID) REFERENCES User(UserID)
 );
 
 and below user's query, generate a SQL query.
 
 User's question: 'How many smartphones of model 10 were ordered in February?'"

5. LLM

The LLM (Language Learning Model) processes the Final Prompt to generate an appropriate SQL query based on:

  • Its understanding of the database schema.
  • The user’s question.

6. SQL Query

The SQL Query is the output generated by the LLM:

sql Code:
 SELECT COUNT(*) AS NumberOfSmartphones
 FROM Orders o
 JOIN OrderDetails od ON o.OrderID = od.OrderID
 JOIN Products p ON od.ProductID = p.ProductID
 WHERE p.ProductModel = 10
   AND MONTH(o.OrderDate) = 2
   AND YEAR(o.OrderDate) = YEAR(CURRENT_DATE);

Setting Up the Environment

Before diving into the code, let’s first set up the necessary environment. You’ll need to install Langchain and OpenAI’s integration. You can do this easily using the following command:

bash Code:
 !pip install --upgrade --quiet langchain langchain-community langchain-openai 

After installation, set up your environment by importing the required libraries and configuring the connection to your OpenAI API key.

python Code:
 import openai_config
 import os
 os.environ["OPENAI_API_KEY"] = openai_config.OPENAI_API_KEY 

Make sure you replace openai_config.OPENAI_API_KEY with your actual OpenAI API key.

Connecting to the Database

For this demonstration, we’ll be using an SQLite database (chinook.db), which contains a variety of data, such as artists, albums, customers, employees, invoices, and more. Langchain’s SQLDatabase utility helps connect to the database and manage queries.

python Code:
 from langchain_community.utilities import SQLDatabase
 
 db = SQLDatabase.from_uri("sqlite:///chinook.db")
 print(db.dialect)
 print(db.get_usable_table_names())
 db.run("SELECT * FROM Artists LIMIT 10;") 

Here, we list available tables and fetch some sample data from the “Artists” table to serve as our data source.

Building the Query Chain with Langchain

Next, we need to create a query chain that translates user input into SQL queries. This is done using Langchain’s create_sql_query_chain function, which integrates an LLM-based model (e.g., OpenAI’s GPT) to help translate natural language questions into SQL queries.

python Code:
 from langchain.chains import create_sql_query_chain
 from langchain_openai import ChatOpenAI
 from langchain_community.llms import OpenAI 

 
 llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0, verbose=True)
 chain = create_sql_query_chain(llm, db) 

The create_sql_query_chain function takes care of both generating the query and interpreting the results, ensuring a seamless interaction.

Understanding the Generated Prompts

Langchain provides useful prompts that guide the LLM in generating the right SQL queries. Here’s an example of a prompt that helps the model understand how to interact with the database.

python Code:
 chain.get_prompts()[0].pretty_print() 

This prompt essentially tells the LLM to create syntactically correct SQLite queries and use the results to generate human-readable responses:

css Code:
You are a SQLite expert. Given an input question, first create a syntactically correct SQLite query to run, then look at the results of the query and return the answer to the input question. 

Answering User Questions with SQL

Once the chain is in place, it can respond to user queries. For instance, when a user asks, “How many employees are there currently?”, the system generates the SQL query:

sql Code:
 SELECT COUNT("EmployeeId") FROM employees; 

The query is executed on the database, and the result, such as 8, is processed by the LLM, which then returns a response like: “There are currently 8 employees.”

Here’s how the interaction works:

python Code:
 invoke_chain("How many employees are there currently?") 

This would output:

sql Code:
 SQL query: SELECT COUNT("EmployeeId") AS "TotalEmployees" FROM employees;
 Result: [(8,)]
 Response: There are currently 8 employees. 

Handling More Complex Questions

The LLM can also handle more complex queries. For example, if a user asks, “Where does Andrew Adams live?”, the system generates a query that filters the employees table based on the name and fetches the relevant fields.

python Code:
 invoke_chain("Where does Andrew Adams live?") 

This would result in:

sql Code:
 SQL query: SELECT "Address", "City", "State", "Country" FROM employees WHERE "FirstName" = 'Andrew' AND "LastName" = 'Adams' LIMIT 1;
 Result: [('11120 Jasper Ave NW', 'Edmonton', 'AB', 'Canada')]
 Response: Andrew Adams lives at 11120 Jasper Ave NW, Edmonton, AB, Canada. 

Full Code for Invoking the Chain

Here’s the complete code for generating SQL queries and returning user-friendly answers:

python Code:
 from langchain.prompts import PromptTemplate
 from langchain.chains import LLMChain
 
 def invoke_chain(user_question):
     # Generate SQL query from the user's question
     sql_query = chain.invoke({"question": user_question})
     print("SQL query:", sql_query, end="\n\n")
 
     # Execute the SQL query against the database and get the result
     result = db.run(sql_query)
     print("Result:", result, end="\n\n")
 
     # Create a prompt template to generate a response from the SQL result
     answer_prompt = PromptTemplate.from_template(
         """Given the following user question, corresponding SQL query, and SQL result, generate a proper reply to give to user
 
         Question: {question}
         SQL Query: {query}
         SQL Result: {result}
         Answer: """
     )
 
     # Generate the answer using LLMChain
     llm = LLMChain(llm=OpenAI(), prompt=answer_prompt)
     ans = llm(inputs={"question": user_question, "query": sql_query, "result": result})
 
     # Print the final response
     print("Response:", ans['text']) 

Conclusion

Building an LLM-Powered Analytics Dashboard combines the best of AI and traditional database querying. By integrating Langchain and OpenAI’s GPT models, you can create a powerful tool that allows users to interact with data using natural language, eliminating the need to understand SQL syntax or database structures. This solution offers a streamlined approach to business analysis, customer support, or general data exploration, making data-driven decisions more accessible and efficient for all.

By enabling faster and simpler access to valuable insights, this dashboard makes interacting with complex data systems easier and more intuitive for non-technical users.

Revolutionizing Banking: The Power of Personalization and Intelligent Analytics

The banking world is evolving rapidly, shifting from generic, one-size-fits-all services to personalized experiences tailored to each customer. With the help of modern technologies like AI, big data, and automation, banks are delivering services that feel more human and intuitive—building stronger connections with their customers.

In this blog, we’ll explore how banks transform with personalization technologies, the benefits of these innovations, and how banks can implement personalized experiences effectively.

What Is Personalization in Banking?

Personalization in banking is about delivering services and products that match individual customers’ needs, preferences, and goals. Instead of bombarding customers with irrelevant offers, banks now focus on understanding their unique journeys to provide meaningful solutions.

For example:

  • Recommending savings plans for customers who aim to buy a home.
  • Offering travel credit cards to frequent flyers.
  • Alerting customers about duplicate subscriptions to help them save money.

This approach ensures customers feel valued and supported, making banking more enjoyable and efficient.

Continue reading “Revolutionizing Banking: The Power of Personalization and Intelligent Analytics”

Intelligent Email Response: Transforming Customer Service with AI

Imagine you have a big toy box with lots of toys scattered everywhere. It takes a lot of time to sort them out, right? Businesses have something like that too, but instead of toys, they get tons of emails every single day! Some emails are like questions, some are like complaints, and some are friendly suggestions. It’s hard to go through all of them one by one. But guess what? AI, like a super-smart robot helper, can make this easy!

The Problem with Manual Email Handling

Now, let’s say a bank gets 5,000 emails in just one day. That’s like a huge mountain of messages! If people try to read and reply to all of them, it will take forever. Customers might get tired of waiting, and some might even leave because they’re upset. Also, people make mistakes sometimes. They might not understand the email properly or reply in a way that doesn’t help much. This could make customers even more unhappy and might make them leave the bank.

But don’t worry—there’s a cool way to fix this! Wanna know how? 😊

Continue reading “Intelligent Email Response: Transforming Customer Service with AI”

AI-Powered Innovations for a Sustainable Supply Chain Future

The supply chain is the backbone of global commerce, connecting manufacturers, suppliers, distributors, and consumers. However, traditional supply chain models are under increasing pressure to become more sustainable, efficient, and resilient in response to environmental challenges, consumer demands, and economic disruptions. Artificial Intelligence (AI) is emerging as a game-changer, revolutionizing supply chains to meet these demands while fostering sustainability.

The Challenges with Traditional Supply Chains

Traditional supply chains face significant challenges in achieving sustainability and efficiency due to:

  • Inefficient Resource Utilization: Overproduction, waste, and under utilization of resources increase environmental footprints.
  • Limited Visibility: Lack of end-to-end transparency makes it difficult to identify inefficiencies or environmental impacts.
  • Demand Uncertainty: Fluctuating consumer demands lead to inventory mismatches and wastage.
  • Complex Logistics: Managing global supply chains with multiple stakeholders is prone to errors and delays.
  • Environmental Regulations: Increasingly stringent policies demand green practices and accountability.

These challenges necessitate innovative approaches, and AI is stepping in to transform supply chains into sustainable and efficient ecosystems.

How AI is Transforming Supply Chains

Continue reading “AI-Powered Innovations for a Sustainable Supply Chain Future”

Generative AI: Revolutionizing Drug Discovery and Development

In this blog, we’ll explore how Generative Artificial Intelligence (AI) is transforming the drug discovery process, making it faster, more accurate, and more cost-effective. Developing new drugs has always been a complex, costly, and time-consuming journey. From the initial research stages to the final approval of a drug, the entire process can take over a decade and cost billions of dollars. But with the rise of AI, the pharmaceutical industry is experiencing a major shift. AI is helping researchers uncover new treatments faster, streamline the discovery process, and improve the chances of success—ultimately benefiting both patients and the healthcare system.

Challenges in Traditional Drug Discovery

Drug discovery and development have long been expensive and slow due to several key challenges:

Continue reading “Generative AI: Revolutionizing Drug Discovery and Development”

Personalized Financial Planning Made Easy with AI-Powered Bots

Managing your money can feel like a big puzzle. You might wonder, “How much should I save?” or “Are my investments enough for the future?” These are important questions, but finding answers often means talking to financial experts. This can take a lot of time and money. What if there was a simpler, easier way?

In this blog, we’ll discuss how a smart digital helper called an AI-powered bot can make financial planning easy by giving tips tailored just for you.

Why a Bot for Financial Planning?

Think of this bot as your friendly money assistant. It helps you decide how much to save and invest to live comfortably now and after retirement. You don’t need to be a math genius or a financial expert.

Continue reading “Personalized Financial Planning Made Easy with AI-Powered Bots”

How AI is Revolutionizing Cybersecurity: Detecting and Preventing Modern Threats

Cybersecurity has become one of the biggest concerns in today’s digital age. From online banking and shopping to social media and cloud storage, we depend heavily on the internet for almost everything. However, this reliance comes with risks—hackers, malware, phishing attacks, and data leaks are becoming more advanced every day. Fortunately, Artificial Intelligence (AI) is stepping in to make the digital world safer.

The Challenges with Traditional Cybersecurity

Cybersecurity traditionally depended on human experts and predefined rules to detect and stop threats. While this approach worked for many years, it struggles to keep up with today’s cybercriminals. Because:

  • Evolving Threats: Hackers are constantly creating new types of attacks that traditional systems can’t recognize.
  • Massive Data: The huge volume of data generated every second makes it impossible for humans to monitor everything manually.
  • Speed of Attacks: Cyberattacks happen in seconds, leaving little time for manual responses.
  • Hidden Threats: Advanced malware often hides within normal-looking traffic, making detection harder.
Continue reading “How AI is Revolutionizing Cybersecurity: Detecting and Preventing Modern Threats”

Personalized Customer Service: The Power of Chatbots and Virtual Assistants

Providing excellent customer service has always been a cornerstone of successful businesses. But in today’s fast-paced world, customers expect immediate, personalized, and accurate responses. This is where AI-powered chatbots and virtual assistants come into play. With the advancements in generative AI architectures, such as Retrieval-Augmented Generation (RAG), industries are now able to tailor these technologies for specific needs, offering unmatched personalization and efficiency.

The Problem with Traditional Customer Service

Continue reading “Personalized Customer Service: The Power of Chatbots and Virtual Assistants”

How AI is Revolutionizing Claims Management and Personalized Auto Insurance

Managing insurance claims is seen as a complicated and lengthy process. Insurance companies receive numerous claims daily, from vehicle accidents and medical expenses to property damage. Manually handling these claims can result in delays, errors, and fraud. We can use Artificial intelligence to simplify the process.

The Problem with Traditional Claims Management

When you make an insurance claim, here’s what usually happens:

  1. You submit your documents (medical bills, photos of damage, etc.).
  2. The insurance company reviews everything manually—a process that can take weeks.
  3. They assess your claim to determine if it’s valid and how much money should be paid.
  4. The claim is either approved or rejected.

While this process sounds straightforward, it’s full of challenges, such as:

  1. It’s Slow: Manually going through forms, photos, and receipts takes much time.
  2. It’s Expensive: Insurance companies need big teams to process claims.
  3. It’s Prone to Errors: Humans can make mistakes when reviewing claims.
  4. It’s Vulnerable to Fraud: Detecting fake claims is difficult without proper tools.

All these issues make it clear that insurance companies need smarter solutions—and that’s where AI comes into the picture.

How AI is Solving These Challenges

Continue reading “How AI is Revolutionizing Claims Management and Personalized Auto Insurance”

Building Real-Time Analytics Dashboard Using Apache Spark

Apache Spark

In this blog post, we will learn how to build a real-time analytics dashboard using Apache Spark streaming, Kafka, Node.js, Socket.IO and Highcharts.

Complete Spark Streaming topic on CloudxLab to refresh your Spark Streaming and Kafka concepts to get most out of this guide.

Problem Statement

An e-commerce portal (http://www.aaaa.com) wants to build a real-time analytics dashboard to visualize the number of orders getting shipped every minute to improve the performance of their logistics.

Solution

Before working on the solution, let’s take a quick look at all the tools we will be using:

Apache Spark – A fast and general engine for large-scale data processing. It is 100 times faster than Hadoop MapReduce in memory and 10x faster on disk. Learn more about Apache Spark here

Python – Python is a widely used high-level, general-purpose, interpreted, dynamic programming language. Learn more about Python here

Kafka – A high-throughput, distributed, publish-subscribe messaging system. Learn more about Kafka here

Node.js – Event-driven I/O server-side JavaScript environment based on V8. Learn more about Node.js here

Socket.IO – Socket.IO is a JavaScript library for real-time web applications. It enables real-time, bi-directional communication between web clients and servers. Read more about Socket.IO here

Highcharts – Interactive JavaScript charts for web pages. Read more about Highcharts here

CloudxLab – Provides a real cloud-based environment for practicing and learn various tools. You can start practicing right away by just signing up online.

How To Build A Data Pipeline?

Below is the high-level architecture of the data pipeline

Data Pipeline
Data Pipeline

Our real-time analytics dashboard will look like this

Real-Time Analytics Dashboard
Real-Time Analytics Dashboard

Continue reading “Building Real-Time Analytics Dashboard Using Apache Spark”