Overcoming AI Hallucinations Using Knowledge Graphs

An illustration with three objects: a robotic mind, a person in a confused state, and a graph network representing AI hallucinations

Artificial Intelligence systems like Large Language Models (LLMs) have captured global attention since the successful launch of ChatGPT, even though LLMs have been around much longer. These systems are now powering everything from chatbots and content generation to brainstorming ideas and scripting codes. However, as these models become more sophisticated, so does their potential for errors. 

Recently, there have been many cases of large language models like ChatGPT generating inaccurate reports, arguing incorrect facts, and depicting real-world biases in their answers. For example, the following tweet depicts how ChatGPT scripts a code that reflects gender and race biases.

Source: Twitter

This has resulted in growing concerns surrounding the reliability of large language models. 

In this article, we will explore the problems with ChatGPT and other large language models and discuss how knowledge graphs can help to improve upon them to deliver reliable AI-generated outputs.

What Are AI Hallucinations?

AI hallucinations refer to the limitation of AI systems that compels them to generate outputs that sound reasonable but are not grounded in reality or are inconsistent with real-world knowledge. And in some extreme cases, the outputs can be completely unreasonable and nonsensical. For instance, a healthcare organization called Nabla once tested an OpenAI GPT-3 chatbot for medical advice and the chatbot suggested a mock mentally ill patient to kill itself.

These hallucinations occur when models generate outputs based on statistical patterns in the data without fully understanding the underlying meaning or context, resulting in nonsensical outcomes. In some cases, they can also result from training on biased or incomplete data, leading AI systems to make assumptions or draw conclusions that do not align with reality. Recently, Google launched a rival chatbot to ChatGPT called Bard. However, Google’s Bard got off to a rocky start after making a factual error during its first demo, costing Google $100 billion as its shares tumbled. 

AI hallucinations are a significant challenge in developing reliable and trustworthy AI systems, particularly in applications where accuracy is critical.

What Are Large Language Models?

Large Language Models (LLMs) are machine learning algorithms to interpret, translate, and summarize natural language texts. The size of large language models generally spans hundreds of gigabytes with trillions of parameters.

These models use deep neural networks to learn from extensive training data for generating appropriate outputs. By implementing the self-attention mechanisms of a transformer architecture, a large language model can relate words in a sentence, even if the words are positioned incorrectly. The relationships between words are captured by assessing and comparing the attention scores of every word in a text sequence. The training text data for large language models is accumulated from multiple sources – depending on the type, such as books, open internet, articles, social media, or research papers. 

Large language models are used in a wide range of applications as conversational AI, content creation engines, search engines, customer service agents, and more. They are a powerful innovation designed to automate and enhance natural language processing tasks.

What Is Wrong With Large Language Models?

While one cannot undermine the sophistication and fluency of large language models like ChatGPT, it is critical not to rely completely on their results because they are given to have a tendency to hallucinate. Models like ChatGPT can be deficient in supporting factual queries with hard evidence from up-to-date and verifiable information. This results in the model hallucinating and generating inaccurate or outdated responses.

Fundamentally, one of the significant limitations of Large Language Models (LLMs) is their lack of non-linguistic knowledge and common sense reasoning. According to Yann LeCun, an AI researcher, “LLMs do not have a clear understanding of the underlying reality that language describes, and most human knowledge is non-linguistic.” This means that while LLMs can generate grammatically and semantically sound text, they lack the experience of the real world. This makes it difficult for them to generate accurate outputs, mainly when dealing with complex and nuanced topics that require observation in the real world. For example, humans learn to ski through practical trial and error rather than by using language theories.


How machines need to incorporate symbols into the text to replicate human reasoning. Source: Discrete and continuous representations and processing in deep learning: Looking forward

As a result, the usefulness of LLMs in generating precise outputs remains limited and can also adversely affect mission-critical industries like healthcare, finance, or national security. For example, an LLM that generates factually inaccurate medical information may lead to an incorrect medical diagnosis that can result in the loss of human lives. Or a hallucinating LLM that delivers inaccurate legal analysis to a financial company leading to decisions that can incur significant losses. Or an LLM could assist a cyber criminal in generating phishing emails to gain unauthorized access to secure military systems, endangering national security.

What Is a Knowledge Graph?

A knowledge graph is a graphical representation of knowledge in the form of a network of nodes and edges depicting real-world data entities and their relationships. Knowledge graphs give context and meaning to structured and unstructured data, making them understandable by humans and machines. All the information regarding entities and relationships of a knowledge graph is stored in a graph database, which serves as a knowledge base for the graph. 

Knowledge graphs integrate various sources and map relationships across any data store to help organizations retrieve meaningful facts from organizational data and discover new facts through modern data analysis. Their ability to manage the fluctuating nature of real-world data enables them to adapt to changing data, which makes them an essential tool in discovering hidden insights from data.

How Can Knowledge Graphs and LLMs Work Together?

Combining knowledge graphs and large language models can provide a powerful solution to the limitations of an LLM’s linguistic knowledge and can potentially solve the hallucination problem to improve the accuracy of query results. 

Integrating a knowledge graph with a large language model involves incorporating a contextual knowledge base into the model, allowing the model to make logical connections between concepts. This enables the large language model to draw on a variety of information sources, including structured and unstructured data, to generate more accurate and relevant outputs. Moreover, it allows the model to reason with greater depth of understanding and generate more meaningful text.

For example, a biopharma company wants to improve its drug discovery. The organization may want to implement an LLM-based chatbot that can intuitively answer inquiries about clinical trials. However, the LLM may not have access to all the necessary information to provide accurate answers.

To address this issue, the company combines its LLM with a knowledge graph engine to create a detailed medical knowledge base that includes structured and unstructured information about drugs and their trials. So if a user asks about clinical trials of a drug compound, the LLM will quickly refer to the contextual knowledge base of a knowledge graph to identify and analyze all the information related to that compound. This integration can enable the company to extract powerful insights from data and use them to make ground-breaking drug discoveries.

Benefits of Combining Knowledge Graphs and LLMs

Combining knowledge graphs and large language models can offer several benefits, including:

  • Centralized source of accurate knowledge: By connecting the output of a large language model to a knowledge graph engine, data can be centralized in a standardized format, making it easier to access and analyze. The knowledge graph offers a visual representation of context that offers semantic insights that can be used to accurately answer questions about the data, train machine learning models, or power business analytics.
  • Structured knowledge fusion of information in different formats: Knowledge graphs provide LLMs with a structured approach to relating concepts together. By incorporating data into a single, unified view, knowledge graphs can help organize data in an easily understandable format that can be used to make better decisions, identify new insights, and gain a more comprehensive understanding of data.
  • Increased informative value of collected data: By linking together previously siloed and inaccessible data, a knowledge graph engine presents all gathered data as a single source of truth that can be analyzed to uncover hidden gems of information. This gives a contextual depth to the informative value of large language models that the model could not have obtained independently.
  • Gives LLMs a human reference frame of the real-world: Large language models are excellent at processing natural language text but lack a real-world reference frame. By connecting them to a knowledge graph, LLMs can access structured knowledge that reflects the patterns of real-world data to obtain a deeper understanding of nonlinguistic knowledge. This can help them to generate more accurate and contextually relevant responses.

How Does Wisecube Leverage LLMs to Centralize Biomedical Data?

With the large language models growing in popularity, investing in solutions to improve their accuracy and performance has become crucial. This is especially critical for industries like biopharma, responsible for developing effective treatments and improving patient outcomes. This industry requires factual large language models for clinical documentation, analyzing and interpreting large volumes of complex medical data to identify potential drug targets, medical coding, predicting patient outcomes, and detecting medical errors.

As for biopharma researchers, Wisecube’s knowledge graph engine can help them identify new avenues for exploration and discovery in the biomedical field. Wisecube is a biomedical knowledge graph built using cutting-edge technology and ground-breaking AI and NLP algorithms.

The combination of ChatGPT-like large language models and Wisecube’s Knowledge Graph platform can potentially revolutionize biopharma information research and retrieval. You can uncover implicit connections and infer undiscovered links in your medical data by centralizing biomedical knowledge and powering it with large language models. If you are ready to try an AI-based approach to explore patterns, uncover insights and make discoveries in your biomedical research area, contact us today to get started.

Table of Contents

Scroll to Top