Question Answering (QA) has long been a benchmark task in Natural Language Processing (NLP) in evaluating the proficiency of language models. Traditional QA systems often focus on single-hop questions with direct responses, where answers can be found within a single passage or document. For example,
"What is the normal body temperature for a healthy adult?"
However, the real world is far from one-dimensional, and questions often require aggregating information from multiple sources and making multiple inferences. This is where multi-hop question answering comes into play.
In this article, we delve into the dynamic world of multi-hop QA, exploring how Knowledge Graphs and LLMs collaborate to navigate the intricacies of multi-hop queries and enhance the user's interaction with the vast universe of knowledge.
What is Multi-Hop Question Answering?
Multi-hop question answering (MHQA) is a challenging subfield of QA that involves answering questions that cannot be resolved with a direct answer from a single source or passage. Instead, MHQA systems need to perform complex reasoning and navigate through a knowledge base or a collection of interconnected documents, to find the correct answer. In essence, MHQA requires multiple "hops" of information retrieval and inference to connect the dots and arrive at a comprehensive response. Consider the following query for example:
“What is the link between obesity and diabetes?”
To answer this question effectively, a system would need to go through the following hops:
- Obesity's impact on insulin resistance.
- Insulin resistance as a precursor to type 2 diabetes.

Multi-Hop Question Answering and LLMs
Large Language Models (LLMs) have proven exceptionally capable in multi-hop QA tasks due to their multifaceted strengths. These models shine in complex reasoning, enabling them to navigate through intricate logical inferences and piece together information from various sources to answer challenging MHQA queries.
Their unparalleled contextual understanding allows LLMs to grasp nuanced cues and adapt responses to the specific context of each question. This is particularly vital for MHQA, where answers may depend on the relationships between pieces of information and the context provided.
Furthermore, LLMs are highly scalable, making them well-suited for MHQA tasks that involve extensive knowledge bases or large volumes of documents. They can efficiently process massive amounts of textual data, ensuring comprehensive information retrieval.
Moreover, LLMs are adaptable across diverse domains, making them valuable tools for MHQA applications in fields such as biomedicine, legal research, education, and customer support. Their language skills and reasoning abilities can be applied effectively to various topics, enabling them to tackle multi-hop questions across different knowledge domains.
Vector Similarity Search: Traditional Technique For Question Answering
Vector similarity search is a fundamental technique used in question answering systems to retrieve relevant information from a large corpus of documents or knowledge bases. It relies on the representation of text data as vectors in a high-dimensional space, where similarity between vectors indicates the relevance of documents to a given query.
This search commonly retrieves the top three most similar documents, providing contextual information that can significantly enhance the ability of an LLM to generate accurate and contextually relevant answers. This approach is particularly effective when the vector search can successfully identify and retrieve relevant text segments.
Challenges of Using Vector Search in Multi-Hop Question Answering For Retrieval-Augmented LLMs
Retrieval-augmented large language models represent a significant advancement in the field of NLP. These models, building upon the capabilities of their LLM counterparts, integrate information retrieval techniques to enhance their performance in various tasks. This integration allows LLMs to access vast external knowledge sources, such as documents, databases, or websites, during the question-answering process, significantly expanding their scope and versatility. Retrieval-augmented LLMs leverage the power of both text generation and information retrieval, enabling them to provide contextually rich and precise responses to user queries.
Vector search is a robust technique for conventional question answering tasks, efficiently retrieving relevant information to answer single-hop questions. However, when it comes to the more intricate multi-hop queries for retrieval-augmented LLMs, it faces certain complexities. These complexities arise due to the need to connect information across multiple sources or documents, making it less straightforward than single-hop Q/A. Here are a few issues that highlight the complexities associated with using vector search in MHQA:
- Information Redundancy in Top Documents: One significant challenge is the potential for repeated or redundant information in the top N retrieved documents. The documents retrieved during vector search might focus on certain aspects or entities, leaving out other relevant details. For instance, for a query about a patient's diagnosis, the top documents retrieved through vector search may consistently mention a specific symptom or laboratory result. This can potentially overlook other critical clinical details that are essential for a comprehensive diagnosis or treatment plan. This repetition can lead to a lack of completeness in the retrieved context, hindering the system's ability to answer comprehensively.
- Dealing with Missing Entity References: Depending on the size and segmentation of text chunks, there is a risk of losing references to entities or concepts within documents. While overlapping chunks can help mitigate this issue to some extent, there are situations where references point to other documents. Coreference resolution or additional preprocessing may be necessary to establish links between entities mentioned in different parts of the text. Failure to address this challenge can result in a disjointed context, making it difficult to piece together relevant information.
- Defining the Ideal Number of Retrieved Documents: Determining the optimal number of documents to retrieve for a specific MHQA task can be challenging. Some questions may require a larger set of documents to provide sufficient context for accurate answers. However, in other cases, an excessive number of documents could introduce noise and make it harder for the system to discern the most relevant information. Striking the right balance between providing adequate context and avoiding information overload is a nuanced task that depends on the nature of the question and the available document set.
How Can Knowledge Graphs Enhance Multi-Hop QA for Retrieval-Augmented LLMs
Knowledge graphs are structured representations of information, where data is organized into nodes (representing entities or concepts) and edges (representing relationships between these entities). This structured format makes knowledge graphs an efficient way to store and retrieve interconnected information, providing context and semantics to data.
Integrating Knowledge Graphs with Retrieval-Augmented LLMs represents a synergy between structured knowledge representation and advanced language understanding. By seamlessly combining the benefits of structured data from knowledge graphs with the contextual prowess of LLMs, we create a powerful framework for tackling multi-hop question answering challenges. Here are the different ways knowledge graphs can be leveraged for this purpose:
Knowledge Graphs as Summarized Information Repository
There are various techniques that can be employed to enable large language models to streamline multi-hop question answering. One approach to this involves condensing information to enhance accessibility during queries. This can be achieved by generating document summaries using LLMs and storing these summaries instead of the complete documents. Whether the contextual summarization occurs during data ingestion or at query time, it serves to reduce noise, improve results, and optimize prompt token space usage.
However, while conducting contextual summarization at query time provides more guided context relevant to the question, it can potentially lead to increased user latency. To address this challenge, knowledge graphs can be used to combine and summarize multiple documents as a single knowledge base. A knowledge graph can efficiently connect structured information extracted individually from documents.
The use of knowledge graphs as a preprocessing step in conjunction with LLMs enables the seamless representation of interconnected data and facilitates answering multi-hop questions spanning across various documents. It empowers information extraction and connection before ingestion, reducing the complexity of addressing these issues during query time.
Knowledge Graphs for Incorporating Graph + Textual Data For Multi-Hop Queries
In certain multi-hop question-answering scenarios, the integration of structured knowledge graphs with unstructured text data proves invaluable. For instance, imagine a medical inquiry like, "What are the latest treatment breakthroughs for Alzheimer's disease?". This task entails identifying key researchers or institutions related to Alzheimer's treatment via the knowledge graph and extracting the most recent research articles mentioning their work.
A knowledge graph serves as an optimal platform to represent both structured data, such as entity relationships, and unstructured text through node properties. Techniques like named entity recognition seamlessly bridge the gap between unstructured content and relevant entities within the knowledge graph.
Retrieval-augmented generation applications can benefit greatly from the synergy between structured and unstructured information, as facilitated by knowledge graphs. Knowledge graphs facilitate the storage of diverse data types and establish explicit relationships, significantly enhancing information accessibility and search efficiency.
Knowledge Graphs for Chain-of-Thought QnA
Chain-of-thought question answering using LLM agents has emerged as a powerful approach to information retrieval. LLM agents can break down complex medical queries into sequential steps, making them more manageable. This method proves valuable for multi-hop questions, even when no direct links exist between medical entities in knowledge graphs and unstructured text.
For instance, consider the question: "What are the latest advancements in Alzheimer's disease treatment?"
In situations where connections between medical research and treatments are not explicit, LLM agents employing a chain-of-thought approach shine. They deconstruct the question into sub-questions:
- Identify Treatments: "What are the current treatments for Alzheimer's disease?"
- Retrieve Research: "What is the latest research on these treatments?"
With access to a knowledge graph, the agent can retrieve structured data about Alzheimer's treatments, such as "cholinesterase inhibitors" and "memantine." It then refines the question:
- Refined Question: “What is the latest research on cholinesterase inhibitors and memantine in Alzheimer's disease treatment?"
The agent can leverage various tools, including knowledge graphs, medical databases, or APIs, to answer the subsequent question effectively. This approach is particularly potent for analytical inquiries, such as identifying high-valuation medical companies or prolific researchers in Alzheimer's treatment. Analytical questions often involve data aggregation, filtering, or sorting, which can be challenging with plain vector similarity search on unstructured text.
While chain-of-thought reasoning can introduce some latency due to multiple LLM calls, the potential of integrating knowledge graphs into this approach holds promise for enhancing the depth and precision of multi-hop question answering.
The Future of Knowledge Graphs & LLMs Synergizing For Improved Multi-Hop QnA
The integration of knowledge graphs with retrieval-augmented LLMs signifies a groundbreaking advancement in multi-hop question answering and knowledge retrieval. These combined capabilities hold immense potential across diverse domains, from biomedicine to legal research and more.
Despite the challenges, retrieval-augmented LLMs empowered by knowledge graphs, such as Wisecube's Orpheus, stand as adaptable and resilient tools for context-rich question answering. This synergy promises a future where structured data and language understanding converge to offer users deeper insights and more effective information access.
As this collaboration evolves, it opens doors to innovative knowledge retrieval and generation, bringing us closer to a future of seamless, intelligent interaction with the wealth of human knowledge.
Get in touch with us today to learn more about Wisecube’s Biomedical Knowledge Graph, Orpheus!