Why You Should Not Trust AI Completely With Research or Decision Making

Why You Should Not Trust AI Completely With Research or Decision Making

Large Language Models (LLMs) and Generative AI are the latest buzzwords, and for good reason. They can save a lot of time by generating ideas and summarizing lengthy content quickly. One area that really caught my interest is using AI for conducting research and drawing conclusions. While AI can be incredibly helpful, it's important to use it carefully. Full disclosure, although AI was used to edit this article, all the insights shared here are personally researched and curated.

Understanding LLMs

From Eliza to ChatGPT: the stormy development of language models | SURF  Communities

Large Language Models (LLMs) are essentially massive predictive models trained on vast collections of text. They work by predicting words or tokens based on probabilities, determining what typically follows a given sequence of tokens. Traditionally, sequence-to-sequence (seq2seq) models were used in text generation and chatbots. However, LLMs have surpassed these models thanks to their ability to handle long contexts more effectively using a parallel attention mechanism. This advancement has significantly improved the accuracy and coherence of AI-generated text.

Limitation #1: Lack of human-level reasoning

Despite their impressive capabilities, Large Language Models (LLMs) fall short of human-level reasoning. These AI systems operate based on patterns and probabilities derived from vast datasets, but they do not possess a Theory of Mind. This means they lack the ability to recognize and understand individuals' preferences, intentions, desires, emotional states, and knowledge, both in themselves and others. Theory of Mind is a critical component of human cognition, allowing us to navigate complex social interactions, infer motivations, and respond empathetically to others' needs and feelings. Without this capability, AI models can miss the subtleties and nuances that are often crucial for accurate understanding and decision-making. This limitation becomes particularly evident in tasks that require empathy, ethical judgment, or an understanding of context beyond mere data patterns. While LLMs can mimic human-like responses, their lack of genuine comprehension can lead to significant errors and misinterpretations, highlighting the necessity for human oversight and involvement.

Limitation #2: Hallucinations and overconfidence

Another significant limitation of LLMs is their tendency to produce hallucinations, especially when poorly prompted. These models can generate responses that appear confident and authoritative but are actually inaccurate or completely fabricated. This issue is particularly problematic because it can lead to misguided decisions based on false information. For example, there have been instances where lawyers faced sanctions for using generative AI to draft a legal motion, only to discover that the AI had invented case law and citations that did not exist. This overconfidence in AI-generated content underscores the need for careful verification and validation by human experts to ensure the accuracy and reliability of the information provided by AI systems.

Limitation #3: Limited access to real-time information

LLMs also struggle with accessing relevant and up-to-date information. Their knowledge is confined to the data they were trained on, which can quickly become outdated. Attempts have been made to mitigate this issue through Retrieval-Augmented Generation (RAG), a technique that allows LLMs to access a database of information that can be regularly updated without the need to retrain the model. While RAG enhances the LLM's ability to provide more current responses, it is not a perfect solution. The integration process can be complex, and the system still relies on the quality and comprehensiveness of the external database. Thus, while RAG represents a step forward, LLMs still require continuous human oversight to ensure the information they produce is both relevant and accurate.

Limitation #4: Garbage in, garbage out

Garbage In Garbage Out. No matter how advanced your PhDs, Mode... - samim

The principle of "garbage in, garbage out" is particularly pertinent when discussing LLMs. Even with advanced techniques like Retrieval-Augmented Generation (RAG), an AI’s response will only be as accurate as the information it has access to. If the AI is provided with false or misleading information, it will process and present that information as if it were accurate. A striking example of this occurred when a Google AI overview suggested that people should eat rocks, highlighting the potential dangers of misinformation. This issue is challenging not only for AI but also for humans, which is why misinformation remains a pervasive problem. The AI's inability to discern the quality or veracity of the data it processes underscores the critical need for careful curation and validation of the information it uses, emphasizing the importance of human oversight in the AI decision-making process.

Limitation #5: Lack of genuine imagination

Doge Meme Funny Portrait Face Of Dog Closeup Classic Art Design Generative  AI. Stock Photo, Picture and Royalty Free Image. Image 201286789.

LLMs do not possess human-level imaginative capabilities, though they can mimic it based on their training on large datasets. This limitation arises from the fact that LLMs lack a genuine understanding of consciousness. Their outputs are generated through pattern recognition and probabilistic modeling rather than true creative thinking or imagination. Unlike humans, AI does not have personal experiences, emotions, or intentions, which are essential components of authentic imagination. As a result, while LLMs can produce text that appears creative, they do so without the depth and originality that comes from human creativity. This fundamental difference highlights why AI, despite its impressive abilities, cannot fully replicate the nuanced and inherently human aspect of imaginative thinking.

Limitation #6: Complexity in debugging agentic models

Using agentic LLMs to produce complex outputs or perform intricate tasks introduces significant challenges, particularly in debugging. As these agentic pipelines become more sophisticated, pinpointing the source of errors becomes increasingly difficult. Without involving a human in the loop at various stages, there is a higher likelihood of undesired outcomes later in the process. Human supervision is essential to catch potential errors, ensure accuracy, and make necessary adjustments. This involvement helps to mitigate risks associated with the autonomous functioning of LLMs, ensuring that their outputs align with the desired goals and standards.

Wrap up on limitations

In conclusion, while LLMs and Generative AI offer powerful tools for various applications, there are significant limitations that necessitate caution. These limitations underscore the importance of not trusting AI completely with research or decision making. Human involvement remains crucial to ensure accuracy, ethical considerations, and the nuanced understanding that AI currently cannot achieve. As powerful as AI can be, it should be viewed as a tool to augment human capabilities rather than replace them.


Incorporating AI into research and decision-making processes can significantly enhance efficiency and effectiveness when used judiciously for specific tasks. AI's strengths lie in its ability to assist with brainstorming and idea generation, providing fresh perspectives and innovative concepts during the initial stages of research. Additionally, AI can efficiently summarize extensive literature, helping researchers quickly grasp key points and insights from large volumes of academic papers and articles. This capability streamlines the review process and aids in identifying relevant information swiftly. Moreover, AI can uncover specific insights through targeted queries, identifying patterns, trends, and correlations that might not be immediately apparent, thus facilitating focused research and hypothesis development.

Beloga, an AI knowledge OS specifically designed for secondary research, is particularly well-suited for this purpose. One of Beloga's standout features is its robust safeguards against hallucinations, ensuring that the information generated is accurate and reliable. Its context-grounding capabilities are based on your knowledge base, which allows for more relevant and precise insights. Beloga’s proprietary model is adept at understanding context derived from previous searches, stored information, and user behavior, enabling it to search for more accurate and relevant context for the AI to ingest. This results in higher quality outputs that are tailored to your specific needs.