Fusing Geoscience Large Language Models and Lightweight RAG for Enhanced Geological Question Answering

Bo Zhou,Kefei Li

Published 2025 in Géosciences

ABSTRACT

Mineral prospecting from vast geological text corpora is impeded by challenges in domain-specific semantic interpretation and knowledge synthesis. General-purpose Large Language Models (LLMs) struggle to parse the complex lexicon and relational semantics of geological texts, limiting their utility for constructing precise knowledge graphs (KGs). Our novel framework addresses this gap by integrating a domain-specific LLM, GeoGPT, with a lightweight retrieval-augmented generation architecture, LightRAG. Within this framework, GeoGPT automates the construction of a high-quality mineral-prospecting KG by performing ontology definition, entity recognition, and relation extraction. The LightRAG component then leverages this KG to power a specialized geological question-answering (Q&A) system featuring a dual-layer retrieval mechanism for enhanced precision and an incremental update capability for dynamic knowledge incorporation. The results indicate that the proposed method achieves a mean F1-score of 0.835 for entity extraction, representing a 17% to 25% performance improvement over general-purpose large models using generic prompts. Furthermore, the geological Q&A model, built upon the LightRAG framework with GeoGPT as its core, demonstrates a superior win rate against the DeepSeek-V3 and Qwen2.5-72B general-purpose large models by 8–29% in the geochemistry domain and 53–78% in the remote sensing geology domain. This study establishes an effective and scalable methodology for intelligent geological text analysis, enabling lightweight, high-performance Q&A systems that accelerate knowledge discovery in mineral exploration.

PUBLICATION RECORD

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-27 of 27 references · Page 1 of 1