Chunking input text is a crucial preprocessing step when using Large Language Models (LLMs) for long or structured documents. However, its impact on downstream task performance remains underexplored. This study presents a comprehensive empirical analysis evaluating the effect of various chunking strategies: fixed-size, overlapping, sentence-based, and paragraph-based, across three fundamental NLP tasks: question answering, text classification, and abstractive summarization. Experiments were conducted using lightweight, open-access models such as Flan-T5, GPT-2, DistilBERT, and RoBERTa on benchmark datasets including SQuAD, CoQA, QuAC, IMDB, Amazon Polarity, CNN/DailyMail, and XSum. Performance was measured using task-appropriate metrics (ROUGE, EM, F1, precision, recall) along with latency. Results reveal that chunking strategies significantly affect performance and latency, with no single approach universally optimal. These findings highlight the need for task-specific chunking choices in practical LLM deployments, especially under resource constraints.
ABSTRACT
PUBLICATION RECORD
- Publication year
2025
- Venue
2025 OITS International Conference on Information Technology (OCIT)
- Publication date
2025-12-18
- Fields of study
Not labeled
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-19 of 19 references · Page 1 of 1
CITED BY
- No citing papers are available for this paper.
Showing 0-0 of 0 citing papers · Page 1 of 1