Attention and Representation Learning in Byte-Level Digital Forensics: A Survey of Methods, Challenges, and Applications

Teena Mary,Sreeja Cs

Published 2026 in International Journal of Advanced Computer Science and Applications

ABSTRACT

Byte-level analysis has become an essential capability in digital forensics, enabling content-based investigation when file system metadata, headers, or structural information are unavailable or unreliable. Recent advances in deep learning allow forensic systems to learn discriminative features directly from raw byte streams; however, the growing diversity of representation strategies, architectural designs, and attention mechanisms makes it difficult to assess their relative effectiveness and practical suitability. This study presents a structured survey of representation learning and attention-based approaches for byte-level digital forensic analysis. We examine statistical, embedding-based, image-based, sequential, and hybrid representations, and analyze how architectural choices and attention mechanisms influence performance, robustness, and scalability. Across the literature, hybrid representations combined with lightweight convolutional backbones and selective attention mechanisms consistently provide a favorable balance between accuracy and computational efficiency. The survey also reviews key forensic applications, including file fragment classification, malware and binary analysis, network payload forensics, and encrypted or compressed data triage. In addition, we critically discuss challenges related to distribution shift, dataset bias, adversarial vulnerability, interpretability, and reproducibility, along with practical considerations for deployment in large-scale forensic pipelines. By synthesizing architectural trends, operational constraints, and reliability concerns, this work identifies critical research gaps and provides a structured foundation for the development of robust and trustworthy byte-level forensic learning systems.

PUBLICATION RECORD

  • Publication year

    2026

  • Venue

    International Journal of Advanced Computer Science and Applications

  • Publication date

    Unknown publication date

  • Fields of study

    Not labeled

  • Identifiers
  • External record

    Open on Semantic Scholar

  • Source metadata

    Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-60 of 60 references · Page 1 of 1

CITED BY

  • No citing papers are available for this paper.

Showing 0-0 of 0 citing papers · Page 1 of 1