Scene Text Detection using Hyperbolic Tangent Binarization and CLIP

Zhao Chen

Published 2024 in International Conference on Robotics, Intelligent Control and Artificial Intelligence

ABSTRACT

Deep learning-based scene text detection has advanced significantly and has promising application prospects. The binarization of hyperbolic tangent network plus (HTBNet++) is designed to solve the problem of missing small text and dense text in scene images. By designing the backbone network based on CLIP, pretrained text and image features were introduced to achieve better feature extraction. At the same time, text and image prompts play a key role in feature fusion so that the features of text and image pairs can be better applied to text detection. In addition, an auxiliary segmentation loss was designed to guide the network to better perform reverse gradient propagation during the training process. Experimental results on Total-Text and TD500 datasets demonstrate that the proposed method significantly enhances text detection accuracy and robustness.

PUBLICATION RECORD

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-25 of 25 references · Page 1 of 1

CITED BY

  • No citing papers are available for this paper.

Showing 0-0 of 0 citing papers · Page 1 of 1