A severe bandwidth mismatch between incoming sensor data rate and wireless backhaul bandwidth often exists on unmanned probes when collecting new training data for machine learning (ML). To overcome this mismatch, we describe a self-improving ML-based transmission system called Hawk. Starting from a weak model that is trained on just a few examples, it seamlessly pipelines semi-supervised learning, active learning, and transfer learning, with asynchronous bandwidth-sensitive data transmission to a distant human for labeling. When a significant number of true positives (TPs) have been labeled, Hawk trains an improved model to replace the old model. This iterative workflow, called Live Learning, continues until a sufficient number of TPs have been collected. For very rare events on challenging datasets, and bandwidths as low as 12 kbps, a team of 7 probes using Hawk discovers up to 87% of the TPs that could have been discovered via full preview, transmission and labeling of all mission data. Hawk also uses diversity sampling and few-shot learning.
Low-Bandwidth Self-Improving Transmission of Rare Training Data
S. George,Haithem Turki,Ziqiang Feng,Deva Ramanan,Padmanabhan Pillai,Mahadev Satyanarayanan
Published 2023 in ACM/IEEE International Conference on Mobile Computing and Networking
ABSTRACT
PUBLICATION RECORD
- Publication year
2023
- Venue
ACM/IEEE International Conference on Mobile Computing and Networking
- Publication date
2023-10-02
- Fields of study
Computer Science, Engineering
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-42 of 42 references · Page 1 of 1
CITED BY
Showing 1-6 of 6 citing papers · Page 1 of 1