Enhancing single-stage excavator activity recognition via knowledge distillation of temporal gradient data
DOI: 10.35490/EC3.2023.321
Abstract: Vision-based single-stage construction entity activity recognition methods have been gaining popularity within the construction domain. However, their relatively low per-frame performance necessitates additional post-processing to link the per-frame detection results and construct the corresponding action tubes. To address this problem, this study proposes DIGER, which stands for knowledge DIstillation of temporal Gradient data for Excavator activity Recognition. DIGER is built upon the You Only Watch Once activity recognition method and improves its performance by designing an auxiliary backbone to exploit the complementary information present in the temporal gradient data using knowledge distillation, achieving an activity recognition accuracy of 93.6%.
Keywords: Computer Vision, Construction, Knowledge Distillation, Single-Stage Activity Recognition, Site Monitoring, Temporal Gradient