Abstract:
We explore the transition from closed environments to open world environments and its impact on visual perception (focusing on object recognition and detection) and the field of deep learning. In open world environments, software systems need to adapt to constantly changing conditions and demands, presenting new challenges for deep learning methods. In particular, open world visual perception requires systems to understand and process environments and objects not seen during the training phase, which exceeds the capabilities of traditional closed systems. We first discuss the dynamic and adaptive system requirements brought about by technological advances, highlighting the advantages of open systems over closed systems. Then we delve into the definition of the open world and existing work, covering five dimensions of openness: open set learning, zero-shot learning, few-shot learning, long-tail learning, and incremental learning. In terms of open world recognition, we analyze the core challenges of each dimension and provide quantified evaluation metrics for each task dataset. For open world object detection, we discuss additional challenges compared with recognition, such as occlusion, scale, posture, symbiotic relationships, background interference, etc., and emphasize the importance of simulation environments in constructing open world object detection datasets. Finally, we underscore the new perspectives and opportunities that the concept of the open world brings to deep learning, acting as a catalyst for technological advancement and deeper understanding of the realistic environment challenges, offering a reference for future research.