Abstract
This paper considers the problem of automatically learning an activity-based semantic scene model from a stream of video data. A scene model is proposed that labels regions according to an identifiable activity in each region, such as entry/exit zones, junctions, paths, and stop zones. We present several unsupervised methods that learn these scene elements and present results that show the efficiency of our approach. Finally, we describe how the models can be used to support the interpretation of moving objects in a visual surveillance environment.
| Original language | English |
|---|---|
| Pages (from-to) | 397-408 |
| Journal | IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics |
| Volume | 35 |
| Issue number | 3 |
| DOIs | |
| Publication status | Published - Jun 2005 |
Bibliographical note
Note: This work was supported by the Engineering and Physical Sciences Research Council [grant number GR/M58030].Keywords
- Computer science and informatics