Pham, L., Le, T., Le, C., Ngo, D., Weissenfeld, A., & Schindler, A. (2023). Deep learning based multimodal with two-phase training strategy for daily life video classification. In Proceedings of the 20th International Conference on Content-Based Multimedia Indexing (pp. 238–242). Association for Computing Machinery. https://doi.org/10.1145/3617233.3617248