Unsupervised Context Distillation from Weakly Supervised Data to Augment Video Question Answering

Paul Gaynor

doi:10.2991/978-94-6463-314-6_2

<Previous Article In Volume

Next Article In Volume>

Unsupervised Context Distillation from Weakly Supervised Data to Augment Video Question Answering

Authors

Paul Gaynor¹^{, *}

¹The University of the West Indies, Kingston, Jamaica

^*Corresponding author. Email: paul.gaynor@uwimona.edu.jm

Corresponding Author

Paul Gaynor

Available Online 21 December 2023.

DOI: 10.2991/978-94-6463-314-6_2 How to use a DOI?
Keywords: clustering; video question answering
Abstract: Anomaly detection by tracking if the context of a video stream has changed could be useful, but supervised training to classify video context can be cumbersome and error prone. Instead, we apply a cascade of clustering techniques that operate on a weakly supervised video data lake to extract a context representation of a video sequence. We then train a bi-directional LSTM model to mimic the functionality of the cascade and predict a context representation from video. Additional experiments have shown that if the context is fed as an additional input to a legacy Video Question Answering solution, loss improves by more than 20% relative to it’s baseline after training over 120 epochs, which is significant as current state of the art accuracy for VideoQA solutions is close to 50%. This report is also a demonstration of how to chart a path to freedom from the requirement to explicitly label data, while preserving semantics.
Copyright: © 2023 The Author(s)
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the International e-Conference on Advances in Computer Engineering and Communication Systems (ICACECS 2023)
Series: Atlantis Highlights in Computer Sciences
Publication Date: 21 December 2023
ISBN: 978-94-6463-314-6
ISSN: 2589-4900
DOI: 10.2991/978-94-6463-314-6_2 How to use a DOI?
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

ris enw bib

TY  - CONF
AU  - Paul Gaynor
PY  - 2023
DA  - 2023/12/21
TI  - Unsupervised Context Distillation from Weakly Supervised Data to Augment Video Question Answering
BT  - Proceedings of the International e-Conference on Advances in Computer Engineering and Communication Systems (ICACECS 2023)
PB  - Atlantis Press
SP  - 5
EP  - 17
SN  - 2589-4900
UR  - https://doi.org/10.2991/978-94-6463-314-6_2
DO  - 10.2991/978-94-6463-314-6_2
ID  - Gaynor2023
ER  -

download .riscopy to clipboard