Access the full text.
Sign up today, get DeepDyve free for 14 days.
M Sekma (2015)
10.1016/j.patrec.2015.06.029Pattern Recogn Lett, 65
K Chen (2018)
10.1007/s13735-017-0139-6Int J Multimed Inf Retr, 7
YG Jiang (2018)
10.1109/TMM.2018.2823900IEEE Trans Multimed, 20
D Chen (2019)
10.1038/s41746-019-0122-0NPJ Digit Med, 2
P Mamoshina (2016)
10.1021/acs.molpharmaceut.5b00982Mol Pharm, 13
Y Peng (2018)
10.1109/TCSVT.2018.2808685IEEE Trans Circuits Syst Video Technol, 29
Y Guo (2018)
10.1007/s13735-017-0141-zInt J Multimed Inf Retr, 7
CJ Kelly (2019)
10.1186/s12916-019-1426-2BMC Med, 17
SB Bhorge (2018)
10.1007/s13735-018-0152-4Int J Multimed Inf Retr, 7
DG Lowe (2004)
10.1023/B:VISI.0000029664.99615.94Int J Comput Vis, 60
L Wang (2017)
10.1016/j.patrec.2017.04.004Pattern Recogn Lett, 92
S Baker (2011)
10.1007/s11263-010-0390-2Int J Comput Vis, 92
MA Goodale (1992)
10.1016/0166-2236(92)90344-8Trends Neurosci, 15
P Wang (2018)
10.1016/j.cviu.2018.04.007Comput Vis Image Underst, 171
Q Li (2017)
10.1007/s13735-016-0117-4Int J Multimed Inf Retr, 6
Z Wang (2018)
10.1016/j.neucom.2018.01.076Neurocomputing, 287
G Sreenu (2019)
10.1186/s40537-019-0212-5J Big Data, 6
RK Tripathi (2018)
10.1007/s10462-017-9545-7Artif Intell Rev, 50
J Zhang (2019)
10.3390/s19010056Sensors, 19
I Laptev (2005)
10.1007/s11263-005-1838-7Int J Comput Vis, 64
Chuanqi Tan (2018)
10.1007/978-3-030-01424-7_27
Yang Du (2018)
10.1007/978-3-030-01270-0_23
NC Mithun (2019)
10.1007/s13735-018-00166-3Int J Multimed Inf Retr, 8
MM Najafabadi (2015)
10.1186/s40537-014-0007-7J Big Data, 2
U Sivarajah (2017)
10.1016/j.jbusres.2016.08.001J Bus Res, 70
L Kangwei (2018)
10.1007/s11760-017-1153-0Signal Image Video Process, 12
Konstantinos Papadopoulos (2019)
10.3390/s19163503Sensors, 19
Li Yao (2016)
10.1155/2016/1760172Computational Intelligence and Neuroscience, 2016
R Melfi (2013)
10.1016/j.patrec.2013.04.025Pattern Recogn Lett, 34
W Zhang (2019)
10.3390/a12010008Algorithms, 12
L Wang (2003)
10.1016/S0031-3203(02)00100-0Pattern Recogn, 36
Darrick Evensen (2019)
10.1038/s41558-019-0481-1Nature Climate Change, 9
YG Jiang (2017)
10.1109/TPAMI.2017.2670560IEEE Trans Pattern Anal Mach Intell, 40
G Atluri (2018)
10.1145/3161602ACM Comput Surv: CSUR, 51
TF Gonzalez (2007)
10.1201/9781420010749
AF Bobick (2001)
10.1109/34.910878IEEE Trans Pattern Anal Mach Intell, 3
WG Hatcher (2018)
10.1109/ACCESS.2018.2830661IEEE Access, 6
KS Ray (2019)
10.1016/j.jvcir.2018.12.002J Vis Commun Image Represent, 58
Y Yuan (2016)
10.1016/j.patcog.2016.02.022Pattern Recogn, 59
S Levine (2018)
10.1177/0278364917710318Int J Robot Res, 37
Annalisa Cocchia (2014)
10.1007/978-3-319-06160-3_2
J Liu (2019)
10.1007/s10489-019-01459-8Appl Intell, 49
Z Qiu (2017)
10.1109/TMM.2017.2759504IEEE Trans Multimed, 20
Y LeCun (2015)
10.1038/nature14539Nature, 521
GJ Burghouts (2013)
10.1016/j.patrec.2013.01.024Pattern Recogn Lett, 34
A Ullah (2017)
10.1109/ACCESS.2017.2778011IEEE Access, 6
S Naseer (2018)
10.1109/ACCESS.2018.2863036IEEE Access, 6
O Russakovsky (2015)
10.1007/s11263-015-0816-yInt J Comput Vis, 115
MZ Alom (2019)
10.3390/electronics8030292Electronics, 8
N Kruger (2012)
10.1109/TPAMI.2012.272IEEE Trans Pattern Anal Mach Intell, 35
Xiao-Yu Zhang (2019)
10.1609/aaai.v33i01.33019227Proceedings of the AAAI Conference on Artificial Intelligence, 33
Y Deldjoo (2018)
10.1007/s13735-018-0155-1Int J Multimed Inf Retr, 7
Video understanding requires abundant semantic information. Substantial progress has been made on deep learning models in the image, text, and audio domains, and notable efforts have been recently dedicated to the design of deep networks in the video domain. We discuss the state-of-the-art convolutional neural network (CNN) and its pipelines for the exploration of video features, various fusion strategies, and their performances; we also discuss the limitations of CNN for long-term motion cues and the use of sequential learning models such as long short-term memory to overcome these limitations. In addition, we address various multi-model approaches for extracting important cues and score fusion techniques from hybrid deep learning frameworks. Then, we highlight future plans in this domain, recent trends, and substantial challenges for video understanding. This survey’s objectives are to study the plethora of approaches that have been developed for solving video understanding problems, to comprehensively study spatiotemporal cues, to explore the various models that are available for solving these problems and to identify the most promising approaches.
International Journal of Multimedia Information Retrieval – Springer Journals
Published: Jun 24, 2020
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.