To see the other types of publications on this topic, follow the link: Video frame.

Journal articles on the topic 'Video frame'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Video frame.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Liu, Dianting, Mei-Ling Shyu, Chao Chen, and Shu-Ching Chen. "Within and Between Shot Information Utilisation in Video Key Frame Extraction." Journal of Information & Knowledge Management 10, no. 03 (2011): 247–59. http://dx.doi.org/10.1142/s0219649211002961.

Full text
Abstract:
In consequence of the popularity of family video recorders and the surge of Web 2.0, increasing amounts of videos have made the management and integration of the information in videos an urgent and important issue in video retrieval. Key frames, as a high-quality summary of videos, play an important role in the areas of video browsing, searching, categorisation, and indexing. An effective set of key frames should include major objects and events of the video sequence, and should contain minimum content redundancies. In this paper, an innovative key frame extraction method is proposed to select
APA, Harvard, Vancouver, ISO, and other styles
2

Gong, Tao, Kai Chen, Xinjiang Wang, et al. "Temporal ROI Align for Video Object Recognition." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 2 (2021): 1442–50. http://dx.doi.org/10.1609/aaai.v35i2.16234.

Full text
Abstract:
Video object detection is challenging in the presence of appearance deterioration in certain video frames. Therefore, it is a natural choice to aggregate temporal information from other frames of the same video into the current frame. However, ROI Align, as one of the most core procedures of video detectors, still remains extracting features from a single-frame feature map for proposals, making the extracted ROI features lack temporal information from videos. In this work, considering the features of the same object instance are highly similar among frames in a video, a novel Temporal ROI Alig
APA, Harvard, Vancouver, ISO, and other styles
3

Park, Sunghyun, Kangyeol Kim, Junsoo Lee, et al. "Vid-ODE: Continuous-Time Video Generation with Neural Ordinary Differential Equation." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 3 (2021): 2412–22. http://dx.doi.org/10.1609/aaai.v35i3.16342.

Full text
Abstract:
Video generation models often operate under the assumption of fixed frame rates, which leads to suboptimal performance when it comes to handling flexible frame rates (e.g., increasing the frame rate of the more dynamic portion of the video as well as handling missing video frames). To resolve the restricted nature of existing video generation models' ability to handle arbitrary timesteps, we propose continuous-time video generation by combining neural ODE (Vid-ODE) with pixel-level video processing techniques. Using ODE-ConvGRU as an encoder, a convolutional version of the recently proposed ne
APA, Harvard, Vancouver, ISO, and other styles
4

Alsrehin, Nawaf O., and Ahmad F. Klaib. "VMQ: an algorithm for measuring the Video Motion Quality." Bulletin of Electrical Engineering and Informatics 8, no. 1 (2019): 231–38. http://dx.doi.org/10.11591/eei.v8i1.1418.

Full text
Abstract:
This paper proposes a new full-reference algorithm, called Video Motion Quality (VMQ) that evaluates the relative motion quality of the distorted video generated from the reference video based on all the frames from both videos. VMQ uses any frame-based metric to compare frames from the original and distorted videos. It uses the time stamp for each frame to measure the intersection values. VMQ combines the comparison values with the intersection values in an aggregation function to produce the final result. To explore the efficiency of the VMQ, we used a set of raw, uncompressed videos to gene
APA, Harvard, Vancouver, ISO, and other styles
5

Nawaf, O. Alsrehin, and F. Klaib Ahmad. "VMQ: an algorithm for measuring the Video Motion Quality." Bulletin of Electrical Engineering and Informatics 8, no. 1 (2019): 231–38. https://doi.org/10.11591/eei.v8i1.1418.

Full text
Abstract:
This paper proposes a new full-reference algorithm, called Video Motion Quality (VMQ) that evaluates the relative motion quality of the distorted video generated from the reference video based on all the frames from both videos. VMQ uses any frame-based metric to compare frames from the original and distorted videos. It uses the time stamp for each frame to measure the intersection values. VMQ combines the comparison values with the intersection values in an aggregation function to produce the final result. To explore the efficiency of the VMQ, we used a set of raw, uncompressed videos to gene
APA, Harvard, Vancouver, ISO, and other styles
6

Chang, Yuchou, and Hong Lin. "Irrelevant frame removal for scene analysis using video hyperclique pattern and spectrum analysis." Journal of Advanced Computer Science & Technology 5, no. 1 (2016): 1. http://dx.doi.org/10.14419/jacst.v5i1.4035.

Full text
Abstract:
<p>Video often include frames that are irrelevant to the scenes for recording. These are mainly due to imperfect shooting, abrupt movements of camera, or unintended switching of scenes. The irrelevant frames should be removed before the semantic analysis of video scene is performed for video retrieval. An unsupervised approach for automatic removal of irrelevant frames is proposed in this paper. A novel log-spectral representation of color video frames based on Fibonacci lattice-quantization has been developed for better description of the global structures of video contents to measure s
APA, Harvard, Vancouver, ISO, and other styles
7

Hadi Ali, Israa, and Talib T. Al â€“ Fatlawi. "A Proposed Method for Key Frame Extraction." International Journal of Engineering & Technology 7, no. 4.19 (2018): 889–92. http://dx.doi.org/10.14419/ijet.v7i4.19.28063.

Full text
Abstract:
Video structure analysis can be considered as a major step in too many applications, such as video summarization, video browsing, content-based video indexing,and retrieval and so on.  Video structure analysis aims to split the video into its major components( scenes, shots, keyframes). A key frame is one of the fundamental components of video; it can be defined as a frame or set of frames that give a good representation and summarization of whole contents of a shot. It must contain most of the features of the shot that it represented. In this paper, we proposed an easy method for key frame
APA, Harvard, Vancouver, ISO, and other styles
8

Li, Li, Jianfeng Lu, Shanqing Zhang, Linda Mohaisen, and Mahmoud Emam. "Frame Duplication Forgery Detection in Surveillance Video Sequences Using Textural Features." Electronics 12, no. 22 (2023): 4597. http://dx.doi.org/10.3390/electronics12224597.

Full text
Abstract:
Frame duplication forgery is the most common inter-frame video forgery type to alter the contents of digital video sequences. It can be used for removing or duplicating some events within the same video sequences. Most of the existing frame duplication forgery detection methods fail to detect highly similar frames in the surveillance videos. In this paper, we propose a frame duplication forgery detection method based on textural feature analysis of video frames for digital video sequences. Firstly, we compute the single-level 2-D wavelet decomposition for each frame in the forged video sequenc
APA, Harvard, Vancouver, ISO, and other styles
9

K, Ragavan, Venkatalakshmi K, and Vijayalakshmi K. "A Case Study of Key Frame Extraction in Video Processing." Perspectives in Communication, Embedded-systems and Signal-processing - PiCES 4, no. 4 (2020): 17–20. https://doi.org/10.5281/zenodo.3974504.

Full text
Abstract:
Video is an integral part of our everyday lives and in too many fields such as content-based video browsing, compression, video analysing, etc., Video has a complex structure that includes scene, shot, and frame. One of the fundamental techniques in content-based video browsing is key frame extraction. In general, to minimize redundancy the key frame should be representative of the video content. A video can be more than one keyframes. The utilization of key frame extraction method speeds up the framework by choosing fundamental frames and thereby removing additional computation on redundant f
APA, Harvard, Vancouver, ISO, and other styles
10

Li, WenLin, DeYu Qi, ChangJian Zhang, Jing Guo, and JiaJun Yao. "Video Summarization Based on Mutual Information and Entropy Sliding Window Method." Entropy 22, no. 11 (2020): 1285. http://dx.doi.org/10.3390/e22111285.

Full text
Abstract:
This paper proposes a video summarization algorithm called the Mutual Information and Entropy based adaptive Sliding Window (MIESW) method, which is specifically for the static summary of gesture videos. Considering that gesture videos usually have uncertain transition postures and unclear movement boundaries or inexplicable frames, we propose a three-step method where the first step involves browsing a video, the second step applies the MIESW method to select candidate key frames, and the third step removes most redundant key frames. In detail, the first step is to convert the video into a se
APA, Harvard, Vancouver, ISO, and other styles
11

Li, Xin, QiLin Li, Dawei Yin, Lijun Zhang, and Dezhong Peng. "Unsupervised Video Summarization Based on An Encoder-Decoder Architecture." Journal of Physics: Conference Series 2258, no. 1 (2022): 012067. http://dx.doi.org/10.1088/1742-6596/2258/1/012067.

Full text
Abstract:
Abstract The purpose of video summarization is to facilitate large-scale video browsing. Video summarization is a short and concise synopsis of original video. It usually composed of a set of representative video frames from the original video. This paper solves the problem of unsupervised video summarization by developing a Video Summarization Network (VSN) to summarize videos, which is formulated as selecting a sparse subset of video frames that best represents the input video. VSN predicts a probability for each video frame, which indicates the possibility of a frame being selected, and the
APA, Harvard, Vancouver, ISO, and other styles
12

Mahum, Rabbia, Aun Irtaza, Saeed Ur Rehman, Talha Meraj, and Hafiz Tayyab Rauf. "A Player-Specific Framework for Cricket Highlights Generation Using Deep Convolutional Neural Networks." Electronics 12, no. 1 (2022): 65. http://dx.doi.org/10.3390/electronics12010065.

Full text
Abstract:
Automatic ways to generate video summarization is a key technique to manage huge video content nowadays. The aim of video summaries is to provide important information in less time to viewers. There exist some techniques for video summarization in the cricket domain, however, to the best of our knowledge our proposed model is the first one to deal with specific player summaries in cricket videos successfully. In this study, we provide a novel framework and a valuable technique for cricket video summarization and classification. For video summary specific to the player, the proposed technique e
APA, Harvard, Vancouver, ISO, and other styles
13

Wang, Yifan, Hao Wang, Kaijie Wang, and Wei Zhang. "Cloud Gaming Video Coding Optimization Based on Camera Motion-Guided Reference Frame Enhancement." Applied Sciences 12, no. 17 (2022): 8504. http://dx.doi.org/10.3390/app12178504.

Full text
Abstract:
Recent years have witnessed tremendous advances in clouding gaming. To alleviate the bandwidth pressure due to transmissions of high-quality cloud gaming videos, this paper optimized existing video codecs with deep learning networks to reduce the bitrate consumption of cloud gaming videos. Specifically, a camera motion-guided network, i.e., CMGNet, was proposed for the reference frame enhancement, leveraging the camera motion information of cloud gaming videos and the reconstructed frames in the reference frame list. The obtained high-quality reference frame was then added to the reference fra
APA, Harvard, Vancouver, ISO, and other styles
14

Sun, Fan, and Xuedong Tian. "Lecture Video Automatic Summarization System Based on DBNet and Kalman Filtering." Mathematical Problems in Engineering 2022 (August 31, 2022): 1–10. http://dx.doi.org/10.1155/2022/5303503.

Full text
Abstract:
Video summarization for educational scenarios aims to extract and locate the most meaningful frames from the original video based on the main contents of the lecture video. Aiming at the defect of existing computer vision-based lecture video summarization methods that tend to target specific scenes, a summarization method based on content detection and tracking is proposed. Firstly, DBNet is introduced to detect the contents such as text and mathematical formulas in the static frames of these videos, which is combined with the convolutional block attention module (CBAM) to improve the detectio
APA, Harvard, Vancouver, ISO, and other styles
15

He, Fei, Naiyu Gao, Qiaozhe Li, Senyao Du, Xin Zhao, and Kaiqi Huang. "Temporal Context Enhanced Feature Aggregation for Video Object Detection." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 07 (2020): 10941–48. http://dx.doi.org/10.1609/aaai.v34i07.6727.

Full text
Abstract:
Video object detection is a challenging task because of the presence of appearance deterioration in certain video frames. One typical solution is to aggregate neighboring features to enhance per-frame appearance features. However, such a method ignores the temporal relations between the aggregated frames, which is critical for improving video recognition accuracy. To handle the appearance deterioration problem, this paper proposes a temporal context enhanced network (TCENet) to exploit temporal context information by temporal aggregation for video object detection. To handle the displacement o
APA, Harvard, Vancouver, ISO, and other styles
16

Khaing, Thazin Min, Yee Swe Wit, Yi Aung Yi, and Chan Myae Zin Khin. "Key Frame Extraction in Video Stream using Two Stage Method with Colour and Structure." International Journal of Trend in Scientific Research and Development 3, no. 5 (2019): 2536–38. https://doi.org/10.5281/zenodo.3591697.

Full text
Abstract:
Key Frame Extraction is the summarization of videos for different applications like video object recognition and classification, video retrieval and archival and surveillance is an active research area in computer vision. In this paper describe a new criterion for well presentative key frames and correspondingly, create a key frame selection algorithm based Two stage Method. A two stage method is used to extract accurate key frames to cover the content for the whole video sequence. Firstly, an alternative sequence is got based on color characteristic difference between adjacent frames from ori
APA, Harvard, Vancouver, ISO, and other styles
17

SURAJ, M. G., D. S. GURU, and S. MANJUNATH. "RECOGNITION OF POSTAL CODES FROM FINGERSPELLING VIDEO SEQUENCE." International Journal of Image and Graphics 11, no. 01 (2011): 21–41. http://dx.doi.org/10.1142/s021946781100397x.

Full text
Abstract:
In this paper, we present a methodology for recognizing fingerspelling signs in videos. A novel approach of user specific appearance model is proposed for improved recognition performance over the classical appearance based model. Fingerspelt postal index number (PIN) code signs in a video are recognized by identifying the signs in the individual frames of a video. Decomposition of a video into frames results in a large number of frames even for a video of short duration. Each frame is processed to get only the hand frame. This results in a series of hand frames corresponding to each video fra
APA, Harvard, Vancouver, ISO, and other styles
18

Huang, Zhitong, Mohan Zhang, and Jing Liao. "LVCD: Reference-based Lineart Video Colorization with Diffusion Models." ACM Transactions on Graphics 43, no. 6 (2024): 1–11. http://dx.doi.org/10.1145/3687910.

Full text
Abstract:
We propose the first video diffusion framework for reference-based lineart video colorization. Unlike previous works that rely solely on image generative models to colorize lineart frame by frame, our approach leverages a large-scale pretrained video diffusion model to generate colorized animation videos. This approach leads to more temporally consistent results and is better equipped to handle large motions. Firstly, we introduce Sketch-guided ControlNet which provides additional control to finetune an image-to-video diffusion model for controllable video synthesis, enabling the generation of
APA, Harvard, Vancouver, ISO, and other styles
19

Kim, Jeongmin, and Yong Ju Jung. "Multi-Stage Network for Event-Based Video Deblurring with Residual Hint Attention." Sensors 23, no. 6 (2023): 2880. http://dx.doi.org/10.3390/s23062880.

Full text
Abstract:
Video deblurring aims at removing the motion blur caused by the movement of objects or camera shake. Traditional video deblurring methods have mainly focused on frame-based deblurring, which takes only blurry frames as the input to produce sharp frames. However, frame-based deblurring has shown poor picture quality in challenging cases of video restoration where severely blurred frames are provided as the input. To overcome this issue, recent studies have begun to explore the event-based approach, which uses the event sequence captured by an event camera for motion deblurring. Event cameras ha
APA, Harvard, Vancouver, ISO, and other styles
20

Li, Xinjie, and Huijuan Xu. "MEID: Mixture-of-Experts with Internal Distillation for Long-Tailed Video Recognition." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 2 (2023): 1451–59. http://dx.doi.org/10.1609/aaai.v37i2.25230.

Full text
Abstract:
The long-tailed video recognition problem is especially challenging, as videos tend to be long and untrimmed, and each video may contain multiple classes, causing frame-level class imbalance. The previous method tackles the long-tailed video recognition only through frame-level sampling for class re-balance without distinguishing the frame-level feature representation between head and tail classes. To improve the frame-level feature representation of tail classes, we modulate the frame-level features with an auxiliary distillation loss to reduce the distribution distance between head and tail
APA, Harvard, Vancouver, ISO, and other styles
21

Jang, Jaeyoung, Yuseok Ban, and Kyungjae Lee. "Dual-Modality Cross-Interaction-Based Hybrid Full-Frame Video Stabilization." Applied Sciences 14, no. 10 (2024): 4290. http://dx.doi.org/10.3390/app14104290.

Full text
Abstract:
This study aims to generate visually useful imagery by preventing cropping while maintaining resolution and minimizing the degradation of stability and distortion to enhance the stability of a video for Augmented Reality applications. The focus is placed on conducting research that balances maintaining execution speed with performance improvements. By processing Inertial Measurement Unit (IMU) sensor data using the Versatile Quaternion-based Filter algorithm and optical flow, our research first applies motion compensation to frames of input video. To address cropping, PCA-flow-based video stab
APA, Harvard, Vancouver, ISO, and other styles
22

Yuan, Ye, Baolei Wu, Zifan Mo, et al. "Temporal-Spatial Redundancy Reduction in Video Sequences: A Motion-Based Entropy-Driven Attention Approach." Biomimetics 10, no. 4 (2025): 192. https://doi.org/10.3390/biomimetics10040192.

Full text
Abstract:
The existence of redundant video frames results in a substantial waste of computational resources during video-understanding tasks. Frame sampling is a crucial technique in improving resource utilization. However, existing sampling strategies typically adopt fixed-frame selection, which lacks flexibility in handling different action categories. In this paper, inspired by the neural mechanism of the human visual pathway, we propose an effective and interpretable frame-sampling method called Entropy-Guided Motion Enhancement Sampling (EGMESampler), which can remove redundant spatio-temporal info
APA, Harvard, Vancouver, ISO, and other styles
23

Pang, Nuo, Songlin Guo, Ming Yan, and Chien Aun Chan. "A Short Video Classification Framework Based on Cross-Modal Fusion." Sensors 23, no. 20 (2023): 8425. http://dx.doi.org/10.3390/s23208425.

Full text
Abstract:
The explosive growth of online short videos has brought great challenges to the efficient management of video content classification, retrieval, and recommendation. Video features for video management can be extracted from video image frames by various algorithms, and they have been proven to be effective in the video classification of sensor systems. However, frame-by-frame processing of video image frames not only requires huge computing power, but also classification algorithms based on a single modality of video features cannot meet the accuracy requirements in specific scenarios. In respo
APA, Harvard, Vancouver, ISO, and other styles
24

Kawin, Bruce. "Video Frame Enlargments." Film Quarterly 61, no. 3 (2008): 52–57. http://dx.doi.org/10.1525/fq.2008.61.3.52.

Full text
Abstract:
Abstract This essay discusses frame-enlargment technology, comparing digital and photographic alternatives and concluding, after the analysis of specific examples, that frames photographed from a 35mm print are much superior in quality.
APA, Harvard, Vancouver, ISO, and other styles
25

Al Bdour, Nashat. "Encryption of Dynamic Areas of Images in Video based on Certain Geometric and Color Shapes." WSEAS TRANSACTIONS ON INFORMATION SCIENCE AND APPLICATIONS 20 (March 29, 2023): 109–18. http://dx.doi.org/10.37394/23209.2023.20.13.

Full text
Abstract:
The paper is devoted to the search for new approaches to encrypting selected objects in an image. Videos were analyzed, which were divided into frames, and in each video frame, the necessary objects were detected for further encryption. Images of objects with a designated geometric shape and color characteristics of pixels were considered. To select objects, a method was used based on the calculation of average values, the analysis of which made it possible to determine the convergence with the established image. Dividing the selected field into subregions with different shapes solves the prob
APA, Harvard, Vancouver, ISO, and other styles
26

Zhou, Yuanding, Baopu Li, Zhihui Wang, and Haojie Li. "Integrating Temporal and Spatial Attention for Video Action Recognition." Security and Communication Networks 2022 (April 26, 2022): 1–8. http://dx.doi.org/10.1155/2022/5094801.

Full text
Abstract:
In recent years, deep convolutional neural networks (DCNN) have been widely used in the field of video action recognition. Attention mechanisms are also increasingly utilized in action recognition tasks. In this paper, we want to combine temporal and spatial attention for better video action recognition. Specifically, we learn a set of sparse attention by computing class response maps for finding the most informative region in a video frame. Each video frame is resampled with this information to form two new frames, one focusing on the most discriminative regions of the image and the other on
APA, Harvard, Vancouver, ISO, and other styles
27

Zhou, Yuanding, Baopu Li, Zhihui Wang, and Haojie Li. "Integrating Temporal and Spatial Attention for Video Action Recognition." Security and Communication Networks 2022 (April 26, 2022): 1–8. http://dx.doi.org/10.1155/2022/5094801.

Full text
Abstract:
In recent years, deep convolutional neural networks (DCNN) have been widely used in the field of video action recognition. Attention mechanisms are also increasingly utilized in action recognition tasks. In this paper, we want to combine temporal and spatial attention for better video action recognition. Specifically, we learn a set of sparse attention by computing class response maps for finding the most informative region in a video frame. Each video frame is resampled with this information to form two new frames, one focusing on the most discriminative regions of the image and the other on
APA, Harvard, Vancouver, ISO, and other styles
28

Zhou, Yuanding, Baopu Li, Zhihui Wang, and Haojie Li. "Integrating Temporal and Spatial Attention for Video Action Recognition." Security and Communication Networks 2022 (April 26, 2022): 1–8. http://dx.doi.org/10.1155/2022/5094801.

Full text
Abstract:
In recent years, deep convolutional neural networks (DCNN) have been widely used in the field of video action recognition. Attention mechanisms are also increasingly utilized in action recognition tasks. In this paper, we want to combine temporal and spatial attention for better video action recognition. Specifically, we learn a set of sparse attention by computing class response maps for finding the most informative region in a video frame. Each video frame is resampled with this information to form two new frames, one focusing on the most discriminative regions of the image and the other on
APA, Harvard, Vancouver, ISO, and other styles
29

Sinulingga, Hagai R., and Seong G. Kong. "Key-Frame Extraction for Reducing Human Effort in Object Detection Training for Video Surveillance." Electronics 12, no. 13 (2023): 2956. http://dx.doi.org/10.3390/electronics12132956.

Full text
Abstract:
This paper presents a supervised learning scheme that employs key-frame extraction to enhance the performance of pre-trained deep learning models for object detection in surveillance videos. Developing supervised deep learning models requires a significant amount of annotated video frames as training data, which demands substantial human effort for preparation. Key frames, which encompass frames containing false negative or false positive objects, can introduce diversity into the training data and contribute to model improvements. Our proposed approach focuses on detecting false negatives by l
APA, Harvard, Vancouver, ISO, and other styles
30

Guo, Quanmin, Hanlei Wang, and Jianhua Yang. "Night Vision Anti-Halation Method Based on Infrared and Visible Video Fusion." Sensors 22, no. 19 (2022): 7494. http://dx.doi.org/10.3390/s22197494.

Full text
Abstract:
In order to address the discontinuity caused by the direct application of the infrared and visible image fusion anti-halation method to a video, an efficient night vision anti-halation method based on video fusion is proposed. The designed frame selection based on inter-frame difference determines the optimal cosine angle threshold by analyzing the relation of cosine angle threshold with nonlinear correlation information entropy and de-frame rate. The proposed time-mark-based adaptive motion compensation constructs the same number of interpolation frames as the redundant frames by taking the r
APA, Harvard, Vancouver, ISO, and other styles
31

He, Tianyao, Huabin Liu, Yuxi Li, et al. "Collaborative Weakly Supervised Video Correlation Learning for Procedure-Aware Instructional Video Analysis." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 3 (2024): 2112–20. http://dx.doi.org/10.1609/aaai.v38i3.27983.

Full text
Abstract:
Video Correlation Learning (VCL), which aims to analyze the relationships between videos, has been widely studied and applied in various general video tasks. However, applying VCL to instructional videos is still quite challenging due to their intrinsic procedural temporal structure. Specifically, procedural knowledge is critical for accurate correlation analyses on instructional videos. Nevertheless, current procedure-learning methods heavily rely on step-level annotations, which are costly and not scalable. To address this problem, we introduce a weakly supervised framework called Collaborat
APA, Harvard, Vancouver, ISO, and other styles
32

Li, Dengshan, Rujing Wang, Peng Chen, Chengjun Xie, Qiong Zhou, and Xiufang Jia. "Visual Feature Learning on Video Object and Human Action Detection: A Systematic Review." Micromachines 13, no. 1 (2021): 72. http://dx.doi.org/10.3390/mi13010072.

Full text
Abstract:
Video object and human action detection are applied in many fields, such as video surveillance, face recognition, etc. Video object detection includes object classification and object location within the frame. Human action recognition is the detection of human actions. Usually, video detection is more challenging than image detection, since video frames are often more blurry than images. Moreover, video detection often has other difficulties, such as video defocus, motion blur, part occlusion, etc. Nowadays, the video detection technology is able to implement real-time detection, or high-accu
APA, Harvard, Vancouver, ISO, and other styles
33

Guo, Xiaoping. "Intelligent Sports Video Classification Based on Deep Neural Network (DNN) Algorithm and Transfer Learning." Computational Intelligence and Neuroscience 2021 (November 24, 2021): 1–9. http://dx.doi.org/10.1155/2021/1825273.

Full text
Abstract:
Traditional text annotation-based video retrieval is done by manually labeling videos with text, which is inefficient and highly subjective and generally cannot accurately describe the meaning of videos. Traditional content-based video retrieval uses convolutional neural networks to extract the underlying feature information of images to build indexes and achieves similarity retrieval of video feature vectors according to certain similarity measure algorithms. In this paper, by studying the characteristics of sports videos, we propose the histogram difference method based on using transfer lea
APA, Harvard, Vancouver, ISO, and other styles
34

Mielke, Maja, Peter Aerts, Chris Van Ginneken, Sam Van Wassenbergh, and Falk Mielke. "Progressive tracking: a novel procedure to facilitate manual digitization of videos." Biology Open 9, no. 11 (2020): bio055962. http://dx.doi.org/10.1242/bio.055962.

Full text
Abstract:
ABSTRACTDigitization of video recordings often requires the laborious procedure of manually clicking points of interest on individual video frames. Here, we present progressive tracking, a procedure that facilitates manual digitization of markerless videos. In contrast to existing software, it allows the user to follow points of interest with a cursor in the progressing video, without the need to click. To compare the performance of progressive tracking with the conventional frame-wise tracking, we quantified speed and accuracy of both methods, testing two different input devices (mouse and st
APA, Harvard, Vancouver, ISO, and other styles
35

Gill, Harsimranjit Singh, Tarandip Singh, Baldeep Kaur, Gurjot Singh Gaba, Mehedi Masud, and Mohammed Baz. "A Metaheuristic Approach to Secure Multimedia Big Data for IoT-Based Smart City Applications." Wireless Communications and Mobile Computing 2021 (October 4, 2021): 1–10. http://dx.doi.org/10.1155/2021/7147940.

Full text
Abstract:
Media streaming falls into the category of Big Data. Regardless of the video duration, an enormous amount of information is encoded in accordance with standardized algorithms of videos. In the transmission of videos, the intended recipient is allowed to receive a copy of the broadcasted video; however, the adversary also has access to it which poses a serious concern to the data confidentiality and availability. In this paper, a cryptographic algorithm, Advanced Encryption Standard, is used to conceal the information from malicious intruders. However, in order to utilize fewer system resources
APA, Harvard, Vancouver, ISO, and other styles
36

Yang, Yixin, Zhiqang Xiang, and Jianbo Li. "Research on Low Frame Rate Video Compression Algorithm in the Context of New Media." Security and Communication Networks 2021 (September 27, 2021): 1–10. http://dx.doi.org/10.1155/2021/7494750.

Full text
Abstract:
When using the current method to compress the low frame rate video animation video, there is no frame rate compensation for the video image, which cannot eliminate the artifacts generated in the compression process, resulting in low definition, poor quality, and low compression efficiency of the compressed low frame rate video animation video. In the context of new media, the linear function model is introduced to study the frame rate video animation video compression algorithm. In this paper, an adaptive detachable convolutional network is used to estimate the offset of low frame rate video a
APA, Harvard, Vancouver, ISO, and other styles
37

Wu, Wei Qiang, Lei Wang, Qin Yu Zhang, and Chang Jian Zhang. "The RTP Encapsulation Based on Frame Type Method for AVS Video." Applied Mechanics and Materials 263-266 (December 2012): 1803–8. http://dx.doi.org/10.4028/www.scientific.net/amm.263-266.1803.

Full text
Abstract:
According to the characteristics of AVS video data, a RTP encapsulation method based on frame type is proposed. When the video data is encapsulated by the RTP protocol, different types of video data such as sequence header, sequence end, I frame, P frame and B frame are encapsulated with different method. Under the limit of maximum transmission unit (MTU), sequence headers, sequence ends and I frames are encapsulated individually to reduce the packet length and protect the important data. While multiple P frames and B frames are encapsulated into one RTP packet to reduce the quantity of the RT
APA, Harvard, Vancouver, ISO, and other styles
38

Alfian, Alfiansyah Imanda Putra, Rusydi Umar, and Abdul Fadlil. "Penerapan Metode Localization Tampering dan Hashing untuk Deteksi Rekayasa Video Digital." Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) 5, no. 2 (2021): 400–406. http://dx.doi.org/10.29207/resti.v5i2.3015.

Full text
Abstract:
The development of digital video technology which is increasingly advanced makes digital video engineering crimes prone to occur. The change in digital video has changed information communication, and it is easy to use in digital crime. One way to solve this digital crime case is to use the NIST (National Institute of Standards and Technology) method for video forensics. The initial stage is carried out by collecting data and carrying out the process of extracting the collected results. A local hash and noise algorithm can then be used to analyze the resulting results, which will detect any di
APA, Harvard, Vancouver, ISO, and other styles
39

Yao, Ping. "Key Frame Extraction Method of Music and Dance Video Based on Multicore Learning Feature Fusion." Scientific Programming 2022 (January 17, 2022): 1–8. http://dx.doi.org/10.1155/2022/9735392.

Full text
Abstract:
The purpose of video key frame extraction is to use as few video frames as possible to represent as much video content as possible, reduce redundant video frames, and reduce the amount of computation, so as to facilitate quick browsing, content summarization, indexing, and retrieval of videos. In this paper, a method of dance motion recognition and video key frame extraction based on multifeature fusion is designed to learn the complicated and changeable dancer motion recognition. Firstly, multiple features are fused, and then the similarity is measured. Then, the video sequences are clustered
APA, Harvard, Vancouver, ISO, and other styles
40

Saqib, Shazia, and Syed Kazmi. "Video Summarization for Sign Languages Using the Median of Entropy of Mean Frames Method." Entropy 20, no. 10 (2018): 748. http://dx.doi.org/10.3390/e20100748.

Full text
Abstract:
Multimedia information requires large repositories of audio-video data. Retrieval and delivery of video content is a very time-consuming process and is a great challenge for researchers. An efficient approach for faster browsing of large video collections and more efficient content indexing and access is video summarization. Compression of data through extraction of keyframes is a solution to these challenges. A keyframe is a representative frame of the salient features of the video. The output frames must represent the original video in temporal order. The proposed research presents a method
APA, Harvard, Vancouver, ISO, and other styles
41

Liu, Yu-Lun, Yi-Tung Liao, Yen-Yu Lin, and Yung-Yu Chuang. "Deep Video Frame Interpolation Using Cyclic Frame Generation." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 8794–802. http://dx.doi.org/10.1609/aaai.v33i01.33018794.

Full text
Abstract:
Video frame interpolation algorithms predict intermediate frames to produce videos with higher frame rates and smooth view transitions given two consecutive frames as inputs. We propose that: synthesized frames are more reliable if they can be used to reconstruct the input frames with high quality. Based on this idea, we introduce a new loss term, the cycle consistency loss. The cycle consistency loss can better utilize the training data to not only enhance the interpolation results, but also maintain the performance better with less training data. It can be integrated into any frame interpola
APA, Harvard, Vancouver, ISO, and other styles
42

Lee, Ki-Sun, Eunyoung Lee, Bareun Choi, and Sung-Bom Pyun. "Automatic Pharyngeal Phase Recognition in Untrimmed Videofluoroscopic Swallowing Study Using Transfer Learning with Deep Convolutional Neural Networks." Diagnostics 11, no. 2 (2021): 300. http://dx.doi.org/10.3390/diagnostics11020300.

Full text
Abstract:
Background: Video fluoroscopic swallowing study (VFSS) is considered as the gold standard diagnostic tool for evaluating dysphagia. However, it is time consuming and labor intensive for the clinician to manually search the recorded long video image frame by frame to identify the instantaneous swallowing abnormality in VFSS images. Therefore, this study aims to present a deep leaning-based approach using transfer learning with a convolutional neural network (CNN) that automatically annotates pharyngeal phase frames in untrimmed VFSS videos such that frames need not be searched manually. Methods
APA, Harvard, Vancouver, ISO, and other styles
43

Li, Dengshan, Rujing Wang, Chengjun Xie, et al. "A Recognition Method for Rice Plant Diseases and Pests Video Detection Based on Deep Convolutional Neural Network." Sensors 20, no. 3 (2020): 578. http://dx.doi.org/10.3390/s20030578.

Full text
Abstract:
Increasing grain production is essential to those areas where food is scarce. Increasing grain production by controlling crop diseases and pests in time should be effective. To construct video detection system for plant diseases and pests, and to build a real-time crop diseases and pests video detection system in the future, a deep learning-based video detection architecture with a custom backbone was proposed for detecting plant diseases and pests in videos. We first transformed the video into still frame, then sent the frame to the still-image detector for detection, and finally synthesized
APA, Harvard, Vancouver, ISO, and other styles
44

Yan, Bo, Chuming Lin, and Weimin Tan. "Frame and Feature-Context Video Super-Resolution." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 5597–604. http://dx.doi.org/10.1609/aaai.v33i01.33015597.

Full text
Abstract:
For video super-resolution, current state-of-the-art approaches either process multiple low-resolution (LR) frames to produce each output high-resolution (HR) frame separately in a sliding window fashion or recurrently exploit the previously estimated HR frames to super-resolve the following frame. The main weaknesses of these approaches are: 1) separately generating each output frame may obtain high-quality HR estimates while resulting in unsatisfactory flickering artifacts, and 2) combining previously generated HR frames can produce temporally consistent results in the case of short informat
APA, Harvard, Vancouver, ISO, and other styles
45

HOSUR, PRABHUDEV, and ROLANDO CARRASCO. "ENHANCED FRAME-BASED VIDEO CODING TO SUPPORT CONTENT-BASED FUNCTIONALITIES." International Journal of Computational Intelligence and Applications 06, no. 02 (2006): 161–75. http://dx.doi.org/10.1142/s1469026806001939.

Full text
Abstract:
This paper presents the enhanced frame-based video coding scheme. The input source video to the enhanced frame-based video encoder consists of a rectangular-sized video and shapes of arbitrarily shaped objects on video frames. The rectangular frame texture is encoded by the conventional frame-based coding technique and the video object's shape is encoded using the contour-based vertex coding. It is possible to achieve several useful content-based functionalities by utilizing the shape information in the bitstream at the cost of a very small overhead to the bit-rate.
APA, Harvard, Vancouver, ISO, and other styles
46

Zhang, Haokai, Dongwei Ren, Zifei Yan, and Wangmeng Zuo. "Arbitrary Timestep Video Frame Interpolation with Time-Dependent Decoding." Mathematics 12, no. 2 (2024): 303. http://dx.doi.org/10.3390/math12020303.

Full text
Abstract:
Given an observed low frame rate video, video frame interpolation (VFI) aims to generate a high frame rate video, which has smooth video frames with higher frames per second (FPS). Most existing VFI methods often focus on generating one frame at a specific timestep, e.g., 0.5, between every two frames, thus lacking the flexibility to increase the video’s FPS by an arbitrary scale, e.g., 3. To better address this issue, in this paper, we propose an arbitrary timestep video frame interpolation (ATVFI) network with time-dependent decoding. Generally, the proposed ATVFI is an encoder–decoder archi
APA, Harvard, Vancouver, ISO, and other styles
47

Chen, Yongjie, and Tieru Wu. "SATVSR: Scenario Adaptive Transformer for Cross Scenarios Video Super-Resolution." Journal of Physics: Conference Series 2456, no. 1 (2023): 012028. http://dx.doi.org/10.1088/1742-6596/2456/1/012028.

Full text
Abstract:
Abstract Video Super-Resolution (VSR) aims to recover sequences of high-resolution (HR) frames from low-resolution (LR) frames. Previous methods mainly utilize temporally adjacent frames to assist the reconstruction of target frames. However, in the real world, there is a lot of irrelevant information in adjacent frames of videos with fast scene switching, these VSR methods cannot adaptively distinguish and select useful information. In contrast, with a transformer structure suitable for temporal tasks, we devise a novel adaptive scenario video super-resolution method. Specifically, we use opt
APA, Harvard, Vancouver, ISO, and other styles
48

Qu, Zhong, and Teng Fei Gao. "An Improved Algorithm of Keyframe Extraction for Video Summarization." Advanced Materials Research 225-226 (April 2011): 807–11. http://dx.doi.org/10.4028/www.scientific.net/amr.225-226.807.

Full text
Abstract:
Video segmentation and keyframe extraction are the basis of Content-based Video Retrieval (CBVR), in which keyframe selection plays the central role in CBVR. In this paper, as the initialization of keyframe extraction, we proposed an improved approach of key-frame extraction for video summarization. In our approach, videos were firstly segmented into shots according to video content, by our improved histogram-based method, with the use of histogram intersection and nonuniform partitioning and weighting. Then, within each shot, keyframes were determined with the calculation of image entropy as
APA, Harvard, Vancouver, ISO, and other styles
49

Rezvan, Hassan, Elham Rezagholi, and Masood Varshosaz. "Critical Examination of 3D Building Modelling through UAV Frame and Video Imaging." International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLVIII-1-2024 (May 10, 2024): 573–78. http://dx.doi.org/10.5194/isprs-archives-xlviii-1-2024-573-2024.

Full text
Abstract:
Abstract. Data capture in UAV photogrammetry is carried out using two main methodologies: frame frame-based and video video-based. Frame Frame-based data gathering is the preferred method among UAV projects because to its inherent reliability in calibration. Nonetheless, circumstances involving moving objects or occlusions inside the measured region may produce unsatisfactory results utilizing this method. I In response to these challenges, video video-based data collecting appears as a potential option, capable of creating a series of successive images that together alleviate the constraints
APA, Harvard, Vancouver, ISO, and other styles
50

Putra, Arief Bramanto Wicaksono, Rheo Malani, Bedi Suprapty, Achmad Fanany Onnilita Gaffar, and Roman Voliansky. "Inter-Frame Video Compression based on Adaptive Fuzzy Inference System Compression of Multiple Frame Characteristics." Knowledge Engineering and Data Science 6, no. 1 (2023): 1. http://dx.doi.org/10.17977/um018v6i12023p1-14.

Full text
Abstract:
Video compression is used for storage or bandwidth efficiency in clip video information. Video compression involves encoders and decoders. Video compression uses intra-frame, inter-frame, and block-based methods. Video compression compresses nearby frame pairs into one compressed frame using inter-frame compression. This study defines odd and even neighboring frame pairings. Motion estimation, compensation, and frame difference underpin video compression methods. In this study, adaptive FIS (Fuzzy Inference System) compresses and decompresses each odd-even frame pair. First, adaptive FIS train
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!