To see the other types of publications on this topic, follow the link: In-frame sequence features.

Journal articles on the topic 'In-frame sequence features'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'In-frame sequence features.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Bulbul, Mohammad Farhad, Amin Ullah, Hazrat Ali, and Daijin Kim. "A Deep Sequence Learning Framework for Action Recognition in Small-Scale Depth Video Dataset." Sensors 22, no. 18 (2022): 6841. http://dx.doi.org/10.3390/s22186841.

Full text
Abstract:
Depth video sequence-based deep models for recognizing human actions are scarce compared to RGB and skeleton video sequences-based models. This scarcity limits the research advancements based on depth data, as training deep models with small-scale data is challenging. In this work, we propose a sequence classification deep model using depth video data for scenarios when the video data are limited. Unlike summarizing the frame contents of each frame into a single class, our method can directly classify a depth video, i.e., a sequence of depth frames. Firstly, the proposed system transforms an input depth video into three sequences of multi-view temporal motion frames. Together with the three temporal motion sequences, the input depth frame sequence offers a four-stream representation of the input depth action video. Next, the DenseNet121 architecture is employed along with ImageNet pre-trained weights to extract the discriminating frame-level action features of depth and temporal motion frames. The extracted four sets of feature vectors about frames of four streams are fed into four bi-directional (BLSTM) networks. The temporal features are further analyzed through multi-head self-attention (MHSA) to capture multi-view sequence correlations. Finally, the concatenated genre of their outputs is processed through dense layers to classify the input depth video. The experimental results on two small-scale benchmark depth datasets, MSRAction3D and DHA, demonstrate that the proposed framework is efficacious even for insufficient training samples and superior to the existing depth data-based action recognition methods.
APA, Harvard, Vancouver, ISO, and other styles
2

Hu, Yan. "Reliability Analysis of Multi-Objective Spatio-Temporal Segmentation of Human Motion in Video Sequences." International Journal of Distributed Systems and Technologies 12, no. 1 (2021): 16–29. http://dx.doi.org/10.4018/ijdst.2021010102.

Full text
Abstract:
In view of the problem of uneven distribution of edge contour of multi-target human motion image in video sequence, which leads to the decline of target detection ability, an algorithm of multi-target spatial-temporal segmentation of human motion in video sequence based on edge contour feature detection and block fusion is proposed. Firstly, a multi-target spatial-temporal detection model of human motion in video sequence was constructed, extracting video image frame sequence, using discrete frame fusion method to segment and fuse moving target image, matching moving multi-target in video sequence, secondly segmenting motion features in moving target image, combining with SURF algorithm (speeded up robust features, accelerated robust features) to detect and extract human motion objects in video sequence. The experimental results show that the gray histogram of human motion multi-target space-time segmentation is close to the original image histogram, and the detection and recognition ability of human motion target is improved.
APA, Harvard, Vancouver, ISO, and other styles
3

Shi, Yunyu, Haisheng Yang, Ming Gong, Xiang Liu, and Yongxiang Xia. "A Fast and Robust Key Frame Extraction Method for Video Copyright Protection." Journal of Electrical and Computer Engineering 2017 (2017): 1–7. http://dx.doi.org/10.1155/2017/1231794.

Full text
Abstract:
The paper proposes a key frame extraction method for video copyright protection. The fast and robust method is based on frame difference with low level features, including color feature and structure feature. A two-stage method is used to extract accurate key frames to cover the content for the whole video sequence. Firstly, an alternative sequence is got based on color characteristic difference between adjacent frames from original sequence. Secondly, by analyzing structural characteristic difference between adjacent frames from the alternative sequence, the final key frame sequence is obtained. And then, an optimization step is added based on the number of final key frames in order to ensure the effectiveness of key frame extraction. Compared with the previous methods, the proposed method has advantage in computation complexity and robustness on several video formats, video resolution, and so on.
APA, Harvard, Vancouver, ISO, and other styles
4

Du, Ping, and Changhe Tu. "Quadrangulations of Animation Sequence." International Journal of Pattern Recognition and Artificial Intelligence 31, no. 11 (2017): 1754021. http://dx.doi.org/10.1142/s0218001417540210.

Full text
Abstract:
This paper presents a novel approach to quadrangulate a sequence of meshes with global optimization method, generating consistent quad meshes expressing global topological and geometric characteristics during animation. The key contribution is that our method first proposes a grouping strategy to extract the geometric and topological features of the sequence in animation quadrangulation. Animation sequence is firstly divided into groups through picked key frames. Then analyze the local deformation of each group and detect the surface feature for each reference frame, inducing a set of hard and soft constraints for each key frame of every subgroup, next smooth the cross field of each reference frame and combine them through singularities for the whole sequence, finally compute a field guided global parametrization and quad meshes extraction with an off-the-shelf method. It is effective and easy to implement, and experiment results and comparisons show that our method achieves better quality.
APA, Harvard, Vancouver, ISO, and other styles
5

Loeb, D. D., R. W. Padgett, S. C. Hardies, et al. "The sequence of a large L1Md element reveals a tandemly repeated 5' end and several features found in retrotransposons." Molecular and Cellular Biology 6, no. 1 (1986): 168–82. http://dx.doi.org/10.1128/mcb.6.1.168-182.1986.

Full text
Abstract:
The complete nucleotide sequence of a 6,851-base pair (bp) member of the L1Md repetitive family from a selected random isolate of the BALB/c mouse genome is reported here. Five kilobases of the element contains two overlapping reading frames of 1,137 and 3,900 bp. The entire 3,900-bp frame and the 3' 600 bp of the 1,137-bp frame, when compared with a composite consensus primate L1 sequence, show a ratio of replacement to silent site differences characteristic of protein coding sequences. This more closely defines the protein coding capacity of this repetitive family, which was previously shown to possess a large open reading frame of undetermined extent. The relative organization of the 1,137- and 3,900-bp reading frames, which overlap by 14 bp, bears resemblance to protein-coding, mobile genetic elements. Homology can be found between the amino acid sequence of the 3,900-bp frame and selected domains of several reverse transcriptases. The 5' ends of the two L1Md elements described in this report have multiple copies, 4 2/3 copies and 1 2/3 copy, of a 208-bp direct tandem repeat. The sequence of this 208-bp element differs from the sequence of a previously defined 5' end for an L1Md element, indicating that there are at least two different 5' end motifs for L1Md.
APA, Harvard, Vancouver, ISO, and other styles
6

Loeb, D. D., R. W. Padgett, S. C. Hardies, et al. "The sequence of a large L1Md element reveals a tandemly repeated 5' end and several features found in retrotransposons." Molecular and Cellular Biology 6, no. 1 (1986): 168–82. http://dx.doi.org/10.1128/mcb.6.1.168.

Full text
Abstract:
The complete nucleotide sequence of a 6,851-base pair (bp) member of the L1Md repetitive family from a selected random isolate of the BALB/c mouse genome is reported here. Five kilobases of the element contains two overlapping reading frames of 1,137 and 3,900 bp. The entire 3,900-bp frame and the 3' 600 bp of the 1,137-bp frame, when compared with a composite consensus primate L1 sequence, show a ratio of replacement to silent site differences characteristic of protein coding sequences. This more closely defines the protein coding capacity of this repetitive family, which was previously shown to possess a large open reading frame of undetermined extent. The relative organization of the 1,137- and 3,900-bp reading frames, which overlap by 14 bp, bears resemblance to protein-coding, mobile genetic elements. Homology can be found between the amino acid sequence of the 3,900-bp frame and selected domains of several reverse transcriptases. The 5' ends of the two L1Md elements described in this report have multiple copies, 4 2/3 copies and 1 2/3 copy, of a 208-bp direct tandem repeat. The sequence of this 208-bp element differs from the sequence of a previously defined 5' end for an L1Md element, indicating that there are at least two different 5' end motifs for L1Md.
APA, Harvard, Vancouver, ISO, and other styles
7

Jiang, Zhuhan. "Object Modelling and Tracking in Videos via Multidimensional Features." ISRN Signal Processing 2011 (February 16, 2011): 1–15. http://dx.doi.org/10.5402/2011/173176.

Full text
Abstract:
We propose to model a tracked object in a video sequence by locating a list of object features that are ranked according to their ability to differentiate against the image background. The Bayesian inference is utilised to derive the probabilistic location of the object in the current frame, with the prior being approximated from the previous frame and the posterior achieved via the current pixel distribution of the object. Consideration has also been made to a number of relevant aspects of object tracking including multidimensional features and the mixture of colours, textures, and object motion. The experiment of the proposed method on the video sequences has been conducted and has shown its effectiveness in capturing the target in a moving background and with nonrigid object motion.
APA, Harvard, Vancouver, ISO, and other styles
8

Wang, Yao, Lihua Cao, Keke Su, Deen Dai, Ning Li, and Di Wu. "Infrared Moving Small Target Detection Based on Space–Time Combination in Complex Scenes." Remote Sensing 15, no. 22 (2023): 5380. http://dx.doi.org/10.3390/rs15225380.

Full text
Abstract:
In the infrared small target images with complex backgrounds, there exist various interferences that share similar characteristics with the target (such as building edges). The accurate detection of small targets is crucial in applications involving infrared search and tracking. However, traditional detection methods based on small target feature detection in a single frame image may result in higher error rates due to insufficient features. Therefore, in this paper, we propose an infrared moving object detection method that integrates spatio-temporal information. To address the limitations of single-frame detection, we introduce a temporal sequence of images to suppress false alarms caused by single-frame detection through analyzing motion features within the sequence. Firstly, based on spatial feature detection, we propose a multi-scale layered contrast feature (MLCF) filtering for preliminary target extraction. Secondly, we utilize the spatio-temporal context (STC) as a feature to track the image sequence point by point, obtaining global motion features. Statistical characteristics are calculated to obtain motion vector data that correspond to abnormal motion, enabling the accurate localization of moving targets. Finally, by combining spatial and temporal features, we determine the precise positions of the targets. The effectiveness of our method is evaluated using a real infrared dataset. Through analysis of the experimental results, our approach demonstrates stronger background suppression capabilities and lower false alarm rates compared to other existing methods. Moreover, our detection rate is similar or even superior to these algorithms, providing further evidence of the efficacy of our algorithm.
APA, Harvard, Vancouver, ISO, and other styles
9

Шевяков, Ю. І., В. В. Ларін, є. л. Казаков, and Ахмед Абдалла. "The video processing features research in computer systems and special purpose networks." Системи озброєння і військова техніка, no. 4(64), (December 17, 2020): 126–32. http://dx.doi.org/10.30748/soivt.2020.64.16.

Full text
Abstract:
For a typical low complexity video sequence, the weight of each P-frame in the stream is approximately three times smaller than the I-frame weight. However, taking into account the number of P-frames in the group, they make the main contribution to the total video data amount. Therefore, the possibility of upgrading coding methods for P-frames is considered on preliminary blocks' type identification with the subsequent formation of block code structures. As the correlation coefficient between adjacent frames increases, the compression ratio of the differential-represented frame's binary mask increases. The compression ratio of the differential-represented frame's binary mask varies from 3 to 21 depending on the correlation coefficient between adjacent frames. The most preferable method for constructing the compact representation technology of the binary masks of frames represented in a differential form is the approach. This is based on the identification and description of the lengths of one-dimensional binary series. A binary series is a consecutive binary elements sequence with the same value. In this case, sequences of identical binary elements are replaced by their lengths.
APA, Harvard, Vancouver, ISO, and other styles
10

Nataraj, Sathees Kumar, M. P. Paulraj, Ahmad Nazri Bin Abdullah, and Sazali Bin Yaacob. "A systematic approach for segmenting voiced/unvoiced signals using fuzzy-logic system and general fusion of neural network models for phonemes-based speech recognition." Journal of Intelligent & Fuzzy Systems 39, no. 5 (2020): 7411–29. http://dx.doi.org/10.3233/jifs-200780.

Full text
Abstract:
In this paper, a speech-to-text translation model has been developed for Malaysian speakers based on 41 classes of Phonemes. A simple data acquisition algorithm has been used to develop a MATLAB graphical user interface (GUI) for recording the isolated word speech signals from 35 non-native Malaysian speakers. The collected database consists of 86 words with 41 classes of phoneme based on Affricatives, Diphthongs, Fricatives, Liquid, Nasals, Semivowels and Glides, Stop and Vowels. The speech samples are preprocessed to eliminate the undesirable artifacts and the fuzzy voice classifier has been employed to classify the samples into voiced sequence and unvoiced sequence. The voiced sequences are divided into frame segments and for each frame, the Linear Predictive co-efficients features are obtained from the voiced sequence. Then the feature sets are formed by deriving the LPC features from all the extracted voiced sequences, and used for classification. The isolated words chosen based on the phonemes are associated with the extracted features to establish classification system input-output mapping. The data are then normalized and randomized to rearrange the values into definite range. The Multilayer Neural Network (MLNN) model has been developed with four combinations of input and hidden activation functions. The neural network models are trained with 60%, 70% and 80% of the total data samples. The neural network architecture was aimed at creating a robust model with 60%, 70%, and 80% of the feature set with 25 trials. The trained network model is validated by simulating the network with the remaining 40%, 30%, and 20% of the set. The reliability of trained network models were compared by measuring true-positive, false-negative, and network classification accuracy. The LPC features show better discrimination and the MLNN neural network models trained using the LPC spectral band features gives better recognition.
APA, Harvard, Vancouver, ISO, and other styles
11

Thirani, Ekta, Jayshree Jain, and Vaibhav Narawade. "Using SVM and KNN to Evaluate Performance Based on Video Plagiarism Detectors and Descriptors for Global Features." Journal of Soft Computing Paradigm 4, no. 2 (2022): 82–100. http://dx.doi.org/10.36548/jscp.2022.2.004.

Full text
Abstract:
The detection of video piracy has improved and emerged as a popular issue in the field of digital video copyright protection because a sequence of videos often comprises a huge amount of data. The major difficulty in achieving efficient and simple video copy detection is to identify compressed and exclusionary video characteristics. To do this, we describe a video copy detection strategy that created the properties for a spatial-temporal domain. The first step is to separate each video sequence into the individual video frame, and then extract the boundaries of each video frame by using PCA SIFT and Hessian- Laplace. Next, for each video frame, we have to implement SVM and KNN features in the spatial and temporal domains to measure their performance matrices in the feature extraction. Finally, the global features found in the Video copy detection are accomplished uniquely and efficiently. Experiments arranged a commonly used VCDB 2014 video dataset, showing that result. The proposed approach is based on various copy detection algorithms and shows various features in terms of both accuracy and efficiency.
APA, Harvard, Vancouver, ISO, and other styles
12

Wang, Meiju, Guoqiang Zhong, Zhaoyang Deng, Kang Zhang, and Peng Jiang. "Recurrent Adversarial Video Prediction Network." Journal of Physics: Conference Series 2278, no. 1 (2022): 012016. http://dx.doi.org/10.1088/1742-6596/2278/1/012016.

Full text
Abstract:
Abstract Mining the intrinsic information of sequential data to predict the future data has a promising research prospect. Considering the temporal features of sequential data, existing approaches generally adopt recurrent neural network and its variants for the prediction. However, for sequences with complex structure, such as video frame sequence, these approaches cannot guarantee to obtain promising prediction results. In this paper, to address the above issue, we propose a novel architecture, called recurrent adversarial video prediction network (RAVPN), which can not only extract the temporal and spatial features of video sequences, but also optimize the generator and discriminator based on the adversarial strategy. Specifically, we use sliding windows with length t +1 and set the (t + 1)-th frame as the label of its previous t frames. The generator takes the first t frames as input and tries to generate the (t + 1)-th frame, while the discriminator distinguishes whether a sample is real or fake to boost the performance of the generator. Experimental results show that our novel RAVPN can obtain a promising performance on video prediction tasks compared with other deep sequence prediction models.
APA, Harvard, Vancouver, ISO, and other styles
13

Xie, Guangda, Yang Li, Yanping Wang, Ziyi Li, and Hongquan Qu. "3D Point Cloud Object Detection Algorithm Based on Temporal Information Fusion and Uncertainty Estimation." Remote Sensing 15, no. 12 (2023): 2986. http://dx.doi.org/10.3390/rs15122986.

Full text
Abstract:
In autonomous driving, LiDAR (light detection and ranging) data are acquired over time. Most existing 3D object detection algorithms propose the object bounding box by processing each frame of data independently, which ignores the temporal sequence information. However, the temporal sequence information is usually helpful to detect the object with missing shape information due to long distance or occlusion. To address this problem, we propose a temporal sequence information fusion 3D point cloud object detection algorithm based on the Ada-GRU (adaptive gated recurrent unit). In this method, the feature of each frame for the LiDAR point cloud is extracted through the backbone network and is fed to the Ada-GRU together with the hidden features of the previous frames. Compared to the traditional GRU, the Ada-GRU can adjust the gating mechanism adaptively during the training process by introducing the adaptive activation function. The Ada-GRU outputs the temporal sequence fusion features to predict the 3D object in the current frame and transmits the hidden features of the current frame to the next frame. At the same time, the label uncertainty of the distant and occluded objects affects the training effect of the model. For this problem, this paper proposes a probability distribution model of 3D bounding box coordinates based on the Gaussian distribution function and designs the corresponding bounding box loss function to enable the model to learn and estimate the uncertainty of the positioning of the bounding box coordinates, so as to remove the bounding box with large positioning uncertainty in the post-processing stage to reduce the false positive rate. Finally, the experiments show that the methods proposed in this paper improve the accuracy of the object detection without significantly increasing the complexity of the algorithm.
APA, Harvard, Vancouver, ISO, and other styles
14

Cho, Du Hyung, M. Naushad Ali, Seok Ju Chun, and Seok Lyong Lee. "Vehicle Association and Tracking in Image Sequences Using Feature-Based Similarity Comparison." Applied Mechanics and Materials 536-537 (April 2014): 176–79. http://dx.doi.org/10.4028/www.scientific.net/amm.536-537.176.

Full text
Abstract:
Object association and tracking have attracted great attention in the computer vision. In this paper, we present an object association and tracking method for monitoring multiple vehicles on the road based on objects' visual features and the similarity comparison between them. First, we identify vehicles using the difference operation between the current frame in CCTV image sequences and the referential images that are stored in a database, and then extract various features from the vehicles identified. Finally, we associate the objects in the current frame with those in the next frames using similarity comparison, and track multiple objects over a sequence of CCTV image frames. Empirical study using CCTV images shows that our method has achieved the considerable effectiveness in tracking vehicles on the road.
APA, Harvard, Vancouver, ISO, and other styles
15

Luo, Zhipeng, Gongjie Zhang, Changqing Zhou, et al. "Modeling Continuous Motion for 3D Point Cloud Object Tracking." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 5 (2024): 4026–34. http://dx.doi.org/10.1609/aaai.v38i5.28196.

Full text
Abstract:
The task of 3D single object tracking (SOT) with LiDAR point clouds is crucial for various applications, such as autonomous driving and robotics. However, existing approaches have primarily relied on appearance matching or motion modeling within only two successive frames, thereby overlooking the long-range continuous motion property of objects in 3D space. To address this issue, this paper presents a novel approach that views each tracklet as a continuous stream: at each timestamp, only the current frame is fed into the network to interact with multi-frame historical features stored in a memory bank, enabling efficient exploitation of sequential information. To achieve effective cross-frame message passing, a hybrid attention mechanism is designed to account for both long-range relation modeling and local geometric feature extraction. Furthermore, to enhance the utilization of multi-frame features for robust tracking, a contrastive sequence enhancement strategy is proposed, which uses ground truth tracklets to augment training sequences and promote discrimination against false positives in a contrastive manner. Extensive experiments demonstrate that the proposed method outperforms the state-of-the-art method by significant margins on multiple benchmarks.
APA, Harvard, Vancouver, ISO, and other styles
16

Fan, Kun, Chungin Joung, and Seungjun Baek. "Sequence-to-Sequence Video Prediction by Learning Hierarchical Representations." Applied Sciences 10, no. 22 (2020): 8288. http://dx.doi.org/10.3390/app10228288.

Full text
Abstract:
Video prediction which maps a sequence of past video frames into realistic future video frames is a challenging task because it is difficult to generate realistic frames and model the coherent relationship between consecutive video frames. In this paper, we propose a hierarchical sequence-to-sequence prediction approach to address this challenge. We present an end-to-end trainable architecture in which the frame generator automatically encodes input frames into different levels of latent Convolutional Neural Network (CNN) features, and then recursively generates future frames conditioned on the estimated hierarchical CNN features and previous prediction. Our design is intended to automatically learn hierarchical representations of video and their temporal dynamics. Convolutional Long Short-Term Memory (ConvLSTM) is used in combination with skip connections so as to separately capture the sequential structures of multiple levels of hierarchy of features. We adopt Scheduled Sampling for training our recurrent network in order to facilitate convergence and to produce high-quality sequence predictions. We evaluate our method on the Bouncing Balls, Moving MNIST, and KTH human action dataset, and report favorable results as compared to existing methods.
APA, Harvard, Vancouver, ISO, and other styles
17

Artyukhin, S. G., and L. M. Mestetskiy. "DACTYL ALPHABET GESTURE RECOGNITION IN A VIDEO SEQUENCE USING MICROSOFT KINECT." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XL-5/W6 (May 18, 2015): 83–86. http://dx.doi.org/10.5194/isprsarchives-xl-5-w6-83-2015.

Full text
Abstract:
This paper presents an efficient framework for solving the problem of static gesture recognition based on data obtained from the web cameras and depth sensor Kinect (RGB-D - data). Each gesture given by a pair of images: color image and depth map. The database store gestures by it features description, genereated by frame for each gesture of the alphabet. Recognition algorithm takes as input a video sequence (a sequence of frames) for marking, put in correspondence with each frame sequence gesture from the database, or decide that there is no suitable gesture in the database. First, classification of the frame of the video sequence is done separately without interframe information. Then, a sequence of successful marked frames in equal gesture is grouped into a single static gesture. We propose a method combined segmentation of frame by depth map and RGB-image. The primary segmentation is based on the depth map. It gives information about the position and allows to get hands rough border. Then, based on the color image border is specified and performed analysis of the shape of the hand. Method of continuous skeleton is used to generate features. We propose a method of skeleton terminal branches, which gives the opportunity to determine the position of the fingers and wrist. Classification features for gesture is description of the position of the fingers relative to the wrist. The experiments were carried out with the developed algorithm on the example of the American Sign Language. American Sign Language gesture has several components, including the shape of the hand, its orientation in space and the type of movement. The accuracy of the proposed method is evaluated on the base of collected gestures consisting of 2700 frames.
APA, Harvard, Vancouver, ISO, and other styles
18

Zhang, Nan, Weifeng Liu, and Xingyu Xia. "Video Global Motion Compensation Based on Affine Inverse Transform Model." Sensors 23, no. 18 (2023): 7750. http://dx.doi.org/10.3390/s23187750.

Full text
Abstract:
Global motion greatly increases the number of false alarms for object detection in video sequences against dynamic backgrounds. Therefore, before detecting the target in the dynamic background, it is necessary to estimate and compensate the global motion to eliminate the influence of the global motion. In this paper, we use the SURF (speeded up robust features) algorithm combined with the MSAC (M-Estimate Sample Consensus) algorithm to process the video. The global motion of a video sequence is estimated according to the feature point matching pairs of adjacent frames of the video sequence and the global motion parameters of the video sequence under the dynamic background. On this basis, we propose an inverse transformation model of affine transformation, which acts on each adjacent frame of the video sequence in turn. The model compensates the global motion, and outputs a video sequence after global motion compensation from a specific view for object detection. Experimental results show that the algorithm proposed in this paper can accurately perform motion compensation on video sequences containing complex global motion, and the compensated video sequences achieve higher peak signal-to-noise ratio and better visual effects.
APA, Harvard, Vancouver, ISO, and other styles
19

Telli, Hichem, Salim Sbaa, Salah Eddine Bekhouche, Fadi Dornaika, Abdelmalik Taleb-Ahmed, and Miguel Bordallo López. "A Novel Multi-Level Pyramid Co-Variance Operators for Estimation of Personality Traits and Job Screening Scores." Traitement du Signal 38, no. 3 (2021): 539–46. http://dx.doi.org/10.18280/ts.380301.

Full text
Abstract:
Recently, automatic personality analysis is becoming an interesting topic for computer vision. Many attempts have been proposed to solve this problem using time-based sequence information. In this paper, we present a new framework for estimating the Big-Five personality traits and job candidate screening variable from video sequences. The framework consists of two parts: (1) the use of Pyramid Multi-level (PML) to extract raw facial textures at different scales and levels; (2) the extension of the Covariance Descriptor (COV) to fuse different local texture features of the face image such as Local Binary Patterns (LBP), Local Directional Pattern (LDP), Binarized Statistical Image Features (BSIF), and Local Phase Quantization (LPQ). Therefore, the COV descriptor uses the textures of PML face parts to generate rich low-level face features that are encoded using concatenation of all PML blocks in a feature vector. Finally, the entire video sequence is represented by aggregating these frame vectors and extracting the most relevant features. The exploratory results on the ChaLearn LAP APA2016 dataset compare well with state-of-the-art methods including deep learning-based methods.
APA, Harvard, Vancouver, ISO, and other styles
20

Huang, Xuefei, Ka-Hou Chan, Wei Ke, and Hao Sheng. "Parallel Dense Video Caption Generation with Multi-Modal Features." Mathematics 11, no. 17 (2023): 3685. http://dx.doi.org/10.3390/math11173685.

Full text
Abstract:
The task of dense video captioning is to generate detailed natural-language descriptions for an original video, which requires deep analysis and mining of semantic captions to identify events in the video. Existing methods typically follow a localisation-then-captioning sequence within given frame sequences, resulting in caption generation that is highly dependent on which objects have been detected. This work proposes a parallel-based dense video captioning method that can simultaneously address the mutual constraint between event proposals and captions. Additionally, a deformable Transformer framework is introduced to reduce or free manual threshold of hyperparameters in such methods. An information transfer station is also added as a representation organisation, which receives the hidden features extracted from a frame and implicitly generates multiple event proposals. The proposed method also adopts LSTM (Long short-term memory) with deformable attention as the main layer for caption generation. Experimental results show that the proposed method outperforms other methods in this area to a certain degree on the ActivityNet Caption dataset, providing competitive results.
APA, Harvard, Vancouver, ISO, and other styles
21

Yu, Hengli, Hao Ding, Zheng Cao, Ningbo Liu, Guoqing Wang, and Zhaoxiang Zhang. "A Floating Small Target Identification Method Based on Doppler Time Series Information." Remote Sensing 16, no. 3 (2024): 505. http://dx.doi.org/10.3390/rs16030505.

Full text
Abstract:
Traditional radar detection methods heavily rely on the signal-to-clutter ratio (SCR); a variety of feature-based detection methods have been proposed, providing a new way for radar detection and the recognition of weak targets. Existing feature-based detection methods determine the presence or absence of a target based on whether the feature value is within the judgment region, generally focusing only on the distribution of features and making insufficient use of inter-feature chronological information. This paper uses the autoregressive (AR) model to model and predict the time sequence of radar echoes in the feature domain and takes the chronological information of historical frame features as the prior information to form new features for detection on this basis. A classification method for floating small targets based on the Doppler spectrum centroid sequence is proposed. By using the AR model to fit the Doppler spectrum centroid feature sequence of the target, the model coefficients are regarded as the secondary features for target identification. The measured data show that the correct classification and identification rate of this method for ship targets and floating small targets can reach over 92% by using 50 centroid features.
APA, Harvard, Vancouver, ISO, and other styles
22

Shen, Wenjie, Fan Ding, Yanping Wang, et al. "Static High Target-Induced False Alarm Suppression in Circular Synthetic Aperture Radar Moving Target Detection Based on Trajectory Features." Remote Sensing 15, no. 12 (2023): 3164. http://dx.doi.org/10.3390/rs15123164.

Full text
Abstract:
The new mode of Circular Synthetic Aperture Radar (CSAR) has several advantages including multi-aspect and long-time observation, which can generate high-frame-rate image sequences to detect moving targets with a single-channel system. Nonetheless, due to CSAR being sensitive to 3D structures, static high targets are observed in scene display rotational motion within CSAR subaperture image sequences. Such motion can cause false alarms rising when utilizing image sequence-based moving target detection methods like logarithm background subtraction (LBS). To address this issue, this paper first thoroughly analyzes the moving target and static high target’s difference for the trajectory in an image sequence. Two new trajectory features of the rotation angle and moving distance are proposed to differentiate them. Based on the features, a new false alarm suppression method is proposed. The method first utilizes LBS to obtain coarse binary detection results comprising both moving and static high targets, then employs morphological filtering to eliminate noise. Next, DBSCAN and target tracking steps are employed to extract the trajectory features of the target and false alarm. Finally, false alarms are suppressed with trajectory-based feature discriminators to output detection results. The W-band CSAR open dataset is used to validate the proposed method’s effectiveness.
APA, Harvard, Vancouver, ISO, and other styles
23

Rakun, Erdefi, and Noer FP Setyono. "Improving Recognition of SIBI Gesture by Combining Skeleton and Hand Shape Features." Jurnal Ilmu Komputer dan Informasi 15, no. 2 (2022): 69–79. http://dx.doi.org/10.21609/jiki.v15i2.1014.

Full text
Abstract:
SIBI (Sign System for Indonesian Language) is an official sign language system used in school for hearing impairment students in Indonesia. This work uses the skeleton and hand shape features to classify SIBI gestures. In order to improve the performance of the gesture classification system, we tried to fuse the features in several different ways. The accuracy results achieved by the feature fusion methods are, in descending order of accuracy: 88.016%, when using sequence-feature-vector concatenation, 85.448% when using Conneau feature vector concatenation, 83.723% when using feature-vector concatenation, and 49.618% when using simple feature concatenation. The sequence-feature-vector concatenation techniques yield noticeably better results than those achieved using single features (82.849% with skeleton feature only, 55.530% for the hand shape feature only). The experiment results show that the combined features of the whole gesture sequence can better distinguish one gesture from another in SIBI than the combined features of each gesture frame. In addition to finding the best feature combination technique, this study also found the most suitable Recurrent Neural Network (RNN) model for recognizing SIBI. The models tested are 1-layer, 2-layer LSTM, and GRU. The experimental results show that the 2-layer bidirectional LSTM has the best performance.
APA, Harvard, Vancouver, ISO, and other styles
24

Iyo, Abiye H., and Cecil W. Forsberg. "Features of the cellodextrinase gene from Fibrobacter succinogenes S85." Canadian Journal of Microbiology 40, no. 7 (1994): 592–96. http://dx.doi.org/10.1139/m94-094.

Full text
Abstract:
The nucleotide sequence of a 2.3-kb DNA fragment containing a cellodextrinase gene (cedA) from the ruminal anaerobe Fibrobacter succinogenes S85 was determined. Activity was expressed from this fragment when it was cloned in both orientations in pBluescript KS+ and SK−, indicating a functional F. succinogenes promoter in Escherichia coli. Promoter sequences (TTGAACA and AATAA) were identified upstream of the ATG initiation codon preceded by a putative ribosome binding site. The cedA open reading frame of 1071 base pairs encoded a protein of 357 amino acid residues with a calculated molecular mass of 41.9 kDa, similar to the40-kDa size of the native protein as determined by gel filtration chromatography. CedA is proposed to belong to family 5 (family A) of the glycosyl hydrolases. The primary structure of the cellodextrinase showed over 40% similarity with endoglucanase 3 from F. succinogenes S85. Short regions of similarity were also demonstrated with endoglucanase C from Clostridium thermocellum, CelA from Ruminococcus flavefaciens, and two exoglucanases from yeast.Key words: Fibrobacter succinogenes, cedA, cellodextrinase, sequence, rumen, gene.
APA, Harvard, Vancouver, ISO, and other styles
25

Mohamed, Islam, Ibrahim Elhenawy, Ahmed W. Sallam, Andrew Gatt, and Ahmad Salah. "A practical evaluation of correlation filter-based object trackers with new features." PLOS ONE 17, no. 8 (2022): e0273022. http://dx.doi.org/10.1371/journal.pone.0273022.

Full text
Abstract:
Visual object tracking is a critical problem in the field of computer vision. The visual object tracker methods can be divided into Correlation Filters (CF) and non-correlation filters trackers. The main advantage of CF-based trackers is that they have an accepted real-time tracking response. In this article, we will focus on CF-based trackers, due to their key role in online applications such as an Unmanned Aerial Vehicle (UAV), through two contributions. In the first contribution, we proposed a set of new video sequences to address two uncovered issues of the existing standard datasets. The first issue is to create two video sequence that is difficult to be tracked by a human being for the movement of the Amoeba under the microscope; these two proposed video sequences include a new feature that combined background clutter and occlusion features in a unique way; we called it hard-to-follow-by-human. The second issue is to increase the difficulty of the existing sequences by increasing the displacement of the tracked object. Then, we proposed a thorough, practical evaluation of eight CF-base trackers, with the top performance, on the existing sequence features such as out-of-view, background clutters, and fast motion. The evaluation utilized the well-known OTB-2013 dataset as well as the proposed video sequences. The overall assessment of the eight trackers on the standard evaluation metrics, e.g., precision and success rates, revealed that the Large Displacement Estimation of Similarity transformation (LDES) tracker is the best CF-based tracker among the trackers of comparison. On the contrary, with a deeper analysis, the results of the proposed video sequences show an average performance of the LDES tracker among the other trackers. The eight trackers failed to capture the moving objects in every frame of the proposed Amoeba movement video sequences while the same trackers managed to capture the object in almost every frame of the sequences of the standard dataset. These results outline the need to improve the CF-based object trackers to be able to process sequences with the proposed feature (i.e., hard-to-follow-by-human).
APA, Harvard, Vancouver, ISO, and other styles
26

Wasenius, V. M., M. Saraste, and V. P. Lehto. "From the spectrin gene to the assembly of the membrane skeleton." International Journal of Developmental Biology 33, no. 1 (1989): 49–54. https://doi.org/10.1387/ijdb.2485701.

Full text
Abstract:
The complete nucleotide sequence coding for the chicken brain alpha-spectrin was determined. It comprises the entire coding frame, 5'- and 3'-untranslated sequences terminating in a poly(A)-tail. The deduced amino acid sequence shows that the alpha-chain contains 22 segments, 20 of which correspond to the typical 106 residue repeat of the human erythrocyte spectrin. Some segments non-homologous to the repeat structure reside in the middle and COOH-terminal regions. Sequence comparisons with other proteins show that these segments evidently harbour some structural and functional features such as: homology to alpha-actinin and dystrophin, two typical EF-hand structures (calcium-binding) and a putative calmodulin-binding site in the COOH-terminus and a sequence homologous to various src-tyrosine kinases and to phospholipase C in the middle of the molecule. Comparison of our sequence with other partial alpha-spectrin sequences shows that alpha-spectrin is well conserved in different species and that the human erythrocyte alpha-spectrin is divergent.
APA, Harvard, Vancouver, ISO, and other styles
27

Bohush, R. P., and S. V. Ablameyko. "Object detection and tracking in video sequences: formalization, metrics and results." Informatics 18, no. 1 (2021): 43–60. http://dx.doi.org/10.37661/1816-0301-2021-18-1-43-60.

Full text
Abstract:
One of the promising areas of development and implementation of artificial intelligence is the automatic detection and tracking of moving objects in video sequence. The paper presents a formalization of the detection and tracking of one and many objects in video. The following metrics are considered: the quality of detection of tracked objects, the accuracy of determining the location of the object in a frame, the trajectory of movement, the accuracy of tracking multiple objects. Based on the considered generalization, an algorithm for tracking people has been developed that uses the tracking through detection method and convolutional neural networks to detect people and form features. Neural network features are included in a composite descriptor that also contains geometric and color features to describe each detected person in the frame. The results of experiments based on the considered criteria are presented, and it is experimentally confirmed that the improvement of the detector operation makes it possible to increase the accuracy of tracking objects. Examples of frames of processed video sequences with visualization of human movement trajectories are presented.
APA, Harvard, Vancouver, ISO, and other styles
28

Paramanantham, Vinsent, and Dr SureshKumar S. "Multi View Video Summarization Using RNN and SURF Based High Level Moving Object Feature Frames." International Journal of Engineering Research in Computer Science and Engineering 9, no. 5 (2022): 1–14. http://dx.doi.org/10.36647/ijercse/09.05.art001.

Full text
Abstract:
Multi-View Video summarization is a process to ease the storage consumption that facilitates organized storage, and perform other mainline videos analytical task. This in-turn helps quick search or browse and retrieve the video data with minimum time and without losing crucial data. In static video summarization, there is less challenge in time and sequence issues to rearrange the video-synopsis. The low-level features are easy to compute and retrieve. But for high-level features like event detection, emotion detection, object recognition, face detection, gesture detection, and others requires the comprehension of the video content. This research is to propose an approach to over- come the difficulties in handling the high-level features. The distinguishable contents from the videos are identified by object detection and feature-based area strategy. The major aspect of the proposed solution is to retrieve the attributes of a motion source from a video frame. By dividing the details of the object that are available in the video frame wavelet decomposition are achieved. The motion frequency scoring method records the time of motions in the video. The frequency motion feature of video usage is a challenge given the continuous change of objects shape. Therefore, the object position and corner points are spotted using Speeded Up Robust Features (SURF) feature points. Support vector machine clustering extracts keyframes. The memory-based re- current neural network (RNN) recognizes the object in the video frame and remembers a long sequence. RNN is an artificial neural network where nodes form a temporal relationship. The attention layer in the proposed RNN network extracts the details about the objects in motion. The motion objects identified using the three video clippings is finally summarized using video summarization algorithm. To perform the simulation, MATLAB R 2014b software was used.
APA, Harvard, Vancouver, ISO, and other styles
29

Li, Jiayue, and Yan Piao. "Video Person Re-Identification with Frame Sampling–Random Erasure and Mutual Information–Temporal Weight Aggregation." Sensors 22, no. 8 (2022): 3047. http://dx.doi.org/10.3390/s22083047.

Full text
Abstract:
Partial occlusion and background clutter in camera video surveillance affect the accuracy of video-based person re-identification (re-ID). To address these problems, we propose a person re-ID method based on random erasure of frame sampling and temporal weight aggregation of mutual information of partial and global features. First, for the case in which the target person is interfered or partially occluded, the frame sampling–random erasure (FSE) method is used for data enhancement to effectively alleviate the occlusion problem, improve the generalization ability of the model, and match persons more accurately. Second, to further improve the re-ID accuracy of video-based persons and learn more discriminative feature representations, we use a ResNet-50 network to extract global and partial features and fuse these features to obtain frame-level features. In the time dimension, based on a mutual information–temporal weight aggregation (MI–TWA) module, the partial features are added according to different weights and the global features are added according to equal weights and connected to output sequence features. The proposed method is extensively experimented on three public video datasets, MARS, DukeMTMC-VideoReID, and PRID-2011; the mean average precision (mAP) values are 82.4%, 94.1%, and 95.3% and Rank-1 values are 86.4%, 94.8%, and 95.2%, respectively.
APA, Harvard, Vancouver, ISO, and other styles
30

Carels, Nicolas, Ramon Vidal, and Diego Frías. "Universal Features for the Classification of Coding and Non-coding DNA Sequences." Bioinformatics and Biology Insights 3 (January 2009): BBI.S2236. http://dx.doi.org/10.4137/bbi.s2236.

Full text
Abstract:
In this report, we revisited simple features that allow the classification of coding sequences (CDS) from non-coding DNA. The spectrum of codon usage of our sequence sample is large and suggests that these features are universal. The features that we investigated combine (i) the stop codon distribution, (ii) the product of purine probabilities in the three positions of nucleotide triplets, (iii) the product of Cytosine, Guanine, Adenine probabilities in 1st, 2nd, 3rd position of triplets, respectively, (iv) the product of G and C probabilities in 1st and 2nd position of triplets. These features are a natural consequence of the physico-chemical properties of proteins and their combination is successful in classifying CDS and non-coding DNA (introns) with a success rate >95% above 350 bp. The coding strand and coding frame are implicitly deduced when the sequences are classified as coding.
APA, Harvard, Vancouver, ISO, and other styles
31

Renita Kurian. "Video Manipulation Detection using Sequence Learning and Convolution Networks: A Comparative Study." Journal of Information Systems Engineering and Management 10, no. 30s (2025): 712–21. https://doi.org/10.52783/jisem.v10i30s.4894.

Full text
Abstract:
Nowadays, the accessible and technologically advanced edit- ing tools, coupled with the surge in photo and video content pose a great risk to content authenticity. Manipulated content can be used to spread misinformation, cause harassment and infringe human rights. In this article, we compare the effectiveness of two approaches for video manipulation detection using micro and macro information, i.e., a Long Short-Term Memory (LSTM) architecture with frame-level features of videos and their respective ground truths as inputs and a Graph Con- volutional Network (GCN) with frame-level video scene graphs con- catenated using temporal edges. While the LSTM-based model cap- tures frame-level micro-information, the GCN model captures high-level macro-information inside a video.
APA, Harvard, Vancouver, ISO, and other styles
32

Fu, Yu, Meng Pan, Xiaoyan Wang, et al. "Complete sequence of a duck astrovirus associated with fatal hepatitis in ducklings." Journal of General Virology 90, no. 5 (2009): 1104–8. http://dx.doi.org/10.1099/vir.0.008599-0.

Full text
Abstract:
Duck astroviruses (DAstVs) are known to cause duck viral hepatitis; however, little is known regarding their molecular biology. Here, we report the complete sequence of a DAstV associated with a recent outbreak of fatal hepatitis in ducklings in China. Sequence analyses indicated that the genome of DAstV possessed a typical astrovirus organization and also exhibited two unique features. The polyadenylated genome comprised 7722 nt, which is the largest among astroviruses sequenced to date. The ORF2 of DAstV was not in the same reading frame as either ORF1a or ORF1b, which was distinct from all other astroviruses. Sequence comparisons and phylogenetic analyses revealed that DAstV was more closely related to turkey astrovirus (TAstV) type 2, TAstV-3 and TAstV/MN/01 (a possible new TAstV serotype) than to TAstV-1 or other astroviruses. These findings suggest that astroviruses may transmit across ducks and turkeys.
APA, Harvard, Vancouver, ISO, and other styles
33

Hardison, Debra M. "Visualizing the acoustic and gestural beats of emphasis in multimodal discourse." Journal of Second Language Pronunciation 4, no. 2 (2018): 232–59. http://dx.doi.org/10.1075/jslp.17006.har.

Full text
Abstract:
Abstract Perceivers’ attention is entrained to the rhythm of a speaker’s gestural and acoustic beats. When different rhythms (polyrhythms) occur across the visual and auditory modalities of speech simultaneously, attention may be heightened, enhancing memorability of the sequence. In this three-stage study, Stage 1 analyzed videorecordings of native English-speaking instructors, focusing on frame-by-frame analysis of time-aligned annotations from Praat and Anvil (video annotation tool) of polyrhythmic sequences. Stage 2 explored the perceivers’ perspective on the sequences’ discourse role. Stage 3 analyzed 10 international teaching assistants’ gestures, and implemented a multistep technology-assisted program to enhance verbal and nonverbal communication skills. Findings demonstrated (a) a dynamic temporal gesture-speech relationship involving perturbations of beat intervals surrounding pitch-accented vowels, (b) the sequences’ important role as highlighters of information, and (c) improvement of ITA confidence, teaching effectiveness, and ability to communicate important points. Findings support the joint production of gesture and prosodically prominent features.
APA, Harvard, Vancouver, ISO, and other styles
34

Wei, Tuanjie, Rui Li, Huimin Zhao, et al. "Metric-Based Key Frame Extraction for Gait Recognition." Electronics 11, no. 24 (2022): 4177. http://dx.doi.org/10.3390/electronics11244177.

Full text
Abstract:
Gait recognition is one of the most promising biometric technologies that can identify individuals at a long distance. From observation, we find that there are differences in the length of the gait cycle and the quality of each frame in the sequence. In this paper, we propose a novel gait recognition framework to analyze human gait. On the one hand, we designed the Multi-scale Temporal Aggregation (MTA) module that models temporal and aggregate contextual information with different scales, on the other hand, we introduce the Metric-based Frame Attention Mechanism (MFAM) to re-weight each frame by the importance score, which calculates using the distance between frame-level features and sequence-level features. We evaluate our model on two of the most popular public datasets, CASIA-B and OU-MVLP. For normal walking, the rank-1 accuracies on the two datasets are 97.6% and 90.1%, respectively. In complex scenarios, the proposed method achieves accuracies of 94.8% and 84.9% on CASIA-B under bag-carrying and coat-wearing walking conditions. The results show that our method achieves the top level among state-of-the-art methods.
APA, Harvard, Vancouver, ISO, and other styles
35

Guo, Xiaoming, Fengbao Yang, and Linna Ji. "A Mimic Fusion Algorithm for Dual Channel Video Based on Possibility Distribution Synthesis Theory." Chinese Journal of Information Fusion 1, no. 1 (2024): 33–49. http://dx.doi.org/10.62762/cjif.2024.361886.

Full text
Abstract:
In response to the current practical fusion requirements for infrared and visible videos, which often involve collaborative fusion of difference feature information, and model cannot dynamically adjust the fusion strategy according to the difference between videos, resulting in poor fusion performance, a mimic fusion algorithm for infrared and visible videos based on the possibility distribution synthesis theory is proposed. Firstly, quantitatively describe the various difference features and their attributes of the region of interest in each frame of the dual channel video sequence, and select the main difference features corresponding to each frame. Secondly, the pearson correlation coefficient is used to measure the correlation between any two features and obtain the feature correlation matrix. Then, based on the similarity measure, the fusion effective degree distribution of each layer variables for different difference features is constructed, and the difference feature distribution is correlated and synthesized based on the possibility distribution synthesis theory. Finally, optimize the select of mimic variables to achieve mimic fusion of infrared and visible videos. The experimental results show that the proposed method achieve significant fusion results in preserving targets and details, and was significantly superior to other single fusion methods in subjective evaluation and objective analysis.
APA, Harvard, Vancouver, ISO, and other styles
36

Chen, Zengqun, Zhiheng Zhou, Junchu Huang, Pengyu Zhang, and Bo Li. "Frame-Guided Region-Aligned Representation for Video Person Re-Identification." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 07 (2020): 10591–98. http://dx.doi.org/10.1609/aaai.v34i07.6632.

Full text
Abstract:
Pedestrians in videos are usually in a moving state, resulting in serious spatial misalignment like scale variations and pose changes, which makes the video-based person re-identification problem more challenging. To address the above issue, in this paper, we propose a Frame-Guided Region-Aligned model (FGRA) for discriminative representation learning in two steps in an end-to-end manner. Firstly, based on a frame-guided feature learning strategy and a non-parametric alignment module, a novel alignment mechanism is proposed to extract well-aligned region features. Secondly, in order to form a sequence representation, an effective feature aggregation strategy that utilizes temporal alignment score and spatial attention is adopted to fuse region features in the temporal and spatial dimensions, respectively. Experiments are conducted on benchmark datasets to demonstrate the effectiveness of the proposed method to solve the misalignment problem and the superiority of the proposed method to the existing video-based person re-identification methods.
APA, Harvard, Vancouver, ISO, and other styles
37

Yan, Bo, Chuming Lin, and Weimin Tan. "Frame and Feature-Context Video Super-Resolution." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 5597–604. http://dx.doi.org/10.1609/aaai.v33i01.33015597.

Full text
Abstract:
For video super-resolution, current state-of-the-art approaches either process multiple low-resolution (LR) frames to produce each output high-resolution (HR) frame separately in a sliding window fashion or recurrently exploit the previously estimated HR frames to super-resolve the following frame. The main weaknesses of these approaches are: 1) separately generating each output frame may obtain high-quality HR estimates while resulting in unsatisfactory flickering artifacts, and 2) combining previously generated HR frames can produce temporally consistent results in the case of short information flow, but it will cause significant jitter and jagged artifacts because the previous super-resolving errors are constantly accumulated to the subsequent frames.In this paper, we propose a fully end-to-end trainable frame and feature-context video super-resolution (FFCVSR) network that consists of two key sub-networks: local network and context network, where the first one explicitly utilizes a sequence of consecutive LR frames to generate local feature and local SR frame, and the other combines the outputs of local network and the previously estimated HR frames and features to super-resolve the subsequent frame. Our approach takes full advantage of the inter-frame information from multiple LR frames and the context information from previously predicted HR frames, producing temporally consistent highquality results while maintaining real-time speed by directly reusing previous features and frames. Extensive evaluations and comparisons demonstrate that our approach produces state-of-the-art results on a standard benchmark dataset, with advantages in terms of accuracy, efficiency, and visual quality over the existing approaches.
APA, Harvard, Vancouver, ISO, and other styles
38

Nibert, Max L. "‘2A-like’ and ‘shifty heptamer’ motifs in penaeid shrimp infectious myonecrosis virus, a monosegmented double-stranded RNA virus." Journal of General Virology 88, no. 4 (2007): 1315–18. http://dx.doi.org/10.1099/vir.0.82681-0.

Full text
Abstract:
Penaeid shrimp infectious myonecrosis virus (IMNV) is a monosegmented double-stranded RNA virus that forms icosahedral virions and is tentatively assigned to the family Totiviridae. New examinations of the IMNV genome sequence revealed features not noted in the original report. These features include (i) two encoded ‘2A-like’ motifs, which are likely involved in open reading frame (ORF) 1 polyprotein ‘cleavage’; (ii) a 199 nt overlap between the end of ORF1 in frame 1 and the start of ORF2 in frame 3; and (iii) a ‘shifty heptamer’ motif and predicted RNA pseudoknot in the region of ORF1–ORF2 overlap, which probably allow ORF2 to be translated as a fusion with ORF1 by −1 ribosomal frameshifting. Features (ii) and (iii) bring the predicted ORF2 coding strategy of IMNV more in line with that of its closest phylogenetic relative, Giardia lamblia virus, as well as with that of several other members of the family Totiviridae including Saccharomyces cerevisiae virus L-A.
APA, Harvard, Vancouver, ISO, and other styles
39

Li, Xiaohai, Bineng Zhong, Qihua Liang, Guorong Li, Zhiyi Mo, and Shuxiang Song. "MambaLCT: Boosting Tracking via Long-term Context State Space Model." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 5 (2025): 4986–94. https://doi.org/10.1609/aaai.v39i5.32528.

Full text
Abstract:
Effectively constructing context information with long-term dependencies from video sequences is crucial for object tracking. However, the context length constructed by existing work is limited, only considering object information from adjacent frames or video clips, leading to insufficient utilization of contextual information. To address this issue, we propose MambaLCT, which constructs and utilizes target variation cues from the first frame to the current frame for robust tracking. First, a novel unidirectional Context Mamba module is designed to scan frame features along the temporal dimension, gathering target change cues throughout the entire sequence. Specifically, target-related information in frame features is compressed into a hidden state space through a selective scanning mechanism. The target information across the entire video is continuously aggregated into target variation cues. Next, we inject the target change cues into the attention mechanism, providing temporal information for modeling the relationship between the template and search frames. The advantage of MambaLCT is its ability to continuously extend the length of the context, capturing complete target change cues, which enhances the stability and robustness of the tracker. Extensive experiments show that long-term context information enhances the model's ability to perceive targets in complex scenarios. MambaLCT achieves new SOTA performance on six benchmarks while maintaining real-time runing speeds.
APA, Harvard, Vancouver, ISO, and other styles
40

Zhang, Xin, Xiao Tao Wang, Bing Wang, and Yue Hua Gao. "Research on Moving Human Positioning in Complex Background Video." Advanced Materials Research 225-226 (April 2011): 403–6. http://dx.doi.org/10.4028/www.scientific.net/amr.225-226.403.

Full text
Abstract:
Human Skin Color(HSC) features have been widely used in video moving human positioning. However, in complex background video sequences, due to illumination changes or other moving objects which have similar HSC regions, the effect of moving human positioning is not satisfactory. A new method of moving human positioning applied on complex background video sequences is presented in this paper. Firstly, brightness information of the video sequence images is detected and analyzed based on HSV color model. Secondly, adopt the multi frame subtraction method to extract the moving object regions from motionless background. Then, the regions with distinctive HSC features are separated from other moving objects using the data fusion model of HSC and brightness information. Finally, identify human object among regions with HSC features according to the prior knowledge of human. The experimental results show that the method provided in this paper is effective in moving human positioning of complex background video, and has the strong illumination change adaptability and anti-jamming ability.
APA, Harvard, Vancouver, ISO, and other styles
41

Searle, S., M. V. McCrossan, and D. F. Smith. "Expression of a mitochondrial stress protein in the protozoan parasite Leishmania major." Journal of Cell Science 104, no. 4 (1993): 1091–100. http://dx.doi.org/10.1242/jcs.104.4.1091.

Full text
Abstract:
The DNA sequence has been determined of a gene from Leishmania major that shares sequence identity with members of the eukaryotic heat shock protein (hsp) 70 gene family. The deduced open reading frame for translation shares a number of features common to hsp70 stress proteins, including conserved amino acids implicated in ATP binding and a putative calmodulin-binding site. In addition, the protein has an N-terminal sequence characteristic of a mitochondrial targeting signal. Specific antibodies to this protein, generated by the use of recombinant fusion peptides, recognise a 65 kDa molecule of pI 6.7. This molecule is constitutively expressed and localises to the mitochondrion in all stages of the parasite life cycle. These features suggest a role for this protein as a molecular chaperone in Leishmania.
APA, Harvard, Vancouver, ISO, and other styles
42

Yu, Chenyang, Xuehu Liu, Yingquan Wang, Pingping Zhang, and Huchuan Lu. "TF-CLIP: Learning Text-Free CLIP for Video-Based Person Re-identification." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 7 (2024): 6764–72. http://dx.doi.org/10.1609/aaai.v38i7.28500.

Full text
Abstract:
Large-scale language-image pre-trained models (e.g., CLIP) have shown superior performances on many cross-modal retrieval tasks. However, the problem of transferring the knowledge learned from such models to video-based person re-identification (ReID) has barely been explored. In addition, there is a lack of decent text descriptions in current ReID benchmarks. To address these issues, in this work, we propose a novel one-stage text-free CLIP-based learning framework named TF-CLIP for video-based person ReID. More specifically, we extract the identity-specific sequence feature as the CLIP-Memory to replace the text feature. Meanwhile, we design a Sequence-Specific Prompt (SSP) module to update the CLIP-Memory online. To capture temporal information, we further propose a Temporal Memory Diffusion (TMD) module, which consists of two key components: Temporal Memory Construction (TMC) and Memory Diffusion (MD). Technically, TMC allows the frame-level memories in a sequence to communicate with each other, and to extract temporal information based on the relations within the sequence. MD further diffuses the temporal memories to each token in the original features to obtain more robust sequence features. Extensive experiments demonstrate that our proposed method shows much better results than other state-of-the-art methods on MARS, LS-VID and iLIDS-VID.
APA, Harvard, Vancouver, ISO, and other styles
43

Yang, Changxuan, Feng Mei, Tuo Zang, Jianfeng Tu, Nan Jiang, and Lingfeng Liu. "Human Action Recognition Using Key-Frame Attention-Based LSTM Networks." Electronics 12, no. 12 (2023): 2622. http://dx.doi.org/10.3390/electronics12122622.

Full text
Abstract:
Human action recognition is a classical problem in computer vision and machine learning, and the task of effectively and efficiently recognising human actions is a concern for researchers. In this paper, we propose a key-frame-based approach to human action recognition. First, we designed a key-frame attention-based LSTM network (KF-LSTM) using the attention mechanism, which can be combined with LSTM to effectively recognise human action sequences by assigning different weight scale values to give more attention to key frames. In addition, we designed a new key-frame extraction method by combining an automatic segmentation model based on the autoregressive moving average (ARMA) algorithm and the K-means clustering algorithm. This method effectively avoids the possibility of inter-frame confusion in the temporal sequence of key frames of different actions and ensures that the subsequent human action recognition task proceeds smoothly. The dataset used in the experiments was acquired with an IMU sensor-based motion capture device, and we separately extracted the motion features of each joint using a manual method and then performed collective inference.
APA, Harvard, Vancouver, ISO, and other styles
44

Tang, Wen, and Linlin Gu. "Harmonic Classification with Enhancing Music Using Deep Learning Techniques." Complexity 2021 (September 29, 2021): 1–10. http://dx.doi.org/10.1155/2021/5590996.

Full text
Abstract:
Automatic extraction of features from harmonic information of music audio is considered in this paper. Automatically obtaining of relevant information is necessary not just for analysis but also for the commercial issue such as music program of tutoring and generating of lead sheet. Two aspects of harmony are considered, chord and global key, facing the issue of the extraction problem by the algorithm of machine learning. Contribution here is to recognize chords in the music by the feature extraction method (voiced models) that performd better than manually one. The modelling carried out chord sequence, getting from frame-by-frame basis, which is known in recognition of the chord system. Technique of machine learning such the convolutional neural network (CNN) will systematically extract the chord sequence to achieve the superiority context model. Then, traditional classification is used to create the key classifier which is better than others or manually one. Datasets used to evaluate the proposed model show good achievement results compared with existing one.
APA, Harvard, Vancouver, ISO, and other styles
45

Wang, Yanbin, Zhu-Hong You, Shan Yang, Xiao Li, Tong-Hai Jiang, and Xi Zhou. "A High Efficient Biological Language Model for Predicting Protein–Protein Interactions." Cells 8, no. 2 (2019): 122. http://dx.doi.org/10.3390/cells8020122.

Full text
Abstract:
Many life activities and key functions in organisms are maintained by different types of protein–protein interactions (PPIs). In order to accelerate the discovery of PPIs for different species, many computational methods have been developed. Unfortunately, even though computational methods are constantly evolving, efficient methods for predicting PPIs from protein sequence information have not been found for many years due to limiting factors including both methodology and technology. Inspired by the similarity of biological sequences and languages, developing a biological language processing technology may provide a brand new theoretical perspective and feasible method for the study of biological sequences. In this paper, a pure biological language processing model is proposed for predicting protein–protein interactions only using a protein sequence. The model was constructed based on a feature representation method for biological sequences called bio-to-vector (Bio2Vec) and a convolution neural network (CNN). The Bio2Vec obtains protein sequence features by using a “bio-word” segmentation system and a word representation model used for learning the distributed representation for each “bio-word”. The Bio2Vec supplies a frame that allows researchers to consider the context information and implicit semantic information of a bio sequence. A remarkable improvement in PPIs prediction performance has been observed by using the proposed model compared with state-of-the-art methods. The presentation of this approach marks the start of “bio language processing technology,” which could cause a technological revolution and could be applied to improve the quality of predictions in other problems.
APA, Harvard, Vancouver, ISO, and other styles
46

Ye, Qing, and Zhenghao Liang. "Video behavior recognition algorithm based on two-stream heterogeneous convolutional neural network." Journal of Physics: Conference Series 2258, no. 1 (2022): 012028. http://dx.doi.org/10.1088/1742-6596/2258/1/012028.

Full text
Abstract:
Abstract In video behavior recognition, making full use of the spatio-temporal information contained in video frame is the critical point to further improve the recognition accuracy. In our study, we propose a video behavior recognition algorithm based on two-stream heterogeneous network, so that we can further extract the spatio-temporal feature information in the video frame, and finally take full advantage of the spatio-temporal features in the video sequence. In view of the characteristics of RGB and optical flow images, this paper uses DenseNet121 and Inception-V4 to structure a two-stream network, to fully extract the spatio-temporal feature information in the video. In this paper, UCF101 dataset is used to evaluate the algorithm, and the recognition accuracy is 91.7%, which verified the reliability of the proposed algorithm.
APA, Harvard, Vancouver, ISO, and other styles
47

Mashtalir, Sergii, and Olena Mikhnova. "Key Frame Extraction from Video." International Journal of Computer Vision and Image Processing 4, no. 2 (2014): 68–79. http://dx.doi.org/10.4018/ijcvip.2014040105.

Full text
Abstract:
A complete overview of key frame extraction techniques has been provided. It has been found out that such techniques usually have three phases, namely shot boundary detection as a pre-processing phase, main phase of key frame detection, where visual, structural, audio and textual features are extracted from each frame, then processed and analyzed with artificial intelligence methods, and the last post-processing phase lies in removal of duplicates if they occur in the resulting sequence of key frames. Estimation techniques and available test video collections have been also observed. At the end, conclusions concerning drawbacks of the examined procedure and basic tendencies of its development have been marked.
APA, Harvard, Vancouver, ISO, and other styles
48

Lu, Yujiang, Yaju Liu, Jianwei Fei, and Zhihua Xia. "Channel-Wise Spatiotemporal Aggregation Technology for Face Video Forensics." Security and Communication Networks 2021 (August 27, 2021): 1–13. http://dx.doi.org/10.1155/2021/5524930.

Full text
Abstract:
Recent progress in deep learning, in particular the generative models, makes it easier to synthesize sophisticated forged faces in videos, leading to severe threats on social media about personal privacy and reputation. It is therefore highly necessary to develop forensics approaches to distinguish those forged videos from the authentic. Existing works are absorbed in exploring frame-level cues but insufficient in leveraging affluent temporal information. Although some approaches identify forgeries from the perspective of motion inconsistency, there is so far not a promising spatiotemporal feature fusion strategy. Towards this end, we propose the Channel-Wise Spatiotemporal Aggregation (CWSA) module to fuse deep features of continuous video frames without any recurrent units. Our approach starts by cropping the face region with some background remained, which transforms the learning objective from manipulations to the difference between pristine and manipulated pixels. A deep convolutional neural network (CNN) with skip connections that are conducive to the preservation of detection-helpful low-level features is then utilized to extract frame-level features. The CWSA module finally makes the real or fake decision by aggregating deep features of the frame sequence. Evaluation against a list of large facial video manipulation benchmarks has illustrated its effectiveness. On all three datasets, FaceForensics++, Celeb-DF, and DeepFake Detection Challenge Preview, the proposed approach outperforms the state-of-the-art methods with significant advantages.
APA, Harvard, Vancouver, ISO, and other styles
49

Hu, Yuchen, Pengyu Zhang, Beizhen Bi, Yuwei Chen, Qin Xin, and Xiaotao Huang. "Spatiotemporal Localization of GPR Sequence Images With 3DResNet18." Journal of Physics: Conference Series 2887, no. 1 (2024): 012034. http://dx.doi.org/10.1088/1742-6596/2887/1/012034.

Full text
Abstract:
Abstract Localizing Ground Penetrating Radar can provide a reliable support for current autonomous navigation and localization. Based on the characteristics of ground penetrating radar images, sequence-based matching and localization can effectively improve the problem of a large number of mismatched candidates in single frame image matching. Convolutional neural network is a deep model that can act directly on the original input, 2D convolutional neural network mainly extracts spatial information, we use 3DResNet18 model to extract features from both spatial and temporal dimensions by 3D convolution, so as to capture information in multiple neighbouring frames, to generate multiple channels of information from input frames, and finally combine the information from all the channels together to get a high-level feature for matching. We use the model for localizing of ground penetrating radar sequence images and it achieves better performance compared to the baseline approach.
APA, Harvard, Vancouver, ISO, and other styles
50

Liu, Zhi, Yunhua Lu, Xiaochuan Zhang, Sen Wang, Shuo Li, and Bo Chen. "Multi-Indices Quantification for Left Ventricle via DenseNet and GRU-Based Encoder-Decoder with Attention." Complexity 2021 (February 20, 2021): 1–9. http://dx.doi.org/10.1155/2021/3260259.

Full text
Abstract:
More and more research on left ventricle quantification skips segmentation due to its requirement of large amounts of pixel-by-pixel labels. In this study, a framework is developed to directly quantify left ventricle multiple indices without the process of segmentation. At first, DenseNet is utilized to extract spatial features for each cardiac frame. Then, in order to take advantage of the time sequence information, the temporal feature for consecutive frames is encoded using gated recurrent unit (GRU). After that, the attention mechanism is integrated into the decoder to effectively establish the mappings between the input sequence and corresponding output sequence. Simultaneously, a regression layer with the same decoder output is used to predict multi-indices of the left ventricle. Different weights are set for different types of indices based on experience, and l2-norm is used to avoid model overfitting. Compared with the state-of-the-art (SOTA), our method can not only produce more competitive results but also be more flexible. This is because the prediction results in our study can be obtained for each frame online while the SOTA only can output results after all frames are analyzed.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!